AI & ML New Capability

Identifies that 'attention imbalance' across modalities and tokens drives object hallucinations and proposes a decoding-time rectification (AIR) to fix it.

March 26, 2026

Original Paper

Mitigating Object Hallucinations in LVLMs via Attention Imbalance Rectification

Han Sun, Qin Li, Peixin Wang, Min Zhang

arXiv · 2603.24058

The Takeaway

This is a lightweight, training-free intervention that reduces hallucination rates in Vision-Language Models by up to 35%. It provides practitioners a way to improve the reliability of deployed multimodal models in high-stakes scenarios like medical imaging or autonomous driving without expensive retraining.

From the abstract

Object hallucination in Large Vision-Language Models (LVLMs) severely compromises their reliability in real-world applications, posing a critical barrier to their deployment in high-stakes scenarios such as autonomous driving and medical image analysis. Through systematic empirical investigation, we identify that the imbalanced attention allocation, both across modalities (i.e., vision and language) and within modalities (among individual tokens), exhibits a strong causal correlation with the oc