Detection and Correction of Object Hallucination using Attention Map and Gradient Information in LVLMs

Kazuki Yamaji

2:40 PM - 3:00 PM

[4I3-GS-7-03] Detection and Correction of Object Hallucination using Attention Map and Gradient Information in LVLMs

〇Kazuki Yamaji¹, Tomohiro Takagi¹ (1. Meiji University)

Keywords:Object Hallucination, multimodal, Large Vision-Language Models

Inspired by the superior language processing capabilities of Large Language Models (LLMs), there has been a recent push to develop Large Vision Language Models (LVLMs) that incorporate powerful LLMs to enhance performance on complex multimodal tasks. However, these LVLMs face issues with Object Hallucination, where they inaccurately recognize and describe objects that do not exist in the image or misrepresent the relationships between objects.
To address this problem, we propose a framework that detects and corrects Object Hallucination. This framework identifies and detects the specific parts of an image that cause Object Hallucination based on Attention Maps and gradient information within the LVLMs, and then makes corrections. Through experiments, we have verified that our proposed method reduces the occurrence of Object Hallucination using multiple quantitative metrics.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[4I3-GS-7] Language media processing:

[4I3-GS-7-03] Detection and Correction of Object Hallucination using Attention Map and Gradient Information in LVLMs

Password