Saved in:
Bibliographic Details
Main Authors: Tomita, Masayo, Hayashi, Katsuhiko, Kaneko, Tomoyuki
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.15389
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Vision-Language Models (VLMs) occasionally generate outputs that contradict input images, constraining their reliability in real-world applications. While visual prompting is reported to suppress hallucinations by augmenting prompts with relevant area inside an image, the effectiveness in terms of the area remains uncertain. This study analyzes success and failure cases of Attention-driven visual prompting in object hallucination, revealing that preserving background context is crucial for mitigating object hallucination.