Saved in:
Bibliographic Details
Main Authors: Tomita, Masayo, Hayashi, Katsuhiko, Kaneko, Tomoyuki
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.15389
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909503597838336
author Tomita, Masayo
Hayashi, Katsuhiko
Kaneko, Tomoyuki
author_facet Tomita, Masayo
Hayashi, Katsuhiko
Kaneko, Tomoyuki
contents Vision-Language Models (VLMs) occasionally generate outputs that contradict input images, constraining their reliability in real-world applications. While visual prompting is reported to suppress hallucinations by augmenting prompts with relevant area inside an image, the effectiveness in terms of the area remains uncertain. This study analyzes success and failure cases of Attention-driven visual prompting in object hallucination, revealing that preserving background context is crucial for mitigating object hallucination.
format Preprint
id arxiv_https___arxiv_org_abs_2502_15389
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle The Role of Background Information in Reducing Object Hallucination in Vision-Language Models: Insights from Cutoff API Prompting
Tomita, Masayo
Hayashi, Katsuhiko
Kaneko, Tomoyuki
Computer Vision and Pattern Recognition
Vision-Language Models (VLMs) occasionally generate outputs that contradict input images, constraining their reliability in real-world applications. While visual prompting is reported to suppress hallucinations by augmenting prompts with relevant area inside an image, the effectiveness in terms of the area remains uncertain. This study analyzes success and failure cases of Attention-driven visual prompting in object hallucination, revealing that preserving background context is crucial for mitigating object hallucination.
title The Role of Background Information in Reducing Object Hallucination in Vision-Language Models: Insights from Cutoff API Prompting
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2502.15389