:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xing, Wenbin, Zha, Quanxing, Zu, Lizheng, Li, Mengran, Li, Ming, Yan, Junchi
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2602.00559
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Mitigating Behavioral Hallucination in Multimodal Large Language Models for Sequential Images
by: You, Liangliang, et al.
Published: (2025)

Causal Decoding for Hallucination-Resistant Multimodal Large Language Models
by: Tan, Shiwei, et al.
Published: (2026)

ConVis: Contrastive Decoding with Hallucination Visualization for Mitigating Hallucinations in Multimodal Large Language Models
by: Park, Yeji, et al.
Published: (2024)

Mitigating Hallucinations in Video Large Language Models via Spatiotemporal-Semantic Contrastive Decoding
by: Gao, Yuansheng, et al.
Published: (2026)

Model Composition for Multimodal Large Language Models
by: Chen, Chi, et al.
Published: (2024)

SDCD: Structure-Disrupted Contrastive Decoding for Mitigating Hallucinations in Large Vision-Language Models
by: Xia, Yuxuan, et al.
Published: (2026)

Woodpecker: Hallucination Correction for Multimodal Large Language Models
by: Yin, Shukang, et al.
Published: (2023)

Temporal Insight Enhancement: Mitigating Temporal Hallucination in Multimodal Large Language Models
by: Sun, Li, et al.
Published: (2024)

Soften the Mask: Adaptive Temporal Soft Mask for Efficient Dynamic Facial Expression Recognition
by: Li, Meng-zhu, et al.
Published: (2025)

Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language Models
by: Huo, Fushuo, et al.
Published: (2024)

OAD-Promoter: Enhancing Zero-shot VQA using Large Language Models with Object Attribute Description
by: Xu, Quanxing, et al.
Published: (2025)

ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models
by: Rawte, Vipula, et al.
Published: (2024)

Hallucination Augmented Contrastive Learning for Multimodal Large Language Model
by: Jiang, Chaoya, et al.
Published: (2023)

Mitigating Hallucination in Multimodal Large Language Model via Hallucination-targeted Direct Preference Optimization
by: Fu, Yuhan, et al.
Published: (2024)

Robust Multimodal Large Language Models Against Modality Conflict
by: Zhang, Zongmeng, et al.
Published: (2025)

Speculative Decoding Reimagined for Multimodal Large Language Models
by: Lin, Luxi, et al.
Published: (2025)

Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models
by: Zhong, Weihong, et al.
Published: (2024)

MESH -- Understanding Videos Like Human: Measuring Hallucinations in Large Video Models
by: Yang, Garry, et al.
Published: (2025)

TangramPuzzle: Evaluating Multimodal Large Language Models with Compositional Spatial Reasoning
by: Liu, Daixian, et al.
Published: (2026)

Dynamic Multimodal Activation Steering for Hallucination Mitigation in Large Vision-Language Models
by: Yin, Jianghao, et al.
Published: (2026)

NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models
by: Wu, Kai, et al.
Published: (2024)

CLASP: Class-Adaptive Layer Fusion and Dual-Stage Pruning for Multimodal Large Language Models
by: Dang, Yunkai, et al.
Published: (2026)

Mitigating Hallucinations in Large Vision-Language Models via Summary-Guided Decoding
by: Min, Kyungmin, et al.
Published: (2024)

Delve into Visual Contrastive Decoding for Hallucination Mitigation of Large Vision-Language Models
by: Lee, Yi-Lun, et al.
Published: (2024)

Mixture of Decoding: An Attention-Inspired Adaptive Decoding Strategy to Mitigate Hallucinations in Large Vision-Language Models
by: Chen, Xinlong, et al.
Published: (2025)

VERHallu: Evaluating and Mitigating Event Relation Hallucination in Video Large Language Models
by: Zhang, Zefan, et al.
Published: (2026)

Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding
by: Wang, Xintong, et al.
Published: (2024)

Residual Decoding: Mitigating Hallucinations in Large Vision-Language Models via History-Aware Residual Guidance
by: Chen, Xinrong, et al.
Published: (2026)

SpaCE-10: A Comprehensive Benchmark for Multimodal Large Language Models in Compositional Spatial Intelligence
by: Gong, Ziyang, et al.
Published: (2025)

White-box Multimodal Jailbreaks Against Large Vision-Language Models
by: Wang, Ruofan, et al.
Published: (2024)

SEASON: Mitigating Temporal Hallucination in Video Large Language Models via Self-Diagnostic Contrastive Decoding
by: Wu, Chang-Hsun, et al.
Published: (2025)

Mitigating Hallucinations in Large Vision-Language Models (LVLMs) via Language-Contrastive Decoding (LCD)
by: Manevich, Avshalom, et al.
Published: (2024)

VidHalluc: Evaluating Temporal Hallucinations in Multimodal Large Language Models for Video Understanding
by: Li, Chaoyu, et al.
Published: (2024)

Efficient Contrastive Decoding with Probabilistic Hallucination Detection - Mitigating Hallucinations in Large Vision Language Models -
by: Fieback, Laura, et al.
Published: (2025)

Retrieval Visual Contrastive Decoding to Mitigate Object Hallucinations in Large Vision-Language Models
by: Lee, Jihoon, et al.
Published: (2025)

Mitigating Hallucinations in Large Vision-Language Models via Entity-Centric Multimodal Preference Optimization
by: Wu, Jiulong, et al.
Published: (2025)

Prefill-Time Intervention for Mitigating Hallucination in Large Vision-Language Models
by: Zhang, Chengsheng, et al.
Published: (2026)

Demystifying the Visual Quality Paradox in Multimodal Large Language Models
by: Xing, Shuo, et al.
Published: (2025)

CrossVid: A Comprehensive Benchmark for Evaluating Cross-Video Reasoning in Multimodal Large Language Models
by: Li, Jingyao, et al.
Published: (2025)

Instinct vs. Reflection: Unifying Token and Verbalized Confidence in Multimodal Large Models
by: Dang, Yunkai, et al.
Published: (2026)