:: Library Catalog

Íomhá chlúdaigh

Sábháilte in:

Sonraí bibleagrafaíochta
Príomhchruthaitheoirí:	Si, Guangzong, Yin, Hao, Li, Xianfei, Ding, Qing, Liao, Wenlong, He, Tao, Peng, Pai
Formáid:	Preprint
Foilsithe / Cruthaithe:	2025
Ábhair:	Computer Vision and Pattern Recognition
Rochtain ar líne:	https://arxiv.org/abs/2509.00371
Clibeanna:	Cuir clib leis Níl clibeanna ann, Bí ar an gcéad duine le clib a chur leis an taifead seo!

Míreanna comhchosúla

The Mirage of Performance Gains: Why Contrastive Decoding Fails to Mitigate Object Hallucinations in MLLMs?
de réir: Yin, Hao, et al.
Foilsithe / Cruthaithe: (2025)

Lifting the Veil on Visual Information Flow in MLLMs: Unlocking Pathways to Faster Inference
de réir: Yin, Hao, et al.
Foilsithe / Cruthaithe: (2025)

Generative Planning with 3D-vision Language Pre-training for End-to-End Autonomous Driving
de réir: Li, Tengpeng, et al.
Foilsithe / Cruthaithe: (2025)

Open-Vocabulary Object Detection via Neighboring Region Attention Alignment
de réir: Qiang, Sunyuan, et al.
Foilsithe / Cruthaithe: (2024)

ClearSight: Visual Signal Enhancement for Object Hallucination Mitigation in Multimodal Large language Models
de réir: Yin, Hao, et al.
Foilsithe / Cruthaithe: (2025)

Rethinking Temporal Fusion with a Unified Gradient Descent View for 3D Semantic Occupancy Prediction
de réir: Chen, Dubing, et al.
Foilsithe / Cruthaithe: (2025)

You Only Click Once: Single Point Weakly Supervised 3D Instance Segmentation for Autonomous Driving
de réir: Jiang, Guangfeng, et al.
Foilsithe / Cruthaithe: (2025)

The DAWN of World-Action Interactive Models
de réir: Lu, Hongbo, et al.
Foilsithe / Cruthaithe: (2026)

Semantic Causality-Aware Vision-Based 3D Occupancy Prediction
de réir: Chen, Dubing, et al.
Foilsithe / Cruthaithe: (2025)

ARGUS: Hallucination and Omission Evaluation in Video-LLMs
de réir: Rawal, Ruchit, et al.
Foilsithe / Cruthaithe: (2025)

VisionNVS: Self-Supervised Inpainting for Novel View Synthesis under the Virtual-Shift Paradigm
de réir: Lu, Hongbo, et al.
Foilsithe / Cruthaithe: (2026)

Attention Reallocation: Towards Zero-cost and Controllable Hallucination Mitigation of MLLMs
de réir: Tu, Chongjun, et al.
Foilsithe / Cruthaithe: (2025)

One Token, Two Fates: A Unified Framework via Vision Token Manipulation Against MLLMs Hallucination
de réir: Fa, Zhan, et al.
Foilsithe / Cruthaithe: (2026)

FREAK: A Fine-grained Hallucination Evaluation Benchmark for Advanced MLLMs
de réir: Yin, Zhihan, et al.
Foilsithe / Cruthaithe: (2026)

TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs
de réir: Zhang, Kejia, et al.
Foilsithe / Cruthaithe: (2025)

RedundancyLens: Revealing and Exploiting Visual Token Processing Redundancy for Efficient Decoder-Only MLLMs
de réir: Li, Hongliang, et al.
Foilsithe / Cruthaithe: (2025)

A Real-Time On-Device Defect Detection Framework for Laser Power-Meter Sensors via Unsupervised Learning
de réir: Zheng, Dongqi, et al.
Foilsithe / Cruthaithe: (2025)

Clinical Cognition Alignment for Gastrointestinal Diagnosis with Multimodal LLMs
de réir: Zheng, Huan, et al.
Foilsithe / Cruthaithe: (2026)

Mitigating Object Hallucinations in MLLMs via Multi-Frequency Perturbations
de réir: Li, Shuo, et al.
Foilsithe / Cruthaithe: (2025)

Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding
de réir: Tang, Feilong, et al.
Foilsithe / Cruthaithe: (2025)

Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation
de réir: Huang, Zhe, et al.
Foilsithe / Cruthaithe: (2025)

Explore the Hallucination on Low-level Perception for MLLMs
de réir: Sun, Yinan, et al.
Foilsithe / Cruthaithe: (2024)

Interpreting and Mitigating Hallucination in MLLMs through Multi-agent Debate
de réir: Lin, Zheng, et al.
Foilsithe / Cruthaithe: (2024)

Instruction-Oriented Preference Alignment for Enhancing Multi-Modal Comprehension Capability of MLLMs
de réir: Wang, Zitian, et al.
Foilsithe / Cruthaithe: (2025)

One Model, Two Minds: Task-Conditioned Reasoning for Unified Image Quality and Aesthetic Assessment
de réir: Yin, Wen, et al.
Foilsithe / Cruthaithe: (2026)

Deep Learning for Inertial Positioning: A Survey
de réir: Chen, Changhao, et al.
Foilsithe / Cruthaithe: (2023)

Mitigating Visual Hallucinations via Semantic Curriculum Preference Optimization in MLLMs
de réir: Li, Yuanshuai, et al.
Foilsithe / Cruthaithe: (2025)

On Discriminative vs. Generative classifiers: Rethinking MLLMs for Action Understanding
de réir: Pang, Zhanzhong, et al.
Foilsithe / Cruthaithe: (2026)

NOAH: Benchmarking Narrative Prior driven Hallucination and Omission in Video Large Language Models
de réir: Lee, Kyuho, et al.
Foilsithe / Cruthaithe: (2025)

Diagnosing and Correcting Concept Omission in Multimodal Diffusion Transformers
de réir: Baek, Kanghyun, et al.
Foilsithe / Cruthaithe: (2026)

Detecting Omissions in Geographic Maps through Computer Vision
de réir: Nguyen, Phuc D. A., et al.
Foilsithe / Cruthaithe: (2024)

ReEXplore: Improving MLLMs for Embodied Exploration with Contextualized Retrospective Experience Replay
de réir: Zhang, Gengyuan, et al.
Foilsithe / Cruthaithe: (2025)

Ground What You See: Hallucination-Resistant MLLMs via Caption Feedback, Diversity-Aware Sampling, and Conflict Regularization
de réir: Pan, Miao, et al.
Foilsithe / Cruthaithe: (2026)

Mitigating Object Hallucination in MLLMs via Data-augmented Phrase-level Alignment
de réir: Sarkar, Pritam, et al.
Foilsithe / Cruthaithe: (2024)

Tell Model Where to Look: Mitigating Hallucinations in MLLMs by Vision-Guided Attention
de réir: Zhao, Jianfei, et al.
Foilsithe / Cruthaithe: (2025)

FINER: MLLMs Hallucinate under Fine-grained Negative Queries
de réir: Xiao, Rui, et al.
Foilsithe / Cruthaithe: (2026)

Improving the Reasoning of Multi-Image Grounding in MLLMs via Reinforcement Learning
de réir: Zhang, Bob, et al.
Foilsithe / Cruthaithe: (2025)

Harnessing the Power of MLLMs for Transferable Text-to-Image Person ReID
de réir: Tan, Wentao, et al.
Foilsithe / Cruthaithe: (2024)

Vision-Language Introspection: Mitigating Overconfident Hallucinations in MLLMs via Interpretable Bi-Causal Steering
de réir: Liu, Shuliang, et al.
Foilsithe / Cruthaithe: (2026)

When Looking Is Not Enough: Visual Attention Structure Reveals Hallucination in MLLMs
de réir: Cao, Fanpu, et al.
Foilsithe / Cruthaithe: (2026)