:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liu, Hude, Hu, Jerry Yao-Chieh, Zhang, Jennifer Yuntong, Song, Zhao, Liu, Han
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Artificial Intelligence Computation and Language Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2509.21473
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

On Flow Matching KL Divergence
by: Su, Maojiang, et al.
Published: (2025)

On Statistical Rates of Conditional Diffusion Transformers: Approximation, Estimation and Minimax Optimality
by: Hu, Jerry Yao-Chieh, et al.
Published: (2024)

In-Context Algorithm Emulation in Fixed-Weight Transformers
by: Hu, Jerry Yao-Chieh, et al.
Published: (2025)

Steering LVLMs via Sparse Autoencoder for Hallucination Mitigation
by: Hua, Zhenglin, et al.
Published: (2025)

Why are Visually-Grounded Language Models Bad at Image Classification?
by: Zhang, Yuhui, et al.
Published: (2024)

STanHop: Sparse Tandem Hopfield Model for Memory-Enhanced Time Series Prediction
by: Wu, Dennis, et al.
Published: (2023)

On Structured State-Space Duality
by: Hu, Jerry Yao-Chieh, et al.
Published: (2025)

Counterfactual Segmentation Reasoning: Diagnosing and Mitigating Pixel-Grounding Hallucination
by: Li, Xinzhuo, et al.
Published: (2025)

Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models
by: Wu, Junfei, et al.
Published: (2024)

Mitigating Object Hallucination in Large Vision-Language Models via Image-Grounded Guidance
by: Zhao, Linxi, et al.
Published: (2024)

Skip \n: A Simple Method to Reduce Hallucination in Large Vision-Language Models
by: Han, Zongbo, et al.
Published: (2024)

Woodpecker: Hallucination Correction for Multimodal Large Language Models
by: Yin, Shukang, et al.
Published: (2023)

Efficient Contrastive Decoding with Probabilistic Hallucination Detection - Mitigating Hallucinations in Large Vision Language Models -
by: Fieback, Laura, et al.
Published: (2025)

Steering the Verifiability of Multimodal AI Hallucinations
by: Pang, Jianhong, et al.
Published: (2026)

ALOHa: A New Measure for Hallucination in Captioning Models
by: Petryk, Suzanne, et al.
Published: (2024)

DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation
by: Rahman, A B M Ashikur, et al.
Published: (2024)

FactCHD: Benchmarking Fact-Conflicting Hallucination Detection
by: Chen, Xiang, et al.
Published: (2023)

When Prompts Override Vision: Prompt-Induced Hallucinations in LVLMs
by: Khayatan, Pegah, et al.
Published: (2026)

Unified Triplet-Level Hallucination Evaluation for Large Vision-Language Models
by: Wu, Junjie, et al.
Published: (2024)

Toward More Reliable Artificial Intelligence: Reducing Hallucinations in Vision-Language Models
by: Sanogo, Kassoum, et al.
Published: (2025)

SegSub: Evaluating Robustness to Knowledge Conflicts and Hallucinations in Vision-Language Models
by: Carragher, Peter, et al.
Published: (2025)

Beyond Superficial Unlearning: Sharpness-Aware Robust Erasure of Hallucinations in Multimodal LLMs
by: Fang, Xianya, et al.
Published: (2026)

MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
by: Wang, Chenxi, et al.
Published: (2024)

Mitigating Object and Action Hallucinations in Multimodal LLMs via Self-Augmented Contrastive Alignment
by: Chang, Kai-Po, et al.
Published: (2025)

Streaming-dLLM: Accelerating Diffusion LLMs via Suffix Pruning and Dynamic Decoding
by: Xiao, Zhongyu, et al.
Published: (2026)

Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback
by: Xiao, Wenyi, et al.
Published: (2024)

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
by: Zhang, Jingyi, et al.
Published: (2025)

R1-SyntheticVL: Is Synthetic Data from Generative Models Ready for Multimodal Large Language Model?
by: Zhang, Jingyi, et al.
Published: (2026)

Contextual Experience Replay for Self-Improvement of Language Agents
by: Liu, Yitao, et al.
Published: (2025)

SEASON: Mitigating Temporal Hallucination in Video Large Language Models via Self-Diagnostic Contrastive Decoding
by: Wu, Chang-Hsun, et al.
Published: (2025)

T2VTextBench: A Human Evaluation Benchmark for Textual Control in Video Generation Models
by: Guo, Xuyang, et al.
Published: (2025)

T2VPhysBench: A First-Principles Benchmark for Physical Consistency in Text-to-Video Generation
by: Guo, Xuyang, et al.
Published: (2025)

$\textit{Jump Your Steps}$: Optimizing Sampling Schedule of Discrete Diffusion Models
by: Park, Yong-Hyun, et al.
Published: (2024)

PerceptionComp: A Video Benchmark for Complex Perception-Centric Reasoning
by: Li, Shaoxuan, et al.
Published: (2026)

Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study
by: Wang, Chenguang, et al.
Published: (2024)

Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification
by: Huang, Wenxuan, et al.
Published: (2024)

Learning to Instruct for Visual Instruction Tuning
by: Zhou, Zhihan, et al.
Published: (2025)

From EduVisBench to EduVisAgent: A Benchmark and Multi-Agent Framework for Reasoning-Driven Pedagogical Visualization
by: Ji, Haonian, et al.
Published: (2025)

Squeeze Out Tokens from Sample for Finer-Grained Data Governance
by: Lin, Weixiong, et al.
Published: (2025)

HLL: Can Agents Cross Humanity's Last Line of Verification?
by: Song, Xinhao, et al.
Published: (2026)