Saved in:
| Main Authors: | Liu, Hude, Hu, Jerry Yao-Chieh, Zhang, Jennifer Yuntong, Song, Zhao, Liu, Han |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.21473 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
On Flow Matching KL Divergence
by: Su, Maojiang, et al.
Published: (2025)
by: Su, Maojiang, et al.
Published: (2025)
On Statistical Rates of Conditional Diffusion Transformers: Approximation, Estimation and Minimax Optimality
by: Hu, Jerry Yao-Chieh, et al.
Published: (2024)
by: Hu, Jerry Yao-Chieh, et al.
Published: (2024)
In-Context Algorithm Emulation in Fixed-Weight Transformers
by: Hu, Jerry Yao-Chieh, et al.
Published: (2025)
by: Hu, Jerry Yao-Chieh, et al.
Published: (2025)
Steering LVLMs via Sparse Autoencoder for Hallucination Mitigation
by: Hua, Zhenglin, et al.
Published: (2025)
by: Hua, Zhenglin, et al.
Published: (2025)
Why are Visually-Grounded Language Models Bad at Image Classification?
by: Zhang, Yuhui, et al.
Published: (2024)
by: Zhang, Yuhui, et al.
Published: (2024)
STanHop: Sparse Tandem Hopfield Model for Memory-Enhanced Time Series Prediction
by: Wu, Dennis, et al.
Published: (2023)
by: Wu, Dennis, et al.
Published: (2023)
On Structured State-Space Duality
by: Hu, Jerry Yao-Chieh, et al.
Published: (2025)
by: Hu, Jerry Yao-Chieh, et al.
Published: (2025)
Counterfactual Segmentation Reasoning: Diagnosing and Mitigating Pixel-Grounding Hallucination
by: Li, Xinzhuo, et al.
Published: (2025)
by: Li, Xinzhuo, et al.
Published: (2025)
Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models
by: Wu, Junfei, et al.
Published: (2024)
by: Wu, Junfei, et al.
Published: (2024)
Mitigating Object Hallucination in Large Vision-Language Models via Image-Grounded Guidance
by: Zhao, Linxi, et al.
Published: (2024)
by: Zhao, Linxi, et al.
Published: (2024)
Skip \n: A Simple Method to Reduce Hallucination in Large Vision-Language Models
by: Han, Zongbo, et al.
Published: (2024)
by: Han, Zongbo, et al.
Published: (2024)
Woodpecker: Hallucination Correction for Multimodal Large Language Models
by: Yin, Shukang, et al.
Published: (2023)
by: Yin, Shukang, et al.
Published: (2023)
Efficient Contrastive Decoding with Probabilistic Hallucination Detection - Mitigating Hallucinations in Large Vision Language Models -
by: Fieback, Laura, et al.
Published: (2025)
by: Fieback, Laura, et al.
Published: (2025)
Steering the Verifiability of Multimodal AI Hallucinations
by: Pang, Jianhong, et al.
Published: (2026)
by: Pang, Jianhong, et al.
Published: (2026)
ALOHa: A New Measure for Hallucination in Captioning Models
by: Petryk, Suzanne, et al.
Published: (2024)
by: Petryk, Suzanne, et al.
Published: (2024)
DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation
by: Rahman, A B M Ashikur, et al.
Published: (2024)
by: Rahman, A B M Ashikur, et al.
Published: (2024)
FactCHD: Benchmarking Fact-Conflicting Hallucination Detection
by: Chen, Xiang, et al.
Published: (2023)
by: Chen, Xiang, et al.
Published: (2023)
When Prompts Override Vision: Prompt-Induced Hallucinations in LVLMs
by: Khayatan, Pegah, et al.
Published: (2026)
by: Khayatan, Pegah, et al.
Published: (2026)
Unified Triplet-Level Hallucination Evaluation for Large Vision-Language Models
by: Wu, Junjie, et al.
Published: (2024)
by: Wu, Junjie, et al.
Published: (2024)
Toward More Reliable Artificial Intelligence: Reducing Hallucinations in Vision-Language Models
by: Sanogo, Kassoum, et al.
Published: (2025)
by: Sanogo, Kassoum, et al.
Published: (2025)
SegSub: Evaluating Robustness to Knowledge Conflicts and Hallucinations in Vision-Language Models
by: Carragher, Peter, et al.
Published: (2025)
by: Carragher, Peter, et al.
Published: (2025)
Beyond Superficial Unlearning: Sharpness-Aware Robust Erasure of Hallucinations in Multimodal LLMs
by: Fang, Xianya, et al.
Published: (2026)
by: Fang, Xianya, et al.
Published: (2026)
MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
by: Wang, Chenxi, et al.
Published: (2024)
by: Wang, Chenxi, et al.
Published: (2024)
Mitigating Object and Action Hallucinations in Multimodal LLMs via Self-Augmented Contrastive Alignment
by: Chang, Kai-Po, et al.
Published: (2025)
by: Chang, Kai-Po, et al.
Published: (2025)
Streaming-dLLM: Accelerating Diffusion LLMs via Suffix Pruning and Dynamic Decoding
by: Xiao, Zhongyu, et al.
Published: (2026)
by: Xiao, Zhongyu, et al.
Published: (2026)
Detecting and Mitigating Hallucination in Large Vision Language Models via Fine-Grained AI Feedback
by: Xiao, Wenyi, et al.
Published: (2024)
by: Xiao, Wenyi, et al.
Published: (2024)
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
by: Zhang, Jingyi, et al.
Published: (2025)
by: Zhang, Jingyi, et al.
Published: (2025)
R1-SyntheticVL: Is Synthetic Data from Generative Models Ready for Multimodal Large Language Model?
by: Zhang, Jingyi, et al.
Published: (2026)
by: Zhang, Jingyi, et al.
Published: (2026)
Contextual Experience Replay for Self-Improvement of Language Agents
by: Liu, Yitao, et al.
Published: (2025)
by: Liu, Yitao, et al.
Published: (2025)
SEASON: Mitigating Temporal Hallucination in Video Large Language Models via Self-Diagnostic Contrastive Decoding
by: Wu, Chang-Hsun, et al.
Published: (2025)
by: Wu, Chang-Hsun, et al.
Published: (2025)
T2VTextBench: A Human Evaluation Benchmark for Textual Control in Video Generation Models
by: Guo, Xuyang, et al.
Published: (2025)
by: Guo, Xuyang, et al.
Published: (2025)
T2VPhysBench: A First-Principles Benchmark for Physical Consistency in Text-to-Video Generation
by: Guo, Xuyang, et al.
Published: (2025)
by: Guo, Xuyang, et al.
Published: (2025)
$\textit{Jump Your Steps}$: Optimizing Sampling Schedule of Discrete Diffusion Models
by: Park, Yong-Hyun, et al.
Published: (2024)
by: Park, Yong-Hyun, et al.
Published: (2024)
PerceptionComp: A Video Benchmark for Complex Perception-Centric Reasoning
by: Li, Shaoxuan, et al.
Published: (2026)
by: Li, Shaoxuan, et al.
Published: (2026)
Benchmarking Zero-Shot Robustness of Multimodal Foundation Models: A Pilot Study
by: Wang, Chenguang, et al.
Published: (2024)
by: Wang, Chenguang, et al.
Published: (2024)
Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification
by: Huang, Wenxuan, et al.
Published: (2024)
by: Huang, Wenxuan, et al.
Published: (2024)
Learning to Instruct for Visual Instruction Tuning
by: Zhou, Zhihan, et al.
Published: (2025)
by: Zhou, Zhihan, et al.
Published: (2025)
From EduVisBench to EduVisAgent: A Benchmark and Multi-Agent Framework for Reasoning-Driven Pedagogical Visualization
by: Ji, Haonian, et al.
Published: (2025)
by: Ji, Haonian, et al.
Published: (2025)
Squeeze Out Tokens from Sample for Finer-Grained Data Governance
by: Lin, Weixiong, et al.
Published: (2025)
by: Lin, Weixiong, et al.
Published: (2025)
HLL: Can Agents Cross Humanity's Last Line of Verification?
by: Song, Xinhao, et al.
Published: (2026)
by: Song, Xinhao, et al.
Published: (2026)
Similar Items
-
On Flow Matching KL Divergence
by: Su, Maojiang, et al.
Published: (2025) -
On Statistical Rates of Conditional Diffusion Transformers: Approximation, Estimation and Minimax Optimality
by: Hu, Jerry Yao-Chieh, et al.
Published: (2024) -
In-Context Algorithm Emulation in Fixed-Weight Transformers
by: Hu, Jerry Yao-Chieh, et al.
Published: (2025) -
Steering LVLMs via Sparse Autoencoder for Hallucination Mitigation
by: Hua, Zhenglin, et al.
Published: (2025) -
Why are Visually-Grounded Language Models Bad at Image Classification?
by: Zhang, Yuhui, et al.
Published: (2024)