Guardado en:
| Autores principales: | Zhao, Jinkun, Huang, Lei, Ge, Haixin, Wu, Wenjun |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2511.13400 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models
por: Rawte, Vipula, et al.
Publicado: (2024)
por: Rawte, Vipula, et al.
Publicado: (2024)
ReactBench: A Cause-Driven Benchmark for Multimodal Hallucination via Systematic Evaluation
por: Zhou, Shizhe, et al.
Publicado: (2026)
por: Zhou, Shizhe, et al.
Publicado: (2026)
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation
por: Gao, Hongcheng, et al.
Publicado: (2025)
por: Gao, Hongcheng, et al.
Publicado: (2025)
MIHBench: Benchmarking and Mitigating Multi-Image Hallucinations in Multimodal Large Language Models
por: Li, Jiale, et al.
Publicado: (2025)
por: Li, Jiale, et al.
Publicado: (2025)
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension
por: Li, Bohao, et al.
Publicado: (2024)
por: Li, Bohao, et al.
Publicado: (2024)
GenColorBench: A Color Evaluation Benchmark for Text-to-Image Generation Models
por: Butt, Muhammad Atif, et al.
Publicado: (2025)
por: Butt, Muhammad Atif, et al.
Publicado: (2025)
MCTBench: Multimodal Cognition towards Text-Rich Visual Scenes Benchmark
por: Shan, Bin, et al.
Publicado: (2024)
por: Shan, Bin, et al.
Publicado: (2024)
Towards Efficient and Effective Deep Clustering with Dynamic Grouping and Prototype Aggregation
por: Zhang, Haixin, et al.
Publicado: (2024)
por: Zhang, Haixin, et al.
Publicado: (2024)
Localizing Before Answering: A Hallucination Evaluation Benchmark for Grounded Medical Multimodal LLMs
por: Nguyen, Dung, et al.
Publicado: (2025)
por: Nguyen, Dung, et al.
Publicado: (2025)
MIRAGE: Assessing Hallucination in Multimodal Reasoning Chains of MLLM
por: Dong, Bowen, et al.
Publicado: (2025)
por: Dong, Bowen, et al.
Publicado: (2025)
VaseVQA: Multimodal Agent and Benchmark for Ancient Greek Pottery
por: Ge, Jinchao, et al.
Publicado: (2025)
por: Ge, Jinchao, et al.
Publicado: (2025)
ClotheDreamer: Text-Guided Garment Generation with 3D Gaussians
por: Liu, Yufei, et al.
Publicado: (2024)
por: Liu, Yufei, et al.
Publicado: (2024)
From Where Things Are to What They Are For: Benchmarking Spatial-Functional Intelligence in Multimodal LLMs
por: Zhang, Le, et al.
Publicado: (2026)
por: Zhang, Le, et al.
Publicado: (2026)
Towards Generalized Multimodal Homography Estimation
por: You, Jinkun, et al.
Publicado: (2026)
por: You, Jinkun, et al.
Publicado: (2026)
Text-Guided Layer Fusion Mitigates Hallucination in Multimodal LLMs
por: Lin, Chenchen, et al.
Publicado: (2026)
por: Lin, Chenchen, et al.
Publicado: (2026)
Lyapunov Probes for Hallucination Detection in Large Foundation Models
por: Luan, Bozhi, et al.
Publicado: (2026)
por: Luan, Bozhi, et al.
Publicado: (2026)
Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift
por: Qiu, Jielin, et al.
Publicado: (2022)
por: Qiu, Jielin, et al.
Publicado: (2022)
Dual-Level Cross-Modal Contrastive Clustering
por: Zhang, Haixin, et al.
Publicado: (2024)
por: Zhang, Haixin, et al.
Publicado: (2024)
What's in Common? Multimodal Models Hallucinate When Reasoning Across Scenes
por: Ross, Candace, et al.
Publicado: (2025)
por: Ross, Candace, et al.
Publicado: (2025)
Seeing is Believing? Mitigating OCR Hallucinations in Multimodal Large Language Models
por: He, Zhentao, et al.
Publicado: (2025)
por: He, Zhentao, et al.
Publicado: (2025)
TinyLLaVA-Video-R1: Towards Smaller LMMs for Video Reasoning
por: Zhang, Xingjian, et al.
Publicado: (2025)
por: Zhang, Xingjian, et al.
Publicado: (2025)
Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models
por: Zhong, Weihong, et al.
Publicado: (2024)
por: Zhong, Weihong, et al.
Publicado: (2024)
Evaluating Durability: Benchmark Insights into Multimodal Watermarking
por: Qiu, Jielin, et al.
Publicado: (2024)
por: Qiu, Jielin, et al.
Publicado: (2024)
FREAK: A Fine-grained Hallucination Evaluation Benchmark for Advanced MLLMs
por: Yin, Zhihan, et al.
Publicado: (2026)
por: Yin, Zhihan, et al.
Publicado: (2026)
A Survey of Multimodal Hallucination Evaluation and Detection
por: Chen, Zhiyuan, et al.
Publicado: (2025)
por: Chen, Zhiyuan, et al.
Publicado: (2025)
ColorConceptBench: A Benchmark for Probabilistic Color-Concept Understanding in Text-to-Image Models
por: Ruan, Chenxi, et al.
Publicado: (2026)
por: Ruan, Chenxi, et al.
Publicado: (2026)
MMR-V: What's Left Unsaid? A Benchmark for Multimodal Deep Reasoning in Videos
por: Zhu, Kejian, et al.
Publicado: (2025)
por: Zhu, Kejian, et al.
Publicado: (2025)
Hallucination Benchmark in Medical Visual Question Answering
por: Wu, Jinge, et al.
Publicado: (2024)
por: Wu, Jinge, et al.
Publicado: (2024)
AutoHallusion: Automatic Generation of Hallucination Benchmarks for Vision-Language Models
por: Wu, Xiyang, et al.
Publicado: (2024)
por: Wu, Xiyang, et al.
Publicado: (2024)
When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding
por: Shu, Yan, et al.
Publicado: (2025)
por: Shu, Yan, et al.
Publicado: (2025)
Hallucination of Multimodal Large Language Models: A Survey
por: Bai, Zechen, et al.
Publicado: (2024)
por: Bai, Zechen, et al.
Publicado: (2024)
Hallucination Augmented Contrastive Learning for Multimodal Large Language Model
por: Jiang, Chaoya, et al.
Publicado: (2023)
por: Jiang, Chaoya, et al.
Publicado: (2023)
When Text Hijacks Vision: Benchmarking and Mitigating Text Overlay-Induced Hallucination in Vision Language Models
por: Yakun, Cui, et al.
Publicado: (2026)
por: Yakun, Cui, et al.
Publicado: (2026)
SELECT: Detecting Label Errors in Real-world Scene Text Data
por: Liu, Wenjun, et al.
Publicado: (2025)
por: Liu, Wenjun, et al.
Publicado: (2025)
FTII-Bench: A Comprehensive Multimodal Benchmark for Flow Text with Image Insertion
por: Ruan, Jiacheng, et al.
Publicado: (2024)
por: Ruan, Jiacheng, et al.
Publicado: (2024)
ColorizeDiffusion: Adjustable Sketch Colorization with Reference Image and Text
por: Yan, Dingkun, et al.
Publicado: (2024)
por: Yan, Dingkun, et al.
Publicado: (2024)
CutPaste&Find: Efficient Multimodal Hallucination Detector with Visual-aid Knowledge Base
por: Nguyen, Cong-Duy, et al.
Publicado: (2025)
por: Nguyen, Cong-Duy, et al.
Publicado: (2025)
Steering the Verifiability of Multimodal AI Hallucinations
por: Pang, Jianhong, et al.
Publicado: (2026)
por: Pang, Jianhong, et al.
Publicado: (2026)
Text2Vis: A Challenging and Diverse Benchmark for Generating Multimodal Visualizations from Text
por: Rahman, Mizanur, et al.
Publicado: (2025)
por: Rahman, Mizanur, et al.
Publicado: (2025)
Combating Multimodal LLM Hallucination via Bottom-Up Holistic Reasoning
por: Wu, Shengqiong, et al.
Publicado: (2024)
por: Wu, Shengqiong, et al.
Publicado: (2024)
Ejemplares similares
-
ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models
por: Rawte, Vipula, et al.
Publicado: (2024) -
ReactBench: A Cause-Driven Benchmark for Multimodal Hallucination via Systematic Evaluation
por: Zhou, Shizhe, et al.
Publicado: (2026) -
Exploring Hallucination of Large Multimodal Models in Video Understanding: Benchmark, Analysis and Mitigation
por: Gao, Hongcheng, et al.
Publicado: (2025) -
MIHBench: Benchmarking and Mitigating Multi-Image Hallucinations in Multimodal Large Language Models
por: Li, Jiale, et al.
Publicado: (2025) -
SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension
por: Li, Bohao, et al.
Publicado: (2024)