Saved in:
Bibliographic Details
Main Authors: Nakada, Shota, Saito, Kazuhiro, Ishikawa, Yuchi, Munakata, Hokuto, Komatsu, Tatsuya, Kondo, Masayoshi
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2510.25225
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • We propose a novel task, hallucination localization in video captioning, which aims to identify hallucinations in video captions at the span level (i.e. individual words or phrases). This allows for a more detailed analysis of hallucinations compared to existing sentence-level hallucination detection task. To establish a benchmark for hallucination localization, we construct HLVC-Dataset, a carefully curated dataset created by manually annotating 1,167 video-caption pairs from VideoLLM-generated captions. We further implement a VideoLLM-based baseline method and conduct quantitative and qualitative evaluations to benchmark current performance on hallucination localization.