Saved in:
Bibliografiske detaljer
Main Authors: Cao, Rui, Deng, Zhenyun, Chen, Yulong, Schlichtkrull, Michael, Vlachos, Andreas
Format: Preprint
Udgivet: 2026
Fag:
Online adgang:https://arxiv.org/abs/2602.11221
Tags: Tilføj Tag
Ingen Tags, Vær først til at tagge denne postø!
_version_ 1866911469342294016
author Cao, Rui
Deng, Zhenyun
Chen, Yulong
Schlichtkrull, Michael
Vlachos, Andreas
author_facet Cao, Rui
Deng, Zhenyun
Chen, Yulong
Schlichtkrull, Michael
Vlachos, Andreas
contents The Automatic Verification of Image-Text Claims (AVerImaTeC) shared task aims to advance system development for retrieving evidence and verifying real-world image-text claims. Participants were allowed to either employ external knowledge sources, such as web search engines, or leverage the curated knowledge store provided by the organizers. System performance was evaluated using the AVerImaTeC score, defined as a conditional verdict accuracy in which a verdict is considered correct only when the associated evidence score exceeds a predefined threshold. The shared task attracted 14 submissions during the development phase and 6 submissions during the testing phase. All participating systems in the testing phase outperformed the baseline provided. The winning team, HUMANE, achieved an AVerImaTeC score of 0.5455. This paper provides a detailed description of the shared task, presents the complete evaluation results, and discusses key insights and lessons learned.
format Preprint
id arxiv_https___arxiv_org_abs_2602_11221
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle The Automatic Verification of Image-Text Claims (AVerImaTeC) Shared Task
Cao, Rui
Deng, Zhenyun
Chen, Yulong
Schlichtkrull, Michael
Vlachos, Andreas
Computation and Language
The Automatic Verification of Image-Text Claims (AVerImaTeC) shared task aims to advance system development for retrieving evidence and verifying real-world image-text claims. Participants were allowed to either employ external knowledge sources, such as web search engines, or leverage the curated knowledge store provided by the organizers. System performance was evaluated using the AVerImaTeC score, defined as a conditional verdict accuracy in which a verdict is considered correct only when the associated evidence score exceeds a predefined threshold. The shared task attracted 14 submissions during the development phase and 6 submissions during the testing phase. All participating systems in the testing phase outperformed the baseline provided. The winning team, HUMANE, achieved an AVerImaTeC score of 0.5455. This paper provides a detailed description of the shared task, presents the complete evaluation results, and discusses key insights and lessons learned.
title The Automatic Verification of Image-Text Claims (AVerImaTeC) Shared Task
topic Computation and Language
url https://arxiv.org/abs/2602.11221