Medarbejdervisning: :: Library Catalog

Saved in:

Bibliografiske detaljer
Main Authors:	Cao, Rui, Deng, Zhenyun, Chen, Yulong, Schlichtkrull, Michael, Vlachos, Andreas
Format:	Preprint
Udgivet:	2026
Fag:	Computation and Language
Online adgang:	https://arxiv.org/abs/2602.11221
Tags:	Tilføj Tag Ingen Tags, Vær først til at tagge denne postø!

_version_	1866911469342294016
author	Cao, Rui Deng, Zhenyun Chen, Yulong Schlichtkrull, Michael Vlachos, Andreas
author_facet	Cao, Rui Deng, Zhenyun Chen, Yulong Schlichtkrull, Michael Vlachos, Andreas
contents	The Automatic Verification of Image-Text Claims (AVerImaTeC) shared task aims to advance system development for retrieving evidence and verifying real-world image-text claims. Participants were allowed to either employ external knowledge sources, such as web search engines, or leverage the curated knowledge store provided by the organizers. System performance was evaluated using the AVerImaTeC score, defined as a conditional verdict accuracy in which a verdict is considered correct only when the associated evidence score exceeds a predefined threshold. The shared task attracted 14 submissions during the development phase and 6 submissions during the testing phase. All participating systems in the testing phase outperformed the baseline provided. The winning team, HUMANE, achieved an AVerImaTeC score of 0.5455. This paper provides a detailed description of the shared task, presents the complete evaluation results, and discusses key insights and lessons learned.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_11221
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	The Automatic Verification of Image-Text Claims (AVerImaTeC) Shared Task Cao, Rui Deng, Zhenyun Chen, Yulong Schlichtkrull, Michael Vlachos, Andreas Computation and Language The Automatic Verification of Image-Text Claims (AVerImaTeC) shared task aims to advance system development for retrieving evidence and verifying real-world image-text claims. Participants were allowed to either employ external knowledge sources, such as web search engines, or leverage the curated knowledge store provided by the organizers. System performance was evaluated using the AVerImaTeC score, defined as a conditional verdict accuracy in which a verdict is considered correct only when the associated evidence score exceeds a predefined threshold. The shared task attracted 14 submissions during the development phase and 6 submissions during the testing phase. All participating systems in the testing phase outperformed the baseline provided. The winning team, HUMANE, achieved an AVerImaTeC score of 0.5455. This paper provides a detailed description of the shared task, presents the complete evaluation results, and discusses key insights and lessons learned.
title	The Automatic Verification of Image-Text Claims (AVerImaTeC) Shared Task
topic	Computation and Language
url	https://arxiv.org/abs/2602.11221

Lignende værker