Saved in:
Bibliographic Details
Main Authors: Devadiga, Prasanna, Suneesh, Arya, Rajpoot, Pawan Kumar, Hazarika, Bharatdeep, Baliga, Aditya U
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2504.16627
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914092952846336
author Devadiga, Prasanna
Suneesh, Arya
Rajpoot, Pawan Kumar
Hazarika, Bharatdeep
Baliga, Aditya U
author_facet Devadiga, Prasanna
Suneesh, Arya
Rajpoot, Pawan Kumar
Hazarika, Bharatdeep
Baliga, Aditya U
contents We address the challenge of retrieving previously fact-checked claims in monolingual and crosslingual settings - a critical task given the global prevalence of disinformation. Our approach follows a two-stage strategy: a reliable baseline retrieval system using a fine-tuned embedding model and an LLM-based reranker. Our key contribution is demonstrating how LLM-based translation can overcome the hurdles of multilingual information retrieval. Additionally, we focus on ensuring that the bulk of the pipeline can be replicated on a consumer GPU. Our final integrated system achieved a success@10 score of 0.938 and 0.81025 on the monolingual and crosslingual test sets, respectively.
format Preprint
id arxiv_https___arxiv_org_abs_2504_16627
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle TIFIN India at SemEval-2025: Harnessing Translation to Overcome Multilingual IR Challenges in Fact-Checked Claim Retrieval
Devadiga, Prasanna
Suneesh, Arya
Rajpoot, Pawan Kumar
Hazarika, Bharatdeep
Baliga, Aditya U
Computation and Language
We address the challenge of retrieving previously fact-checked claims in monolingual and crosslingual settings - a critical task given the global prevalence of disinformation. Our approach follows a two-stage strategy: a reliable baseline retrieval system using a fine-tuned embedding model and an LLM-based reranker. Our key contribution is demonstrating how LLM-based translation can overcome the hurdles of multilingual information retrieval. Additionally, we focus on ensuring that the bulk of the pipeline can be replicated on a consumer GPU. Our final integrated system achieved a success@10 score of 0.938 and 0.81025 on the monolingual and crosslingual test sets, respectively.
title TIFIN India at SemEval-2025: Harnessing Translation to Overcome Multilingual IR Challenges in Fact-Checked Claim Retrieval
topic Computation and Language
url https://arxiv.org/abs/2504.16627