Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.16627 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866914092952846336 |
|---|---|
| author | Devadiga, Prasanna Suneesh, Arya Rajpoot, Pawan Kumar Hazarika, Bharatdeep Baliga, Aditya U |
| author_facet | Devadiga, Prasanna Suneesh, Arya Rajpoot, Pawan Kumar Hazarika, Bharatdeep Baliga, Aditya U |
| contents | We address the challenge of retrieving previously fact-checked claims in monolingual and crosslingual settings - a critical task given the global prevalence of disinformation. Our approach follows a two-stage strategy: a reliable baseline retrieval system using a fine-tuned embedding model and an LLM-based reranker. Our key contribution is demonstrating how LLM-based translation can overcome the hurdles of multilingual information retrieval. Additionally, we focus on ensuring that the bulk of the pipeline can be replicated on a consumer GPU. Our final integrated system achieved a success@10 score of 0.938 and 0.81025 on the monolingual and crosslingual test sets, respectively. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2504_16627 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | TIFIN India at SemEval-2025: Harnessing Translation to Overcome Multilingual IR Challenges in Fact-Checked Claim Retrieval Devadiga, Prasanna Suneesh, Arya Rajpoot, Pawan Kumar Hazarika, Bharatdeep Baliga, Aditya U Computation and Language We address the challenge of retrieving previously fact-checked claims in monolingual and crosslingual settings - a critical task given the global prevalence of disinformation. Our approach follows a two-stage strategy: a reliable baseline retrieval system using a fine-tuned embedding model and an LLM-based reranker. Our key contribution is demonstrating how LLM-based translation can overcome the hurdles of multilingual information retrieval. Additionally, we focus on ensuring that the bulk of the pipeline can be replicated on a consumer GPU. Our final integrated system achieved a success@10 score of 0.938 and 0.81025 on the monolingual and crosslingual test sets, respectively. |
| title | TIFIN India at SemEval-2025: Harnessing Translation to Overcome Multilingual IR Challenges in Fact-Checked Claim Retrieval |
| topic | Computation and Language |
| url | https://arxiv.org/abs/2504.16627 |