Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Huang, Jingyi, Yang, Yuyi, Ji, Mengmeng, Alba, Charles, Zhang, Sheng, An, Ruopeng
Format:	Preprint
Published:	2025
Subjects:	Information Retrieval Artificial Intelligence Computation and Language
Online Access:	https://arxiv.org/abs/2512.00007
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915644486713344
author	Huang, Jingyi Yang, Yuyi Ji, Mengmeng Alba, Charles Zhang, Sheng An, Ruopeng
author_facet	Huang, Jingyi Yang, Yuyi Ji, Mengmeng Alba, Charles Zhang, Sheng An, Ruopeng
contents	The COVID-19 infodemic calls for scalable fact-checking solutions that handle long-form misinformation with accuracy and reliability. This study presents SAFE (system for accurate fact extraction and evaluation), an agent system that combines large language models with retrieval-augmented generation (RAG) to improve automated fact-checking of long-form COVID-19 misinformation. SAFE includes two agents - one for claim extraction and another for claim verification using LOTR-RAG, which leverages a 130,000-document COVID-19 research corpus. An enhanced variant, SAFE (LOTR-RAG + SRAG), incorporates Self-RAG to refine retrieval via query rewriting. We evaluated both systems on 50 fake news articles (2-17 pages) containing 246 annotated claims (M = 4.922, SD = 3.186), labeled as true (14.1%), partly true (14.4%), false (27.0%), partly false (2.2%), and misleading (21.0%) by public health professionals. SAFE systems significantly outperformed baseline LLMs in all metrics (p < 0.001). For consistency (0-1 scale), SAFE (LOTR-RAG) scored 0.629, exceeding both SAFE (+SRAG) (0.577) and the baseline (0.279). In subjective evaluations (0-4 Likert scale), SAFE (LOTR-RAG) also achieved the highest average ratings in usefulness (3.640), clearness (3.800), and authenticity (3.526). Adding SRAG slightly reduced overall performance, except for a minor gain in clearness. SAFE demonstrates robust improvements in long-form COVID-19 fact-checking by addressing LLM limitations in consistency and explainability. The core LOTR-RAG design proved more effective than its SRAG-augmented variant, offering a strong foundation for scalable misinformation mitigation.
format	Preprint
id	arxiv_https___arxiv_org_abs_2512_00007
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Use of Retrieval-Augmented Large Language Model Agent for Long-Form COVID-19 Fact-Checking Huang, Jingyi Yang, Yuyi Ji, Mengmeng Alba, Charles Zhang, Sheng An, Ruopeng Information Retrieval Artificial Intelligence Computation and Language The COVID-19 infodemic calls for scalable fact-checking solutions that handle long-form misinformation with accuracy and reliability. This study presents SAFE (system for accurate fact extraction and evaluation), an agent system that combines large language models with retrieval-augmented generation (RAG) to improve automated fact-checking of long-form COVID-19 misinformation. SAFE includes two agents - one for claim extraction and another for claim verification using LOTR-RAG, which leverages a 130,000-document COVID-19 research corpus. An enhanced variant, SAFE (LOTR-RAG + SRAG), incorporates Self-RAG to refine retrieval via query rewriting. We evaluated both systems on 50 fake news articles (2-17 pages) containing 246 annotated claims (M = 4.922, SD = 3.186), labeled as true (14.1%), partly true (14.4%), false (27.0%), partly false (2.2%), and misleading (21.0%) by public health professionals. SAFE systems significantly outperformed baseline LLMs in all metrics (p < 0.001). For consistency (0-1 scale), SAFE (LOTR-RAG) scored 0.629, exceeding both SAFE (+SRAG) (0.577) and the baseline (0.279). In subjective evaluations (0-4 Likert scale), SAFE (LOTR-RAG) also achieved the highest average ratings in usefulness (3.640), clearness (3.800), and authenticity (3.526). Adding SRAG slightly reduced overall performance, except for a minor gain in clearness. SAFE demonstrates robust improvements in long-form COVID-19 fact-checking by addressing LLM limitations in consistency and explainability. The core LOTR-RAG design proved more effective than its SRAG-augmented variant, offering a strong foundation for scalable misinformation mitigation.
title	Use of Retrieval-Augmented Large Language Model Agent for Long-Form COVID-19 Fact-Checking
topic	Information Retrieval Artificial Intelligence Computation and Language
url	https://arxiv.org/abs/2512.00007

Similar Items