Saved in:
Bibliographic Details
Main Authors: Vladika, Juraj, Soydemir, Ihsan, Matthes, Florian
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2506.19607
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • While large language models (LLMs) have shown remarkable capabilities to generate coherent text, they suffer from the issue of hallucinations -- factually inaccurate statements. Among numerous approaches to tackle hallucinations, especially promising are the self-correcting methods. They leverage the multi-turn nature of LLMs to iteratively generate verification questions inquiring additional evidence, answer them with internal or external knowledge, and use that to refine the original response with the new corrections. These methods have been explored for encyclopedic generation, but less so for domains like news summarization. In this work, we investigate two state-of-the-art self-correcting systems by applying them to correct hallucinated summaries using evidence from three search engines. We analyze the results and provide insights into systems' performance, revealing interesting practical findings on the benefits of search engine snippets and few-shot prompts, as well as high alignment of G-Eval and human evaluation.