Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhang, Siyuan, Zhang, Yichi, Dong, Yinpeng, Su, Hang
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2502.19127
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909836027887616
author	Zhang, Siyuan Zhang, Yichi Dong, Yinpeng Su, Hang
author_facet	Zhang, Siyuan Zhang, Yichi Dong, Yinpeng Su, Hang
contents	Large Language Models (LLMs) often struggle to align their responses with objective facts, resulting in the issue of factual hallucinations, which can be difficult to detect and mislead users without relevant knowledge. Although post-training techniques have been employed to mitigate the issue, existing methods usually suffer from poor generalization and trade-offs in other different capabilities. In this paper, we propose to address these by directly augmenting LLM's fundamental ability to precisely leverage its knowledge and introduce PKUE (Precise Knowledge Utilization Enhancement), which fine-tunes the model on self-generated responses to precise and simple factual questions through preference optimization. Furthermore, we construct FactualBench, a comprehensive and precise factual QA dataset containing 181k Chinese data spanning 21 domains, to facilitate both evaluation and training. Extensive experiments demonstrate that PKUE significantly improves LLM overall performance, with consistent enhancement across factual tasks of various forms, general tasks beyond factuality, and tasks in different language.
format	Preprint
id	arxiv_https___arxiv_org_abs_2502_19127
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Exploring the Generalizability of Factual Hallucination Mitigation via Enhancing Precise Knowledge Utilization Zhang, Siyuan Zhang, Yichi Dong, Yinpeng Su, Hang Computation and Language Large Language Models (LLMs) often struggle to align their responses with objective facts, resulting in the issue of factual hallucinations, which can be difficult to detect and mislead users without relevant knowledge. Although post-training techniques have been employed to mitigate the issue, existing methods usually suffer from poor generalization and trade-offs in other different capabilities. In this paper, we propose to address these by directly augmenting LLM's fundamental ability to precisely leverage its knowledge and introduce PKUE (Precise Knowledge Utilization Enhancement), which fine-tunes the model on self-generated responses to precise and simple factual questions through preference optimization. Furthermore, we construct FactualBench, a comprehensive and precise factual QA dataset containing 181k Chinese data spanning 21 domains, to facilitate both evaluation and training. Extensive experiments demonstrate that PKUE significantly improves LLM overall performance, with consistent enhancement across factual tasks of various forms, general tasks beyond factuality, and tasks in different language.
title	Exploring the Generalizability of Factual Hallucination Mitigation via Enhancing Precise Knowledge Utilization
topic	Computation and Language
url	https://arxiv.org/abs/2502.19127

Similar Items