Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yeh, Yi-Fan, Tao, Linwei, Dong, Minjing, Huang, Tao, Yu, Jialin, Torr, Philip, Xu, Chang
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2605.19344
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910272797540352
author	Yeh, Yi-Fan Tao, Linwei Dong, Minjing Huang, Tao Yu, Jialin Torr, Philip Xu, Chang
author_facet	Yeh, Yi-Fan Tao, Linwei Dong, Minjing Huang, Tao Yu, Jialin Torr, Philip Xu, Chang
contents	Linguistic cues such as "I believe" and "probably" offer an intuitive interface for communicating confidence, yet a generalisable, principled calibration framework for linguistic confidence expressions remains underexplored. In particular, co-occurring linguistic cues, contextual variation, and subjective audience interpretation pose unique challenges. We therefore model linguistic confidence as a distribution over plausible perceived probability values that a statement is correct, capturing interpretation variability that scalar representations discard. Within this distributional framework, we introduce faithfulness as a complementary evaluation dimension and present Faithfulness Divergence (FD), an information-theoretic metric quantifying the surprise induced in audience beliefs upon truth revelation. Building on these foundations, we present Retrieval-Augmented Linguistic Calibration (RALC), a lightweight post-hoc pipeline that propagates calibrated confidence signals back into natural language via retrieval-augmented rewriting. Across three QA benchmarks and five LLM families, RALC improves in-domain faithfulness and calibration up to 66% and 58%, respectively, outperforming black-box and grey-box calibration baselines.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_19344
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Retrieval-Augmented Linguistic Calibration Yeh, Yi-Fan Tao, Linwei Dong, Minjing Huang, Tao Yu, Jialin Torr, Philip Xu, Chang Computation and Language Linguistic cues such as "I believe" and "probably" offer an intuitive interface for communicating confidence, yet a generalisable, principled calibration framework for linguistic confidence expressions remains underexplored. In particular, co-occurring linguistic cues, contextual variation, and subjective audience interpretation pose unique challenges. We therefore model linguistic confidence as a distribution over plausible perceived probability values that a statement is correct, capturing interpretation variability that scalar representations discard. Within this distributional framework, we introduce faithfulness as a complementary evaluation dimension and present Faithfulness Divergence (FD), an information-theoretic metric quantifying the surprise induced in audience beliefs upon truth revelation. Building on these foundations, we present Retrieval-Augmented Linguistic Calibration (RALC), a lightweight post-hoc pipeline that propagates calibrated confidence signals back into natural language via retrieval-augmented rewriting. Across three QA benchmarks and five LLM families, RALC improves in-domain faithfulness and calibration up to 66% and 58%, respectively, outperforming black-box and grey-box calibration baselines.
title	Retrieval-Augmented Linguistic Calibration
topic	Computation and Language
url	https://arxiv.org/abs/2605.19344

Similar Items