Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Zhao, Xinghao
Format:	Preprint
Published:	2026
Subjects:	Computation and Language Machine Learning
Online Access:	https://arxiv.org/abs/2603.18940
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910077472997376
author	Zhao, Xinghao
author_facet	Zhao, Xinghao
contents	Understanding uncertainty in chain-of-thought reasoning is critical for reliable deployment of large language models. In this work, we propose a simple yet effective diagnostic approach based on trajectory shape rather than scalar magnitude. We show that this signal is practical, interpretable, and inexpensive to obtain in black-box settings, while remaining robust across models and datasets. Through extensive ablations and cross-domain replications, we demonstrate its utility for selective prediction and triage. Our findings offer a generalizable insight into uncertainty dynamics in reasoning tasks, with particular focus on numeric and discrete-answer settings.
format	Preprint
id	arxiv_https___arxiv_org_abs_2603_18940
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Entropy trajectory shape predicts LLM reasoning reliability: A diagnostic study of uncertainty dynamics in chain-of-thought Zhao, Xinghao Computation and Language Machine Learning Understanding uncertainty in chain-of-thought reasoning is critical for reliable deployment of large language models. In this work, we propose a simple yet effective diagnostic approach based on trajectory shape rather than scalar magnitude. We show that this signal is practical, interpretable, and inexpensive to obtain in black-box settings, while remaining robust across models and datasets. Through extensive ablations and cross-domain replications, we demonstrate its utility for selective prediction and triage. Our findings offer a generalizable insight into uncertainty dynamics in reasoning tasks, with particular focus on numeric and discrete-answer settings.
title	Entropy trajectory shape predicts LLM reasoning reliability: A diagnostic study of uncertainty dynamics in chain-of-thought
topic	Computation and Language Machine Learning
url	https://arxiv.org/abs/2603.18940

Similar Items