Saved in:
Bibliographic Details
Main Author: Zhao, Xinghao
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.18940
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910077472997376
author Zhao, Xinghao
author_facet Zhao, Xinghao
contents Understanding uncertainty in chain-of-thought reasoning is critical for reliable deployment of large language models. In this work, we propose a simple yet effective diagnostic approach based on trajectory shape rather than scalar magnitude. We show that this signal is practical, interpretable, and inexpensive to obtain in black-box settings, while remaining robust across models and datasets. Through extensive ablations and cross-domain replications, we demonstrate its utility for selective prediction and triage. Our findings offer a generalizable insight into uncertainty dynamics in reasoning tasks, with particular focus on numeric and discrete-answer settings.
format Preprint
id arxiv_https___arxiv_org_abs_2603_18940
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Entropy trajectory shape predicts LLM reasoning reliability: A diagnostic study of uncertainty dynamics in chain-of-thought
Zhao, Xinghao
Computation and Language
Machine Learning
Understanding uncertainty in chain-of-thought reasoning is critical for reliable deployment of large language models. In this work, we propose a simple yet effective diagnostic approach based on trajectory shape rather than scalar magnitude. We show that this signal is practical, interpretable, and inexpensive to obtain in black-box settings, while remaining robust across models and datasets. Through extensive ablations and cross-domain replications, we demonstrate its utility for selective prediction and triage. Our findings offer a generalizable insight into uncertainty dynamics in reasoning tasks, with particular focus on numeric and discrete-answer settings.
title Entropy trajectory shape predicts LLM reasoning reliability: A diagnostic study of uncertainty dynamics in chain-of-thought
topic Computation and Language
Machine Learning
url https://arxiv.org/abs/2603.18940