Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhang, Siyuan, Li, Jialian, Zhang, Yichi, Yang, Xiao, Dong, Yinpeng, Su, Hang
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2602.00770
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914298465353728
author	Zhang, Siyuan Li, Jialian Zhang, Yichi Yang, Xiao Dong, Yinpeng Su, Hang
author_facet	Zhang, Siyuan Li, Jialian Zhang, Yichi Yang, Xiao Dong, Yinpeng Su, Hang
contents	Large Language Models have achieved remarkable performance on reasoning tasks, motivating research into how this ability evolves during training. Prior work has primarily analyzed this evolution via explicit generation outcomes, treating the reasoning process as a black box and obscuring internal changes. To address this opacity, we introduce a representational perspective to investigate the dynamics of the model's internal states. Through comprehensive experiments across models at various training stages, we discover that post-training yields only limited improvement in static initial representation quality. Furthermore, we reveal that, distinct from non-reasoning tasks, reasoning involves a significant continuous distributional shift in representations during generation. Comparative analysis indicates that post-training empowers models to drive this transition toward a better distribution for task solving. To clarify the relationship between internal states and external outputs, statistical analysis confirms a high correlation between generation correctness and the final representations; while counterfactual experiments identify the semantics of the generated tokens, rather than additional computation during inference or intrinsic parameter differences, as the dominant driver of the transition. Collectively, we offer a novel understanding of the reasoning process and the effect of training on reasoning enhancement, providing valuable insights for future model analysis and optimization.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_00770
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Reasoning as State Transition: A Representational Analysis of Reasoning Evolution in Large Language Models Zhang, Siyuan Li, Jialian Zhang, Yichi Yang, Xiao Dong, Yinpeng Su, Hang Computation and Language Large Language Models have achieved remarkable performance on reasoning tasks, motivating research into how this ability evolves during training. Prior work has primarily analyzed this evolution via explicit generation outcomes, treating the reasoning process as a black box and obscuring internal changes. To address this opacity, we introduce a representational perspective to investigate the dynamics of the model's internal states. Through comprehensive experiments across models at various training stages, we discover that post-training yields only limited improvement in static initial representation quality. Furthermore, we reveal that, distinct from non-reasoning tasks, reasoning involves a significant continuous distributional shift in representations during generation. Comparative analysis indicates that post-training empowers models to drive this transition toward a better distribution for task solving. To clarify the relationship between internal states and external outputs, statistical analysis confirms a high correlation between generation correctness and the final representations; while counterfactual experiments identify the semantics of the generated tokens, rather than additional computation during inference or intrinsic parameter differences, as the dominant driver of the transition. Collectively, we offer a novel understanding of the reasoning process and the effect of training on reasoning enhancement, providing valuable insights for future model analysis and optimization.
title	Reasoning as State Transition: A Representational Analysis of Reasoning Evolution in Large Language Models
topic	Computation and Language
url	https://arxiv.org/abs/2602.00770

Similar Items