Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Deng, Jingcheng, Pang, Liang, Wei, Zihao, Xu, Shicheng, Duan, Zenghao, Xu, Kun, Song, Yang, Shen, Huawei, Cheng, Xueqi
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2510.15522
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914292761100288
author	Deng, Jingcheng Pang, Liang Wei, Zihao Xu, Shicheng Duan, Zenghao Xu, Kun Song, Yang Shen, Huawei Cheng, Xueqi
author_facet	Deng, Jingcheng Pang, Liang Wei, Zihao Xu, Shicheng Duan, Zenghao Xu, Kun Song, Yang Shen, Huawei Cheng, Xueqi
contents	Latent reasoning offers a computation-efficient alternative to Chain-of-Thought but often suffers from performance degradation due to distributional misalignment and ambiguous chain definitions. Ideally, latent reasoning should function as a superposition of multiple reasoning paths. To realize this, we introduce Latent-SFT, a unified framework addressing challenges at three levels: token, chain, and learning. First, we define the Latent-Vocab to constrain hidden states within the pre-trained vocab-space. Second, we construct the Latent-Chain via Induction-Supervision Masking to ensure semantic compactness and sufficiency. Third, we employ Latent-Optim with stochastic Gumbel-Softmax to guide the model toward generalizable solutions. Empirical results demonstrate that Latent-SFT consistently outperforms explicit SFT across six mathematical benchmarks (e.g., GSM8k, AIME24) while achieving a 2.7x to 5.5x reduction in reasoning length. Analysis confirms that our method effectively captures a superposition of diverse reasoning trajectories rather than merely compressing a single path.
format	Preprint
id	arxiv_https___arxiv_org_abs_2510_15522
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	LLM Latent Reasoning as Chain of Superposition Deng, Jingcheng Pang, Liang Wei, Zihao Xu, Shicheng Duan, Zenghao Xu, Kun Song, Yang Shen, Huawei Cheng, Xueqi Computation and Language Latent reasoning offers a computation-efficient alternative to Chain-of-Thought but often suffers from performance degradation due to distributional misalignment and ambiguous chain definitions. Ideally, latent reasoning should function as a superposition of multiple reasoning paths. To realize this, we introduce Latent-SFT, a unified framework addressing challenges at three levels: token, chain, and learning. First, we define the Latent-Vocab to constrain hidden states within the pre-trained vocab-space. Second, we construct the Latent-Chain via Induction-Supervision Masking to ensure semantic compactness and sufficiency. Third, we employ Latent-Optim with stochastic Gumbel-Softmax to guide the model toward generalizable solutions. Empirical results demonstrate that Latent-SFT consistently outperforms explicit SFT across six mathematical benchmarks (e.g., GSM8k, AIME24) while achieving a 2.7x to 5.5x reduction in reasoning length. Analysis confirms that our method effectively captures a superposition of diverse reasoning trajectories rather than merely compressing a single path.
title	LLM Latent Reasoning as Chain of Superposition
topic	Computation and Language
url	https://arxiv.org/abs/2510.15522

Similar Items