Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Liu, Bingyang Kelvin, Chen, Ziyu Patrick, Woodruff, David P.
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2512.19171
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

Current autoregressive language models couple high-level reasoning and low-level token generation into a single sequential process, making the reasoning trajectory vulnerable to compounding expression errors. We propose JEPA-Reasoner, a novel architectural paradigm that decouples these tasks using a Joint-Embedding Predictive Architecture (JEPA) for pure latent-space reasoning and a separate Talker module for linguistic reconstruction. By isolating the reasoning engine from the discrete token-sampling process, our architecture enables: (1) Error Containment, where token-level failures cannot propagate into the latent reasoning chain; (2) Continuous Guidance, providing the generator with access to the entire lossless reasoning trajectory; and (3) Representation of Uncertainty, allowing the model to maintain multiple hypotheses via mixed latent vectors. Controlled experiments on synthetic and natural language tasks demonstrate that this decoupling enables a 0.9B model to achieve a 149.5\% improvement in 8-shot GSM8K accuracy over a coupled Transformer baseline trained on identical data. This work shifts the focus from scaling coupled models to investigating decoupled architectures as a more robust foundation for complex reasoning.

Similar Items