Saved in:
Bibliographic Details
Main Author: Zixi, Li
Format: Recurso digital
Language:
Published: Zenodo 2025
Online Access:https://doi.org/10.57967/hf/7066
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866901413663080448
author Zixi, Li
author_facet Zixi, Li
contents <p>We present LeftAndRight, a diagnostic framework using four algorithmic primitives (>>,<br><<, 1, 0) to reveal a fundamental property of transformer representations: they geometrically<br>collapse backward operations, regardless of attention architecture.<br>The counterintuitive discovery: We initially hypothesized that causal attention masks<br>cause this collapse. Through systematic validation across three levels—attention patterns, to-<br>ken embeddings, and sentence embeddings—we discovered that even bidirectional models<br>collapse backward operations. DistilBERT, which can attend to future tokens (36.2% future<br>attention), shows zero backward primitives (<< = 0%) at both token and sentence levels.<br>This reveals that the collapse is not caused by attention masks, but by representation<br>geometry itself. Our experiments on 25 boundary problems (OpenXOR, TSP, SAT) and<br>three model architectures (MiniLM, Pythia, DistilBERT) show universal collapse (A = 1.000<br>across all tests). We demonstrate that learned representations encode inherent temporal<br>directionality—possibly from positional encodings, training data ordering, or fundamental<br>properties of sequential modeling—that prevents encoding of backward operations even when<br>attention is bidirectional.<br>This is not about causal attention. This is about how representations form. The<br>4 atoms revealed a deeper geometric truth than expected: transformers fail at backtracking not<br>because of attention architecture, but because their representation space is geometrically<br>unidirectional</p>
format Recurso digital
id zenodo_https___doi_org_10_57967_hf_7066
institution Zenodo
language
publishDate 2025
publisher Zenodo
record_format zenodo
spellingShingle Why Reasoning Models Collapse Themselves in Reasoning
Zixi, Li
<p>We present LeftAndRight, a diagnostic framework using four algorithmic primitives (>>,<br><<, 1, 0) to reveal a fundamental property of transformer representations: they geometrically<br>collapse backward operations, regardless of attention architecture.<br>The counterintuitive discovery: We initially hypothesized that causal attention masks<br>cause this collapse. Through systematic validation across three levels—attention patterns, to-<br>ken embeddings, and sentence embeddings—we discovered that even bidirectional models<br>collapse backward operations. DistilBERT, which can attend to future tokens (36.2% future<br>attention), shows zero backward primitives (<< = 0%) at both token and sentence levels.<br>This reveals that the collapse is not caused by attention masks, but by representation<br>geometry itself. Our experiments on 25 boundary problems (OpenXOR, TSP, SAT) and<br>three model architectures (MiniLM, Pythia, DistilBERT) show universal collapse (A = 1.000<br>across all tests). We demonstrate that learned representations encode inherent temporal<br>directionality—possibly from positional encodings, training data ordering, or fundamental<br>properties of sequential modeling—that prevents encoding of backward operations even when<br>attention is bidirectional.<br>This is not about causal attention. This is about how representations form. The<br>4 atoms revealed a deeper geometric truth than expected: transformers fail at backtracking not<br>because of attention architecture, but because their representation space is geometrically<br>unidirectional</p>
title Why Reasoning Models Collapse Themselves in Reasoning
url https://doi.org/10.57967/hf/7066