Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Author:	Chang, Po-Hao
Format:	Preprint
Published:	2026
Subjects:	Disordered Systems and Neural Networks
Online Access:	https://arxiv.org/abs/2603.11322
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912969133129728
author	Chang, Po-Hao
author_facet	Chang, Po-Hao
contents	Transformer architectures are typically described in algorithmic and statistical terms, leaving their internal mechanics without a familiar structural language for researchers trained in physical theories. To bridge this gap, we develop a complementary operator-theoretic framework that recasts their mechanics in a language familiar to many-body physics. Beginning from the token as a discrete index without intrinsic geometry, we show that embedding corresponds to a basis transformation into a continuous representation space. Once such a reference basis is established, self-attention naturally assumes the role of a non-Hermitian interaction operator, and network depth implements an ordered composition of these interactions. Within this formulation, several empirical properties of deep Transformers -- including stability at large depth, representational saturation, and the effectiveness of multi-head decomposition -- find natural structural interpretations as consequences of regulated operator composition. Together, channel factorization and normalization emerge as organizing structural logic rather than isolated architectural choices. This perspective does not rely on post-hoc analogy, but follows a constructive path where each parallel arises from the preceding structural step. By recasting Transformer mechanics in operator language, the framework lowers the conceptual barrier between deep learning and many-body physics through shared mathematical structure, making tools and intuitions from each domain more readily legible to the other.
format	Preprint
id	arxiv_https___arxiv_org_abs_2603_11322
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	From Embeddings to Dyson Series: Transformer Mechanics as Non-Hermitian Operator Theory Chang, Po-Hao Disordered Systems and Neural Networks Transformer architectures are typically described in algorithmic and statistical terms, leaving their internal mechanics without a familiar structural language for researchers trained in physical theories. To bridge this gap, we develop a complementary operator-theoretic framework that recasts their mechanics in a language familiar to many-body physics. Beginning from the token as a discrete index without intrinsic geometry, we show that embedding corresponds to a basis transformation into a continuous representation space. Once such a reference basis is established, self-attention naturally assumes the role of a non-Hermitian interaction operator, and network depth implements an ordered composition of these interactions. Within this formulation, several empirical properties of deep Transformers -- including stability at large depth, representational saturation, and the effectiveness of multi-head decomposition -- find natural structural interpretations as consequences of regulated operator composition. Together, channel factorization and normalization emerge as organizing structural logic rather than isolated architectural choices. This perspective does not rely on post-hoc analogy, but follows a constructive path where each parallel arises from the preceding structural step. By recasting Transformer mechanics in operator language, the framework lowers the conceptual barrier between deep learning and many-body physics through shared mathematical structure, making tools and intuitions from each domain more readily legible to the other.
title	From Embeddings to Dyson Series: Transformer Mechanics as Non-Hermitian Operator Theory
topic	Disordered Systems and Neural Networks
url	https://arxiv.org/abs/2603.11322

Similar Items