MARC21: :: Library Catalog

Salvato in:

Dettagli Bibliografici
Autori principali:	Moisescu-Pareja, Gabriela, McCracken, Gavin, Wiltzer, Harley, Létourneau, Vincent, Daniels, Colin, Precup, Doina, Love, Jonathan
Natura:	Preprint
Pubblicazione:	2025
Soggetti:	Machine Learning
Accesso online:	https://arxiv.org/abs/2512.25060
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

_version_	1866917178521944064
author	Moisescu-Pareja, Gabriela McCracken, Gavin Wiltzer, Harley Létourneau, Vincent Daniels, Colin Precup, Doina Love, Jonathan
author_facet	Moisescu-Pareja, Gabriela McCracken, Gavin Wiltzer, Harley Létourneau, Vincent Daniels, Colin Precup, Doina Love, Jonathan
contents	The Clock and Pizza interpretations, associated with architectures differing in either uniform or learnable attention, were introduced to argue that different architectural designs can yield distinct circuits for modular addition. In this work, we show that this is not the case, and that both uniform attention and trainable attention architectures implement the same algorithm via topologically and geometrically equivalent representations. Our methodology goes beyond the interpretation of individual neurons and weights. Instead, we identify all of the neurons corresponding to each learned representation and then study the collective group of neurons as one entity. This method reveals that each learned representation is a manifold that we can study utilizing tools from topology. Based on this insight, we can statistically analyze the learned representations across hundreds of circuits to demonstrate the similarity between learned modular addition circuits that arise naturally from common deep learning paradigms.
format	Preprint
id	arxiv_https___arxiv_org_abs_2512_25060
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	On the geometry and topology of representations: the manifolds of modular addition Moisescu-Pareja, Gabriela McCracken, Gavin Wiltzer, Harley Létourneau, Vincent Daniels, Colin Precup, Doina Love, Jonathan Machine Learning The Clock and Pizza interpretations, associated with architectures differing in either uniform or learnable attention, were introduced to argue that different architectural designs can yield distinct circuits for modular addition. In this work, we show that this is not the case, and that both uniform attention and trainable attention architectures implement the same algorithm via topologically and geometrically equivalent representations. Our methodology goes beyond the interpretation of individual neurons and weights. Instead, we identify all of the neurons corresponding to each learned representation and then study the collective group of neurons as one entity. This method reveals that each learned representation is a manifold that we can study utilizing tools from topology. Based on this insight, we can statistically analyze the learned representations across hundreds of circuits to demonstrate the similarity between learned modular addition circuits that arise naturally from common deep learning paradigms.
title	On the geometry and topology of representations: the manifolds of modular addition
topic	Machine Learning
url	https://arxiv.org/abs/2512.25060

Documenti analoghi