Saved in:
Bibliographic Details
Main Authors: Wang, George, Baker, Garrett, Gordon, Andrew, Murfet, Daniel
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2508.00331
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916875061952512
author Wang, George
Baker, Garrett
Gordon, Andrew
Murfet, Daniel
author_facet Wang, George
Baker, Garrett
Gordon, Andrew
Murfet, Daniel
contents Understanding how language models develop their internal computational structure is a central problem in the science of deep learning. While susceptibilities, drawn from statistical physics, offer a promising analytical tool, their full potential for visualizing network organization remains untapped. In this work, we introduce an embryological approach, applying UMAP to the susceptibility matrix to visualize the model's structural development over training. Our visualizations reveal the emergence of a clear ``body plan,'' charting the formation of known features like the induction circuit and discovering previously unknown structures, such as a ``spacing fin'' dedicated to counting space tokens. This work demonstrates that susceptibility analysis can move beyond validation to uncover novel mechanisms, providing a powerful, holistic lens for studying the developmental principles of complex neural networks.
format Preprint
id arxiv_https___arxiv_org_abs_2508_00331
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Embryology of a Language Model
Wang, George
Baker, Garrett
Gordon, Andrew
Murfet, Daniel
Machine Learning
Understanding how language models develop their internal computational structure is a central problem in the science of deep learning. While susceptibilities, drawn from statistical physics, offer a promising analytical tool, their full potential for visualizing network organization remains untapped. In this work, we introduce an embryological approach, applying UMAP to the susceptibility matrix to visualize the model's structural development over training. Our visualizations reveal the emergence of a clear ``body plan,'' charting the formation of known features like the induction circuit and discovering previously unknown structures, such as a ``spacing fin'' dedicated to counting space tokens. This work demonstrates that susceptibility analysis can move beyond validation to uncover novel mechanisms, providing a powerful, holistic lens for studying the developmental principles of complex neural networks.
title Embryology of a Language Model
topic Machine Learning
url https://arxiv.org/abs/2508.00331