Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Hitchcock, Rohan, Delaney, Gary W., Manton, Jonathan H., Scalzo, Richard, Zhu, Jingge
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2504.11830
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915245852721152
author	Hitchcock, Rohan Delaney, Gary W. Manton, Jonathan H. Scalzo, Richard Zhu, Jingge
author_facet	Hitchcock, Rohan Delaney, Gary W. Manton, Jonathan H. Scalzo, Richard Zhu, Jingge
contents	Neural networks often have identifiable computational structures - components of the network which perform an interpretable algorithm or task - but the mechanisms by which these emerge and the best methods for detecting these structures are not well understood. In this paper we investigate the emergence of computational structure in a transformer-like model trained to simulate the physics of a particle system, where the transformer's attention mechanism is used to transfer information between particles. We show that (a) structures emerge in the attention heads of the transformer which learn to detect particle collisions, (b) the emergence of these structures is associated to degenerate geometry in the loss landscape, and (c) the dynamics of this emergence follows a power law. This suggests that these components are governed by a degenerate "effective potential". These results have implications for the convergence time of computational structure within neural networks and suggest that the emergence of computational structure can be detected by studying the dynamics of network components.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_11830
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Emergence of Computational Structure in a Neural Network Physics Simulator Hitchcock, Rohan Delaney, Gary W. Manton, Jonathan H. Scalzo, Richard Zhu, Jingge Machine Learning Neural networks often have identifiable computational structures - components of the network which perform an interpretable algorithm or task - but the mechanisms by which these emerge and the best methods for detecting these structures are not well understood. In this paper we investigate the emergence of computational structure in a transformer-like model trained to simulate the physics of a particle system, where the transformer's attention mechanism is used to transfer information between particles. We show that (a) structures emerge in the attention heads of the transformer which learn to detect particle collisions, (b) the emergence of these structures is associated to degenerate geometry in the loss landscape, and (c) the dynamics of this emergence follows a power law. This suggests that these components are governed by a degenerate "effective potential". These results have implications for the convergence time of computational structure within neural networks and suggest that the emergence of computational structure can be detected by studying the dynamics of network components.
title	Emergence of Computational Structure in a Neural Network Physics Simulator
topic	Machine Learning
url	https://arxiv.org/abs/2504.11830

Similar Items