Saved in:
Bibliographic Details
Main Authors: Hitchcock, Rohan, Delaney, Gary W., Manton, Jonathan H., Scalzo, Richard, Zhu, Jingge
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2504.11830
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915245852721152
author Hitchcock, Rohan
Delaney, Gary W.
Manton, Jonathan H.
Scalzo, Richard
Zhu, Jingge
author_facet Hitchcock, Rohan
Delaney, Gary W.
Manton, Jonathan H.
Scalzo, Richard
Zhu, Jingge
contents Neural networks often have identifiable computational structures - components of the network which perform an interpretable algorithm or task - but the mechanisms by which these emerge and the best methods for detecting these structures are not well understood. In this paper we investigate the emergence of computational structure in a transformer-like model trained to simulate the physics of a particle system, where the transformer's attention mechanism is used to transfer information between particles. We show that (a) structures emerge in the attention heads of the transformer which learn to detect particle collisions, (b) the emergence of these structures is associated to degenerate geometry in the loss landscape, and (c) the dynamics of this emergence follows a power law. This suggests that these components are governed by a degenerate "effective potential". These results have implications for the convergence time of computational structure within neural networks and suggest that the emergence of computational structure can be detected by studying the dynamics of network components.
format Preprint
id arxiv_https___arxiv_org_abs_2504_11830
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Emergence of Computational Structure in a Neural Network Physics Simulator
Hitchcock, Rohan
Delaney, Gary W.
Manton, Jonathan H.
Scalzo, Richard
Zhu, Jingge
Machine Learning
Neural networks often have identifiable computational structures - components of the network which perform an interpretable algorithm or task - but the mechanisms by which these emerge and the best methods for detecting these structures are not well understood. In this paper we investigate the emergence of computational structure in a transformer-like model trained to simulate the physics of a particle system, where the transformer's attention mechanism is used to transfer information between particles. We show that (a) structures emerge in the attention heads of the transformer which learn to detect particle collisions, (b) the emergence of these structures is associated to degenerate geometry in the loss landscape, and (c) the dynamics of this emergence follows a power law. This suggests that these components are governed by a degenerate "effective potential". These results have implications for the convergence time of computational structure within neural networks and suggest that the emergence of computational structure can be detected by studying the dynamics of network components.
title Emergence of Computational Structure in a Neural Network Physics Simulator
topic Machine Learning
url https://arxiv.org/abs/2504.11830