Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yang, Yongyi, Poggio, Tomaso, Chuang, Isaac, Ziyin, Liu
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2510.02670
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915531772133376
author	Yang, Yongyi Poggio, Tomaso Chuang, Isaac Ziyin, Liu
author_facet	Yang, Yongyi Poggio, Tomaso Chuang, Isaac Ziyin, Liu
contents	We prove that for a broad class of permutation-equivariant learning rules (including SGD, Adam, and others), the training process induces a bi-Lipschitz mapping between neurons and strongly constrains the topology of the neuron distribution during training. This result reveals a qualitative difference between small and large learning rates $η$. With a learning rate below a topological critical point $η^$, the training is constrained to preserve all topological structure of the neurons. In contrast, above $η^$, the learning process allows for topological simplification, making the neuron manifold progressively coarser and thereby reducing the model's expressivity. Viewed in combination with the recent discovery of the edge of stability phenomenon, the learning dynamics of neuron networks under gradient descent can be divided into two phases: first they undergo smooth optimization under topological constraints, and then enter a second phase where they learn through drastic topological simplifications. A key feature of our theory is that it is independent of specific architectures or loss functions, enabling the universal application of topological methods to the study of deep learning.
format	Preprint
id	arxiv_https___arxiv_org_abs_2510_02670
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Topological Invariance and Breakdown in Learning Yang, Yongyi Poggio, Tomaso Chuang, Isaac Ziyin, Liu Machine Learning We prove that for a broad class of permutation-equivariant learning rules (including SGD, Adam, and others), the training process induces a bi-Lipschitz mapping between neurons and strongly constrains the topology of the neuron distribution during training. This result reveals a qualitative difference between small and large learning rates $η$. With a learning rate below a topological critical point $η^$, the training is constrained to preserve all topological structure of the neurons. In contrast, above $η^$, the learning process allows for topological simplification, making the neuron manifold progressively coarser and thereby reducing the model's expressivity. Viewed in combination with the recent discovery of the edge of stability phenomenon, the learning dynamics of neuron networks under gradient descent can be divided into two phases: first they undergo smooth optimization under topological constraints, and then enter a second phase where they learn through drastic topological simplifications. A key feature of our theory is that it is independent of specific architectures or loss functions, enabling the universal application of topological methods to the study of deep learning.
title	Topological Invariance and Breakdown in Learning
topic	Machine Learning
url	https://arxiv.org/abs/2510.02670

Similar Items