Saved in:
Bibliographic Details
Main Authors: Bu, Wei, Kol, Uri, Liu, Ziming
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2501.09659
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916780065161216
author Bu, Wei
Kol, Uri
Liu, Ziming
author_facet Bu, Wei
Kol, Uri
Liu, Ziming
contents The dynamical evolution of a neural network during training has been an incredibly fascinating subject of study. First principal derivation of generic evolution of variables in statistical physics systems has proved useful when used to describe training dynamics conceptually, which in practice means numerically solving equations such as Fokker-Planck equation. Simulating entire networks inevitably runs into the curse of dimensionality. In this paper, we utilize Fokker-Planck to simulate the probability density evolution of individual weight matrices in the bottleneck layers of a simple 2-bottleneck-layered auto-encoder and compare the theoretical evolutions against the empirical ones by examining the output data distributions. We also derive physically relevant partial differential equations such as Callan-Symanzik and Kardar-Parisi-Zhang equations from the dynamical equation we have.
format Preprint
id arxiv_https___arxiv_org_abs_2501_09659
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Fokker-Planck to Callan-Symanzik: evolution of weight matrices under training
Bu, Wei
Kol, Uri
Liu, Ziming
Machine Learning
The dynamical evolution of a neural network during training has been an incredibly fascinating subject of study. First principal derivation of generic evolution of variables in statistical physics systems has proved useful when used to describe training dynamics conceptually, which in practice means numerically solving equations such as Fokker-Planck equation. Simulating entire networks inevitably runs into the curse of dimensionality. In this paper, we utilize Fokker-Planck to simulate the probability density evolution of individual weight matrices in the bottleneck layers of a simple 2-bottleneck-layered auto-encoder and compare the theoretical evolutions against the empirical ones by examining the output data distributions. We also derive physically relevant partial differential equations such as Callan-Symanzik and Kardar-Parisi-Zhang equations from the dynamical equation we have.
title Fokker-Planck to Callan-Symanzik: evolution of weight matrices under training
topic Machine Learning
url https://arxiv.org/abs/2501.09659