Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Bu, Wei, Kol, Uri, Liu, Ziming
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2501.09659
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916780065161216
author	Bu, Wei Kol, Uri Liu, Ziming
author_facet	Bu, Wei Kol, Uri Liu, Ziming
contents	The dynamical evolution of a neural network during training has been an incredibly fascinating subject of study. First principal derivation of generic evolution of variables in statistical physics systems has proved useful when used to describe training dynamics conceptually, which in practice means numerically solving equations such as Fokker-Planck equation. Simulating entire networks inevitably runs into the curse of dimensionality. In this paper, we utilize Fokker-Planck to simulate the probability density evolution of individual weight matrices in the bottleneck layers of a simple 2-bottleneck-layered auto-encoder and compare the theoretical evolutions against the empirical ones by examining the output data distributions. We also derive physically relevant partial differential equations such as Callan-Symanzik and Kardar-Parisi-Zhang equations from the dynamical equation we have.
format	Preprint
id	arxiv_https___arxiv_org_abs_2501_09659
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Fokker-Planck to Callan-Symanzik: evolution of weight matrices under training Bu, Wei Kol, Uri Liu, Ziming Machine Learning The dynamical evolution of a neural network during training has been an incredibly fascinating subject of study. First principal derivation of generic evolution of variables in statistical physics systems has proved useful when used to describe training dynamics conceptually, which in practice means numerically solving equations such as Fokker-Planck equation. Simulating entire networks inevitably runs into the curse of dimensionality. In this paper, we utilize Fokker-Planck to simulate the probability density evolution of individual weight matrices in the bottleneck layers of a simple 2-bottleneck-layered auto-encoder and compare the theoretical evolutions against the empirical ones by examining the output data distributions. We also derive physically relevant partial differential equations such as Callan-Symanzik and Kardar-Parisi-Zhang equations from the dynamical equation we have.
title	Fokker-Planck to Callan-Symanzik: evolution of weight matrices under training
topic	Machine Learning
url	https://arxiv.org/abs/2501.09659

Similar Items