Internformat: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Yang, Xiguang, Arora, Krish, Bachmann, Michael
Format:	Preprint
Veröffentlicht:	2025
Schlagworte:	Disordered Systems and Neural Networks Statistical Mechanics Machine Learning Computational Physics
Online-Zugang:	https://arxiv.org/abs/2501.08341
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

_version_	1866917892871684096
author	Yang, Xiguang Arora, Krish Bachmann, Michael
author_facet	Yang, Xiguang Arora, Krish Bachmann, Michael
contents	We investigate the loss landscape and backpropagation dynamics of convergence for the simplest possible artificial neural network representing the logical exclusive-OR (XOR) gate. Cross-sections of the loss landscape in the nine-dimensional parameter space are found to exhibit distinct features, which help understand why backpropagation efficiently achieves convergence toward zero loss, whereas values of weights and biases keep drifting. Differences in shapes of cross-sections obtained by nonrandomized and randomized batches are discussed. In reference to statistical physics we introduce the microcanonical entropy as a unique quantity that allows to characterize the phase behavior of the network. Learning in neural networks can thus be thought of as an annealing process that experiences the analogue of phase transitions known from thermodynamic systems. It also reveals how the loss landscape simplifies as more hidden neurons are added to the network, eliminating entropic barriers caused by finite-size effects.
format	Preprint
id	arxiv_https___arxiv_org_abs_2501_08341
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Dissecting a Small Artificial Neural Network Yang, Xiguang Arora, Krish Bachmann, Michael Disordered Systems and Neural Networks Statistical Mechanics Machine Learning Computational Physics We investigate the loss landscape and backpropagation dynamics of convergence for the simplest possible artificial neural network representing the logical exclusive-OR (XOR) gate. Cross-sections of the loss landscape in the nine-dimensional parameter space are found to exhibit distinct features, which help understand why backpropagation efficiently achieves convergence toward zero loss, whereas values of weights and biases keep drifting. Differences in shapes of cross-sections obtained by nonrandomized and randomized batches are discussed. In reference to statistical physics we introduce the microcanonical entropy as a unique quantity that allows to characterize the phase behavior of the network. Learning in neural networks can thus be thought of as an annealing process that experiences the analogue of phase transitions known from thermodynamic systems. It also reveals how the loss landscape simplifies as more hidden neurons are added to the network, eliminating entropic barriers caused by finite-size effects.
title	Dissecting a Small Artificial Neural Network
topic	Disordered Systems and Neural Networks Statistical Mechanics Machine Learning Computational Physics
url	https://arxiv.org/abs/2501.08341

Ähnliche Einträge