Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Afanah, Assem, Rosenow, Bernd
Format:	Preprint
Published:	2025
Subjects:	Disordered Systems and Neural Networks
Online Access:	https://arxiv.org/abs/2512.16556
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915693940703232
author	Afanah, Assem Rosenow, Bernd
author_facet	Afanah, Assem Rosenow, Bernd
contents	We study the learning dynamics of the soft committee machine (SCM) with Rectified Linear Unit (ReLU) activation using a statistical-mechanics approach within the annealed approximation. The SCM consists of a student network with $N$ input units and $K$ hidden units trained to reproduce the output of a teacher network with $M$ hidden units. We introduce a reduced set of macroscopic order parameters that yields a unified description valid from the conventional regime $K \ll N$ to the ultra-wide limit $K \ge N$. The control parameter $α$, proportional to the ratio of training samples to adjustable weights, serves as an effective measure of dataset size. For small $γ= M/N$, we recover a continuous phase transition at $α_{c} \approx 2π$ from an unspecialized, permutation-symmetric state to a specialized state in which student units align with the teacher. For finite $γ$, the transition disappears and the generalization error decreases smoothly with dataset size, reaching a low plateau when $γ=1$. In the asymptotic limit $α\to \infty$, the error scales as $\varepsilon_{g} \propto 1/α$, independent of $γ$ and $K$. The results highlight the central role of network dimensions in SCM learning and provide a framework extendable to other activations and quenched analyses.
format	Preprint
id	arxiv_https___arxiv_org_abs_2512_16556
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Unified Description of Learning Dynamics in the Soft Committee Machine from Finite to Ultra-Wide Regimes Afanah, Assem Rosenow, Bernd Disordered Systems and Neural Networks We study the learning dynamics of the soft committee machine (SCM) with Rectified Linear Unit (ReLU) activation using a statistical-mechanics approach within the annealed approximation. The SCM consists of a student network with $N$ input units and $K$ hidden units trained to reproduce the output of a teacher network with $M$ hidden units. We introduce a reduced set of macroscopic order parameters that yields a unified description valid from the conventional regime $K \ll N$ to the ultra-wide limit $K \ge N$. The control parameter $α$, proportional to the ratio of training samples to adjustable weights, serves as an effective measure of dataset size. For small $γ= M/N$, we recover a continuous phase transition at $α_{c} \approx 2π$ from an unspecialized, permutation-symmetric state to a specialized state in which student units align with the teacher. For finite $γ$, the transition disappears and the generalization error decreases smoothly with dataset size, reaching a low plateau when $γ=1$. In the asymptotic limit $α\to \infty$, the error scales as $\varepsilon_{g} \propto 1/α$, independent of $γ$ and $K$. The results highlight the central role of network dimensions in SCM learning and provide a framework extendable to other activations and quenched analyses.
title	Unified Description of Learning Dynamics in the Soft Committee Machine from Finite to Ultra-Wide Regimes
topic	Disordered Systems and Neural Networks
url	https://arxiv.org/abs/2512.16556

Similar Items