Saved in:
| Main Authors: | , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.16556 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866915693940703232 |
|---|---|
| author | Afanah, Assem Rosenow, Bernd |
| author_facet | Afanah, Assem Rosenow, Bernd |
| contents | We study the learning dynamics of the soft committee machine (SCM) with Rectified Linear Unit (ReLU) activation using a statistical-mechanics approach within the annealed approximation. The SCM consists of a student network with $N$ input units and $K$ hidden units trained to reproduce the output of a teacher network with $M$ hidden units. We introduce a reduced set of macroscopic order parameters that yields a unified description valid from the conventional regime $K \ll N$ to the ultra-wide limit $K \ge N$. The control parameter $α$, proportional to the ratio of training samples to adjustable weights, serves as an effective measure of dataset size.
For small $γ= M/N$, we recover a continuous phase transition at $α_{c} \approx 2π$ from an unspecialized, permutation-symmetric state to a specialized state in which student units align with the teacher. For finite $γ$, the transition disappears and the generalization error decreases smoothly with dataset size, reaching a low plateau when $γ=1$. In the asymptotic limit $α\to \infty$, the error scales as $\varepsilon_{g} \propto 1/α$, independent of $γ$ and $K$. The results highlight the central role of network dimensions in SCM learning and provide a framework extendable to other activations and quenched analyses. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2512_16556 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | Unified Description of Learning Dynamics in the Soft Committee Machine from Finite to Ultra-Wide Regimes Afanah, Assem Rosenow, Bernd Disordered Systems and Neural Networks We study the learning dynamics of the soft committee machine (SCM) with Rectified Linear Unit (ReLU) activation using a statistical-mechanics approach within the annealed approximation. The SCM consists of a student network with $N$ input units and $K$ hidden units trained to reproduce the output of a teacher network with $M$ hidden units. We introduce a reduced set of macroscopic order parameters that yields a unified description valid from the conventional regime $K \ll N$ to the ultra-wide limit $K \ge N$. The control parameter $α$, proportional to the ratio of training samples to adjustable weights, serves as an effective measure of dataset size. For small $γ= M/N$, we recover a continuous phase transition at $α_{c} \approx 2π$ from an unspecialized, permutation-symmetric state to a specialized state in which student units align with the teacher. For finite $γ$, the transition disappears and the generalization error decreases smoothly with dataset size, reaching a low plateau when $γ=1$. In the asymptotic limit $α\to \infty$, the error scales as $\varepsilon_{g} \propto 1/α$, independent of $γ$ and $K$. The results highlight the central role of network dimensions in SCM learning and provide a framework extendable to other activations and quenched analyses. |
| title | Unified Description of Learning Dynamics in the Soft Committee Machine from Finite to Ultra-Wide Regimes |
| topic | Disordered Systems and Neural Networks |
| url | https://arxiv.org/abs/2512.16556 |