Saved in:
| Main Authors: | Bordelon, Blake, Mori, Francesco |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.04774 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
How Feature Learning Can Improve Neural Scaling Laws
by: Bordelon, Blake, et al.
Published: (2024)
by: Bordelon, Blake, et al.
Published: (2024)
A Dynamical Model of Neural Scaling Laws
by: Bordelon, Blake, et al.
Published: (2024)
by: Bordelon, Blake, et al.
Published: (2024)
Theory of Scaling Laws for In-Context Regression: Depth, Width, Context and Time
by: Bordelon, Blake, et al.
Published: (2025)
by: Bordelon, Blake, et al.
Published: (2025)
Disordered Dynamics in High Dimensions: Connections to Random Matrices and Machine Learning
by: Bordelon, Blake, et al.
Published: (2026)
by: Bordelon, Blake, et al.
Published: (2026)
Transfer Learning in Infinite Width Feature Learning Networks
by: Lauditi, Clarissa, et al.
Published: (2025)
by: Lauditi, Clarissa, et al.
Published: (2025)
Spectral Dynamics in Deep Networks: Feature Learning, Outlier Escape, and Learning Rate Transfer
by: Lauditi, Clarissa, et al.
Published: (2026)
by: Lauditi, Clarissa, et al.
Published: (2026)
Deep Linear Network Training Dynamics from Random Initialization: Data, Width, Depth, and Hyperparameter Transfer
by: Bordelon, Blake, et al.
Published: (2025)
by: Bordelon, Blake, et al.
Published: (2025)
Adaptive kernel predictors from feature-learning infinite limits of neural networks
by: Lauditi, Clarissa, et al.
Published: (2025)
by: Lauditi, Clarissa, et al.
Published: (2025)
Infinite Limits of Multi-head Transformer Dynamics
by: Bordelon, Blake, et al.
Published: (2024)
by: Bordelon, Blake, et al.
Published: (2024)
No Free Lunch From Random Feature Ensembles: Scaling Laws and Near-Optimality Conditions
by: Ruben, Benjamin S., et al.
Published: (2024)
by: Ruben, Benjamin S., et al.
Published: (2024)
Grokking as the Transition from Lazy to Rich Training Dynamics
by: Kumar, Tanishq, et al.
Published: (2023)
by: Kumar, Tanishq, et al.
Published: (2023)
Dropout Universality: Scaling Laws and Optimal Scheduling at the Edge-of-Chaos
by: Sarmiento, Lucas Fernandez
Published: (2026)
by: Sarmiento, Lucas Fernandez
Published: (2026)
Two-Point Deterministic Equivalence for Stochastic Gradient Dynamics in Linear Models
by: Atanasov, Alexander, et al.
Published: (2025)
by: Atanasov, Alexander, et al.
Published: (2025)
Dynamically Learning to Integrate in Recurrent Neural Networks
by: Bordelon, Blake, et al.
Published: (2025)
by: Bordelon, Blake, et al.
Published: (2025)
Optimal Protocols for Continual Learning via Statistical Physics and Control Theory
by: Mori, Francesco, et al.
Published: (2024)
by: Mori, Francesco, et al.
Published: (2024)
From Kernels to Features: A Multi-Scale Adaptive Theory of Feature Learning
by: Rubin, Noa, et al.
Published: (2025)
by: Rubin, Noa, et al.
Published: (2025)
Scaling Laws and Representation Learning in Simple Hierarchical Languages: Transformers vs. Convolutional Architectures
by: Cagnetta, Francesco, et al.
Published: (2025)
by: Cagnetta, Francesco, et al.
Published: (2025)
Explaining Neural Scaling Laws
by: Bahri, Yasaman, et al.
Published: (2021)
by: Bahri, Yasaman, et al.
Published: (2021)
Scaling Laws and Spectra of Shallow Neural Networks in the Feature Learning Regime
by: Defilippis, Leonardo, et al.
Published: (2025)
by: Defilippis, Leonardo, et al.
Published: (2025)
Asymmetric Scaling Laws from Sparse Features
by: Sous, John, et al.
Published: (2026)
by: Sous, John, et al.
Published: (2026)
Neural Scaling Laws Rooted in the Data Distribution
by: Brill, Ari
Published: (2024)
by: Brill, Ari
Published: (2024)
Deep Learning as Neural Low-Degree Filtering: A Spectral Theory of Hierarchical Feature Learning
by: Dandi, Yatin, et al.
Published: (2026)
by: Dandi, Yatin, et al.
Published: (2026)
A Random-Matrix Criterion for Initializing Gated Recurrent Neural Networks
by: Fioratti, Tommaso, et al.
Published: (2026)
by: Fioratti, Tommaso, et al.
Published: (2026)
Analytic theory of dropout regularization
by: Mori, Francesco, et al.
Published: (2025)
by: Mori, Francesco, et al.
Published: (2025)
Asymptotics of Learning with Deep Structured (Random) Features
by: Schröder, Dominik, et al.
Published: (2024)
by: Schröder, Dominik, et al.
Published: (2024)
Random Features Hopfield Networks generalize retrieval to previously unseen examples
by: Kalaj, Silvio, et al.
Published: (2024)
by: Kalaj, Silvio, et al.
Published: (2024)
Modeling Structured Data Learning with Restricted Boltzmann Machines in the Teacher-Student Setting
by: Thériault, Robin, et al.
Published: (2024)
by: Thériault, Robin, et al.
Published: (2024)
Single-Head Attention in High Dimensions: A Theory of Generalization, Weights Spectra, and Scaling Laws
by: Boncoraglio, Fabrizio, et al.
Published: (2025)
by: Boncoraglio, Fabrizio, et al.
Published: (2025)
Why Warmup the Learning Rate? Underlying Mechanisms and Improvements
by: Kalra, Dayal Singh, et al.
Published: (2024)
by: Kalra, Dayal Singh, et al.
Published: (2024)
How Deep Networks Learn Sparse and Hierarchical Data: the Sparse Random Hierarchy Model
by: Tomasini, Umberto, et al.
Published: (2024)
by: Tomasini, Umberto, et al.
Published: (2024)
The Quantization Model of Neural Scaling
by: Michaud, Eric J., et al.
Published: (2023)
by: Michaud, Eric J., et al.
Published: (2023)
Learning curves theory for hierarchically compositional data with power-law distributed features
by: Cagnetta, Francesco, et al.
Published: (2025)
by: Cagnetta, Francesco, et al.
Published: (2025)
Optimal Spectral Transitions in High-Dimensional Multi-Index Models
by: Defilippis, Leonardo, et al.
Published: (2025)
by: Defilippis, Leonardo, et al.
Published: (2025)
The Effect of Optimal Self-Distillation in Noisy Gaussian Mixture Model
by: Takanami, Kaito, et al.
Published: (2025)
by: Takanami, Kaito, et al.
Published: (2025)
EB-RANSAC: Random Sample Consensus based on Energy-Based Model
by: Yasuda, Muneki, et al.
Published: (2026)
by: Yasuda, Muneki, et al.
Published: (2026)
Theory of Speciation Transitions in Diffusion Models with General Class Structure
by: Achilli, Beatrice, et al.
Published: (2026)
by: Achilli, Beatrice, et al.
Published: (2026)
Small Singular Values Matter: A Random Matrix Analysis of Transformer Models
by: Staats, Max, et al.
Published: (2024)
by: Staats, Max, et al.
Published: (2024)
Statistical Physics of Deep Neural Networks: Generalization Capability, Beyond the Infinite Width, and Feature Learning
by: Ariosto, Sebastiano
Published: (2025)
by: Ariosto, Sebastiano
Published: (2025)
Random features and polynomial rules
by: Aguirre-López, Fabián, et al.
Published: (2024)
by: Aguirre-López, Fabián, et al.
Published: (2024)
Supervised Hebbian Learning
by: Alemanno, Francesco, et al.
Published: (2022)
by: Alemanno, Francesco, et al.
Published: (2022)
Similar Items
-
How Feature Learning Can Improve Neural Scaling Laws
by: Bordelon, Blake, et al.
Published: (2024) -
A Dynamical Model of Neural Scaling Laws
by: Bordelon, Blake, et al.
Published: (2024) -
Theory of Scaling Laws for In-Context Regression: Depth, Width, Context and Time
by: Bordelon, Blake, et al.
Published: (2025) -
Disordered Dynamics in High Dimensions: Connections to Random Matrices and Machine Learning
by: Bordelon, Blake, et al.
Published: (2026) -
Transfer Learning in Infinite Width Feature Learning Networks
by: Lauditi, Clarissa, et al.
Published: (2025)