Saved in:
| Main Authors: | Arnaboldi, Luca, Krzakala, Florent, Loureiro, Bruno, Stephan, Ludovic |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2305.18502 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Asymptotics of SGD in Sequence-Single Index Models and Single-Layer Attention Networks
by: Arnaboldi, Luca, et al.
Published: (2025)
by: Arnaboldi, Luca, et al.
Published: (2025)
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions
by: Arnaboldi, Luca, et al.
Published: (2024)
by: Arnaboldi, Luca, et al.
Published: (2024)
Universality laws for Gaussian mixtures in generalized linear models
by: Dandi, Yatin, et al.
Published: (2023)
by: Dandi, Yatin, et al.
Published: (2023)
Online Learning and Information Exponents: On The Importance of Batch size, and Time/Complexity Tradeoffs
by: Arnaboldi, Luca, et al.
Published: (2024)
by: Arnaboldi, Luca, et al.
Published: (2024)
How Two-Layer Neural Networks Learn, One (Giant) Step at a Time
by: Dandi, Yatin, et al.
Published: (2023)
by: Dandi, Yatin, et al.
Published: (2023)
Asymptotics of feature learning in two-layer networks after one gradient-step
by: Cui, Hugo, et al.
Published: (2024)
by: Cui, Hugo, et al.
Published: (2024)
Optimal scaling laws in learning hierarchical multi-index models
by: Defilippis, Leonardo, et al.
Published: (2026)
by: Defilippis, Leonardo, et al.
Published: (2026)
Gaussian Universality of Perceptrons with Random Labels
by: Gerace, Federica, et al.
Published: (2022)
by: Gerace, Federica, et al.
Published: (2022)
A Noise Sensitivity Exponent Controls Large Statistical-to-Computational Gaps in Single- and Multi-Index Models
by: Defilippis, Leonardo, et al.
Published: (2026)
by: Defilippis, Leonardo, et al.
Published: (2026)
The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
by: Dandi, Yatin, et al.
Published: (2024)
by: Dandi, Yatin, et al.
Published: (2024)
Deep Learning as Neural Low-Degree Filtering: A Spectral Theory of Hierarchical Feature Learning
by: Dandi, Yatin, et al.
Published: (2026)
by: Dandi, Yatin, et al.
Published: (2026)
A High Dimensional Statistical Model for Adversarial Training: Geometry and Trade-Offs
by: Tanner, Kasimir, et al.
Published: (2024)
by: Tanner, Kasimir, et al.
Published: (2024)
The committee machine: Computational to statistical gaps in learning a two-layers neural network
by: Aubin, Benjamin, et al.
Published: (2018)
by: Aubin, Benjamin, et al.
Published: (2018)
A Random Matrix Theory Perspective on the Spectrum of Learned Features and Asymptotic Generalization Capabilities
by: Dandi, Yatin, et al.
Published: (2024)
by: Dandi, Yatin, et al.
Published: (2024)
Optimal Spectral Transitions in High-Dimensional Multi-Index Models
by: Defilippis, Leonardo, et al.
Published: (2025)
by: Defilippis, Leonardo, et al.
Published: (2025)
Fundamental computational limits of weak learnability in high-dimensional multi-index models
by: Troiani, Emanuele, et al.
Published: (2024)
by: Troiani, Emanuele, et al.
Published: (2024)
Scaling Laws from Sequential Feature Recovery: A Solvable Hierarchical Model
by: Wortsman-Zurich, Arie, et al.
Published: (2026)
by: Wortsman-Zurich, Arie, et al.
Published: (2026)
A phase transition between positional and semantic learning in a solvable model of dot-product attention
by: Cui, Hugo, et al.
Published: (2024)
by: Cui, Hugo, et al.
Published: (2024)
Analysis of Bootstrap and Subsampling in High-dimensional Regularized Regression
by: Clarté, Lucas, et al.
Published: (2024)
by: Clarté, Lucas, et al.
Published: (2024)
Analysis of learning a flow-based generative model from limited sample complexity
by: Cui, Hugo, et al.
Published: (2023)
by: Cui, Hugo, et al.
Published: (2023)
Fundamental limits of learning in sequence multi-index models and deep attention networks: High-dimensional asymptotics and sharp thresholds
by: Troiani, Emanuele, et al.
Published: (2025)
by: Troiani, Emanuele, et al.
Published: (2025)
The Computational Advantage of Depth: Learning High-Dimensional Hierarchical Functions with Gradient Descent
by: Dandi, Yatin, et al.
Published: (2025)
by: Dandi, Yatin, et al.
Published: (2025)
Deep Learning of Compositional Targets with Hierarchical Spectral Methods
by: Tabanelli, Hugo, et al.
Published: (2026)
by: Tabanelli, Hugo, et al.
Published: (2026)
ColBERT-Zero: To Pre-train Or Not To Pre-train ColBERT models
by: Chaffin, Antoine, et al.
Published: (2026)
by: Chaffin, Antoine, et al.
Published: (2026)
Escape dynamics and implicit bias of one-pass SGD in overparameterized quadratic networks
by: Bocchi, Dario, et al.
Published: (2026)
by: Bocchi, Dario, et al.
Published: (2026)
Bayes-optimal learning of an extensive-width neural network from quadratically many samples
by: Maillard, Antoine, et al.
Published: (2024)
by: Maillard, Antoine, et al.
Published: (2024)
Spectral Phase Transition and Optimal PCA in Block-Structured Spiked models
by: Mergny, Pierre, et al.
Published: (2024)
by: Mergny, Pierre, et al.
Published: (2024)
Sharp description of local minima in the loss landscape of high-dimensional two-layer ReLU neural networks
by: Huang, Jie, et al.
Published: (2026)
by: Huang, Jie, et al.
Published: (2026)
Sampling with flows, diffusion and autoregressive neural networks: A spin-glass perspective
by: Ghio, Davide, et al.
Published: (2023)
by: Ghio, Davide, et al.
Published: (2023)
Scaling Laws and Spectra of Shallow Neural Networks in the Feature Learning Regime
by: Defilippis, Leonardo, et al.
Published: (2025)
by: Defilippis, Leonardo, et al.
Published: (2025)
Learning with Restricted Boltzmann Machines: Asymptotics of AMP and GD in High Dimensions
by: Xu, Yizhou, et al.
Published: (2025)
by: Xu, Yizhou, et al.
Published: (2025)
Asymptotic Characterisation of Robust Empirical Risk Minimisation Performance in the Presence of Outliers
by: Vilucchio, Matteo, et al.
Published: (2023)
by: Vilucchio, Matteo, et al.
Published: (2023)
Fundamental limits of Non-Linear Low-Rank Matrix Estimation
by: Mergny, Pierre, et al.
Published: (2024)
by: Mergny, Pierre, et al.
Published: (2024)
Breaking the curse of dimensionality for linear rules: optimal predictors over the ellipsoid
by: Ayme, Alexis, et al.
Published: (2025)
by: Ayme, Alexis, et al.
Published: (2025)
Fast Escape, Slow Convergence: Learning Dynamics of Phase Retrieval under Power-Law Data
by: Braun, Guillaume, et al.
Published: (2025)
by: Braun, Guillaume, et al.
Published: (2025)
Kernel ridge regression under power-law data: spectrum and generalization
by: Wortsman, Arie, et al.
Published: (2025)
by: Wortsman, Arie, et al.
Published: (2025)
The merged-staircase property: a necessary and nearly sufficient condition for SGD learning of sparse functions on two-layer neural networks
by: Abbe, Emmanuel, et al.
Published: (2022)
by: Abbe, Emmanuel, et al.
Published: (2022)
Provable Learning of Random Hierarchy Models and Hierarchical Shallow-to-Deep Chaining
by: Ren, Yunwei, et al.
Published: (2026)
by: Ren, Yunwei, et al.
Published: (2026)
Computational Thresholds in Multi-Modal Learning via the Spiked Matrix-Tensor Model
by: Tabanelli, Hugo, et al.
Published: (2025)
by: Tabanelli, Hugo, et al.
Published: (2025)
Convergence, Sticking and Escape: Stochastic Dynamics Near Critical Points in SGD
by: Dudukalov, Dmitry, et al.
Published: (2025)
by: Dudukalov, Dmitry, et al.
Published: (2025)
Similar Items
-
Asymptotics of SGD in Sequence-Single Index Models and Single-Layer Attention Networks
by: Arnaboldi, Luca, et al.
Published: (2025) -
Repetita Iuvant: Data Repetition Allows SGD to Learn High-Dimensional Multi-Index Functions
by: Arnaboldi, Luca, et al.
Published: (2024) -
Universality laws for Gaussian mixtures in generalized linear models
by: Dandi, Yatin, et al.
Published: (2023) -
Online Learning and Information Exponents: On The Importance of Batch size, and Time/Complexity Tradeoffs
by: Arnaboldi, Luca, et al.
Published: (2024) -
How Two-Layer Neural Networks Learn, One (Giant) Step at a Time
by: Dandi, Yatin, et al.
Published: (2023)