Saved in:
| Main Authors: | Kunin, Daniel, Marchetti, Giovanni Luca, Chen, Feng, Karkada, Dhruva, Simon, James B., DeWeese, Michael R., Ganguli, Surya, Miolane, Nina |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.06489 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Closed-Form Training Dynamics Reveal Learned Features and Linear Structure in Word2Vec-like Models
by: Karkada, Dhruva, et al.
Published: (2025)
by: Karkada, Dhruva, et al.
Published: (2025)
Sequential Group Composition: A Window into the Mechanics of Deep Learning
by: Marchetti, Giovanni Luca, et al.
Published: (2026)
by: Marchetti, Giovanni Luca, et al.
Published: (2026)
Revisiting Policy Gradients for Restricted Policy Classes: Escaping Myopic Local Optima with $k$-step Policy Gradients
by: DeWeese, Alex, et al.
Published: (2026)
by: DeWeese, Alex, et al.
Published: (2026)
A Theory of Saddle Escape in Deep Nonlinear Networks
by: Rawal, Divit, et al.
Published: (2026)
by: Rawal, Divit, et al.
Published: (2026)
The lazy (NTK) and rich ($μ$P) regimes: a gentle tutorial
by: Karkada, Dhruva
Published: (2024)
by: Karkada, Dhruva
Published: (2024)
Stochastic Collapse: How Gradient Noise Attracts SGD Dynamics Towards Simpler Subnetworks
by: Chen, Feng, et al.
Published: (2023)
by: Chen, Feng, et al.
Published: (2023)
Status Concerns and Library Professionalism
by: DeWeese, L. Carroll
Published: (1972)
by: DeWeese, L. Carroll
Published: (1972)
Thinking Beyond Visibility: A Near-Optimal Policy Framework for Locally Interdependent Multi-Agent MDPs
by: DeWeese, Alex, et al.
Published: (2025)
by: DeWeese, Alex, et al.
Published: (2025)
Locally Interdependent Multi-Agent MDP: Theoretical Framework for Decentralized Agents with Dynamic Dependencies
by: DeWeese, Alex, et al.
Published: (2024)
by: DeWeese, Alex, et al.
Published: (2024)
A Paradigm of Commitment
by: DeWeese, Lemuel Carroll, III
Published: (1970)
by: DeWeese, Lemuel Carroll, III
Published: (1970)
A Paradigm of Commitment: Toward Professional Identity for Librarians.
by: DeWeese, Lemuel Carroll, III
Published: (1970)
by: DeWeese, Lemuel Carroll, III
Published: (1970)
Beyond Linear Response: Equivalence between Thermodynamic Geometry and Optimal Transport
by: Zhong, Adrianne, et al.
Published: (2024)
by: Zhong, Adrianne, et al.
Published: (2024)
Temperature and flow data from a sediment tank experiment and numerical Advection-Dispersion Model code
by: Luce, Charles, et al.
Published: (2017)
by: Luce, Charles, et al.
Published: (2017)
Predicting kernel regression learning curves from only raw data statistics
by: Karkada, Dhruva, et al.
Published: (2025)
by: Karkada, Dhruva, et al.
Published: (2025)
More is Better in Modern Machine Learning: when Infinite Overparameterization is Optimal and Overfitting is Obligatory
by: Simon, James B., et al.
Published: (2023)
by: Simon, James B., et al.
Published: (2023)
Higher-order response theory in optimal stochastic thermodynamics
by: DAmbrosia, Samuel. H., et al.
Published: (2025)
by: DAmbrosia, Samuel. H., et al.
Published: (2025)
Time-Asymmetric Fluctuation Theorem and Efficient Free Energy Estimation
by: Zhong, Adrianne, et al.
Published: (2023)
by: Zhong, Adrianne, et al.
Published: (2023)
Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning
by: Kunin, Daniel, et al.
Published: (2024)
by: Kunin, Daniel, et al.
Published: (2024)
On the Emergence of Linear Analogies in Word Embeddings
by: Korchinski, Daniel J., et al.
Published: (2025)
by: Korchinski, Daniel J., et al.
Published: (2025)
The Entropy of Floating-Point Numbers
by: Daniels, Sultan, et al.
Published: (2026)
by: Daniels, Sultan, et al.
Published: (2026)
There Will Be a Scientific Theory of Deep Learning
by: Simon, Jamie, et al.
Published: (2026)
by: Simon, Jamie, et al.
Published: (2026)
A General Framework for Robust G-Invariance in G-Equivariant Networks
by: Sanborn, Sophia, et al.
Published: (2023)
by: Sanborn, Sophia, et al.
Published: (2023)
The Thermodynamic Costs of Simple Linear Regression
by: D'Ambrosia, Samuel H., et al.
Published: (2026)
by: D'Ambrosia, Samuel H., et al.
Published: (2026)
Symmetry in language statistics shapes the geometry of model representations
by: Karkada, Dhruva, et al.
Published: (2026)
by: Karkada, Dhruva, et al.
Published: (2026)
Meta-Learning for Better Learning: Using Meta-Learning Methods to Automatically Label Exam Questions with Detailed Learning Objectives
by: Zur, Amir, et al.
Published: (2023)
by: Zur, Amir, et al.
Published: (2023)
TopoTune : A Framework for Generalized Combinatorial Complex Neural Networks
by: Papillon, Mathilde, et al.
Published: (2024)
by: Papillon, Mathilde, et al.
Published: (2024)
An efficient algorithm for the Riemannian logarithm on the Stiefel manifold for a family of Riemannian metrics
by: Mataigne, Simon, et al.
Published: (2024)
by: Mataigne, Simon, et al.
Published: (2024)
Features are fate: a theory of transfer learning in high-dimensional regression
by: Tahir, Javan, et al.
Published: (2024)
by: Tahir, Javan, et al.
Published: (2024)
Learning on a Razor's Edge: Identifiability and Singularity of Polynomial Neural Networks
by: Shahverdi, Vahid, et al.
Published: (2025)
by: Shahverdi, Vahid, et al.
Published: (2025)
Architectures of Topological Deep Learning: A Survey of Message-Passing Topological Neural Networks
by: Papillon, Mathilde, et al.
Published: (2023)
by: Papillon, Mathilde, et al.
Published: (2023)
An analytic theory of creativity in convolutional diffusion models
by: Kamb, Mason, et al.
Published: (2024)
by: Kamb, Mason, et al.
Published: (2024)
Fooling LLM graders into giving better grades through neural activity guided adversarial prompting
by: Yamamura, Atsushi, et al.
Published: (2024)
by: Yamamura, Atsushi, et al.
Published: (2024)
Harmonics of Learning: Universal Fourier Features Emerge in Invariant Networks
by: Marchetti, Giovanni Luca, et al.
Published: (2023)
by: Marchetti, Giovanni Luca, et al.
Published: (2023)
The Selective Disk Bispectrum and Its Inversion, with Application to Multi-Reference Alignment
by: Myers, Adele, et al.
Published: (2025)
by: Myers, Adele, et al.
Published: (2025)
Rubik's Abstract Polytopes
by: Marchetti, Giovanni Luca
Published: (2025)
by: Marchetti, Giovanni Luca
Published: (2025)
On the approximation of the Riemannian barycenter
by: Mataigne, Simon, et al.
Published: (2025)
by: Mataigne, Simon, et al.
Published: (2025)
Bounds on the geodesic distances on the Stiefel manifold for a family of Riemannian metrics
by: Mataigne, Simon, et al.
Published: (2024)
by: Mataigne, Simon, et al.
Published: (2024)
The Selective G-Bispectrum and its Inversion: Applications to G-Invariant Networks
by: Mataigne, Simon, et al.
Published: (2024)
by: Mataigne, Simon, et al.
Published: (2024)
Deriving Neural Scaling Laws from the statistics of natural language
by: Cagnetta, Francesco, et al.
Published: (2026)
by: Cagnetta, Francesco, et al.
Published: (2026)
Causal Interpretation of Neural Network Computations with Contribution Decomposition
by: Melander, Joshua Brendan, et al.
Published: (2026)
by: Melander, Joshua Brendan, et al.
Published: (2026)
Similar Items
-
Closed-Form Training Dynamics Reveal Learned Features and Linear Structure in Word2Vec-like Models
by: Karkada, Dhruva, et al.
Published: (2025) -
Sequential Group Composition: A Window into the Mechanics of Deep Learning
by: Marchetti, Giovanni Luca, et al.
Published: (2026) -
Revisiting Policy Gradients for Restricted Policy Classes: Escaping Myopic Local Optima with $k$-step Policy Gradients
by: DeWeese, Alex, et al.
Published: (2026) -
A Theory of Saddle Escape in Deep Nonlinear Networks
by: Rawal, Divit, et al.
Published: (2026) -
The lazy (NTK) and rich ($μ$P) regimes: a gentle tutorial
by: Karkada, Dhruva
Published: (2024)