Saved in:
| Main Authors: | Yang, Yongyi, Poggio, Tomaso, Chuang, Isaac, Ziyin, Liu |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.02670 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Heterosynaptic Circuits Are Universal Gradient Machines
by: Ziyin, Liu, et al.
Published: (2025)
by: Ziyin, Liu, et al.
Published: (2025)
Parameter Symmetry Potentially Unifies Deep Learning Theory
by: Ziyin, Liu, et al.
Published: (2025)
by: Ziyin, Liu, et al.
Published: (2025)
Formation of Representations in Neural Networks
by: Ziyin, Liu, et al.
Published: (2024)
by: Ziyin, Liu, et al.
Published: (2024)
Ubiquity of Emergent Hebbian Dynamics in Regularized Learning
by: Koplow, David, et al.
Published: (2025)
by: Koplow, David, et al.
Published: (2025)
A universal compression theory for lottery ticket hypothesis and neural scaling laws
by: Wang, Hong-Yi, et al.
Published: (2025)
by: Wang, Hong-Yi, et al.
Published: (2025)
Does Feedback Alignment Work at Biological Timescales?
by: Bacvanski, Marc Gong, et al.
Published: (2025)
by: Bacvanski, Marc Gong, et al.
Published: (2025)
On efficiently computable functions, deep networks and sparse compositionality
by: Poggio, Tomaso
Published: (2025)
by: Poggio, Tomaso
Published: (2025)
An Equivariance Toolbox for Learning Dynamics
by: Yang, Yongyi, et al.
Published: (2025)
by: Yang, Yongyi, et al.
Published: (2025)
Proof of a perfect platonic representation hypothesis
by: Ziyin, Liu, et al.
Published: (2025)
by: Ziyin, Liu, et al.
Published: (2025)
How Neural Networks Learn the Support is an Implicit Regularization Effect of SGD
by: Beneventano, Pierfrancesco, et al.
Published: (2024)
by: Beneventano, Pierfrancesco, et al.
Published: (2024)
Remove Symmetries to Control Model Expressivity and Improve Optimization
by: Ziyin, Liu, et al.
Published: (2024)
by: Ziyin, Liu, et al.
Published: (2024)
On Generalization Bounds for Neural Networks with Low Rank Layers
by: Pinto, Andrea, et al.
Published: (2024)
by: Pinto, Andrea, et al.
Published: (2024)
Neural Thermodynamics: Entropic Forces in Deep and Universal Representation Learning
by: Ziyin, Liu, et al.
Published: (2025)
by: Ziyin, Liu, et al.
Published: (2025)
Learning Sparse Compositional Functions with Norm-Constrained Neural Networks
by: Huang, Shuo, et al.
Published: (2026)
by: Huang, Shuo, et al.
Published: (2026)
Hierarchical Reasoning Models: Perspectives and Misconceptions
by: Ge, Renee, et al.
Published: (2025)
by: Ge, Renee, et al.
Published: (2025)
pAI/MSc: ML Theory Research with Humans on the Loop
by: Abdelmoneum, Mahmoud, et al.
Published: (2026)
by: Abdelmoneum, Mahmoud, et al.
Published: (2026)
Unraveling Syntax: How Language Models Learn Context-Free Grammars
by: Schulz, Laura Ying, et al.
Published: (2025)
by: Schulz, Laura Ying, et al.
Published: (2025)
Does SGD Seek Flatness or Sharpness? An Exactly Solvable Model
by: Xu, Yizhou, et al.
Published: (2026)
by: Xu, Yizhou, et al.
Published: (2026)
SGD and Weight Decay Secretly Minimize the Rank of Your Neural Network
by: Galanti, Tomer, et al.
Published: (2022)
by: Galanti, Tomer, et al.
Published: (2022)
Thermodynamic Irreversibility of Training Algorithms
by: Ziyin, Liu, et al.
Published: (2026)
by: Ziyin, Liu, et al.
Published: (2026)
Does Weight Decay Enhance Training Stability?
by: Saether, Marius, et al.
Published: (2026)
by: Saether, Marius, et al.
Published: (2026)
Iterative regularization in classification via hinge loss diagonal descent
by: Apidopoulos, Vassilis, et al.
Published: (2022)
by: Apidopoulos, Vassilis, et al.
Published: (2022)
Learning Multi-Index Models with Hyper-Kernel Ridge Regression
by: Huang, Shuo, et al.
Published: (2025)
by: Huang, Shuo, et al.
Published: (2025)
Position: A Theory of Deep Learning Must Include Compositional Sparsity
by: Danhofer, David A., et al.
Published: (2025)
by: Danhofer, David A., et al.
Published: (2025)
On the Invariance and Generality of Neural Scaling Laws
by: Han, Xing, et al.
Published: (2026)
by: Han, Xing, et al.
Published: (2026)
Too Sharp, Too Sure: When Calibration Follows Curvature
by: Morosini, Alessandro, et al.
Published: (2026)
by: Morosini, Alessandro, et al.
Published: (2026)
Symmetry Induces Structure and Constraint of Learning
by: Ziyin, Liu
Published: (2023)
by: Ziyin, Liu
Published: (2023)
Momentum Further Constrains Sharpness at the Edge of Stochastic Stability
by: Andreyev, Arseniy, et al.
Published: (2026)
by: Andreyev, Arseniy, et al.
Published: (2026)
The Generalized Turing Test: A Foundation for Comparing Intelligence
by: Mitropolsky, Daniel, et al.
Published: (2026)
by: Mitropolsky, Daniel, et al.
Published: (2026)
Same Error, Different Function: The Optimizer as an Implicit Prior in Financial Time Series
by: Cortesi, Federico Vittorio, et al.
Published: (2026)
by: Cortesi, Federico Vittorio, et al.
Published: (2026)
Do Deep Networks Forget Initialization? A Forgetting-Time View of Practical Inductive Bias
by: Das, Mohua, et al.
Published: (2026)
by: Das, Mohua, et al.
Published: (2026)
Three Mechanisms of Feature Learning in a Linear Network
by: Xu, Yizhou, et al.
Published: (2024)
by: Xu, Yizhou, et al.
Published: (2024)
Information Filtering Networks: Theoretical Foundations, Generative Methodologies, and Real-World Applications
by: Aste, Tomaso
Published: (2025)
by: Aste, Tomaso
Published: (2025)
Training the Untrainable: Introducing Inductive Bias via Representational Alignment
by: Subramaniam, Vighnesh, et al.
Published: (2024)
by: Subramaniam, Vighnesh, et al.
Published: (2024)
Self-Assembly of a Biologically Plausible Learning Circuit
by: Liao, Qianli, et al.
Published: (2024)
by: Liao, Qianli, et al.
Published: (2024)
Manifold Topological Deep Learning for Biomedical Data
by: Liu, Xiang, et al.
Published: (2025)
by: Liu, Xiang, et al.
Published: (2025)
Continuous Invariance Learning
by: Lin, Yong, et al.
Published: (2023)
by: Lin, Yong, et al.
Published: (2023)
Provable Low-Frequency Bias of In-Context Learning of Representations
by: Yang, Yongyi, et al.
Published: (2025)
by: Yang, Yongyi, et al.
Published: (2025)
Unraveling the Enigma of Double Descent: An In-depth Analysis through the Lens of Learned Feature Space
by: Gu, Yufei, et al.
Published: (2023)
by: Gu, Yufei, et al.
Published: (2023)
New Evidence of the Two-Phase Learning Dynamics of Neural Networks
by: Zhou, Zhanpeng, et al.
Published: (2025)
by: Zhou, Zhanpeng, et al.
Published: (2025)
Similar Items
-
Heterosynaptic Circuits Are Universal Gradient Machines
by: Ziyin, Liu, et al.
Published: (2025) -
Parameter Symmetry Potentially Unifies Deep Learning Theory
by: Ziyin, Liu, et al.
Published: (2025) -
Formation of Representations in Neural Networks
by: Ziyin, Liu, et al.
Published: (2024) -
Ubiquity of Emergent Hebbian Dynamics in Regularized Learning
by: Koplow, David, et al.
Published: (2025) -
A universal compression theory for lottery ticket hypothesis and neural scaling laws
by: Wang, Hong-Yi, et al.
Published: (2025)