Saved in:
| Main Authors: | Wang, Lawrence, Roberts, Stephen J. |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.12558 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Can Stability be Detrimental? Better Generalization through Gradient Descent Instabilities
by: Wang, Lawrence, et al.
Published: (2024)
by: Wang, Lawrence, et al.
Published: (2024)
The Implicit Bias of Gradient Descent on Separable Multiclass Data
by: Ravi, Hrithik, et al.
Published: (2024)
by: Ravi, Hrithik, et al.
Published: (2024)
The Implicit Bias of Gradient Descent on Separable Data
by: Soudry, Daniel, et al.
Published: (2017)
by: Soudry, Daniel, et al.
Published: (2017)
The Implicit Bias of Steepest Descent with Mini-batch Stochastic Gradient
by: Li, Jichu, et al.
Published: (2026)
by: Li, Jichu, et al.
Published: (2026)
Understanding Gradient Descent through the Training Jacobian
by: Belrose, Nora, et al.
Published: (2024)
by: Belrose, Nora, et al.
Published: (2024)
Streaming Krylov-Accelerated Stochastic Gradient Descent
by: Thomas, Stephen
Published: (2025)
by: Thomas, Stephen
Published: (2025)
Gradient Descent as a Shrinkage Operator for Spectral Bias
by: Lucey, Simon
Published: (2025)
by: Lucey, Simon
Published: (2025)
Implicit Bias of Gradient Descent for Non-Homogeneous Deep Networks
by: Cai, Yuhang, et al.
Published: (2025)
by: Cai, Yuhang, et al.
Published: (2025)
Convergence and Implicit Bias of Gradient Descent on Continual Linear Classification
by: Jung, Hyunji, et al.
Published: (2025)
by: Jung, Hyunji, et al.
Published: (2025)
Bias of Stochastic Gradient Descent or the Architecture: Disentangling the Effects of Overparameterization of Neural Networks
by: Peleg, Amit, et al.
Published: (2024)
by: Peleg, Amit, et al.
Published: (2024)
Refining Covariance Matrix Estimation in Stochastic Gradient Descent Through Bias Reduction
by: Wei, Ziyang, et al.
Published: (2026)
by: Wei, Ziyang, et al.
Published: (2026)
Stochastic Normalized Gradient Descent with Momentum for Large-Batch Training
by: Zhao, Shen-Yi, et al.
Published: (2020)
by: Zhao, Shen-Yi, et al.
Published: (2020)
Feature Averaging: An Implicit Bias of Gradient Descent Leading to Non-Robustness in Neural Networks
by: Li, Binghui, et al.
Published: (2024)
by: Li, Binghui, et al.
Published: (2024)
Gradient Descent Algorithm Survey
by: Fucheng, Deng, et al.
Published: (2025)
by: Fucheng, Deng, et al.
Published: (2025)
Occam Gradient Descent
by: Kausik, B. N.
Published: (2024)
by: Kausik, B. N.
Published: (2024)
Trained Mamba Emulates Online Gradient Descent in In-Context Linear Regression
by: Jiang, Jiarui, et al.
Published: (2025)
by: Jiang, Jiarui, et al.
Published: (2025)
First-ish Order Methods: Hessian-aware Scalings of Gradient Descent
by: Smee, Oscar, et al.
Published: (2025)
by: Smee, Oscar, et al.
Published: (2025)
Superpositional Gradient Descent: Harnessing Quantum Principles for Model Training
by: Pamuk, Ahmet Erdem, et al.
Published: (2025)
by: Pamuk, Ahmet Erdem, et al.
Published: (2025)
Neutron Reflectometry by Gradient Descent
by: Champneys, Max D., et al.
Published: (2025)
by: Champneys, Max D., et al.
Published: (2025)
Anti-Correlated Noise in Epoch-Based Stochastic Gradient Descent: Implications for Weight Variances in Flat Directions
by: Kühn, Marcel, et al.
Published: (2023)
by: Kühn, Marcel, et al.
Published: (2023)
Step by Step: Adaptive Gradient Descent for Training L-Lipschitz Neural Networks
by: Sung, Kyle, et al.
Published: (2025)
by: Sung, Kyle, et al.
Published: (2025)
The Global Empirical NTK: Self-Referential Bias and Dimensionality of Gradient Descent Learning
by: Hazelden, James, et al.
Published: (2026)
by: Hazelden, James, et al.
Published: (2026)
Stochastic Adaptive Gradient Descent Without Descent
by: Aujol, Jean-François, et al.
Published: (2025)
by: Aujol, Jean-François, et al.
Published: (2025)
Thermodynamic Natural Gradient Descent
by: Donatella, Kaelan, et al.
Published: (2024)
by: Donatella, Kaelan, et al.
Published: (2024)
Stacking as Accelerated Gradient Descent
by: Agarwal, Naman, et al.
Published: (2024)
by: Agarwal, Naman, et al.
Published: (2024)
Corner Gradient Descent
by: Yarotsky, Dmitry
Published: (2025)
by: Yarotsky, Dmitry
Published: (2025)
Adjacent Leader Decentralized Stochastic Gradient Descent
by: He, Haoze, et al.
Published: (2024)
by: He, Haoze, et al.
Published: (2024)
Approximation and Gradient Descent Training with Neural Networks
by: Welper, G.
Published: (2024)
by: Welper, G.
Published: (2024)
Convergence of Implicit Gradient Descent for Training Two-Layer Physics-Informed Neural Networks
by: Xu, Xianliang, et al.
Published: (2024)
by: Xu, Xianliang, et al.
Published: (2024)
Gradient Descent Robustly Learns the Intrinsic Dimension of Data in Training Convolutional Neural Networks
by: Zhang, Chenyang, et al.
Published: (2025)
by: Zhang, Chenyang, et al.
Published: (2025)
Distributed Gradient Descent for Functional Learning
by: Yu, Zhan, et al.
Published: (2023)
by: Yu, Zhan, et al.
Published: (2023)
Robust Gradient Descent for Phase Retrieval
by: Buna, Alex, et al.
Published: (2024)
by: Buna, Alex, et al.
Published: (2024)
On the Generalization of Stochastic Gradient Descent with Momentum
by: Ramezani-Kebrya, Ali, et al.
Published: (2018)
by: Ramezani-Kebrya, Ali, et al.
Published: (2018)
Dual Natural Gradient Descent for Scalable Training of Physics-Informed Neural Networks
by: Jnini, Anas, et al.
Published: (2025)
by: Jnini, Anas, et al.
Published: (2025)
How Does the ReLU Activation Affect the Implicit Bias of Gradient Descent on High-dimensional Neural Network Regression?
by: Lai, Kuo-Wei, et al.
Published: (2026)
by: Lai, Kuo-Wei, et al.
Published: (2026)
Adaptive Conditional Gradient Descent
by: Khademi, Abbas, et al.
Published: (2025)
by: Khademi, Abbas, et al.
Published: (2025)
$k$-SVD with Gradient Descent
by: Jedra, Yassir, et al.
Published: (2025)
by: Jedra, Yassir, et al.
Published: (2025)
Transformers Trained via Gradient Descent Can Provably Learn a Class of Teacher Models
by: Zhang, Chenyang, et al.
Published: (2026)
by: Zhang, Chenyang, et al.
Published: (2026)
Quantum Equilibrium Propagation: Gradient-Descent Training of Quantum Systems
by: Scellier, Benjamin
Published: (2024)
by: Scellier, Benjamin
Published: (2024)
Stochastic Gradient Descent with Momentum is Algorithmically Stable
by: Lei, Yunwen, et al.
Published: (2026)
by: Lei, Yunwen, et al.
Published: (2026)
Similar Items
-
Can Stability be Detrimental? Better Generalization through Gradient Descent Instabilities
by: Wang, Lawrence, et al.
Published: (2024) -
The Implicit Bias of Gradient Descent on Separable Multiclass Data
by: Ravi, Hrithik, et al.
Published: (2024) -
The Implicit Bias of Gradient Descent on Separable Data
by: Soudry, Daniel, et al.
Published: (2017) -
The Implicit Bias of Steepest Descent with Mini-batch Stochastic Gradient
by: Li, Jichu, et al.
Published: (2026) -
Understanding Gradient Descent through the Training Jacobian
by: Belrose, Nora, et al.
Published: (2024)