:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Lawrence, Roberts, Stephen J.
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2412.17613
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Training Instabilities Induce Flatness Bias in Gradient Descent
by: Wang, Lawrence, et al.
Published: (2025)

Can Gradient Descent Simulate Prompting?
by: Zhang, Eric, et al.
Published: (2025)

Product-Stability: Provable Convergence for Gradient Descent on the Edge of Stability
by: Gan, Eric
Published: (2026)

Streaming Krylov-Accelerated Stochastic Gradient Descent
by: Thomas, Stephen
Published: (2025)

Understanding Gradient Descent through the Training Jacobian
by: Belrose, Nora, et al.
Published: (2024)

Can LLMs predict the convergence of Stochastic Gradient Descent?
by: Zekri, Oussama, et al.
Published: (2024)

On the Generalization of Stochastic Gradient Descent with Momentum
by: Ramezani-Kebrya, Ali, et al.
Published: (2018)

Non-Euclidean Gradient Descent Operates at the Edge of Stability
by: Islamov, Rustem, et al.
Published: (2026)

Outlier Gradient Analysis: Efficiently Identifying Detrimental Training Samples for Deep Learning Models
by: Chhabra, Anshuman, et al.
Published: (2024)

Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks
by: Wang, Puyu, et al.
Published: (2023)

Occam Gradient Descent
by: Kausik, B. N.
Published: (2024)

Gradient Descent Algorithm Survey
by: Fucheng, Deng, et al.
Published: (2025)

Generalization Bounds of Stochastic Gradient Descent in Homogeneous Neural Networks
by: Ma, Wenquan, et al.
Published: (2026)

First-ish Order Methods: Hessian-aware Scalings of Gradient Descent
by: Smee, Oscar, et al.
Published: (2025)

Can Looped Transformers Learn to Implement Multi-step Gradient Descent for In-context Learning?
by: Gatmiry, Khashayar, et al.
Published: (2024)

Characterizing Dynamical Stability of Stochastic Gradient Descent in Overparameterized Learning
by: Chemnitz, Dennis, et al.
Published: (2024)

Generalized Gradient Descent is a Hypergraph Functor
by: Hanks, Tyler, et al.
Published: (2024)

Neutron Reflectometry by Gradient Descent
by: Champneys, Max D., et al.
Published: (2025)

Convergence Rates for Gradient Descent on the Edge of Stability in Overparametrised Least Squares
by: MacDonald, Lachlan Ewen, et al.
Published: (2025)

Improving Energy Natural Gradient Descent through Woodbury, Momentum, and Randomization
by: Guzmán-Cordero, Andrés, et al.
Published: (2025)

Stochastic Adaptive Gradient Descent Without Descent
by: Aujol, Jean-François, et al.
Published: (2025)

Armijo Line-search Can Make (Stochastic) Gradient Descent Provably Faster
by: Vaswani, Sharan, et al.
Published: (2025)

Thermodynamic Natural Gradient Descent
by: Donatella, Kaelan, et al.
Published: (2024)

Stacking as Accelerated Gradient Descent
by: Agarwal, Naman, et al.
Published: (2024)

Corner Gradient Descent
by: Yarotsky, Dmitry
Published: (2025)

Adjacent Leader Decentralized Stochastic Gradient Descent
by: He, Haoze, et al.
Published: (2024)

On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent
by: Li, Bingrui, et al.
Published: (2024)

Transformers Trained via Gradient Descent Can Provably Learn a Class of Teacher Models
by: Zhang, Chenyang, et al.
Published: (2026)

Controlling the Flow: Stability and Convergence for Stochastic Gradient Descent with Decaying Regularization
by: Kassing, Sebastian, et al.
Published: (2025)

Gradient Flossing: Improving Gradient Descent through Dynamic Control of Jacobians
by: Engelken, Rainer
Published: (2023)

Type-II Saddles and Probabilistic Stability of Stochastic Gradient Descent
by: Ziyin, Liu, et al.
Published: (2023)

Robust Gradient Descent for Phase Retrieval
by: Buna, Alex, et al.
Published: (2024)

Distributed Gradient Descent for Functional Learning
by: Yu, Zhan, et al.
Published: (2023)

Optimal Rates for Generalization of Gradient Descent for Deep ReLU Classification
by: Li, Yuanfan, et al.
Published: (2025)

Multiclass Loss Geometry Matters for Generalization of Gradient Descent in Separable Classification
by: Schliserman, Matan, et al.
Published: (2025)

Generalization Guarantees on Data-Driven Tuning of Gradient Descent with Langevin Updates
by: Goyal, Saumya, et al.
Published: (2026)

The Implicit Bias of Gradient Descent on Separable Multiclass Data
by: Ravi, Hrithik, et al.
Published: (2024)

Algorithmic Stability of Stochastic Gradient Descent with Momentum under Heavy-Tailed Noise
by: Dang, Thanh, et al.
Published: (2025)

Adaptive Conditional Gradient Descent
by: Khademi, Abbas, et al.
Published: (2025)

$k$-SVD with Gradient Descent
by: Jedra, Yassir, et al.
Published: (2025)