:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Fang, Cheng, Dixit, Rishabh, Bajwa, Waheed U., Gurbuzbalaban, Mert
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Optimization and Control
Online Access:	https://arxiv.org/abs/2502.07977
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Accelerated Gradient Methods for Nonconvex Optimization: Escape Trajectories From Strict Saddle Points and Convergence to Local Minima
by: Dixit, Rishabh, et al.
Published: (2023)

DIGing--SGLD: Decentralized and Scalable Langevin Sampling over Time--Varying Networks
by: Bajwa, Waheed U., et al.
Published: (2025)

Accelerated Gradient Methods with Biased Gradient Estimates: Risk Sensitivity, High-Probability Guarantees, and Large Deviation Bounds
by: Gürbüzbalaban, Mert, et al.
Published: (2025)

Heavy-Tail Phenomenon in Decentralized SGD
by: Gurbuzbalaban, Mert, et al.
Published: (2022)

Algorithmic Stability of Stochastic Gradient Descent with Momentum under Heavy-Tailed Noise
by: Dang, Thanh, et al.
Published: (2025)

Generalized EXTRA stochastic gradient Langevin dynamics
by: Gurbuzbalaban, Mert, et al.
Published: (2024)

Faster Convergence of Stochastic Accelerated Gradient Descent under Interpolation
by: Mishkin, Aaron, et al.
Published: (2024)

Decentralized Stochastic Gradient Descent Ascent for Finite-Sum Minimax Problems
by: Gao, Hongchang
Published: (2022)

An Accelerated Primal Dual Algorithm with Backtracking for Decentralized Constrained Optimization
by: Xu, Qiushui, et al.
Published: (2025)

Stochastic Adaptive Gradient Descent Without Descent
by: Aujol, Jean-François, et al.
Published: (2025)

Corner Gradient Descent
by: Yarotsky, Dmitry
Published: (2025)

Learning Provably Improves the Convergence of Gradient Descent
by: Song, Qingyu, et al.
Published: (2025)

Enhancing Fractional Gradient Descent with Learned Optimizers
by: Sobotka, Jan, et al.
Published: (2025)

Scalable Decentralized Learning with Teleportation
by: Takezawa, Yuki, et al.
Published: (2025)

Anytime Acceleration of Gradient Descent
by: Zhang, Zihan, et al.
Published: (2024)

On the Convergence of Gradient Descent on Learning Transformers with Residual Connections
by: Qin, Zhen, et al.
Published: (2025)

Gauss-Newton Natural Gradient Descent for Shape Learning
by: King, James, et al.
Published: (2026)

Adaptive Conditional Gradient Descent
by: Khademi, Abbas, et al.
Published: (2025)

$k$-SVD with Gradient Descent
by: Jedra, Yassir, et al.
Published: (2025)

Active Learning of Deep Neural Networks via Gradient-Free Cutting Planes
by: Zhang, Erica, et al.
Published: (2024)

Gradient is All You Need? How Consensus-Based Optimization can be Interpreted as a Stochastic Relaxation of Gradient Descent
by: Riedl, Konstantin, et al.
Published: (2023)

Communication-Efficient Gradient Descent-Accent Methods for Distributed Variational Inequalities: Unified Analysis and Local Updates
by: Zhang, Siqi, et al.
Published: (2023)

Stochastic Gradient Descent with Strategic Querying
by: Jiang, Nanfei, et al.
Published: (2025)

Stochastic Gradient Descent with Adaptive Data
by: Che, Ethan, et al.
Published: (2024)

Unraveling the Gradient Descent Dynamics of Transformers
by: Song, Bingqing, et al.
Published: (2024)

Reconstructing Physics-Informed Machine Learning for Traffic Flow Modeling: a Multi-Gradient Descent and Pareto Learning Approach
by: Lei, Yuan-Zheng, et al.
Published: (2025)

Mirror and Preconditioned Gradient Descent in Wasserstein Space
by: Bonet, Clément, et al.
Published: (2024)

Derivatives of Stochastic Gradient Descent in parametric optimization
by: Iutzeler, Franck, et al.
Published: (2024)

Convergence of Alternating Gradient Descent for Matrix Factorization
by: Ward, Rachel, et al.
Published: (2023)

On Penalty-based Bilevel Gradient Descent Method
by: Shen, Han, et al.
Published: (2023)

A Local Polyak-Lojasiewicz and Descent Lemma of Gradient Descent For Overparametrized Linear Models
by: Xu, Ziqing, et al.
Published: (2025)

Using Stochastic Gradient Descent to Smooth Nonconvex Functions: Analysis of Implicit Graduated Optimization
by: Sato, Naoki, et al.
Published: (2023)

Increasing Both Batch Size and Learning Rate Accelerates Stochastic Gradient Descent
by: Umeda, Hikaru, et al.
Published: (2024)

AutoGD: Automatic Learning Rate Selection for Gradient Descent
by: Surjanovic, Nikola, et al.
Published: (2025)

Scaling Laws for Gradient Descent and Sign Descent for Linear Bigram Models under Zipf's Law
by: Kunstner, Frederik, et al.
Published: (2025)

On the Inherent Privacy of Zeroth Order Projected Gradient Descent
by: Gupta, Devansh, et al.
Published: (2025)

Convergence Analysis of Stochastic Gradient Descent with MCMC Estimators
by: Li, Tianyou, et al.
Published: (2023)

The Sample Complexity of Gradient Descent in Stochastic Convex Optimization
by: Livni, Roi
Published: (2024)

On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems
by: Lin, Tianyi, et al.
Published: (2019)

Gradient Descent's Last Iterate is Often (slightly) Suboptimal
by: Kornowski, Guy, et al.
Published: (2026)