:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Garg, Sachin, Dereziński, Michał
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2605.18609
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Second-order Information Promotes Mini-Batch Robustness in Variance-Reduced Gradients
by: Garg, Sachin, et al.
Published: (2024)

Faster Low-Rank Approximation and Kernel Ridge Regression via the Block-Nyström Method
by: Garg, Sachin, et al.
Published: (2025)

Stochastic Variance-Reduced Newton: Accelerating Finite-Sum Minimization with Large Batches
by: Dereziński, Michał
Published: (2022)

Distributed Least Squares in Small Space via Sketching and Bias Reduction
by: Garg, Sachin, et al.
Published: (2024)

Last-Iterate Convergence of Randomized Kaczmarz and SGD with Greedy Step Size
by: Dereziński, Michał, et al.
Published: (2026)

Debiasing Random Oblique Projections for Subsampled OLS and Fast CUR in High Dimensions
by: Niu, Chengmei, et al.
Published: (2026)

Turbocharging Gaussian Process Inference with Approximate Sketch-and-Project
by: Rathore, Pratik, et al.
Published: (2025)

Accelerating Power Method with Fast Sketching for Stronger Low-Rank Approximation
by: Chenakkod, Shabarish, et al.
Published: (2026)

SGD with Adaptive Preconditioning: Unified Analysis and Momentum Acceleration
by: Kovalev, Dmitry
Published: (2025)

From One-Pass SGD to Data Reuse: Mini-Batch Scaling Laws in Sketched Linear Regression
by: Chen, Ziyan, et al.
Published: (2026)

Understanding Outer Optimizers in Local SGD: Learning Rates, Momentum, and Acceleration
by: Khaled, Ahmed, et al.
Published: (2025)

Recent and Upcoming Developments in Randomized Numerical Linear Algebra for Machine Learning
by: Dereziński, Michał, et al.
Published: (2024)

Ordered Momentum for Asynchronous SGD
by: Shi, Chang-Wei, et al.
Published: (2024)

Approaching Optimality for Solving Dense Linear Systems with Low-Rank Structure
by: Dereziński, Michał, et al.
Published: (2025)

Solving Dense Linear Systems Faster Than via Preconditioning
by: Dereziński, Michał, et al.
Published: (2023)

Dimension-adapted Momentum Outscales SGD
by: Ferbach, Damien, et al.
Published: (2025)

Stochastic Newton Proximal Extragradient Method
by: Jiang, Ruichen, et al.
Published: (2024)

HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded Graph Neural Networks
by: Yang, Yongyi, et al.
Published: (2024)

On the Provable Suboptimality of Momentum SGD in Nonstationary Stochastic Optimization
by: Sahu, Sharan, et al.
Published: (2026)

The Marginal Value of Momentum for Small Learning Rate SGD
by: Wang, Runzhe, et al.
Published: (2023)

Signal Processing Meets SGD: From Momentum to Filter
by: Yao, Zhipeng, et al.
Published: (2023)

Optimal Oblivious Subspace Embeddings with Near-optimal Sparsity
by: Chenakkod, Shabarish, et al.
Published: (2024)

Optimal Subspace Embeddings: Resolving Nelson-Nguyen Conjecture Up to Sub-Polylogarithmic Factors
by: Chenakkod, Shabarish, et al.
Published: (2025)

Debiasing Mini-Batch Quadratics for Applications in Deep Learning
by: Tatzel, Lukas, et al.
Published: (2024)

Towards Universal Convergence of Backward Error in Linear System Solvers
by: Dereziński, Michał, et al.
Published: (2026)

Faster Linear Systems and Matrix Norm Approximation via Multi-level Sketched Preconditioning
by: Dereziński, Michał, et al.
Published: (2024)

High-dimensional limit theorems for SGD: Momentum and Adaptive Step-sizes
by: Jagannath, Aukosh, et al.
Published: (2025)

Mini-Batch Kernel $k$-means
by: Jourdan, Ben, et al.
Published: (2024)

Mini-Batch Class Composition Bias in Link Prediction
by: Maguire, Kieran, et al.
Published: (2026)

Enhancing SignSGD: Small-Batch Convergence Analysis and a Hybrid Switching Strategy
by: Chen, Haoran, et al.
Published: (2026)

SGD for Variational Inference: Tackling Unbounded Variance via Preconditioning and Dynamic Batching
by: Labarrière, Hippolyte, et al.
Published: (2026)

Stochastic Normalized Gradient Descent with Momentum for Large-Batch Training
by: Zhao, Shen-Yi, et al.
Published: (2020)

Optimal Embedding Dimension for Sparse Subspace Embeddings
by: Chenakkod, Shabarish, et al.
Published: (2023)

Can Microcanonical Langevin Dynamics Leverage Mini-Batch Gradient Noise?
by: Sommer, Emanuel, et al.
Published: (2026)

Bringing Order to Asynchronous SGD: Towards Optimality under Data-Dependent Delays with Momentum
by: Dahan, Tehila, et al.
Published: (2026)

Hierarchical Rectified Flow Matching with Mini-Batch Couplings
by: Zhang, Yichi, et al.
Published: (2025)

Leveraging Coordinate Momentum in SignSGD and Muon: Memory-Optimized Zero-Order
by: Petrov, Egor, et al.
Published: (2025)

Well-Conditioned Oblivious Perturbations in Linear Space
by: Chenakkod, Shabarish, et al.
Published: (2026)

Pseudo-Asynchronous Local SGD: Robust and Efficient Data-Parallel Training
by: Naganuma, Hiroki, et al.
Published: (2025)

Optimal Growth Schedules for Batch Size and Learning Rate in SGD that Reduce SFO Complexity
by: Umeda, Hikaru, et al.
Published: (2025)