Saved in:
| Main Authors: | Garg, Sachin, Dereziński, Michał |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2605.18609 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Second-order Information Promotes Mini-Batch Robustness in Variance-Reduced Gradients
by: Garg, Sachin, et al.
Published: (2024)
by: Garg, Sachin, et al.
Published: (2024)
Faster Low-Rank Approximation and Kernel Ridge Regression via the Block-Nyström Method
by: Garg, Sachin, et al.
Published: (2025)
by: Garg, Sachin, et al.
Published: (2025)
Stochastic Variance-Reduced Newton: Accelerating Finite-Sum Minimization with Large Batches
by: Dereziński, Michał
Published: (2022)
by: Dereziński, Michał
Published: (2022)
Distributed Least Squares in Small Space via Sketching and Bias Reduction
by: Garg, Sachin, et al.
Published: (2024)
by: Garg, Sachin, et al.
Published: (2024)
Last-Iterate Convergence of Randomized Kaczmarz and SGD with Greedy Step Size
by: Dereziński, Michał, et al.
Published: (2026)
by: Dereziński, Michał, et al.
Published: (2026)
Debiasing Random Oblique Projections for Subsampled OLS and Fast CUR in High Dimensions
by: Niu, Chengmei, et al.
Published: (2026)
by: Niu, Chengmei, et al.
Published: (2026)
Turbocharging Gaussian Process Inference with Approximate Sketch-and-Project
by: Rathore, Pratik, et al.
Published: (2025)
by: Rathore, Pratik, et al.
Published: (2025)
Accelerating Power Method with Fast Sketching for Stronger Low-Rank Approximation
by: Chenakkod, Shabarish, et al.
Published: (2026)
by: Chenakkod, Shabarish, et al.
Published: (2026)
SGD with Adaptive Preconditioning: Unified Analysis and Momentum Acceleration
by: Kovalev, Dmitry
Published: (2025)
by: Kovalev, Dmitry
Published: (2025)
From One-Pass SGD to Data Reuse: Mini-Batch Scaling Laws in Sketched Linear Regression
by: Chen, Ziyan, et al.
Published: (2026)
by: Chen, Ziyan, et al.
Published: (2026)
Understanding Outer Optimizers in Local SGD: Learning Rates, Momentum, and Acceleration
by: Khaled, Ahmed, et al.
Published: (2025)
by: Khaled, Ahmed, et al.
Published: (2025)
Recent and Upcoming Developments in Randomized Numerical Linear Algebra for Machine Learning
by: Dereziński, Michał, et al.
Published: (2024)
by: Dereziński, Michał, et al.
Published: (2024)
Ordered Momentum for Asynchronous SGD
by: Shi, Chang-Wei, et al.
Published: (2024)
by: Shi, Chang-Wei, et al.
Published: (2024)
Approaching Optimality for Solving Dense Linear Systems with Low-Rank Structure
by: Dereziński, Michał, et al.
Published: (2025)
by: Dereziński, Michał, et al.
Published: (2025)
Solving Dense Linear Systems Faster Than via Preconditioning
by: Dereziński, Michał, et al.
Published: (2023)
by: Dereziński, Michał, et al.
Published: (2023)
Dimension-adapted Momentum Outscales SGD
by: Ferbach, Damien, et al.
Published: (2025)
by: Ferbach, Damien, et al.
Published: (2025)
Stochastic Newton Proximal Extragradient Method
by: Jiang, Ruichen, et al.
Published: (2024)
by: Jiang, Ruichen, et al.
Published: (2024)
HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded Graph Neural Networks
by: Yang, Yongyi, et al.
Published: (2024)
by: Yang, Yongyi, et al.
Published: (2024)
On the Provable Suboptimality of Momentum SGD in Nonstationary Stochastic Optimization
by: Sahu, Sharan, et al.
Published: (2026)
by: Sahu, Sharan, et al.
Published: (2026)
The Marginal Value of Momentum for Small Learning Rate SGD
by: Wang, Runzhe, et al.
Published: (2023)
by: Wang, Runzhe, et al.
Published: (2023)
Signal Processing Meets SGD: From Momentum to Filter
by: Yao, Zhipeng, et al.
Published: (2023)
by: Yao, Zhipeng, et al.
Published: (2023)
Optimal Oblivious Subspace Embeddings with Near-optimal Sparsity
by: Chenakkod, Shabarish, et al.
Published: (2024)
by: Chenakkod, Shabarish, et al.
Published: (2024)
Optimal Subspace Embeddings: Resolving Nelson-Nguyen Conjecture Up to Sub-Polylogarithmic Factors
by: Chenakkod, Shabarish, et al.
Published: (2025)
by: Chenakkod, Shabarish, et al.
Published: (2025)
Debiasing Mini-Batch Quadratics for Applications in Deep Learning
by: Tatzel, Lukas, et al.
Published: (2024)
by: Tatzel, Lukas, et al.
Published: (2024)
Towards Universal Convergence of Backward Error in Linear System Solvers
by: Dereziński, Michał, et al.
Published: (2026)
by: Dereziński, Michał, et al.
Published: (2026)
Faster Linear Systems and Matrix Norm Approximation via Multi-level Sketched Preconditioning
by: Dereziński, Michał, et al.
Published: (2024)
by: Dereziński, Michał, et al.
Published: (2024)
High-dimensional limit theorems for SGD: Momentum and Adaptive Step-sizes
by: Jagannath, Aukosh, et al.
Published: (2025)
by: Jagannath, Aukosh, et al.
Published: (2025)
Mini-Batch Kernel $k$-means
by: Jourdan, Ben, et al.
Published: (2024)
by: Jourdan, Ben, et al.
Published: (2024)
Mini-Batch Class Composition Bias in Link Prediction
by: Maguire, Kieran, et al.
Published: (2026)
by: Maguire, Kieran, et al.
Published: (2026)
Enhancing SignSGD: Small-Batch Convergence Analysis and a Hybrid Switching Strategy
by: Chen, Haoran, et al.
Published: (2026)
by: Chen, Haoran, et al.
Published: (2026)
SGD for Variational Inference: Tackling Unbounded Variance via Preconditioning and Dynamic Batching
by: Labarrière, Hippolyte, et al.
Published: (2026)
by: Labarrière, Hippolyte, et al.
Published: (2026)
Stochastic Normalized Gradient Descent with Momentum for Large-Batch Training
by: Zhao, Shen-Yi, et al.
Published: (2020)
by: Zhao, Shen-Yi, et al.
Published: (2020)
Optimal Embedding Dimension for Sparse Subspace Embeddings
by: Chenakkod, Shabarish, et al.
Published: (2023)
by: Chenakkod, Shabarish, et al.
Published: (2023)
Can Microcanonical Langevin Dynamics Leverage Mini-Batch Gradient Noise?
by: Sommer, Emanuel, et al.
Published: (2026)
by: Sommer, Emanuel, et al.
Published: (2026)
Bringing Order to Asynchronous SGD: Towards Optimality under Data-Dependent Delays with Momentum
by: Dahan, Tehila, et al.
Published: (2026)
by: Dahan, Tehila, et al.
Published: (2026)
Hierarchical Rectified Flow Matching with Mini-Batch Couplings
by: Zhang, Yichi, et al.
Published: (2025)
by: Zhang, Yichi, et al.
Published: (2025)
Leveraging Coordinate Momentum in SignSGD and Muon: Memory-Optimized Zero-Order
by: Petrov, Egor, et al.
Published: (2025)
by: Petrov, Egor, et al.
Published: (2025)
Well-Conditioned Oblivious Perturbations in Linear Space
by: Chenakkod, Shabarish, et al.
Published: (2026)
by: Chenakkod, Shabarish, et al.
Published: (2026)
Pseudo-Asynchronous Local SGD: Robust and Efficient Data-Parallel Training
by: Naganuma, Hiroki, et al.
Published: (2025)
by: Naganuma, Hiroki, et al.
Published: (2025)
Optimal Growth Schedules for Batch Size and Learning Rate in SGD that Reduce SFO Complexity
by: Umeda, Hikaru, et al.
Published: (2025)
by: Umeda, Hikaru, et al.
Published: (2025)
Similar Items
-
Second-order Information Promotes Mini-Batch Robustness in Variance-Reduced Gradients
by: Garg, Sachin, et al.
Published: (2024) -
Faster Low-Rank Approximation and Kernel Ridge Regression via the Block-Nyström Method
by: Garg, Sachin, et al.
Published: (2025) -
Stochastic Variance-Reduced Newton: Accelerating Finite-Sum Minimization with Large Batches
by: Dereziński, Michał
Published: (2022) -
Distributed Least Squares in Small Space via Sketching and Bias Reduction
by: Garg, Sachin, et al.
Published: (2024) -
Last-Iterate Convergence of Randomized Kaczmarz and SGD with Greedy Step Size
by: Dereziński, Michał, et al.
Published: (2026)