:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Gruntkowska, Kaja, Gaponov, Alexander, Tovmasyan, Zhirayr, Richtárik, Peter
Format:	Preprint
Published:	2025
Subjects:	Machine Learning Optimization and Control
Online Access:	https://arxiv.org/abs/2510.00643
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Drop-Muon: Update Less, Converge Faster
by: Gruntkowska, Kaja, et al.
Published: (2025)

Non-Euclidean Broximal Point Method: A Blueprint for Geometry-Aware Optimization
by: Gruntkowska, Kaja, et al.
Published: (2025)

Gluon: Making Muon & Scion Great Again! (Bridging Theory and Practice of LMO-based Optimizers for LLMs)
by: Riabinin, Artem, et al.
Published: (2025)

Improving the Worst-Case Bidirectional Communication Complexity for Nonconvex Distributed Optimization under Function Similarity
by: Gruntkowska, Kaja, et al.
Published: (2024)

Freya PAGE: First Optimal Time Complexity for Large-Scale Nonconvex Finite-Sum Optimization with Heterogeneous Asynchronous Computations
by: Tyurin, Alexander, et al.
Published: (2024)

Rennala MVR: Improved Time Complexity for Parallel Stochastic Optimization via Momentum-Based Variance Reduction
by: Tovmasyan, Zhirayr, et al.
Published: (2026)

Local LMO: Constrained Gradient Optimization via a Local Linear Minimization Oracle
by: Richtárik, Peter, et al.
Published: (2026)

Tighter Performance Theory of FedExProx
by: Anyszka, Wojciech, et al.
Published: (2024)

The Ball-Proximal (="Broximal") Point Method: a New Algorithm, Convergence Theory, and Applications
by: Gruntkowska, Kaja, et al.
Published: (2025)

Revisiting Stochastic Proximal Point Methods: Generalized Smoothness and Similarity
by: Tovmasyan, Zhirayr, et al.
Published: (2025)

Communication Compression for Byzantine Robust Learning: New Efficient Algorithms and Improved Rates
by: Rammal, Ahmad, et al.
Published: (2023)

Stabilized Proximal Point Method via Trust Region Control
by: Li, Hanmin, et al.
Published: (2026)

Broximal Alignment for Global Non-Convex Optimization
by: Gruntkowska, Kaja, et al.
Published: (2026)

Byzantine-Robust and Differentially Private Federated Optimization under Weaker Assumptions
by: Islamov, Rustem, et al.
Published: (2026)

Muon is Provably Faster with Momentum Variance Reduction
by: Qian, Xun, et al.
Published: (2025)

Beyond the Ideal: Analyzing the Inexact Muon Update
by: Shulgin, Egor, et al.
Published: (2025)

Double Momentum and Error Feedback for Clipping with Fast Rates and Differential Privacy
by: Islamov, Rustem, et al.
Published: (2025)

EF21 with Bells & Whistles: Six Algorithmic Extensions of Modern Error Feedback
by: Fatkhullin, Ilyas, et al.
Published: (2021)

Improved Convergence in Parameter-Agnostic Error Feedback through Momentum
by: Sadiev, Abdurakhmon, et al.
Published: (2025)

A Computation and Communication Efficient Method for Distributed Nonconvex Problems in the Partial Participation Setting
by: Tyurin, Alexander, et al.
Published: (2022)

Convergence Analysis of the PAGE Stochastic Algorithm for Weakly Convex Finite-Sum Optimization
by: Condat, Laurent, et al.
Published: (2025)

MARINA-P: Superior Performance in Non-smooth Federated Optimization with Adaptive Stepsizes
by: Sokolov, Igor, et al.
Published: (2024)

Shadowheart SGD: Distributed Asynchronous SGD with Optimal Time Complexity Under Arbitrary Computation and Communication Heterogeneity
by: Tyurin, Alexander, et al.
Published: (2024)

Second-order Optimization under Heavy-Tailed Noise: Hessian Clipping and Sample Complexity Limits
by: Sadiev, Abdurakhmon, et al.
Published: (2025)

A Unified Theory of Stochastic Proximal Point Methods without Smoothness
by: Richtárik, Peter, et al.
Published: (2024)

BiCoLoR: Communication-Efficient Optimization with Bidirectional Compression and Local Training
by: Condat, Laurent, et al.
Published: (2026)

On the Convergence of DP-SGD with Adaptive Clipping
by: Shulgin, Egor, et al.
Published: (2024)

First Provable Guarantees for Practical Private FL: Beyond Restrictive Assumptions
by: Shulgin, Egor, et al.
Published: (2025)

Better LMO-based Momentum Methods with Second-Order Information
by: Khirirat, Sarit, et al.
Published: (2025)

Sparse-ProxSkip: Accelerated Sparse-to-Sparse Training in Federated Learning
by: Meinhardt, Georg, et al.
Published: (2024)

TAMUNA: Doubly Accelerated Distributed Optimization with Local Training, Compression, and Partial Participation
by: Condat, Laurent, et al.
Published: (2023)

Ringmaster ASGD: The First Asynchronous SGD with Optimal Time Complexity
by: Maranjyan, Artavazd, et al.
Published: (2025)

A Novel Unified Parametric Assumption for Nonconvex Optimization
by: Riabinin, Artem, et al.
Published: (2025)

Differentially Private Random Block Coordinate Descent
by: Maranjyan, Artavazd, et al.
Published: (2024)

Phases of Muon: When Muon Eclipses SignSGD
by: Paquette, Elliot, et al.
Published: (2026)

Ringleader ASGD: The First Asynchronous SGD with Optimal Time Complexity under Data Heterogeneity
by: Maranjyan, Artavazd, et al.
Published: (2025)

Towards a Better Theoretical Understanding of Independent Subnetwork Training
by: Shulgin, Egor, et al.
Published: (2023)

Modular Distributed Nonconvex Learning with Error Feedback
by: Carnevale, Guido, et al.
Published: (2025)

MuonBP: Faster Muon via Block-Periodic Orthogonalization
by: Khaled, Ahmed, et al.
Published: (2025)

LiMuon: Light and Fast Muon Optimizer for Large Models
by: Huang, Feihu, et al.
Published: (2025)