Saved in:
| Main Authors: | Wilson, Paul, Zanasi, Fabio, Constantinides, George |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.04051 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Drop-Muon: Update Less, Converge Faster
by: Gruntkowska, Kaja, et al.
Published: (2025)
by: Gruntkowska, Kaja, et al.
Published: (2025)
Convergence of Distributed Adaptive Optimization with Local Updates
by: Cheng, Ziheng, et al.
Published: (2024)
by: Cheng, Ziheng, et al.
Published: (2024)
Adam Converges Without Any Modification On Update Rules
by: Zhang, Yushun, et al.
Published: (2026)
by: Zhang, Yushun, et al.
Published: (2026)
Improving Convergence and Generalization Using Parameter Symmetries
by: Zhao, Bo, et al.
Published: (2023)
by: Zhao, Bo, et al.
Published: (2023)
QLABGrad: a Hyperparameter-Free and Convergence-Guaranteed Scheme for Deep Learning
by: Fu, Minghan, et al.
Published: (2023)
by: Fu, Minghan, et al.
Published: (2023)
Coupling-based Convergence Diagnostic and Stepsize Scheme for Stochastic Gradient Descent
by: Li, Xiang, et al.
Published: (2024)
by: Li, Xiang, et al.
Published: (2024)
FOCUS: First Order Concentrated Updating Scheme
by: Liu, Yizhou, et al.
Published: (2025)
by: Liu, Yizhou, et al.
Published: (2025)
Less is More: Convergence Benefits of Fewer Data Weight Updates over Longer Horizon
by: Das, Rudrajit, et al.
Published: (2026)
by: Das, Rudrajit, et al.
Published: (2026)
Improved Convergence in Parameter-Agnostic Error Feedback through Momentum
by: Sadiev, Abdurakhmon, et al.
Published: (2025)
by: Sadiev, Abdurakhmon, et al.
Published: (2025)
Policy Gradient Methods for Discrete Time Linear Quadratic Regulator With Random Parameters
by: Li, Deyue
Published: (2023)
by: Li, Deyue
Published: (2023)
Sketch-and-Project Meets Newton Method: Global $\mathcal O(k^{-2})$ Convergence with Low-Rank Updates
by: Hanzely, Slavomír
Published: (2023)
by: Hanzely, Slavomír
Published: (2023)
Beyond Discretization: Learning the Optimal Solution Path
by: Dong, Qiran, et al.
Published: (2024)
by: Dong, Qiran, et al.
Published: (2024)
DOGE-Train: Discrete Optimization on GPU with End-to-end Training
by: Abbas, Ahmed, et al.
Published: (2022)
by: Abbas, Ahmed, et al.
Published: (2022)
Learning Over-Relaxation Policies for ADMM with Convergence Guarantees
by: Lin, Junan, et al.
Published: (2026)
by: Lin, Junan, et al.
Published: (2026)
Global Convergence of Multiplicative Updates for the Matrix Mechanism: A Collaborative Proof with Gemini 3
by: Rush, Keith
Published: (2026)
by: Rush, Keith
Published: (2026)
Temporal Difference Learning with Compressed Updates: Error-Feedback meets Reinforcement Learning
by: Mitra, Aritra, et al.
Published: (2023)
by: Mitra, Aritra, et al.
Published: (2023)
Stochastic Gradient Methods with Preconditioned Updates
by: Sadiev, Abdurakhmon, et al.
Published: (2022)
by: Sadiev, Abdurakhmon, et al.
Published: (2022)
Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning
by: Caron, Francois, et al.
Published: (2023)
by: Caron, Francois, et al.
Published: (2023)
Beyond the Ideal: Analyzing the Inexact Muon Update
by: Shulgin, Egor, et al.
Published: (2025)
by: Shulgin, Egor, et al.
Published: (2025)
Fundamental Benefit of Alternating Updates in Minimax Optimization
by: Lee, Jaewook, et al.
Published: (2024)
by: Lee, Jaewook, et al.
Published: (2024)
The Role of Target Update Frequencies in Q-Learning
by: Weissmann, Simon, et al.
Published: (2026)
by: Weissmann, Simon, et al.
Published: (2026)
GANs as Gradient Flows that Converge
by: Huang, Yu-Jui, et al.
Published: (2022)
by: Huang, Yu-Jui, et al.
Published: (2022)
Convergence Rate Analysis of LION
by: Dong, Yiming, et al.
Published: (2024)
by: Dong, Yiming, et al.
Published: (2024)
Convergence of Muon with Newton-Schulz
by: Kim, Gyu Yeol, et al.
Published: (2026)
by: Kim, Gyu Yeol, et al.
Published: (2026)
Discrete and Continuous Difference of Submodular Minimization
by: Orfanides, George, et al.
Published: (2025)
by: Orfanides, George, et al.
Published: (2025)
Reusing Historical Trajectories in Natural Policy Gradient via Importance Sampling: Convergence and Convergence Rate
by: Lin, Yifan, et al.
Published: (2024)
by: Lin, Yifan, et al.
Published: (2024)
Optimization of Discrete Parameters Using the Adaptive Gradient Method and Directed Evolution
by: Beinarovich, Andrei, et al.
Published: (2024)
by: Beinarovich, Andrei, et al.
Published: (2024)
Optimizing Asynchronous Federated Learning: A Delicate Trade-Off Between Model-Parameter Staleness and Update Frequency
by: Alahyane, Abdelkrim, et al.
Published: (2025)
by: Alahyane, Abdelkrim, et al.
Published: (2025)
Improved Analysis for Sign-based Methods with Momentum Updates
by: Jiang, Wei, et al.
Published: (2025)
by: Jiang, Wei, et al.
Published: (2025)
Block Sparse Bayesian Learning: A Diversified Scheme
by: Zhang, Yanhao, et al.
Published: (2024)
by: Zhang, Yanhao, et al.
Published: (2024)
On PI Controllers for Updating Lagrange Multipliers in Constrained Optimization
by: Sohrabi, Motahareh, et al.
Published: (2024)
by: Sohrabi, Motahareh, et al.
Published: (2024)
Automatic Differentiation of Optimization Algorithms with Time-Varying Updates
by: Mehmood, Sheheryar, et al.
Published: (2024)
by: Mehmood, Sheheryar, et al.
Published: (2024)
Dynamics of Stochastic Momentum with Sparse Updates in High Dimensions
by: Everett, Katie, et al.
Published: (2026)
by: Everett, Katie, et al.
Published: (2026)
Adaptive Delayed-Update Cyclic Algorithm for Variational Inequalities
by: Wei, Yi, et al.
Published: (2026)
by: Wei, Yi, et al.
Published: (2026)
The Convergence of Dynamic Routing between Capsules
by: Ye, Daoyuan, et al.
Published: (2025)
by: Ye, Daoyuan, et al.
Published: (2025)
Provably Convergent Federated Trilevel Learning
by: Jiao, Yang, et al.
Published: (2023)
by: Jiao, Yang, et al.
Published: (2023)
AdaGrad Meets Muon: Adaptive Stepsizes for Orthogonal Updates
by: Zhang, Minxin, et al.
Published: (2025)
by: Zhang, Minxin, et al.
Published: (2025)
Learning to Shuffle: Block Reshuffling and Reversal Schemes for Stochastic Optimization
by: Nguyen, Lam M., et al.
Published: (2026)
by: Nguyen, Lam M., et al.
Published: (2026)
The Effectiveness of Local Updates for Decentralized Learning under Data Heterogeneity
by: Wu, Tongle, et al.
Published: (2024)
by: Wu, Tongle, et al.
Published: (2024)
Enhancing GNNs Performance on Combinatorial Optimization by Recurrent Feature Update
by: Pugacheva, Daria, et al.
Published: (2024)
by: Pugacheva, Daria, et al.
Published: (2024)
Similar Items
-
Drop-Muon: Update Less, Converge Faster
by: Gruntkowska, Kaja, et al.
Published: (2025) -
Convergence of Distributed Adaptive Optimization with Local Updates
by: Cheng, Ziheng, et al.
Published: (2024) -
Adam Converges Without Any Modification On Update Rules
by: Zhang, Yushun, et al.
Published: (2026) -
Improving Convergence and Generalization Using Parameter Symmetries
by: Zhao, Bo, et al.
Published: (2023) -
QLABGrad: a Hyperparameter-Free and Convergence-Guaranteed Scheme for Deep Learning
by: Fu, Minghan, et al.
Published: (2023)