Saved in:
| Main Author: | Nguyen, Quan |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.03478 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise
by: Ahn, Kwangjun, et al.
Published: (2024)
by: Ahn, Kwangjun, et al.
Published: (2024)
Adam Converges Without Any Modification On Update Rules
by: Zhang, Yushun, et al.
Published: (2026)
by: Zhang, Yushun, et al.
Published: (2026)
Provable Adaptivity of Adam under Non-uniform Smoothness
by: Wang, Bohan, et al.
Published: (2022)
by: Wang, Bohan, et al.
Published: (2022)
HomeAdam: Adam and AdamW Algorithms Sometimes Go Home to Obtain Better Provable Generalization
by: Huang, Feihu, et al.
Published: (2026)
by: Huang, Feihu, et al.
Published: (2026)
Adam-HNAG: A Convergent Reformulation of Adam with Accelerated Rate
by: Yu, Yaxin, et al.
Published: (2026)
by: Yu, Yaxin, et al.
Published: (2026)
Adam on Local Time: Addressing Nonstationarity in RL with Relative Adam Timesteps
by: Ellis, Benjamin, et al.
Published: (2024)
by: Ellis, Benjamin, et al.
Published: (2024)
A Theoretical and Empirical Study on the Convergence of Adam with an "Exact" Constant Step Size in Non-Convex Settings
by: Mazumder, Alokendu, et al.
Published: (2023)
by: Mazumder, Alokendu, et al.
Published: (2023)
Projection-free Online Learning over Strongly Convex Sets
by: Wan, Yuanyu, et al.
Published: (2020)
by: Wan, Yuanyu, et al.
Published: (2020)
Majorization-minimization for Sparse Nonnegative Matrix Factorization with the $β$-divergence
by: Marmin, Arthur, et al.
Published: (2022)
by: Marmin, Arthur, et al.
Published: (2022)
Adam-SHANG: A Convergent Adam-Type Method for Stochastic Smooth Convex Optimization
by: Yu, Yaxin, et al.
Published: (2026)
by: Yu, Yaxin, et al.
Published: (2026)
The Sample Complexity of Online Reinforcement Learning: A Multi-model Perspective
by: Muehlebach, Michael, et al.
Published: (2025)
by: Muehlebach, Michael, et al.
Published: (2025)
Level Set Teleportation: An Optimization Perspective
by: Mishkin, Aaron, et al.
Published: (2024)
by: Mishkin, Aaron, et al.
Published: (2024)
Online Inventory Problems: Beyond the i.i.d. Setting with Online Convex Optimization
by: Hihat, Massil, et al.
Published: (2023)
by: Hihat, Massil, et al.
Published: (2023)
Efficient First-Order Optimization on the Pareto Set for Multi-Objective Learning under Preference Guidance
by: Chen, Lisha, et al.
Published: (2025)
by: Chen, Lisha, et al.
Published: (2025)
The Rich and the Simple: On the Implicit Bias of Adam and SGD
by: Vasudeva, Bhavya, et al.
Published: (2025)
by: Vasudeva, Bhavya, et al.
Published: (2025)
Convergence rates for the Adam optimizer
by: Dereich, Steffen, et al.
Published: (2024)
by: Dereich, Steffen, et al.
Published: (2024)
Adam-family Methods for Nonsmooth Optimization with Convergence Guarantees
by: Xiao, Nachuan, et al.
Published: (2023)
by: Xiao, Nachuan, et al.
Published: (2023)
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
by: Hong, Yusu, et al.
Published: (2024)
by: Hong, Yusu, et al.
Published: (2024)
Efficient Online Large-Margin Classification via Dual Certificates
by: Ho-Nguyen, Nam, et al.
Published: (2025)
by: Ho-Nguyen, Nam, et al.
Published: (2025)
Adam with model exponential moving average is effective for nonconvex optimization
by: Ahn, Kwangjun, et al.
Published: (2024)
by: Ahn, Kwangjun, et al.
Published: (2024)
Convergence of Steepest Descent and Adam under Non-Uniform Smoothness
by: Vaswani, Sharan, et al.
Published: (2026)
by: Vaswani, Sharan, et al.
Published: (2026)
A Semantic-Loss Function Modeling Framework With Task-Oriented Machine Learning Perspectives
by: Nguyen, Ti Ti, et al.
Published: (2025)
by: Nguyen, Ti Ti, et al.
Published: (2025)
From Distributional Robustness to Robust Statistics: A Confidence Sets Perspective
by: Chan, Gabriel, et al.
Published: (2024)
by: Chan, Gabriel, et al.
Published: (2024)
On the Convergence of Adam-Type Algorithm for Bilevel Optimization under Unbounded Smoothness
by: Gong, Xiaochuan, et al.
Published: (2025)
by: Gong, Xiaochuan, et al.
Published: (2025)
Implicit Bias of AdamW: $\ell_\infty$ Norm Constrained Optimization
by: Xie, Shuo, et al.
Published: (2024)
by: Xie, Shuo, et al.
Published: (2024)
A Comprehensive Framework for Analyzing the Convergence of Adam: Bridging the Gap with SGD
by: Jin, Ruinan, et al.
Published: (2024)
by: Jin, Ruinan, et al.
Published: (2024)
Muon Outperforms Adam in Tail-End Associative Memory Learning
by: Wang, Shuche, et al.
Published: (2025)
by: Wang, Shuche, et al.
Published: (2025)
On the $O(\frac{\sqrt{d}}{K^{1/4}})$ Convergence Rate of AdamW Measured by $\ell_1$ Norm
by: Li, Huan, et al.
Published: (2025)
by: Li, Huan, et al.
Published: (2025)
Is your batch size the problem? Revisiting the Adam-SGD gap in language modeling
by: Srećković, Teodora, et al.
Published: (2025)
by: Srećković, Teodora, et al.
Published: (2025)
On the Convergence of Adam under Non-uniform Smoothness: Separability from SGDM and Beyond
by: Wang, Bohan, et al.
Published: (2024)
by: Wang, Bohan, et al.
Published: (2024)
ODE approximation for the Adam algorithm: General and overparametrized setting
by: Dereich, Steffen, et al.
Published: (2025)
by: Dereich, Steffen, et al.
Published: (2025)
Learning of Linear Dynamical Systems as a Non-Commutative Polynomial Optimization Problem
by: Zhou, Quan, et al.
Published: (2020)
by: Zhou, Quan, et al.
Published: (2020)
Clipping Improves Adam-Norm and AdaGrad-Norm when the Noise Is Heavy-Tailed
by: Chezhegov, Savelii, et al.
Published: (2024)
by: Chezhegov, Savelii, et al.
Published: (2024)
Dynamic Regret via Discounted-to-Dynamic Reduction with Applications to Curved Losses and Adam Optimizer
by: Xie, Yan-Feng, et al.
Published: (2026)
by: Xie, Yan-Feng, et al.
Published: (2026)
Towards Quantifying the Preconditioning Effect of Adam
by: Das, Rudrajit, et al.
Published: (2024)
by: Das, Rudrajit, et al.
Published: (2024)
Fully Unconstrained Online Learning
by: Cutkosky, Ashok, et al.
Published: (2024)
by: Cutkosky, Ashok, et al.
Published: (2024)
On the Implicit Bias of Adam
by: Cattaneo, Matias D., et al.
Published: (2023)
by: Cattaneo, Matias D., et al.
Published: (2023)
Upper-Linearizability of Online Non-Monotone DR-Submodular Maximization over Down-Closed Convex Sets
by: Lu, Yiyang, et al.
Published: (2026)
by: Lu, Yiyang, et al.
Published: (2026)
Online Optimization Perspective on First-Order and Zero-Order Decentralized Nonsmooth Nonconvex Stochastic Optimization
by: Sahinoglu, Emre, et al.
Published: (2024)
by: Sahinoglu, Emre, et al.
Published: (2024)
Safe Online Control-Informed Learning
by: Zhou, Tianyu, et al.
Published: (2025)
by: Zhou, Tianyu, et al.
Published: (2025)
Similar Items
-
Understanding Adam Optimizer via Online Learning of Updates: Adam is FTRL in Disguise
by: Ahn, Kwangjun, et al.
Published: (2024) -
Adam Converges Without Any Modification On Update Rules
by: Zhang, Yushun, et al.
Published: (2026) -
Provable Adaptivity of Adam under Non-uniform Smoothness
by: Wang, Bohan, et al.
Published: (2022) -
HomeAdam: Adam and AdamW Algorithms Sometimes Go Home to Obtain Better Provable Generalization
by: Huang, Feihu, et al.
Published: (2026) -
Adam-HNAG: A Convergent Reformulation of Adam with Accelerated Rate
by: Yu, Yaxin, et al.
Published: (2026)