:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Nguyen, Son, Liu, Bo, Chen, Lizhang, Liu, Qiang
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2502.07488
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Memory-Efficient Optimization with Factorized Hamiltonian Descent
by: Nguyen, Son, et al.
Published: (2024)

Cautious Optimizers: Improving Training with One Line of Code
by: Liang, Kaizhao, et al.
Published: (2024)

Lion Secretly Solves Constrained Optimization: As Lyapunov Predicts
by: Chen, Lizhang, et al.
Published: (2023)

Muon Optimizes Under Spectral Norm Constraints
by: Chen, Lizhang, et al.
Published: (2025)

Memory-Efficient LLM Training with Online Subspace Descent
by: Liang, Kaizhao, et al.
Published: (2024)

DeMo: Decoupled Momentum Optimization
by: Peng, Bowen, et al.
Published: (2024)

Training-Free Looped Transformers
by: Chen, Lizhang, et al.
Published: (2026)

Structured Preconditioners in Adaptive Optimization: A Unified Analysis
by: Xie, Shuo, et al.
Published: (2025)

Adaptive Preconditioners Trigger Loss Spikes in Adam
by: Bai, Zhiwei, et al.
Published: (2025)

Momentum Guidance: Plug-and-Play Guidance for Flow Models
by: Liao, Runlong, et al.
Published: (2026)

Communication Efficient Distributed Training with Distributed Lion
by: Liu, Bo, et al.
Published: (2024)

Taming Preconditioner Drift: Unlocking the Potential of Second-Order Optimizers for Federated Learning on Non-IID Data
by: Liu, Junkang, et al.
Published: (2026)

$ϕ$-Balancing for Mixture-of-Experts Training
by: Chen, Lizhang, et al.
Published: (2026)

AdaFlow: Imitation Learning with Variance-Adaptive Flow-Based Policies
by: Hu, Xixi, et al.
Published: (2024)

Cautious Weight Decay
by: Chen, Lizhang, et al.
Published: (2025)

Muon$^2$: Boosting Muon via Adaptive Second-Moment Preconditioning
by: Liu, Ziyue, et al.
Published: (2026)

SAMix: Calibrated and Accurate Continual Learning via Sphere-Adaptive Mixup and Neural Collapse
by: Dang, Trung-Anh, et al.
Published: (2025)

Curvature-Informed SGD via General Purpose Lie-Group Preconditioners
by: Pooladzandi, Omead, et al.
Published: (2024)

CUPID in the Model Zoo: Online Matchmaking for Selecting Your Dream LLM
by: Nguyen, Son, et al.
Published: (2026)

Diagonally-Weighted Generalized Method of Moments Estimation for Gaussian Mixture Modeling
by: Zhang, Liu, et al.
Published: (2025)

Graph Neural Preconditioners for Iterative Solutions of Sparse Linear Systems
by: Chen, Jie
Published: (2024)

Learning Sparse Approximate Inverse Preconditioners for Conjugate Gradient Solvers on GPUs
by: Yang, Zherui, et al.
Published: (2025)

A New Perspective on Shampoo's Preconditioner
by: Morwani, Depen, et al.
Published: (2024)

Gaussian Processes Sampling with Sparse Grids under Additive Schwarz Preconditioner
by: Chen, Haoyuan, et al.
Published: (2024)

Improving Probabilistic Diffusion Models With Optimal Diagonal Covariance Matching
by: Ou, Zijing, et al.
Published: (2024)

Adam Improves Muon: Adaptive Moment Estimation with Orthogonalized Momentum
by: Zhang, Minxin, et al.
Published: (2026)

Regime-Adaptive Bayesian Optimization via Dirichlet Process Mixtures of Gaussian Processes
by: Zhang, Yan, et al.
Published: (2026)

A Self-Attentive Meta-Optimizer with Group-Adaptive Learning Rates and Weight Decay
by: Zhao, JiangBo, et al.
Published: (2026)

An Experimental Study of Semantic Continuity for Deep Learning Models
by: Wu, Shangxi, et al.
Published: (2020)

Adaptive Estimation and Inference in Conditional Moment Models via the Discrepancy Principle
by: Tan, Jiyuan, et al.
Published: (2026)

Spectral Embeddings Leak Graph Topology: Theory, Benchmark, and Adaptive Reconstruction
by: Nguyen-Cong, Thinh, et al.
Published: (2026)

Diagonal Adaptive Non-local Observables on Quantum Neural Networks
by: Tseng, Huan-Hsin, et al.
Published: (2026)

Optimization Insights into Deep Diagonal Linear Networks
by: Labarrière, Hippolyte, et al.
Published: (2024)

Generative modeling of Sparse Approximate Inverse Preconditioners
by: Li, Mou, et al.
Published: (2024)

Diagonal Over-parameterization in Reproducing Kernel Hilbert Spaces as an Adaptive Feature Model: Generalization and Adaptivity
by: Li, Yicheng, et al.
Published: (2025)

Preconditioners for the Stochastic Training of Neural Fields
by: Chng, Shin-Fang, et al.
Published: (2024)

Improving Rectified Flow with Boundary Conditions
by: Hu, Xixi, et al.
Published: (2025)

Improving Deep Knowledge Tracing via Gated Architectures and Adaptive Optimization
by: Shukurlu, Altun
Published: (2025)

Adaptive Moment Estimation Optimization Algorithm Using Projection Gradient for Deep Learning
by: Li, Yongqi, et al.
Published: (2025)

GADPN: Graph Adaptive Denoising and Perturbation Networks via Singular Value Decomposition
by: Deng, Hao, et al.
Published: (2026)