Saved in:
| Main Authors: | Liu, Yifeng, Yuan, Angela, Gu, Quanquan |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.21800 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MARS: Unleashing the Power of Variance Reduction for Training Large Models
by: Yuan, Huizhuo, et al.
Published: (2024)
by: Yuan, Huizhuo, et al.
Published: (2024)
Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits
by: Li, Xuheng, et al.
Published: (2025)
by: Li, Xuheng, et al.
Published: (2025)
Towards Simple and Provable Parameter-Free Adaptive Gradient Methods
by: Tao, Yuanzhe, et al.
Published: (2024)
by: Tao, Yuanzhe, et al.
Published: (2024)
Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits
by: Di, Qiwei, et al.
Published: (2023)
by: Di, Qiwei, et al.
Published: (2023)
Understanding SGD with Exponential Moving Average: A Case Study in Linear Regression
by: Li, Xuheng, et al.
Published: (2025)
by: Li, Xuheng, et al.
Published: (2025)
Optimal Horizon-Free Reward-Free Exploration for Linear Mixture MDPs
by: Zhang, Junkai, et al.
Published: (2023)
by: Zhang, Junkai, et al.
Published: (2023)
Feel-Good Thompson Sampling for Contextual Dueling Bandits
by: Li, Xuheng, et al.
Published: (2024)
by: Li, Xuheng, et al.
Published: (2024)
Unified Convergence Analysis for Score-Based Diffusion Models with Deterministic Samplers
by: Li, Runjia, et al.
Published: (2024)
by: Li, Runjia, et al.
Published: (2024)
A Nearly Optimal and Low-Switching Algorithm for Reinforcement Learning with General Function Approximation
by: Zhao, Heyang, et al.
Published: (2023)
by: Zhao, Heyang, et al.
Published: (2023)
On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization
by: Zhou, Dongruo, et al.
Published: (2018)
by: Zhou, Dongruo, et al.
Published: (2018)
Pessimistic Nonlinear Least-Squares Value Iteration for Offline Reinforcement Learning
by: Di, Qiwei, et al.
Published: (2023)
by: Di, Qiwei, et al.
Published: (2023)
Dimension-Independent Convergence of Underdamped Langevin Monte Carlo in KL Divergence
by: Zhang, Shiyuan, et al.
Published: (2026)
by: Zhang, Shiyuan, et al.
Published: (2026)
Muon is Provably Faster with Momentum Variance Reduction
by: Qian, Xun, et al.
Published: (2025)
by: Qian, Xun, et al.
Published: (2025)
Stochastic Gradient Langevin Dynamics with Variance Reduction
by: Huang, Zhishen, et al.
Published: (2021)
by: Huang, Zhishen, et al.
Published: (2021)
Gradient Estimation and Variance Reduction in Stochastic and Deterministic Models
by: Keane, Ronan
Published: (2024)
by: Keane, Ronan
Published: (2024)
Matching the Statistical Query Lower Bound for $k$-Sparse Parity Problems with Sign Stochastic Gradient Descent
by: Kou, Yiwen, et al.
Published: (2024)
by: Kou, Yiwen, et al.
Published: (2024)
TRSVR: An Adaptive Stochastic Trust-Region Method with Variance Reduction
by: Fang, Yuchen, et al.
Published: (2026)
by: Fang, Yuchen, et al.
Published: (2026)
Adaptive Variance Reduction for Stochastic Optimization under Weaker Assumptions
by: Jiang, Wei, et al.
Published: (2024)
by: Jiang, Wei, et al.
Published: (2024)
Provably Efficient Representation Selection in Low-rank Markov Decision Processes: From Online to Offline RL
by: Zhang, Weitong, et al.
Published: (2021)
by: Zhang, Weitong, et al.
Published: (2021)
Accelerating RLHF Training with Reward Variance Increase
by: Yang, Zonglin, et al.
Published: (2025)
by: Yang, Zonglin, et al.
Published: (2025)
VAMO: Efficient Zeroth-Order Variance Reduction for SGD with Faster Convergence
by: Chen, Jiahe, et al.
Published: (2025)
by: Chen, Jiahe, et al.
Published: (2025)
Efficient Sign-Based Optimization: Accelerating Convergence via Variance Reduction
by: Jiang, Wei, et al.
Published: (2024)
by: Jiang, Wei, et al.
Published: (2024)
Policy-based Primal-Dual Methods for Concave CMDP with Variance Reduction
by: Ying, Donghao, et al.
Published: (2022)
by: Ying, Donghao, et al.
Published: (2022)
Infeasible Deterministic, Stochastic, and Variance-Reduction Algorithms for Optimization under Orthogonality Constraints
by: Ablin, Pierre, et al.
Published: (2023)
by: Ablin, Pierre, et al.
Published: (2023)
Projected Forward Gradient-Guided Frank-Wolfe Algorithm via Variance Reduction
by: Rostami, M., et al.
Published: (2024)
by: Rostami, M., et al.
Published: (2024)
Drago: Primal-Dual Coupled Variance Reduction for Faster Distributionally Robust Optimization
by: Mehta, Ronak, et al.
Published: (2024)
by: Mehta, Ronak, et al.
Published: (2024)
Global Convergence of Natural Policy Gradient with Hessian-aided Momentum Variance Reduction
by: Feng, Jie, et al.
Published: (2024)
by: Feng, Jie, et al.
Published: (2024)
Global Convergence and Rich Feature Learning in $L$-Layer Infinite-Width Neural Networks under $μ$P Parametrization
by: Chen, Zixiang, et al.
Published: (2025)
by: Chen, Zixiang, et al.
Published: (2025)
Variance Reduction and Low Sample Complexity in Stochastic Optimization via Proximal Point Method
by: Liang, Jiaming
Published: (2024)
by: Liang, Jiaming
Published: (2024)
Projection-Free Variance Reduction Methods for Stochastic Constrained Multi-Level Compositional Optimization
by: Jiang, Wei, et al.
Published: (2024)
by: Jiang, Wei, et al.
Published: (2024)
PPI-SVRG: Unifying Prediction-Powered Inference and Variance Reduction for Semi-Supervised Optimization
by: Ao, Ruicheng, et al.
Published: (2026)
by: Ao, Ruicheng, et al.
Published: (2026)
When Deep Learning Meets Polyhedral Theory: A Survey
by: Huchette, Joey, et al.
Published: (2023)
by: Huchette, Joey, et al.
Published: (2023)
Reinforcement Learning from Human Feedback with Active Queries
by: Ji, Kaixuan, et al.
Published: (2024)
by: Ji, Kaixuan, et al.
Published: (2024)
Variance Reduction Methods Do Not Need to Compute Full Gradients: Improved Efficiency through Shuffling
by: Medyakov, Daniil, et al.
Published: (2025)
by: Medyakov, Daniil, et al.
Published: (2025)
Double Variance Reduction: A Smoothing Trick for Composite Optimization Problems without First-Order Gradient
by: Di, Hao, et al.
Published: (2024)
by: Di, Hao, et al.
Published: (2024)
Adapprox: Adaptive Approximation in Adam Optimization via Randomized Low-Rank Matrices
by: Zhao, Pengxiang, et al.
Published: (2024)
by: Zhao, Pengxiang, et al.
Published: (2024)
Beyond Bounded Variance: Variance-Reduced Normalized Methods for Nonconvex Optimization under Blum-Gladyshev Noise
by: Upadhyay, Antesh, et al.
Published: (2026)
by: Upadhyay, Antesh, et al.
Published: (2026)
A Variance-Reduced Stochastic Gradient Tracking Algorithm for Decentralized Optimization with Orthogonality Constraints
by: Wang, Lei, et al.
Published: (2022)
by: Wang, Lei, et al.
Published: (2022)
Learning When to Restart: Nonstationary Newsvendor from Uncensored to Censored Demand
by: Chen, Xin, et al.
Published: (2025)
by: Chen, Xin, et al.
Published: (2025)
Towards Weaker Variance Assumptions for Stochastic Optimization
by: Alacaoglu, Ahmet, et al.
Published: (2025)
by: Alacaoglu, Ahmet, et al.
Published: (2025)
Similar Items
-
MARS: Unleashing the Power of Variance Reduction for Training Large Models
by: Yuan, Huizhuo, et al.
Published: (2024) -
Variance-Aware Feel-Good Thompson Sampling for Contextual Bandits
by: Li, Xuheng, et al.
Published: (2025) -
Towards Simple and Provable Parameter-Free Adaptive Gradient Methods
by: Tao, Yuanzhe, et al.
Published: (2024) -
Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits
by: Di, Qiwei, et al.
Published: (2023) -
Understanding SGD with Exponential Moving Average: A Case Study in Linear Regression
by: Li, Xuheng, et al.
Published: (2025)