Saved in:
| Main Authors: | Zuo, Qian, Wang, Zhiyong, He, Fengxiang |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.10917 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Ensuring Safety in an Uncertain Environment: Constrained MDPs via Stochastic Thresholds
by: Zuo, Qian, et al.
Published: (2025)
by: Zuo, Qian, et al.
Published: (2025)
Beyond Slater's Condition in Online CMDPs with Stochastic and Adversarial Constraints
by: Stradi, Francesco Emanuele, et al.
Published: (2025)
by: Stradi, Francesco Emanuele, et al.
Published: (2025)
Model-Free, Regret-Optimal Best Policy Identification in Online CMDPs
by: Zhou, Zihan, et al.
Published: (2023)
by: Zhou, Zihan, et al.
Published: (2023)
Learning Weakly Communicating Average-Reward CMDPs: Strong Duality and Improved Regret
by: Yu, Kihyun, et al.
Published: (2026)
by: Yu, Kihyun, et al.
Published: (2026)
Near-Optimal Last-Iterate Convergence for Zero-Sum Games with Bandit Feedback and Opponent Actions
by: Hait, Soumita, et al.
Published: (2026)
by: Hait, Soumita, et al.
Published: (2026)
On the Last-Iterate Convergence of Shuffling Gradient Methods
by: Liu, Zijian, et al.
Published: (2024)
by: Liu, Zijian, et al.
Published: (2024)
Near-Optimal Primal-Dual Algorithm for Learning Linear Mixture CMDPs with Adversarial Rewards
by: Yu, Kihyun, et al.
Published: (2026)
by: Yu, Kihyun, et al.
Published: (2026)
Augmented Lagrangian Method for Last-Iterate Convergence for Constrained MDPs
by: Lu, Michael, et al.
Published: (2026)
by: Lu, Michael, et al.
Published: (2026)
Convergence Rate of the Last Iterate of Stochastic Proximal Algorithms
by: Vaidyan, Kevin Kurian Thomas, et al.
Published: (2026)
by: Vaidyan, Kevin Kurian Thomas, et al.
Published: (2026)
Revisiting the Last-Iterate Convergence of Stochastic Gradient Methods
by: Liu, Zijian, et al.
Published: (2023)
by: Liu, Zijian, et al.
Published: (2023)
Efficient Last-Iterate Convergence in Regret Minimization via Adaptive Reward Transformation
by: Ren, Hang, et al.
Published: (2025)
by: Ren, Hang, et al.
Published: (2025)
Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning
by: Montenegro, Alessandro, et al.
Published: (2024)
by: Montenegro, Alessandro, et al.
Published: (2024)
Last-Iterate Convergence of General Parameterized Policies in Constrained MDPs
by: Mondal, Washim Uddin, et al.
Published: (2024)
by: Mondal, Washim Uddin, et al.
Published: (2024)
Fast Last-Iterate Convergence of SGD in the Smooth Interpolation Regime
by: Attia, Amit, et al.
Published: (2025)
by: Attia, Amit, et al.
Published: (2025)
Last Iterate Convergence of Incremental Methods and Applications in Continual Learning
by: Cai, Xufeng, et al.
Published: (2024)
by: Cai, Xufeng, et al.
Published: (2024)
XAI for In-hospital Mortality Prediction via Multimodal ICU Data
by: Li, Xingqiao, et al.
Published: (2023)
by: Li, Xingqiao, et al.
Published: (2023)
Rationality Measurement and Theory for Reinforcement Learning Agents
by: Qian, Kejiang, et al.
Published: (2026)
by: Qian, Kejiang, et al.
Published: (2026)
Primal-Dual Policy Optimization for Linear CMDPs with Adversarial Losses
by: Yu, Kihyun, et al.
Published: (2026)
by: Yu, Kihyun, et al.
Published: (2026)
Best-of-Both-Worlds Policy Optimization for CMDPs with Bandit Feedback
by: Stradi, Francesco Emanuele, et al.
Published: (2024)
by: Stradi, Francesco Emanuele, et al.
Published: (2024)
On Separation Between Best-Iterate, Random-Iterate, and Last-Iterate Convergence of Learning in Games
by: Cai, Yang, et al.
Published: (2025)
by: Cai, Yang, et al.
Published: (2025)
Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games
by: Cai, Yang, et al.
Published: (2023)
by: Cai, Yang, et al.
Published: (2023)
Last-Iterate Convergence of No-Regret Learning for Equilibria in Bargaining Games
by: Kamp, Serafina, et al.
Published: (2025)
by: Kamp, Serafina, et al.
Published: (2025)
Improved Last-Iterate Convergence of Shuffling Gradient Methods for Nonsmooth Convex Optimization
by: Liu, Zijian, et al.
Published: (2025)
by: Liu, Zijian, et al.
Published: (2025)
Last-Iterate Convergence of Randomized Kaczmarz and SGD with Greedy Step Size
by: Dereziński, Michał, et al.
Published: (2026)
by: Dereziński, Michał, et al.
Published: (2026)
Provable Last-Iterate Convergence for Multi-Objective Safe LLM Alignment via Optimistic Primal-Dual
by: Li, Yining, et al.
Published: (2026)
by: Li, Yining, et al.
Published: (2026)
From Average-Iterate to Last-Iterate Convergence in Games: A Reduction and Its Applications
by: Cai, Yang, et al.
Published: (2025)
by: Cai, Yang, et al.
Published: (2025)
Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity and Last-Iterate Convergence
by: Wu, Jiduan, et al.
Published: (2023)
by: Wu, Jiduan, et al.
Published: (2023)
Integrating LTL Constraints into PPO for Safe Reinforcement Learning
by: Zhang, Maifang, et al.
Published: (2026)
by: Zhang, Maifang, et al.
Published: (2026)
The Harder Path: Last Iterate Convergence for Uncoupled Learning in Zero-Sum Games with Bandit Feedback
by: Fiegel, Côme, et al.
Published: (2026)
by: Fiegel, Côme, et al.
Published: (2026)
Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms
by: Cai, Yang, et al.
Published: (2024)
by: Cai, Yang, et al.
Published: (2024)
Last-Iterate Convergence of Adaptive Riemannian Gradient Descent for Equilibrium Computation
by: Cai, Yang, et al.
Published: (2023)
by: Cai, Yang, et al.
Published: (2023)
Iteration and Stochastic First-order Oracle Complexities of Stochastic Gradient Descent using Constant and Decaying Learning Rates
by: Imaizumi, Kento, et al.
Published: (2024)
by: Imaizumi, Kento, et al.
Published: (2024)
Last-Iterate Convergent Policy Gradient Primal-Dual Methods for Constrained MDPs
by: Ding, Dongsheng, et al.
Published: (2023)
by: Ding, Dongsheng, et al.
Published: (2023)
Extragradient Preference Optimization (EGPO): Beyond Last-Iterate Convergence for Nash Learning from Human Feedback
by: Zhou, Runlong, et al.
Published: (2025)
by: Zhou, Runlong, et al.
Published: (2025)
Last-Iterate Convergence of Payoff-Based Independent Learning in Zero-Sum Stochastic Games
by: Chen, Zaiwei, et al.
Published: (2024)
by: Chen, Zaiwei, et al.
Published: (2024)
Last-Iterate Convergence: Zero-Sum Games and Constrained Min-Max Optimization
by: Daskalakis, Constantinos, et al.
Published: (2018)
by: Daskalakis, Constantinos, et al.
Published: (2018)
Uniform Last-Iterate Guarantee for Bandits and Reinforcement Learning
by: Liu, Junyan, et al.
Published: (2024)
by: Liu, Junyan, et al.
Published: (2024)
Structure-Dependent Regret and Constraint Violation Bounds for Online Convex Optimization with Time-Varying Constraints
by: Liu, Xiufeng, et al.
Published: (2026)
by: Liu, Xiufeng, et al.
Published: (2026)
Closing the Gap: Achieving Global Convergence (Last Iterate) of Actor-Critic under Markovian Sampling with Neural Network Parametrization
by: Gaur, Mudit, et al.
Published: (2024)
by: Gaur, Mudit, et al.
Published: (2024)
Convergence Rate for the Last Iterate of Stochastic Gradient Descent Schemes
by: Hudiani, Marcel
Published: (2025)
by: Hudiani, Marcel
Published: (2025)
Similar Items
-
Ensuring Safety in an Uncertain Environment: Constrained MDPs via Stochastic Thresholds
by: Zuo, Qian, et al.
Published: (2025) -
Beyond Slater's Condition in Online CMDPs with Stochastic and Adversarial Constraints
by: Stradi, Francesco Emanuele, et al.
Published: (2025) -
Model-Free, Regret-Optimal Best Policy Identification in Online CMDPs
by: Zhou, Zihan, et al.
Published: (2023) -
Learning Weakly Communicating Average-Reward CMDPs: Strong Duality and Improved Regret
by: Yu, Kihyun, et al.
Published: (2026) -
Near-Optimal Last-Iterate Convergence for Zero-Sum Games with Bandit Feedback and Opponent Actions
by: Hait, Soumita, et al.
Published: (2026)