:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zuo, Qian, Wang, Zhiyong, He, Fengxiang
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2602.10917
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Ensuring Safety in an Uncertain Environment: Constrained MDPs via Stochastic Thresholds
by: Zuo, Qian, et al.
Published: (2025)

Beyond Slater's Condition in Online CMDPs with Stochastic and Adversarial Constraints
by: Stradi, Francesco Emanuele, et al.
Published: (2025)

Model-Free, Regret-Optimal Best Policy Identification in Online CMDPs
by: Zhou, Zihan, et al.
Published: (2023)

Learning Weakly Communicating Average-Reward CMDPs: Strong Duality and Improved Regret
by: Yu, Kihyun, et al.
Published: (2026)

Near-Optimal Last-Iterate Convergence for Zero-Sum Games with Bandit Feedback and Opponent Actions
by: Hait, Soumita, et al.
Published: (2026)

On the Last-Iterate Convergence of Shuffling Gradient Methods
by: Liu, Zijian, et al.
Published: (2024)

Near-Optimal Primal-Dual Algorithm for Learning Linear Mixture CMDPs with Adversarial Rewards
by: Yu, Kihyun, et al.
Published: (2026)

Augmented Lagrangian Method for Last-Iterate Convergence for Constrained MDPs
by: Lu, Michael, et al.
Published: (2026)

Convergence Rate of the Last Iterate of Stochastic Proximal Algorithms
by: Vaidyan, Kevin Kurian Thomas, et al.
Published: (2026)

Revisiting the Last-Iterate Convergence of Stochastic Gradient Methods
by: Liu, Zijian, et al.
Published: (2023)

Efficient Last-Iterate Convergence in Regret Minimization via Adaptive Reward Transformation
by: Ren, Hang, et al.
Published: (2025)

Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning
by: Montenegro, Alessandro, et al.
Published: (2024)

Last-Iterate Convergence of General Parameterized Policies in Constrained MDPs
by: Mondal, Washim Uddin, et al.
Published: (2024)

Fast Last-Iterate Convergence of SGD in the Smooth Interpolation Regime
by: Attia, Amit, et al.
Published: (2025)

Last Iterate Convergence of Incremental Methods and Applications in Continual Learning
by: Cai, Xufeng, et al.
Published: (2024)

XAI for In-hospital Mortality Prediction via Multimodal ICU Data
by: Li, Xingqiao, et al.
Published: (2023)

Rationality Measurement and Theory for Reinforcement Learning Agents
by: Qian, Kejiang, et al.
Published: (2026)

Primal-Dual Policy Optimization for Linear CMDPs with Adversarial Losses
by: Yu, Kihyun, et al.
Published: (2026)

Best-of-Both-Worlds Policy Optimization for CMDPs with Bandit Feedback
by: Stradi, Francesco Emanuele, et al.
Published: (2024)

On Separation Between Best-Iterate, Random-Iterate, and Last-Iterate Convergence of Learning in Games
by: Cai, Yang, et al.
Published: (2025)

Last-Iterate Convergence Properties of Regret-Matching Algorithms in Games
by: Cai, Yang, et al.
Published: (2023)

Last-Iterate Convergence of No-Regret Learning for Equilibria in Bargaining Games
by: Kamp, Serafina, et al.
Published: (2025)

Improved Last-Iterate Convergence of Shuffling Gradient Methods for Nonsmooth Convex Optimization
by: Liu, Zijian, et al.
Published: (2025)

Last-Iterate Convergence of Randomized Kaczmarz and SGD with Greedy Step Size
by: Dereziński, Michał, et al.
Published: (2026)

Provable Last-Iterate Convergence for Multi-Objective Safe LLM Alignment via Optimistic Primal-Dual
by: Li, Yining, et al.
Published: (2026)

From Average-Iterate to Last-Iterate Convergence in Games: A Reduction and Its Applications
by: Cai, Yang, et al.
Published: (2025)

Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity and Last-Iterate Convergence
by: Wu, Jiduan, et al.
Published: (2023)

Integrating LTL Constraints into PPO for Safe Reinforcement Learning
by: Zhang, Maifang, et al.
Published: (2026)

The Harder Path: Last Iterate Convergence for Uncoupled Learning in Zero-Sum Games with Bandit Feedback
by: Fiegel, Côme, et al.
Published: (2026)

Fast Last-Iterate Convergence of Learning in Games Requires Forgetful Algorithms
by: Cai, Yang, et al.
Published: (2024)

Last-Iterate Convergence of Adaptive Riemannian Gradient Descent for Equilibrium Computation
by: Cai, Yang, et al.
Published: (2023)

Iteration and Stochastic First-order Oracle Complexities of Stochastic Gradient Descent using Constant and Decaying Learning Rates
by: Imaizumi, Kento, et al.
Published: (2024)

Last-Iterate Convergent Policy Gradient Primal-Dual Methods for Constrained MDPs
by: Ding, Dongsheng, et al.
Published: (2023)

Extragradient Preference Optimization (EGPO): Beyond Last-Iterate Convergence for Nash Learning from Human Feedback
by: Zhou, Runlong, et al.
Published: (2025)

Last-Iterate Convergence of Payoff-Based Independent Learning in Zero-Sum Stochastic Games
by: Chen, Zaiwei, et al.
Published: (2024)

Last-Iterate Convergence: Zero-Sum Games and Constrained Min-Max Optimization
by: Daskalakis, Constantinos, et al.
Published: (2018)

Uniform Last-Iterate Guarantee for Bandits and Reinforcement Learning
by: Liu, Junyan, et al.
Published: (2024)

Structure-Dependent Regret and Constraint Violation Bounds for Online Convex Optimization with Time-Varying Constraints
by: Liu, Xiufeng, et al.
Published: (2026)

Closing the Gap: Achieving Global Convergence (Last Iterate) of Actor-Critic under Markovian Sampling with Neural Network Parametrization
by: Gaur, Mudit, et al.
Published: (2024)

Convergence Rate for the Last Iterate of Stochastic Gradient Descent Schemes
by: Hudiani, Marcel
Published: (2025)