:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yu, Kihyun, Baek, Beomhan, Lee, Dabeen
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Optimization and Control
Online Access:	https://arxiv.org/abs/2605.11586
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Near-Optimal Primal-Dual Algorithm for Learning Linear Mixture CMDPs with Adversarial Rewards
by: Yu, Kihyun, et al.
Published: (2026)

Improved Regret Bound for Safe Reinforcement Learning via Tighter Cost Pessimism and Reward Optimism
by: Yu, Kihyun, et al.
Published: (2024)

Provably Efficient Infinite-Horizon Average-Reward Reinforcement Learning with Linear Function Approximation
by: Chae, Woojin, et al.
Published: (2024)

Learning Infinite-Horizon Average-Reward Linear Mixture MDPs of Bounded Span
by: Chae, Woojin, et al.
Published: (2024)

Implicit Bias of Per-sample Adam on Separable Data: Departure from the Full-batch Regime
by: Baek, Beomhan, et al.
Published: (2025)

Primal-Dual Policy Optimization for Linear CMDPs with Adversarial Losses
by: Yu, Kihyun, et al.
Published: (2026)

On Convergence of Average-Reward Q-Learning in Weakly Communicating Markov Decision Processes
by: Wan, Yi, et al.
Published: (2024)

Through the River: Understanding the Benefit of Schedule-Free Methods for Language Model Training
by: Song, Minhak, et al.
Published: (2025)

Parameter-Free Algorithms for Performative Regret Minimization under Decision-Dependent Distributions
by: Park, Sungwoo, et al.
Published: (2024)

Achieving Tractable Minimax Optimal Regret in Average Reward MDPs
by: Boone, Victor, et al.
Published: (2024)

Stochastic-Constrained Stochastic Optimization with Markovian Data
by: Kim, Yeongjong, et al.
Published: (2023)

Infinite-Horizon Reinforcement Learning with Multinomial Logistic Function Approximation
by: Park, Jaehyun, et al.
Published: (2024)

Span-Based Optimal Sample Complexity for Weakly Communicating and General Average Reward MDPs
by: Zurek, Matthew, et al.
Published: (2024)

Chebyshev Center-Based Direction Selection for Multi-Objective Optimization and Training PINNs
by: Yoon, Hoyeol, et al.
Published: (2026)

Sample Complexity of Distributionally Robust Average-Reward Reinforcement Learning
by: Chen, Zijun, et al.
Published: (2025)

Quotient-Categorical Representations for Bellman-Compatible Average-Reward Distributional Reinforcement Learning
by: Kaya, Ege C., et al.
Published: (2026)

Optimal Sample Complexity for Average Reward Markov Decision Processes
by: Wang, Shengbo, et al.
Published: (2023)

Non-Rectangular Average-Reward Robust MDPs: Optimal Policies and Their Transient Values
by: Wang, Shengbo, et al.
Published: (2026)

Tail Distribution of Regret in Optimistic Reinforcement Learning
by: Khodadadian, Sajad, et al.
Published: (2025)

Reinforcement Learning and Regret Bounds for Admission Control
by: Weber, Lucas, et al.
Published: (2024)

Bellman Optimality of Average-Reward Robust Markov Decision Processes with a Constant Gain
by: Wang, Shengbo, et al.
Published: (2025)

Learning Decentralized Linear Quadratic Regulators with $\sqrt{T}$ Regret
by: Ye, Lintao, et al.
Published: (2022)

Probabilistic Safety Guarantee for Stochastic Control Systems Using Average Reward MDPs
by: Omidi, Saber, et al.
Published: (2025)

Span-Based Optimal Sample Complexity for Average Reward MDPs
by: Zurek, Matthew, et al.
Published: (2023)

Regret Lower Bounds for Learning Linear Quadratic Gaussian Systems
by: Ziemann, Ingvar, et al.
Published: (2022)

Adaptivity and Universality: Problem-dependent Universal Regret for Online Convex Optimization
by: Zhao, Peng, et al.
Published: (2025)

Doubly Optimal No-Regret Online Learning in Strongly Monotone Games with Bandit Feedback
by: Ba, Wenjia, et al.
Published: (2021)

Regret of exploratory policy improvement and $q$-learning
by: Tang, Wenpin, et al.
Published: (2024)

Lagrangian Index Policy for Restless Bandits with Average Reward
by: Avrachenkov, Konstantin, et al.
Published: (2024)

Robust Regression over Averaged Uncertainty
by: Bertsimas, Dimitris, et al.
Published: (2023)

Regret Analysis: a control perspective
by: Gibson, Travis E., et al.
Published: (2025)

Wasserstein Distributionally Robust Regret Optimization
by: Fiechtner, Lukas-Benedikt, et al.
Published: (2025)

Rethinking PCA Through Duality
by: Quan, Jan, et al.
Published: (2025)

Planning and Learning in Average Risk-aware MDPs
by: Wang, Weikai, et al.
Published: (2025)

Kernel Mean Embedding Topology: Weak and Strong Forms for Stochastic Kernels and Implications for Model Learning
by: Saldi, Naci, et al.
Published: (2025)

The Plug-in Approach for Average-Reward and Discounted MDPs: Optimal Sample Complexity Analysis
by: Zurek, Matthew, et al.
Published: (2024)

Span-Agnostic Optimal Sample Complexity and Oracle Inequalities for Average-Reward RL
by: Zurek, Matthew, et al.
Published: (2025)

Almost Surely $\sqrt{T}$ Regret for Adaptive LQR
by: Lu, Yiwen, et al.
Published: (2023)

Sublinear Regret for a Class of Continuous-Time Linear-Quadratic Reinforcement Learning Problems
by: Huang, Yilie, et al.
Published: (2024)

Rate-Optimal Regret for the Safe Learning-based Control of the Constrained Linear Quadratic Regulator
by: Hutchinson, Spencer, et al.
Published: (2026)