:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Tian, Haoxing, Chen, Zaiwei, Paschalidis, Ioannis Ch., Olshevsky, Alex
Format:	Preprint
Published:	2026
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2605.02103
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

One-Shot Averaging for Distributed TD($λ$) Under Markov Sampling
by: Tian, Haoxing, et al.
Published: (2024)

Closing the gap between SVRG and TD-SVRG with Gradient Splitting
by: Mustafin, Arsenii, et al.
Published: (2022)

On Value Iteration Convergence in Connected MDPs
by: Mustafin, Arsenii, et al.
Published: (2024)

Analysis of Value Iteration Through Absolute Probability Sequences
by: Mustafin, Arsenii, et al.
Published: (2025)

Geometric Re-Analysis of Classical MDP Solving Algorithms
by: Mustafin, Arsenii, et al.
Published: (2025)

MDP Geometry, Normalization and Reward Balancing Solvers
by: Mustafin, Arsenii, et al.
Published: (2024)

Distributionally Robust Learning in Survival Analysis
by: Jin, Yeping, et al.
Published: (2025)

Adversarial Imitation Learning from Visual Observations using Latent Information
by: Giammarino, Vittorio, et al.
Published: (2023)

Visually Robust Adversarial Imitation Learning from Videos with Contrastive Learning
by: Giammarino, Vittorio, et al.
Published: (2024)

Provably Efficient Off-Policy Adversarial Imitation Learning with Convergence Guarantees
by: Chen, Yilei, et al.
Published: (2024)

Multiple-policy Evaluation via Density Estimation
by: Chen, Yilei, et al.
Published: (2024)

Improving Adaptive Online Learning Using Refined Discretization
by: Zhang, Zhiyu, et al.
Published: (2023)

From Set Convergence to Pointwise Convergence: Finite-Time Guarantees for Average-Reward Q-Learning with Adaptive Stepsizes
by: Chen, Zaiwei, et al.
Published: (2025)

Distributionally Robust Token Optimization in RLHF
by: Jin, Yeping, et al.
Published: (2026)

DRO-Augment Framework: Robustness by Synergizing Wasserstein Distributionally Robust Optimization and Data Augmentation
by: Hu, Jiaming, et al.
Published: (2025)

A Model-Based Approach for Improving Reinforcement Learning Efficiency Leveraging Expert Observations
by: Ozcan, Erhan Can, et al.
Published: (2024)

Network Epidemic Control via Model Predictive Control: Extended Version
by: Talaei, Mahtab, et al.
Published: (2026)

Generalized Policy Improvement Algorithms with Theoretically Supported Sample Reuse
by: Queeney, James, et al.
Published: (2022)

Achieving $\varepsilon^{-2}$ Dependence for Average-Reward Q-Learning with a New Contraction Principle
by: Chen, Zijun, et al.
Published: (2026)

Optimal Transport Perturbations for Safe Reinforcement Learning with Robustness Guarantees
by: Queeney, James, et al.
Published: (2023)

Convex SGD: Generalization Without Early Stopping
by: Hendrickx, Julien, et al.
Published: (2024)

Smooth Ranking SVM via Cutting-Plane Method
by: Ozcan, Erhan Can, et al.
Published: (2024)

Analyzing and Bridging the Gap between Maximizing Total Reward and Discounted Reward in Deep Reinforcement Learning
by: Yin, Shuyu, et al.
Published: (2024)

Reducing Blackwell and Average Optimality to Discounted MDPs via the Blackwell Discount Factor
by: Grand-Clément, Julien, et al.
Published: (2023)

A Minimal-Assumption Analysis of Q-Learning with Time-Varying Policies
by: Nanda, Phalguni, et al.
Published: (2025)

Towards General Preference Alignment: Diffusion Models at Nash Equilibrium
by: Hu, Jiaming, et al.
Published: (2026)

Sample Complexity of the Linear Quadratic Regulator: A Reinforcement Learning Lens
by: Moghaddam, Amirreza Neshaei, et al.
Published: (2024)

Network-Based Epidemic Control Through Optimal Travel and Quarantine Management
by: Talaei, Mahtab, et al.
Published: (2024)

Learning to Reason Efficiently with Discounted Reinforcement Learning
by: Ayoub, Alex, et al.
Published: (2025)

Achieving $ε^{-2}$ Sample Complexity for Single-Loop Actor-Critic under Minimal Assumptions
by: Hamza, Ishaq, et al.
Published: (2026)

Natural Policy Gradient as Doubly Smoothed Policy Iteration: A Bellman-Operator Framework
by: Nanda, Phalguni, et al.
Published: (2026)

ASPEST: Bridging the Gap Between Active Learning and Selective Prediction
by: Chen, Jiefeng, et al.
Published: (2023)

Bridging the Gap Between Bayesian Deep Learning and Ensemble Weather Forecasts
by: Xiong, Xinlei, et al.
Published: (2025)

Closing the Gap between TD Learning and Supervised Learning with $Q$-Conditioned Maximization
by: Lei, Xing, et al.
Published: (2025)

Non-Asymptotic Convergence of Stochastic Iterative Algorithms: A Lyapunov Framework
by: Chen, Zaiwei, et al.
Published: (2026)

Sample Complexity of Linear Quadratic Regulator Without Initial Stability
by: Moghaddam, Amirreza Neshaei, et al.
Published: (2025)

Personalized Multi-Agent Average Reward TD-Learning via Joint Linear Approximation
by: Wang, Leo Muxing, et al.
Published: (2026)

Data Deletion Can Help in Adaptive RL
by: Budhraja, Param, et al.
Published: (2026)

Revisiting Value Iteration: Unified Analysis of Discounted and Average-Reward Cases
by: Mustafin, Arsenii, et al.
Published: (2025)

Closing the Gap between TD Learning and Supervised Learning -- A Generalisation Point of View
by: Ghugare, Raj, et al.
Published: (2024)