:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liu, Zhishuai, Wang, Weixin, Xu, Pan
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2409.20521
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Sample Complexity of Distributionally Robust Off-Dynamics Reinforcement Learning with Online Interaction
by: He, Yiting, et al.
Published: (2025)

Minimax Optimal and Computationally Efficient Algorithms for Distributionally Robust Offline Reinforcement Learning
by: Liu, Zhishuai, et al.
Published: (2024)

Return Augmented Decision Transformer for Off-Dynamics Reinforcement Learning
by: Wang, Ruhan, et al.
Published: (2024)

Linear Mixture Distributionally Robust Markov Decision Processes
by: Liu, Zhishuai, et al.
Published: (2025)

Robust Offline Reinforcement Learning with Linearly Structured f-Divergence Regularization
by: Tang, Cheng, et al.
Published: (2024)

Distributionally Robust Off-Dynamics Reinforcement Learning: Provable Efficiency with Linear Function Approximation
by: Liu, Zhishuai, et al.
Published: (2024)

Policy Regularized Distributionally Robust Markov Decision Processes with Linear Function Approximation
by: Gu, Jingwen, et al.
Published: (2025)

MOBODY: Model Based Off-Dynamics Offline Reinforcement Learning
by: Guo, Yihong, et al.
Published: (2025)

Off-Dynamics Reinforcement Learning via Domain Adaptation and Reward Augmented Imitation
by: Guo, Yihong, et al.
Published: (2024)

Localized Dynamics-Aware Domain Adaption for Off-Dynamics Offline Reinforcement Learning
by: Xia, Zhangjie, et al.
Published: (2026)

How to Provably Improve Return Conditioned Supervised Learning?
by: Liu, Zhishuai, et al.
Published: (2025)

Cross-Domain Energy-Guided Diffusion Generation for Off-Dynamics Reinforcement Learning
by: Yang, Yu, et al.
Published: (2026)

ODRL: A Benchmark for Off-Dynamics Reinforcement Learning
by: Lyu, Jiafei, et al.
Published: (2024)

Provable Anytime Ensemble Sampling Algorithms in Nonlinear Contextual Bandits
by: Sun, Jiazheng, et al.
Published: (2025)

Behaviour Policy Optimization: Provably Lower Variance Return Estimates for Off-Policy Reinforcement Learning
by: Goodall, Alexander W., et al.
Published: (2025)

Understanding and Improving Adversarial Robustness of Neural Probabilistic Circuits
by: Chen, Weixin, et al.
Published: (2025)

Inverse Reinforcement Learning with Dynamic Reward Scaling for LLM Alignment
by: Cheng, Ruoxi, et al.
Published: (2025)

Single-Trajectory Distributionally Robust Reinforcement Learning
by: Liang, Zhipeng, et al.
Published: (2023)

Quadratic Upper Bound for Boosting Robustness
by: You, Euijin, et al.
Published: (2026)

Rethinking Langevin Thompson Sampling from A Stochastic Approximation Perspective
by: Wang, Weixin, et al.
Published: (2025)

Bridging Distributional and Risk-sensitive Reinforcement Learning with Provable Regret Bounds
by: Liang, Hao, et al.
Published: (2022)

Towards Off-Policy Reinforcement Learning for Ranking Policies with Human Feedback
by: Xiao, Teng, et al.
Published: (2024)

Upper Entropy for 2-Monotone Lower Probabilities
by: Vu, Tuan-Anh, et al.
Published: (2026)

On Distributional Reinforcement Learning in Chaotic Dynamical Systems
by: Rudd-Jones, James, et al.
Published: (2026)

Reinforcement Learning Interventions on Boundedly Rational Human Agents in Frictionful Tasks
by: Nofshin, Eura, et al.
Published: (2024)

Regret Bounds and Reinforcement Learning Exploration of EXP-based Algorithms
by: Xu, Mengfan, et al.
Published: (2020)

Breaking the Curse of Repulsion: Optimistic Distributionally Robust Policy Optimization for Off-Policy Generative Recommendation
by: Jiang, Jie, et al.
Published: (2026)

TADPO: Reinforcement Learning Goes Off-road
by: Wu, Zhouchonghao, et al.
Published: (2026)

Dynamic Adversarial Reinforcement Learning for Robust Multimodal Large Language Models
by: Bao, Yicheng, et al.
Published: (2026)

On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised Fine-Tuning and Reinforcement Learning via Dynamic Weighting
by: Zhang, Wenhao, et al.
Published: (2025)

Low-Rank Adaptation for Critic Learning in Off-Policy Reinforcement Learning
by: Zhuang, Yuan, et al.
Published: (2026)

OffSeeker: Online Reinforcement Learning Is Not All You Need for Deep Research Agents
by: Zhou, Yuhang, et al.
Published: (2026)

Distributionally Robust Reinforcement Learning with Human Feedback
by: Mandal, Debmalya, et al.
Published: (2025)

Scaling Off-Policy Reinforcement Learning with Batch and Weight Normalization
by: Palenicek, Daniel, et al.
Published: (2025)

Distributionally Robust Model-based Reinforcement Learning with Large State Spaces
by: Ramesh, Shyam Sundhar, et al.
Published: (2023)

Open Knowledge Base Canonicalization with Multi-task Learning
by: Liu, Bingchen, et al.
Published: (2024)

Kernel-Based Distributed Q-Learning: A Scalable Reinforcement Learning Approach for Dynamic Treatment Regimes
by: Wang, Di, et al.
Published: (2023)

Adapting Critic Match Loss Landscape Visualization to Off-policy Reinforcement Learning
by: Liu, Jingyi, et al.
Published: (2026)

Towards Robust Deep Reinforcement Learning against Environmental State Perturbation
by: Wang, Chenxu, et al.
Published: (2025)

ADG: Ambient Diffusion-Guided Dataset Recovery for Corruption-Robust Offline Reinforcement Learning
by: Liu, Zeyuan, et al.
Published: (2025)