Saved in:
| Main Authors: | Liu, Zifan, Li, Xinran, Chen, Shibo, Zhang, Jun |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.05323 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Puzzle it Out: Local-to-Global World Model for Offline Multi-Agent Reinforcement Learning
by: Li, Sijia, et al.
Published: (2026)
by: Li, Sijia, et al.
Published: (2026)
Reinforcement Learning with Intrinsically Motivated Feedback Graph for Lost-sales Inventory Control
by: Liu, Zifan, et al.
Published: (2024)
by: Liu, Zifan, et al.
Published: (2024)
Individual Contributions as Intrinsic Exploration Scaffolds for Multi-agent Reinforcement Learning
by: Li, Xinran, et al.
Published: (2024)
by: Li, Xinran, et al.
Published: (2024)
Advancing Safe Mechanical Ventilation Using Offline RL With Hybrid Actions and Clinically Aligned Rewards
by: Yousuf, Muhammad Hamza, et al.
Published: (2025)
by: Yousuf, Muhammad Hamza, et al.
Published: (2025)
OMG-RL:Offline Model-based Guided Reward Learning for Heparin Treatment
by: Lim, Yooseok, et al.
Published: (2024)
by: Lim, Yooseok, et al.
Published: (2024)
Model-Based Proactive Cost Generation for Learning Safe Policies Offline with Limited Violation Data
by: Xue, Ruiqi, et al.
Published: (2026)
by: Xue, Ruiqi, et al.
Published: (2026)
Enhancing Generative Auto-bidding with Offline Reward Evaluation and Policy Search
by: Mou, Zhiyu, et al.
Published: (2025)
by: Mou, Zhiyu, et al.
Published: (2025)
Offline RL with Smooth OOD Generalization in Convex Hull and its Neighborhood
by: Yao, Qingmao, et al.
Published: (2025)
by: Yao, Qingmao, et al.
Published: (2025)
Safe Offline Reinforcement Learning with Real-Time Budget Constraints
by: Lin, Qian, et al.
Published: (2023)
by: Lin, Qian, et al.
Published: (2023)
STO-RL: Offline RL under Sparse Rewards via LLM-Guided Subgoal Temporal Order
by: Gu, Chengyang, et al.
Published: (2026)
by: Gu, Chengyang, et al.
Published: (2026)
Are Expressive Models Truly Necessary for Offline RL?
by: Wang, Guan, et al.
Published: (2024)
by: Wang, Guan, et al.
Published: (2024)
Budgeting Counterfactual for Offline RL
by: Liu, Yao, et al.
Published: (2023)
by: Liu, Yao, et al.
Published: (2023)
Accelerating Diffusion Planners in Offline RL via Reward-Aware Consistency Trajectory Distillation
by: Duan, Xintong, et al.
Published: (2025)
by: Duan, Xintong, et al.
Published: (2025)
Dual Alignment Maximin Optimization for Offline Model-based RL
by: Zhou, Chi, et al.
Published: (2025)
by: Zhou, Chi, et al.
Published: (2025)
Policy-regularized Offline Multi-objective Reinforcement Learning
by: Lin, Qian, et al.
Published: (2024)
by: Lin, Qian, et al.
Published: (2024)
A Tractable Inference Perspective of Offline RL
by: Liu, Xuejie, et al.
Published: (2023)
by: Liu, Xuejie, et al.
Published: (2023)
An Empirical Study on the Effectiveness of Incorporating Offline RL As Online RL Subroutines
by: Su, Jianhai, et al.
Published: (2025)
by: Su, Jianhai, et al.
Published: (2025)
CROP: Conservative Reward for Model-based Offline Policy Optimization
by: Li, Hao, et al.
Published: (2023)
by: Li, Hao, et al.
Published: (2023)
Automatic Reward Shaping from Confounded Offline Data
by: Li, Mingxuan, et al.
Published: (2025)
by: Li, Mingxuan, et al.
Published: (2025)
Safe Offline Reinforcement Learning with Feasibility-Guided Diffusion Model
by: Zheng, Yinan, et al.
Published: (2024)
by: Zheng, Yinan, et al.
Published: (2024)
Decoupled Guidance Diffusion for Adaptive Offline Safe Reinforcement Learning
by: Chen, Rufeng, et al.
Published: (2026)
by: Chen, Rufeng, et al.
Published: (2026)
Scalable Offline Model-Based RL with Action Chunks
by: Park, Kwanyoung, et al.
Published: (2025)
by: Park, Kwanyoung, et al.
Published: (2025)
Reward Generation via Large Vision-Language Model in Offline Reinforcement Learning
by: Lee, Younghwan, et al.
Published: (2025)
by: Lee, Younghwan, et al.
Published: (2025)
Selective Uncertainty Propagation in Offline RL
by: Krishnamurthy, Sanath Kumar, et al.
Published: (2023)
by: Krishnamurthy, Sanath Kumar, et al.
Published: (2023)
Decoupled Prioritized Resampling for Offline RL
by: Yue, Yang, et al.
Published: (2023)
by: Yue, Yang, et al.
Published: (2023)
Augmenting Offline RL with Unlabeled Data
by: Wang, Zhao, et al.
Published: (2024)
by: Wang, Zhao, et al.
Published: (2024)
Balance Reward and Safety Optimization for Safe Reinforcement Learning: A Perspective of Gradient Manipulation
by: Gu, Shangding, et al.
Published: (2024)
by: Gu, Shangding, et al.
Published: (2024)
Boundary-to-Region Supervision for Offline Safe Reinforcement Learning
by: Su, Huikang, et al.
Published: (2025)
by: Su, Huikang, et al.
Published: (2025)
Exploring and Addressing Reward Confusion in Offline Preference Learning
by: Chen, Xin, et al.
Published: (2024)
by: Chen, Xin, et al.
Published: (2024)
Towards Improving Reward Design in RL: A Reward Alignment Metric for RL Practitioners
by: Muslimani, Calarina, et al.
Published: (2025)
by: Muslimani, Calarina, et al.
Published: (2025)
Policy Agnostic RL: Offline RL and Online RL Fine-Tuning of Any Class and Backbone
by: Mark, Max Sobol, et al.
Published: (2024)
by: Mark, Max Sobol, et al.
Published: (2024)
Offline Reinforcement Learning with Imputed Rewards
by: Romeo, Carlo, et al.
Published: (2024)
by: Romeo, Carlo, et al.
Published: (2024)
Design Considerations in Offline Preference-based RL
by: Agarwal, Alekh, et al.
Published: (2025)
by: Agarwal, Alekh, et al.
Published: (2025)
OGBench: Benchmarking Offline Goal-Conditioned RL
by: Park, Seohong, et al.
Published: (2024)
by: Park, Seohong, et al.
Published: (2024)
Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining
by: Cheng, Jie, et al.
Published: (2024)
by: Cheng, Jie, et al.
Published: (2024)
Offline vs. Online Learning in Model-based RL: Lessons for Data Collection Strategies
by: Chen, Jiaqi, et al.
Published: (2025)
by: Chen, Jiaqi, et al.
Published: (2025)
Scaling Offline RL via Efficient and Expressive Shortcut Models
by: Espinosa-Dice, Nicolas, et al.
Published: (2025)
by: Espinosa-Dice, Nicolas, et al.
Published: (2025)
Decision MetaMamba: Enhancing Selective SSM in Offline RL with Heterogeneous Sequence Mixing
by: Kim, Wall, et al.
Published: (2026)
by: Kim, Wall, et al.
Published: (2026)
Decision MetaMamba: Enhancing Selective SSM in Offline RL with Heterogeneous Sequence Mixing
by: Kim, Wall, et al.
Published: (2024)
by: Kim, Wall, et al.
Published: (2024)
Accelerating LLM Reasoning via Early Rejection with Partial Reward Modeling
by: Cheshmi, Seyyed Saeid, et al.
Published: (2025)
by: Cheshmi, Seyyed Saeid, et al.
Published: (2025)
Similar Items
-
Puzzle it Out: Local-to-Global World Model for Offline Multi-Agent Reinforcement Learning
by: Li, Sijia, et al.
Published: (2026) -
Reinforcement Learning with Intrinsically Motivated Feedback Graph for Lost-sales Inventory Control
by: Liu, Zifan, et al.
Published: (2024) -
Individual Contributions as Intrinsic Exploration Scaffolds for Multi-agent Reinforcement Learning
by: Li, Xinran, et al.
Published: (2024) -
Advancing Safe Mechanical Ventilation Using Offline RL With Hybrid Actions and Clinically Aligned Rewards
by: Yousuf, Muhammad Hamza, et al.
Published: (2025) -
OMG-RL:Offline Model-based Guided Reward Learning for Heparin Treatment
by: Lim, Yooseok, et al.
Published: (2024)