Saved in:
| Main Authors: | Chen, Xin, Toyer, Sam, Shkurti, Florian |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.16025 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SMAC: Score-Matched Actor-Critics for Robust Offline-to-Online Transfer
by: de Lara, Nathan Samuel, et al.
Published: (2026)
by: de Lara, Nathan Samuel, et al.
Published: (2026)
Generalization in VAE and Diffusion Models: A Unified Information-Theoretic Analysis
by: Chen, Qi, et al.
Published: (2025)
by: Chen, Qi, et al.
Published: (2025)
Listwise Reward Estimation for Offline Preference-based Reinforcement Learning
by: Choi, Heewoong, et al.
Published: (2024)
by: Choi, Heewoong, et al.
Published: (2024)
Continual Model-Based Reinforcement Learning with Hypernetworks
by: Huang, Yizhou, et al.
Published: (2020)
by: Huang, Yizhou, et al.
Published: (2020)
PROF: An LLM-based Reward Code Preference Optimization Framework for Offline Imitation Learning
by: Sun, Shengjie, et al.
Published: (2025)
by: Sun, Shengjie, et al.
Published: (2025)
Hindsight Preference Learning for Offline Preference-based Reinforcement Learning
by: Gao, Chen-Xiao, et al.
Published: (2024)
by: Gao, Chen-Xiao, et al.
Published: (2024)
Offline Reinforcement Learning with Imputed Rewards
by: Romeo, Carlo, et al.
Published: (2024)
by: Romeo, Carlo, et al.
Published: (2024)
Preference Elicitation for Offline Reinforcement Learning
by: Pace, Alizée, et al.
Published: (2024)
by: Pace, Alizée, et al.
Published: (2024)
Informing Acquisition Functions via Foundation Models for Molecular Discovery
by: Chen, Qi, et al.
Published: (2025)
by: Chen, Qi, et al.
Published: (2025)
Behavior Preference Regression for Offline Reinforcement Learning
by: Srinivasan, Padmanaba, et al.
Published: (2025)
by: Srinivasan, Padmanaba, et al.
Published: (2025)
Offline Trajectory Optimization for Offline Reinforcement Learning
by: Zhao, Ziqi, et al.
Published: (2024)
by: Zhao, Ziqi, et al.
Published: (2024)
Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning
by: Zhang, Tianle, et al.
Published: (2024)
by: Zhang, Tianle, et al.
Published: (2024)
Reward Learning From Preference With Ties
by: Liu, Jinsong, et al.
Published: (2024)
by: Liu, Jinsong, et al.
Published: (2024)
In-Dataset Trajectory Return Regularization for Offline Preference-based Reinforcement Learning
by: Tu, Songjun, et al.
Published: (2024)
by: Tu, Songjun, et al.
Published: (2024)
Adversarial Policy Optimization for Offline Preference-based Reinforcement Learning
by: Kang, Hyungkyu, et al.
Published: (2025)
by: Kang, Hyungkyu, et al.
Published: (2025)
Penalizing Infeasible Actions and Reward Scaling in Reinforcement Learning with Offline Data
by: Kim, Jeonghye, et al.
Published: (2025)
by: Kim, Jeonghye, et al.
Published: (2025)
The Confusion is Real: GRAPHIC -- A Network Science Approach to Confusion Matrices in Deep Learning
by: Fröhlich, Johanna S., et al.
Published: (2026)
by: Fröhlich, Johanna S., et al.
Published: (2026)
LEASE: Offline Preference-based Reinforcement Learning with High Sample Efficiency
by: Liu, Xiao-Yin, et al.
Published: (2024)
by: Liu, Xiao-Yin, et al.
Published: (2024)
Should We Ever Prefer Decision Transformer for Offline Reinforcement Learning?
by: Omori, Yumi, et al.
Published: (2025)
by: Omori, Yumi, et al.
Published: (2025)
OPRIDE: Offline Preference-based Reinforcement Learning via In-Dataset Exploration
by: Yang, Yiqin, et al.
Published: (2026)
by: Yang, Yiqin, et al.
Published: (2026)
Search-Based Credit Assignment for Offline Preference-Based Reinforcement Learning
by: Gao, Xiancheng, et al.
Published: (2025)
by: Gao, Xiancheng, et al.
Published: (2025)
Two-Step Offline Preference-Based Reinforcement Learning with Constrained Actions
by: Xu, Yinglun, et al.
Published: (2023)
by: Xu, Yinglun, et al.
Published: (2023)
Rectifying Shortcut Behaviors in Preference-based Reward Learning
by: Ye, Wenqian, et al.
Published: (2025)
by: Ye, Wenqian, et al.
Published: (2025)
OMG-RL:Offline Model-based Guided Reward Learning for Heparin Treatment
by: Lim, Yooseok, et al.
Published: (2024)
by: Lim, Yooseok, et al.
Published: (2024)
Reward Generation via Large Vision-Language Model in Offline Reinforcement Learning
by: Lee, Younghwan, et al.
Published: (2025)
by: Lee, Younghwan, et al.
Published: (2025)
Design Considerations in Offline Preference-based RL
by: Agarwal, Alekh, et al.
Published: (2025)
by: Agarwal, Alekh, et al.
Published: (2025)
Latent Adversarial Regularization for Offline Preference Optimization
by: Jiang, Enyi, et al.
Published: (2026)
by: Jiang, Enyi, et al.
Published: (2026)
Preference as Reward, Maximum Preference Optimization with Importance Sampling
by: Jiang, Zaifan, et al.
Published: (2023)
by: Jiang, Zaifan, et al.
Published: (2023)
Automatic Reward Shaping from Confounded Offline Data
by: Li, Mingxuan, et al.
Published: (2025)
by: Li, Mingxuan, et al.
Published: (2025)
Robust Offline Reinforcement learning with Heavy-Tailed Rewards
by: Zhu, Jin, et al.
Published: (2023)
by: Zhu, Jin, et al.
Published: (2023)
Enhancing Generative Auto-bidding with Offline Reward Evaluation and Policy Search
by: Mou, Zhiyu, et al.
Published: (2025)
by: Mou, Zhiyu, et al.
Published: (2025)
Preference-Guided Diffusion for Multi-Objective Offline Optimization
by: Annadani, Yashas, et al.
Published: (2025)
by: Annadani, Yashas, et al.
Published: (2025)
Residual Reward Models for Preference-based Reinforcement Learning
by: Cao, Chenyang, et al.
Published: (2025)
by: Cao, Chenyang, et al.
Published: (2025)
CROP: Conservative Reward for Model-based Offline Policy Optimization
by: Li, Hao, et al.
Published: (2023)
by: Li, Hao, et al.
Published: (2023)
Robust Guided Diffusion for Offline Black-Box Optimization
by: Chen, Can Sam, et al.
Published: (2024)
by: Chen, Can Sam, et al.
Published: (2024)
Preference Poisoning Attacks on Reward Model Learning
by: Wu, Junlin, et al.
Published: (2024)
by: Wu, Junlin, et al.
Published: (2024)
GAS: Enhancing Reward-Cost Balance of Generative Model-assisted Offline Safe RL
by: Liu, Zifan, et al.
Published: (2026)
by: Liu, Zifan, et al.
Published: (2026)
SafeMIL: Learning Offline Safe Imitation Policy from Non-Preferred Trajectories
by: Burnwal, Returaj, et al.
Published: (2025)
by: Burnwal, Returaj, et al.
Published: (2025)
Flexible Blood Glucose Control: Offline Reinforcement Learning from Human Feedback
by: Emerson, Harry, et al.
Published: (2025)
by: Emerson, Harry, et al.
Published: (2025)
Preference-Based Self-Distillation: Beyond KL Matching via Reward Regularization
by: Yu, Xin, et al.
Published: (2026)
by: Yu, Xin, et al.
Published: (2026)
Similar Items
-
SMAC: Score-Matched Actor-Critics for Robust Offline-to-Online Transfer
by: de Lara, Nathan Samuel, et al.
Published: (2026) -
Generalization in VAE and Diffusion Models: A Unified Information-Theoretic Analysis
by: Chen, Qi, et al.
Published: (2025) -
Listwise Reward Estimation for Offline Preference-based Reinforcement Learning
by: Choi, Heewoong, et al.
Published: (2024) -
Continual Model-Based Reinforcement Learning with Hypernetworks
by: Huang, Yizhou, et al.
Published: (2020) -
PROF: An LLM-based Reward Code Preference Optimization Framework for Offline Imitation Learning
by: Sun, Shengjie, et al.
Published: (2025)