Saved in:
| Main Authors: | Liu, Yi, Datta, Gaurav, Novoseller, Ellen, Brown, Daniel S. |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2301.04741 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Crowd-PrefRL: Preference-Based Reward Learning from Crowds
by: Chhan, David, et al.
Published: (2024)
by: Chhan, David, et al.
Published: (2024)
GraphAllocBench: A Flexible Benchmark for Preference-Conditioned Multi-Objective Policy Learning
by: Jiang, Zhiheng, et al.
Published: (2026)
by: Jiang, Zhiheng, et al.
Published: (2026)
Rating-based Reinforcement Learning
by: White, Devin, et al.
Published: (2023)
by: White, Devin, et al.
Published: (2023)
Multi-Type Preference Learning: Empowering Preference-Based Reinforcement Learning with Equal Preferences
by: Liu, Ziang, et al.
Published: (2024)
by: Liu, Ziang, et al.
Published: (2024)
Preference-Guided Reinforcement Learning for Efficient Exploration
by: Wang, Guojian, et al.
Published: (2024)
by: Wang, Guojian, et al.
Published: (2024)
Boosting Robustness in Preference-Based Reinforcement Learning with Dynamic Sparsity
by: Muslimani, Calarina, et al.
Published: (2024)
by: Muslimani, Calarina, et al.
Published: (2024)
Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better
by: Menghani, Gaurav
Published: (2021)
by: Menghani, Gaurav
Published: (2021)
Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards
by: Metcalf, Katherine, et al.
Published: (2024)
by: Metcalf, Katherine, et al.
Published: (2024)
Preference VLM: Leveraging VLMs for Scalable Preference-Based Reinforcement Learning
by: Ghosh, Udita, et al.
Published: (2025)
by: Ghosh, Udita, et al.
Published: (2025)
RA-PbRL: Provably Efficient Risk-Aware Preference-Based Reinforcement Learning
by: Zhao, Yujie, et al.
Published: (2024)
by: Zhao, Yujie, et al.
Published: (2024)
Query-Policy Misalignment in Preference-Based Reinforcement Learning
by: Hu, Xiao, et al.
Published: (2023)
by: Hu, Xiao, et al.
Published: (2023)
TEACH: Temporal Variance-Driven Curriculum for Reinforcement Learning
by: Chaudhary, Gaurav, et al.
Published: (2025)
by: Chaudhary, Gaurav, et al.
Published: (2025)
Efficient Preference-Based Reinforcement Learning: Randomized Exploration Meets Experimental Design
by: Schlaginhaufen, Andreas, et al.
Published: (2025)
by: Schlaginhaufen, Andreas, et al.
Published: (2025)
From Reward-Free Representations to Preferences: Rethinking Offline Preference-Based Reinforcement Learning
by: Yang, Jun-Jie, et al.
Published: (2026)
by: Yang, Jun-Jie, et al.
Published: (2026)
Hindsight Preference Learning for Offline Preference-based Reinforcement Learning
by: Gao, Chen-Xiao, et al.
Published: (2024)
by: Gao, Chen-Xiao, et al.
Published: (2024)
From Novelty to Imitation: Self-Distilled Rewards for Offline Reinforcement Learning
by: Chaudhary, Gaurav, et al.
Published: (2025)
by: Chaudhary, Gaurav, et al.
Published: (2025)
Efficient Multi-Policy Evaluation for Reinforcement Learning
by: Liu, Shuze Daniel, et al.
Published: (2024)
by: Liu, Shuze Daniel, et al.
Published: (2024)
Modeling Behavioral Preferences of Cyber Adversaries Using Inverse Reinforcement Learning
by: Shinde, Aditya, et al.
Published: (2025)
by: Shinde, Aditya, et al.
Published: (2025)
Combinatorial Reinforcement Learning with Preference Feedback
by: Lee, Joongkyu, et al.
Published: (2025)
by: Lee, Joongkyu, et al.
Published: (2025)
On Efficient Bayesian Exploration in Model-Based Reinforcement Learning
by: Caron, Alberto, et al.
Published: (2025)
by: Caron, Alberto, et al.
Published: (2025)
General Preference Reinforcement Learning
by: Umer, Muhammad, et al.
Published: (2026)
by: Umer, Muhammad, et al.
Published: (2026)
ResponseRank: Data-Efficient Reward Modeling through Preference Strength Learning
by: Kaufmann, Timo, et al.
Published: (2025)
by: Kaufmann, Timo, et al.
Published: (2025)
AREAL-DTA: Dynamic Tree Attention for Efficient Reinforcement Learning of Large Language Models
by: Zhang, Jiarui, et al.
Published: (2026)
by: Zhang, Jiarui, et al.
Published: (2026)
Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
by: Chen, Claire, et al.
Published: (2024)
by: Chen, Claire, et al.
Published: (2024)
Search-Based Credit Assignment for Offline Preference-Based Reinforcement Learning
by: Gao, Xiancheng, et al.
Published: (2025)
by: Gao, Xiancheng, et al.
Published: (2025)
Autonomous Assessment of Demonstration Sufficiency via Bayesian Inverse Reinforcement Learning
by: Trinh, Tu, et al.
Published: (2022)
by: Trinh, Tu, et al.
Published: (2022)
Provable Reward-Agnostic Preference-Based Reinforcement Learning
by: Zhan, Wenhao, et al.
Published: (2023)
by: Zhan, Wenhao, et al.
Published: (2023)
PB$^2$: Preference Space Exploration via Population-Based Methods in Preference-Based Reinforcement Learning
by: Driss, Brahim, et al.
Published: (2025)
by: Driss, Brahim, et al.
Published: (2025)
Reinforcement Learning from Diverse Human Preferences
by: Xue, Wanqi, et al.
Published: (2023)
by: Xue, Wanqi, et al.
Published: (2023)
Preference-based Multi-Objective Reinforcement Learning
by: Mu, Ni, et al.
Published: (2025)
by: Mu, Ni, et al.
Published: (2025)
MOORL: A Framework for Integrating Offline-Online Reinforcement Learning
by: Chaudhary, Gaurav, et al.
Published: (2025)
by: Chaudhary, Gaurav, et al.
Published: (2025)
MOBODY: Model Based Off-Dynamics Offline Reinforcement Learning
by: Guo, Yihong, et al.
Published: (2025)
by: Guo, Yihong, et al.
Published: (2025)
Preference Elicitation for Offline Reinforcement Learning
by: Pace, Alizée, et al.
Published: (2024)
by: Pace, Alizée, et al.
Published: (2024)
Residual Reward Models for Preference-based Reinforcement Learning
by: Cao, Chenyang, et al.
Published: (2025)
by: Cao, Chenyang, et al.
Published: (2025)
Binary Reward Labeling: Bridging Offline Preference and Reward-Based Reinforcement Learning
by: Xu, Yinglun, et al.
Published: (2024)
by: Xu, Yinglun, et al.
Published: (2024)
Dynamic Preference Multi-Objective Reinforcement Learning for Internet Network Management
by: Heo, DongNyeong, et al.
Published: (2025)
by: Heo, DongNyeong, et al.
Published: (2025)
Efficient Reinforcement Learning from Human Feedback via Bayesian Preference Inference
by: Cercola, Matteo, et al.
Published: (2025)
by: Cercola, Matteo, et al.
Published: (2025)
LAPP: Large Language Model Feedback for Preference-Driven Reinforcement Learning
by: Jian, Pingcheng, et al.
Published: (2025)
by: Jian, Pingcheng, et al.
Published: (2025)
Two-Step Offline Preference-Based Reinforcement Learning with Constrained Actions
by: Xu, Yinglun, et al.
Published: (2023)
by: Xu, Yinglun, et al.
Published: (2023)
Fine-tuning Behavioral Cloning Policies with Preference-Based Reinforcement Learning
by: Macuglia, Maël, et al.
Published: (2025)
by: Macuglia, Maël, et al.
Published: (2025)
Similar Items
-
Crowd-PrefRL: Preference-Based Reward Learning from Crowds
by: Chhan, David, et al.
Published: (2024) -
GraphAllocBench: A Flexible Benchmark for Preference-Conditioned Multi-Objective Policy Learning
by: Jiang, Zhiheng, et al.
Published: (2026) -
Rating-based Reinforcement Learning
by: White, Devin, et al.
Published: (2023) -
Multi-Type Preference Learning: Empowering Preference-Based Reinforcement Learning with Equal Preferences
by: Liu, Ziang, et al.
Published: (2024) -
Preference-Guided Reinforcement Learning for Efficient Exploration
by: Wang, Guojian, et al.
Published: (2024)