:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liu, Yi, Datta, Gaurav, Novoseller, Ellen, Brown, Daniel S.
Format:	Preprint
Published:	2023
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2301.04741
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Crowd-PrefRL: Preference-Based Reward Learning from Crowds
by: Chhan, David, et al.
Published: (2024)

GraphAllocBench: A Flexible Benchmark for Preference-Conditioned Multi-Objective Policy Learning
by: Jiang, Zhiheng, et al.
Published: (2026)

Rating-based Reinforcement Learning
by: White, Devin, et al.
Published: (2023)

Multi-Type Preference Learning: Empowering Preference-Based Reinforcement Learning with Equal Preferences
by: Liu, Ziang, et al.
Published: (2024)

Preference-Guided Reinforcement Learning for Efficient Exploration
by: Wang, Guojian, et al.
Published: (2024)

Boosting Robustness in Preference-Based Reinforcement Learning with Dynamic Sparsity
by: Muslimani, Calarina, et al.
Published: (2024)

Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better
by: Menghani, Gaurav
Published: (2021)

Sample-Efficient Preference-based Reinforcement Learning with Dynamics Aware Rewards
by: Metcalf, Katherine, et al.
Published: (2024)

Preference VLM: Leveraging VLMs for Scalable Preference-Based Reinforcement Learning
by: Ghosh, Udita, et al.
Published: (2025)

RA-PbRL: Provably Efficient Risk-Aware Preference-Based Reinforcement Learning
by: Zhao, Yujie, et al.
Published: (2024)

Query-Policy Misalignment in Preference-Based Reinforcement Learning
by: Hu, Xiao, et al.
Published: (2023)

TEACH: Temporal Variance-Driven Curriculum for Reinforcement Learning
by: Chaudhary, Gaurav, et al.
Published: (2025)

Efficient Preference-Based Reinforcement Learning: Randomized Exploration Meets Experimental Design
by: Schlaginhaufen, Andreas, et al.
Published: (2025)

From Reward-Free Representations to Preferences: Rethinking Offline Preference-Based Reinforcement Learning
by: Yang, Jun-Jie, et al.
Published: (2026)

Hindsight Preference Learning for Offline Preference-based Reinforcement Learning
by: Gao, Chen-Xiao, et al.
Published: (2024)

From Novelty to Imitation: Self-Distilled Rewards for Offline Reinforcement Learning
by: Chaudhary, Gaurav, et al.
Published: (2025)

Efficient Multi-Policy Evaluation for Reinforcement Learning
by: Liu, Shuze Daniel, et al.
Published: (2024)

Modeling Behavioral Preferences of Cyber Adversaries Using Inverse Reinforcement Learning
by: Shinde, Aditya, et al.
Published: (2025)

Combinatorial Reinforcement Learning with Preference Feedback
by: Lee, Joongkyu, et al.
Published: (2025)

On Efficient Bayesian Exploration in Model-Based Reinforcement Learning
by: Caron, Alberto, et al.
Published: (2025)

General Preference Reinforcement Learning
by: Umer, Muhammad, et al.
Published: (2026)

ResponseRank: Data-Efficient Reward Modeling through Preference Strength Learning
by: Kaufmann, Timo, et al.
Published: (2025)

AREAL-DTA: Dynamic Tree Attention for Efficient Reinforcement Learning of Large Language Models
by: Zhang, Jiarui, et al.
Published: (2026)

Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
by: Chen, Claire, et al.
Published: (2024)

Search-Based Credit Assignment for Offline Preference-Based Reinforcement Learning
by: Gao, Xiancheng, et al.
Published: (2025)

Autonomous Assessment of Demonstration Sufficiency via Bayesian Inverse Reinforcement Learning
by: Trinh, Tu, et al.
Published: (2022)

Provable Reward-Agnostic Preference-Based Reinforcement Learning
by: Zhan, Wenhao, et al.
Published: (2023)

PB$^2$: Preference Space Exploration via Population-Based Methods in Preference-Based Reinforcement Learning
by: Driss, Brahim, et al.
Published: (2025)

Reinforcement Learning from Diverse Human Preferences
by: Xue, Wanqi, et al.
Published: (2023)

Preference-based Multi-Objective Reinforcement Learning
by: Mu, Ni, et al.
Published: (2025)

MOORL: A Framework for Integrating Offline-Online Reinforcement Learning
by: Chaudhary, Gaurav, et al.
Published: (2025)

MOBODY: Model Based Off-Dynamics Offline Reinforcement Learning
by: Guo, Yihong, et al.
Published: (2025)

Preference Elicitation for Offline Reinforcement Learning
by: Pace, Alizée, et al.
Published: (2024)

Residual Reward Models for Preference-based Reinforcement Learning
by: Cao, Chenyang, et al.
Published: (2025)

Binary Reward Labeling: Bridging Offline Preference and Reward-Based Reinforcement Learning
by: Xu, Yinglun, et al.
Published: (2024)

Dynamic Preference Multi-Objective Reinforcement Learning for Internet Network Management
by: Heo, DongNyeong, et al.
Published: (2025)

Efficient Reinforcement Learning from Human Feedback via Bayesian Preference Inference
by: Cercola, Matteo, et al.
Published: (2025)

LAPP: Large Language Model Feedback for Preference-Driven Reinforcement Learning
by: Jian, Pingcheng, et al.
Published: (2025)

Two-Step Offline Preference-Based Reinforcement Learning with Constrained Actions
by: Xu, Yinglun, et al.
Published: (2023)

Fine-tuning Behavioral Cloning Policies with Preference-Based Reinforcement Learning
by: Macuglia, Maël, et al.
Published: (2025)