Saved in:
Bibliographic Details
Main Authors: Liu, Ziang, Xu, Junjie, Wu, Xingjiao, Yang, Jing, He, Liang
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2409.07268
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913546707664896
author Liu, Ziang
Xu, Junjie
Wu, Xingjiao
Yang, Jing
He, Liang
author_facet Liu, Ziang
Xu, Junjie
Wu, Xingjiao
Yang, Jing
He, Liang
contents Preference-Based reinforcement learning (PBRL) learns directly from the preferences of human teachers regarding agent behaviors without needing meticulously designed reward functions. However, existing PBRL methods often learn primarily from explicit preferences, neglecting the possibility that teachers may choose equal preferences. This neglect may hinder the understanding of the agent regarding the task perspective of the teacher, leading to the loss of important information. To address this issue, we introduce the Equal Preference Learning Task, which optimizes the neural network by promoting similar reward predictions when the behaviors of two agents are labeled as equal preferences. Building on this task, we propose a novel PBRL method, Multi-Type Preference Learning (MTPL), which allows simultaneous learning from equal preferences while leveraging existing methods for learning from explicit preferences. To validate our approach, we design experiments applying MTPL to four existing state-of-the-art baselines across ten locomotion and robotic manipulation tasks in the DeepMind Control Suite. The experimental results indicate that simultaneous learning from both equal and explicit preferences enables the PBRL method to more comprehensively understand the feedback from teachers, thereby enhancing feedback efficiency. Project page: \url{https://github.com/FeiCuiLengMMbb/paper_MTPL}
format Preprint
id arxiv_https___arxiv_org_abs_2409_07268
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Multi-Type Preference Learning: Empowering Preference-Based Reinforcement Learning with Equal Preferences
Liu, Ziang
Xu, Junjie
Wu, Xingjiao
Yang, Jing
He, Liang
Machine Learning
Preference-Based reinforcement learning (PBRL) learns directly from the preferences of human teachers regarding agent behaviors without needing meticulously designed reward functions. However, existing PBRL methods often learn primarily from explicit preferences, neglecting the possibility that teachers may choose equal preferences. This neglect may hinder the understanding of the agent regarding the task perspective of the teacher, leading to the loss of important information. To address this issue, we introduce the Equal Preference Learning Task, which optimizes the neural network by promoting similar reward predictions when the behaviors of two agents are labeled as equal preferences. Building on this task, we propose a novel PBRL method, Multi-Type Preference Learning (MTPL), which allows simultaneous learning from equal preferences while leveraging existing methods for learning from explicit preferences. To validate our approach, we design experiments applying MTPL to four existing state-of-the-art baselines across ten locomotion and robotic manipulation tasks in the DeepMind Control Suite. The experimental results indicate that simultaneous learning from both equal and explicit preferences enables the PBRL method to more comprehensively understand the feedback from teachers, thereby enhancing feedback efficiency. Project page: \url{https://github.com/FeiCuiLengMMbb/paper_MTPL}
title Multi-Type Preference Learning: Empowering Preference-Based Reinforcement Learning with Equal Preferences
topic Machine Learning
url https://arxiv.org/abs/2409.07268