Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Liu, Ziang, Xu, Junjie, Wu, Xingjiao, Yang, Jing, He, Liang
Format:	Preprint
Published:	2024
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2409.07268
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913546707664896
author	Liu, Ziang Xu, Junjie Wu, Xingjiao Yang, Jing He, Liang
author_facet	Liu, Ziang Xu, Junjie Wu, Xingjiao Yang, Jing He, Liang
contents	Preference-Based reinforcement learning (PBRL) learns directly from the preferences of human teachers regarding agent behaviors without needing meticulously designed reward functions. However, existing PBRL methods often learn primarily from explicit preferences, neglecting the possibility that teachers may choose equal preferences. This neglect may hinder the understanding of the agent regarding the task perspective of the teacher, leading to the loss of important information. To address this issue, we introduce the Equal Preference Learning Task, which optimizes the neural network by promoting similar reward predictions when the behaviors of two agents are labeled as equal preferences. Building on this task, we propose a novel PBRL method, Multi-Type Preference Learning (MTPL), which allows simultaneous learning from equal preferences while leveraging existing methods for learning from explicit preferences. To validate our approach, we design experiments applying MTPL to four existing state-of-the-art baselines across ten locomotion and robotic manipulation tasks in the DeepMind Control Suite. The experimental results indicate that simultaneous learning from both equal and explicit preferences enables the PBRL method to more comprehensively understand the feedback from teachers, thereby enhancing feedback efficiency. Project page: \url{https://github.com/FeiCuiLengMMbb/paper_MTPL}
format	Preprint
id	arxiv_https___arxiv_org_abs_2409_07268
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Multi-Type Preference Learning: Empowering Preference-Based Reinforcement Learning with Equal Preferences Liu, Ziang Xu, Junjie Wu, Xingjiao Yang, Jing He, Liang Machine Learning Preference-Based reinforcement learning (PBRL) learns directly from the preferences of human teachers regarding agent behaviors without needing meticulously designed reward functions. However, existing PBRL methods often learn primarily from explicit preferences, neglecting the possibility that teachers may choose equal preferences. This neglect may hinder the understanding of the agent regarding the task perspective of the teacher, leading to the loss of important information. To address this issue, we introduce the Equal Preference Learning Task, which optimizes the neural network by promoting similar reward predictions when the behaviors of two agents are labeled as equal preferences. Building on this task, we propose a novel PBRL method, Multi-Type Preference Learning (MTPL), which allows simultaneous learning from equal preferences while leveraging existing methods for learning from explicit preferences. To validate our approach, we design experiments applying MTPL to four existing state-of-the-art baselines across ten locomotion and robotic manipulation tasks in the DeepMind Control Suite. The experimental results indicate that simultaneous learning from both equal and explicit preferences enables the PBRL method to more comprehensively understand the feedback from teachers, thereby enhancing feedback efficiency. Project page: \url{https://github.com/FeiCuiLengMMbb/paper_MTPL}
title	Multi-Type Preference Learning: Empowering Preference-Based Reinforcement Learning with Equal Preferences
topic	Machine Learning
url	https://arxiv.org/abs/2409.07268

Similar Items