Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Wu, Mingkang, White, Devin, Rose, Evelyn, Lawhern, Vernon, Waytowich, Nicholas R, Cao, Yongcan
Formato:	Preprint
Publicado:	2025
Materias:	Machine Learning Artificial Intelligence
Acceso en línea:	https://arxiv.org/abs/2506.09183
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866911010364850176
author	Wu, Mingkang White, Devin Rose, Evelyn Lawhern, Vernon Waytowich, Nicholas R Cao, Yongcan
author_facet	Wu, Mingkang White, Devin Rose, Evelyn Lawhern, Vernon Waytowich, Nicholas R Cao, Yongcan
contents	Reinforcement learning from human feedback (RLHF) has become a key factor in aligning model behavior with users' goals. However, while humans integrate multiple strategies when making decisions, current RLHF approaches often simplify this process by modeling human reasoning through isolated tasks such as classification or regression. In this paper, we propose a novel reinforcement learning (RL) method that mimics human decision-making by jointly considering multiple tasks. Specifically, we leverage human ratings in reward-free environments to infer a reward function, introducing learnable weights that balance the contributions of both classification and regression models. This design captures the inherent uncertainty in human decision-making and allows the model to adaptively emphasize different strategies. We conduct several experiments using synthetic human ratings to validate the effectiveness of the proposed approach. Results show that our method consistently outperforms existing rating-based RL methods, and in some cases, even surpasses traditional RL approaches.
format	Preprint
id	arxiv_https___arxiv_org_abs_2506_09183
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Multi-Task Reward Learning from Human Ratings Wu, Mingkang White, Devin Rose, Evelyn Lawhern, Vernon Waytowich, Nicholas R Cao, Yongcan Machine Learning Artificial Intelligence Reinforcement learning from human feedback (RLHF) has become a key factor in aligning model behavior with users' goals. However, while humans integrate multiple strategies when making decisions, current RLHF approaches often simplify this process by modeling human reasoning through isolated tasks such as classification or regression. In this paper, we propose a novel reinforcement learning (RL) method that mimics human decision-making by jointly considering multiple tasks. Specifically, we leverage human ratings in reward-free environments to infer a reward function, introducing learnable weights that balance the contributions of both classification and regression models. This design captures the inherent uncertainty in human decision-making and allows the model to adaptively emphasize different strategies. We conduct several experiments using synthetic human ratings to validate the effectiveness of the proposed approach. Results show that our method consistently outperforms existing rating-based RL methods, and in some cases, even surpasses traditional RL approaches.
title	Multi-Task Reward Learning from Human Ratings
topic	Machine Learning Artificial Intelligence
url	https://arxiv.org/abs/2506.09183

Ejemplares similares