Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Schiavi, Giulio, Cramariuc, Andrei, Ott, Lionel, Siegwart, Roland
Format:	Preprint
Published:	2025
Subjects:	Robotics Machine Learning
Online Access:	https://arxiv.org/abs/2507.04730
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911042329640960
author	Schiavi, Giulio Cramariuc, Andrei Ott, Lionel Siegwart, Roland
author_facet	Schiavi, Giulio Cramariuc, Andrei Ott, Lionel Siegwart, Roland
contents	Human guidance has emerged as a powerful tool for enhancing reinforcement learning (RL). However, conventional forms of guidance such as demonstrations or binary scalar feedback can be challenging to collect or have low information content, motivating the exploration of other forms of human input. Among these, relative feedback (i.e., feedback on how to improve an action, such as "more to the left") offers a good balance between usability and information richness. Previous research has shown that relative feedback can be used to enhance policy search methods. However, these efforts have been limited to specific policy classes and use feedback inefficiently. In this work, we introduce a novel method to learn from relative feedback and combine it with off-policy reinforcement learning. Through evaluations on two sparse-reward tasks, we demonstrate our method can be used to improve the sample efficiency of reinforcement learning by guiding its exploration process. Additionally, we show it can adapt a policy to changes in the environment or the user's preferences. Finally, we demonstrate real-world applicability by employing our approach to learn a navigation policy in a sparse reward setting.
format	Preprint
id	arxiv_https___arxiv_org_abs_2507_04730
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	CueLearner: Bootstrapping and local policy adaptation from relative feedback Schiavi, Giulio Cramariuc, Andrei Ott, Lionel Siegwart, Roland Robotics Machine Learning Human guidance has emerged as a powerful tool for enhancing reinforcement learning (RL). However, conventional forms of guidance such as demonstrations or binary scalar feedback can be challenging to collect or have low information content, motivating the exploration of other forms of human input. Among these, relative feedback (i.e., feedback on how to improve an action, such as "more to the left") offers a good balance between usability and information richness. Previous research has shown that relative feedback can be used to enhance policy search methods. However, these efforts have been limited to specific policy classes and use feedback inefficiently. In this work, we introduce a novel method to learn from relative feedback and combine it with off-policy reinforcement learning. Through evaluations on two sparse-reward tasks, we demonstrate our method can be used to improve the sample efficiency of reinforcement learning by guiding its exploration process. Additionally, we show it can adapt a policy to changes in the environment or the user's preferences. Finally, we demonstrate real-world applicability by employing our approach to learn a navigation policy in a sparse reward setting.
title	CueLearner: Bootstrapping and local policy adaptation from relative feedback
topic	Robotics Machine Learning
url	https://arxiv.org/abs/2507.04730

Similar Items