Internformat: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Yang, Qianlan, Wang, Yu-Xiong
Format:	Preprint
Veröffentlicht:	2024
Schlagworte:	Machine Learning Artificial Intelligence Computer Vision and Pattern Recognition
Online-Zugang:	https://arxiv.org/abs/2406.04323
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

_version_	1866929376291979264
author	Yang, Qianlan Wang, Yu-Xiong
author_facet	Yang, Qianlan Wang, Yu-Xiong
contents	Training autonomous agents with sparse rewards is a long-standing problem in online reinforcement learning (RL), due to low data efficiency. Prior work overcomes this challenge by extracting useful knowledge from offline data, often accomplished through the learning of action distribution from offline data and utilizing the learned distribution to facilitate online RL. However, since the offline data are given and fixed, the extracted knowledge is inherently limited, making it difficult to generalize to new tasks. We propose a novel approach that leverages offline data to learn a generative diffusion model, coined as Adaptive Trajectory Diffuser (ATraDiff). This model generates synthetic trajectories, serving as a form of data augmentation and consequently enhancing the performance of online RL methods. The key strength of our diffuser lies in its adaptability, allowing it to effectively handle varying trajectory lengths and mitigate distribution shifts between online and offline data. Because of its simplicity, ATraDiff seamlessly integrates with a wide spectrum of RL methods. Empirical evaluation shows that ATraDiff consistently achieves state-of-the-art performance across a variety of environments, with particularly pronounced improvements in complicated settings. Our code and demo video are available at https://atradiff.github.io .
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_04323
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories Yang, Qianlan Wang, Yu-Xiong Machine Learning Artificial Intelligence Computer Vision and Pattern Recognition Training autonomous agents with sparse rewards is a long-standing problem in online reinforcement learning (RL), due to low data efficiency. Prior work overcomes this challenge by extracting useful knowledge from offline data, often accomplished through the learning of action distribution from offline data and utilizing the learned distribution to facilitate online RL. However, since the offline data are given and fixed, the extracted knowledge is inherently limited, making it difficult to generalize to new tasks. We propose a novel approach that leverages offline data to learn a generative diffusion model, coined as Adaptive Trajectory Diffuser (ATraDiff). This model generates synthetic trajectories, serving as a form of data augmentation and consequently enhancing the performance of online RL methods. The key strength of our diffuser lies in its adaptability, allowing it to effectively handle varying trajectory lengths and mitigate distribution shifts between online and offline data. Because of its simplicity, ATraDiff seamlessly integrates with a wide spectrum of RL methods. Empirical evaluation shows that ATraDiff consistently achieves state-of-the-art performance across a variety of environments, with particularly pronounced improvements in complicated settings. Our code and demo video are available at https://atradiff.github.io .
title	ATraDiff: Accelerating Online Reinforcement Learning with Imaginary Trajectories
topic	Machine Learning Artificial Intelligence Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2406.04323

Ähnliche Einträge