Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Kwon, Jeongyeol, Yang, Liu, Nowak, Robert, Hanna, Josiah
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2402.07102
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916646576193536
author	Kwon, Jeongyeol Yang, Liu Nowak, Robert Hanna, Josiah
author_facet	Kwon, Jeongyeol Yang, Liu Nowak, Robert Hanna, Josiah
contents	Learning good representations of historical contexts is one of the core challenges of reinforcement learning (RL) in partially observable environments. While self-predictive auxiliary tasks have been shown to improve performance in fully observed settings, their role in partial observability remains underexplored. In this empirical study, we examine the effectiveness of self-predictive representation learning via future prediction, i.e., predicting next-step observations as an auxiliary task for learning history representations, especially in environments with long-term dependencies. We test the hypothesis that future prediction alone can produce representations that enable strong RL performance. To evaluate this, we introduce $\texttt{DRL}^2$, an approach that explicitly decouples representation learning from reinforcement learning, and compare this approach to end-to-end training across multiple benchmarks requiring long-term memory. Our findings provide evidence that this hypothesis holds across different network architectures, reinforcing the idea that future prediction performance serves as a reliable indicator of representation quality and contributes to improved RL performance.
format	Preprint
id	arxiv_https___arxiv_org_abs_2402_07102
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	An Empirical Study on the Power of Future Prediction in Partially Observable Environments Kwon, Jeongyeol Yang, Liu Nowak, Robert Hanna, Josiah Machine Learning Artificial Intelligence Learning good representations of historical contexts is one of the core challenges of reinforcement learning (RL) in partially observable environments. While self-predictive auxiliary tasks have been shown to improve performance in fully observed settings, their role in partial observability remains underexplored. In this empirical study, we examine the effectiveness of self-predictive representation learning via future prediction, i.e., predicting next-step observations as an auxiliary task for learning history representations, especially in environments with long-term dependencies. We test the hypothesis that future prediction alone can produce representations that enable strong RL performance. To evaluate this, we introduce $\texttt{DRL}^2$, an approach that explicitly decouples representation learning from reinforcement learning, and compare this approach to end-to-end training across multiple benchmarks requiring long-term memory. Our findings provide evidence that this hypothesis holds across different network architectures, reinforcing the idea that future prediction performance serves as a reliable indicator of representation quality and contributes to improved RL performance.
title	An Empirical Study on the Power of Future Prediction in Partially Observable Environments
topic	Machine Learning Artificial Intelligence
url	https://arxiv.org/abs/2402.07102

Similar Items