Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Kaiser, Lukasz, Babaeizadeh, Mohammad, Milos, Piotr, Osinski, Blazej, Campbell, Roy H, Czechowski, Konrad, Erhan, Dumitru, Finn, Chelsea, Kozakowski, Piotr, Levine, Sergey, Mohiuddin, Afroz, Sepassi, Ryan, Tucker, George, Michalewski, Henryk
Format:	Preprint
Published:	2019
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/1903.00374
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909157844582400
author	Kaiser, Lukasz Babaeizadeh, Mohammad Milos, Piotr Osinski, Blazej Campbell, Roy H Czechowski, Konrad Erhan, Dumitru Finn, Chelsea Kozakowski, Piotr Levine, Sergey Mohiuddin, Afroz Sepassi, Ryan Tucker, George Michalewski, Henryk
author_facet	Kaiser, Lukasz Babaeizadeh, Mohammad Milos, Piotr Osinski, Blazej Campbell, Roy H Czechowski, Konrad Erhan, Dumitru Finn, Chelsea Kozakowski, Piotr Levine, Sergey Mohiuddin, Afroz Sepassi, Ryan Tucker, George Michalewski, Henryk
contents	Model-free reinforcement learning (RL) can be used to learn effective policies for complex tasks, such as Atari games, even from image observations. However, this typically requires very large amounts of interaction -- substantially more, in fact, than a human would need to learn the same games. How can people learn so quickly? Part of the answer may be that people can learn how the game works and predict which actions will lead to desirable outcomes. In this paper, we explore how video prediction models can similarly enable agents to solve Atari games with fewer interactions than model-free methods. We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting. Our experiments evaluate SimPLe on a range of Atari games in low data regime of 100k interactions between the agent and the environment, which corresponds to two hours of real-time play. In most games SimPLe outperforms state-of-the-art model-free algorithms, in some games by over an order of magnitude.
format	Preprint
id	arxiv_https___arxiv_org_abs_1903_00374
institution	arXiv
publishDate	2019
record_format	arxiv
spellingShingle	Model-Based Reinforcement Learning for Atari Kaiser, Lukasz Babaeizadeh, Mohammad Milos, Piotr Osinski, Blazej Campbell, Roy H Czechowski, Konrad Erhan, Dumitru Finn, Chelsea Kozakowski, Piotr Levine, Sergey Mohiuddin, Afroz Sepassi, Ryan Tucker, George Michalewski, Henryk Machine Learning Model-free reinforcement learning (RL) can be used to learn effective policies for complex tasks, such as Atari games, even from image observations. However, this typically requires very large amounts of interaction -- substantially more, in fact, than a human would need to learn the same games. How can people learn so quickly? Part of the answer may be that people can learn how the game works and predict which actions will lead to desirable outcomes. In this paper, we explore how video prediction models can similarly enable agents to solve Atari games with fewer interactions than model-free methods. We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting. Our experiments evaluate SimPLe on a range of Atari games in low data regime of 100k interactions between the agent and the environment, which corresponds to two hours of real-time play. In most games SimPLe outperforms state-of-the-art model-free algorithms, in some games by over an order of magnitude.
title	Model-Based Reinforcement Learning for Atari
topic	Machine Learning
url	https://arxiv.org/abs/1903.00374

Similar Items