Saved in:
Bibliographic Details
Main Authors: Micheli, Vincent, Alonso, Eloi, Fleuret, François
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2406.19320
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913406752129024
author Micheli, Vincent
Alonso, Eloi
Fleuret, François
author_facet Micheli, Vincent
Alonso, Eloi
Fleuret, François
contents Scaling up deep Reinforcement Learning (RL) methods presents a significant challenge. Following developments in generative modelling, model-based RL positions itself as a strong contender. Recent advances in sequence modelling have led to effective transformer-based world models, albeit at the price of heavy computations due to the long sequences of tokens required to accurately simulate environments. In this work, we propose $Δ$-IRIS, a new agent with a world model architecture composed of a discrete autoencoder that encodes stochastic deltas between time steps and an autoregressive transformer that predicts future deltas by summarizing the current state of the world with continuous tokens. In the Crafter benchmark, $Δ$-IRIS sets a new state of the art at multiple frame budgets, while being an order of magnitude faster to train than previous attention-based approaches. We release our code and models at https://github.com/vmicheli/delta-iris.
format Preprint
id arxiv_https___arxiv_org_abs_2406_19320
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Efficient World Models with Context-Aware Tokenization
Micheli, Vincent
Alonso, Eloi
Fleuret, François
Machine Learning
Artificial Intelligence
Computer Vision and Pattern Recognition
Scaling up deep Reinforcement Learning (RL) methods presents a significant challenge. Following developments in generative modelling, model-based RL positions itself as a strong contender. Recent advances in sequence modelling have led to effective transformer-based world models, albeit at the price of heavy computations due to the long sequences of tokens required to accurately simulate environments. In this work, we propose $Δ$-IRIS, a new agent with a world model architecture composed of a discrete autoencoder that encodes stochastic deltas between time steps and an autoregressive transformer that predicts future deltas by summarizing the current state of the world with continuous tokens. In the Crafter benchmark, $Δ$-IRIS sets a new state of the art at multiple frame budgets, while being an order of magnitude faster to train than previous attention-based approaches. We release our code and models at https://github.com/vmicheli/delta-iris.
title Efficient World Models with Context-Aware Tokenization
topic Machine Learning
Artificial Intelligence
Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2406.19320