Enregistré dans:
Détails bibliographiques
Auteurs principaux: Morad, Steven, Lu, Chris, Kortvelesy, Ryan, Liwicki, Stephan, Foerster, Jakob, Prorok, Amanda
Format: Preprint
Publié: 2024
Sujets:
Accès en ligne:https://arxiv.org/abs/2402.09900
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
_version_ 1866916456410644480
author Morad, Steven
Lu, Chris
Kortvelesy, Ryan
Liwicki, Stephan
Foerster, Jakob
Prorok, Amanda
author_facet Morad, Steven
Lu, Chris
Kortvelesy, Ryan
Liwicki, Stephan
Foerster, Jakob
Prorok, Amanda
contents Memory models such as Recurrent Neural Networks (RNNs) and Transformers address Partially Observable Markov Decision Processes (POMDPs) by mapping trajectories to latent Markov states. Neither model scales particularly well to long sequences, especially compared to an emerging class of memory models called Linear Recurrent Models. We discover that the recurrent update of these models resembles a monoid, leading us to reformulate existing models using a novel monoid-based framework that we call memoroids. We revisit the traditional approach to batching in recurrent reinforcement learning, highlighting theoretical and empirical deficiencies. We leverage memoroids to propose a batching method that improves sample efficiency, increases the return, and simplifies the implementation of recurrent loss functions in reinforcement learning.
format Preprint
id arxiv_https___arxiv_org_abs_2402_09900
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Recurrent Reinforcement Learning with Memoroids
Morad, Steven
Lu, Chris
Kortvelesy, Ryan
Liwicki, Stephan
Foerster, Jakob
Prorok, Amanda
Machine Learning
Artificial Intelligence
Memory models such as Recurrent Neural Networks (RNNs) and Transformers address Partially Observable Markov Decision Processes (POMDPs) by mapping trajectories to latent Markov states. Neither model scales particularly well to long sequences, especially compared to an emerging class of memory models called Linear Recurrent Models. We discover that the recurrent update of these models resembles a monoid, leading us to reformulate existing models using a novel monoid-based framework that we call memoroids. We revisit the traditional approach to batching in recurrent reinforcement learning, highlighting theoretical and empirical deficiencies. We leverage memoroids to propose a batching method that improves sample efficiency, increases the return, and simplifies the implementation of recurrent loss functions in reinforcement learning.
title Recurrent Reinforcement Learning with Memoroids
topic Machine Learning
Artificial Intelligence
url https://arxiv.org/abs/2402.09900