Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Rank, Ben, Triantafyllou, Stelios, Mandal, Debmalya, Radanovic, Goran
Formato:	Preprint
Publicado:	2024
Materias:	Machine Learning
Acceso en línea:	https://arxiv.org/abs/2402.09838
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866914817493696512
author	Rank, Ben Triantafyllou, Stelios Mandal, Debmalya Radanovic, Goran
author_facet	Rank, Ben Triantafyllou, Stelios Mandal, Debmalya Radanovic, Goran
contents	When Reinforcement Learning (RL) agents are deployed in practice, they might impact their environment and change its dynamics. We propose a new framework to model this phenomenon, where the current environment depends on the deployed policy as well as its previous dynamics. This is a generalization of Performative RL (PRL) [Mandal et al., 2023]. Unlike PRL, our framework allows to model scenarios where the environment gradually adjusts to a deployed policy. We adapt two algorithms from the performative prediction literature to our setting and propose a novel algorithm called Mixed Delayed Repeated Retraining (MDRR). We provide conditions under which these algorithms converge and compare them using three metrics: number of retrainings, approximation guarantee, and number of samples per deployment. MDRR is the first algorithm in this setting which combines samples from multiple deployments in its training. This makes MDRR particularly suitable for scenarios where the environment's response strongly depends on its previous dynamics, which are common in practice. We experimentally compare the algorithms using a simulation-based testbed and our results show that MDRR converges significantly faster than previous approaches.
format	Preprint
id	arxiv_https___arxiv_org_abs_2402_09838
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Performative Reinforcement Learning in Gradually Shifting Environments Rank, Ben Triantafyllou, Stelios Mandal, Debmalya Radanovic, Goran Machine Learning When Reinforcement Learning (RL) agents are deployed in practice, they might impact their environment and change its dynamics. We propose a new framework to model this phenomenon, where the current environment depends on the deployed policy as well as its previous dynamics. This is a generalization of Performative RL (PRL) [Mandal et al., 2023]. Unlike PRL, our framework allows to model scenarios where the environment gradually adjusts to a deployed policy. We adapt two algorithms from the performative prediction literature to our setting and propose a novel algorithm called Mixed Delayed Repeated Retraining (MDRR). We provide conditions under which these algorithms converge and compare them using three metrics: number of retrainings, approximation guarantee, and number of samples per deployment. MDRR is the first algorithm in this setting which combines samples from multiple deployments in its training. This makes MDRR particularly suitable for scenarios where the environment's response strongly depends on its previous dynamics, which are common in practice. We experimentally compare the algorithms using a simulation-based testbed and our results show that MDRR converges significantly faster than previous approaches.
title	Performative Reinforcement Learning in Gradually Shifting Environments
topic	Machine Learning
url	https://arxiv.org/abs/2402.09838

Ejemplares similares