MARC21: :: Library Catalog

Salvato in:

Dettagli Bibliografici
Autori principali:	Chen, Xuyang, Zhao, Lin
Natura:	Preprint
Pubblicazione:	2022
Soggetti:	Machine Learning Optimization and Control
Accesso online:	https://arxiv.org/abs/2210.09921
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

_version_	1866929224499068928
author	Chen, Xuyang Zhao, Lin
author_facet	Chen, Xuyang Zhao, Lin
contents	Actor-critic methods have achieved significant success in many challenging applications. However, its finite-time convergence is still poorly understood in the most practical single-timescale form. Existing works on analyzing single-timescale actor-critic have been limited to i.i.d. sampling or tabular setting for simplicity. We investigate the more practical online single-timescale actor-critic algorithm on continuous state space, where the critic assumes linear function approximation and updates with a single Markovian sample per actor step. Previous analysis has been unable to establish the convergence for such a challenging scenario. We demonstrate that the online single-timescale actor-critic method provably finds an $ε$-approximate stationary point with $\widetilde{\mathcal{O}}(ε^{-2})$ sample complexity under standard assumptions, which can be further improved to $\mathcal{O}(ε^{-2})$ under the i.i.d. sampling. Our novel framework systematically evaluates and controls the error propagation between the actor and critic. It offers a promising approach for analyzing other single-timescale reinforcement learning algorithms as well.
format	Preprint
id	arxiv_https___arxiv_org_abs_2210_09921
institution	arXiv
publishDate	2022
record_format	arxiv
spellingShingle	Finite-time analysis of single-timescale actor-critic Chen, Xuyang Zhao, Lin Machine Learning Optimization and Control Actor-critic methods have achieved significant success in many challenging applications. However, its finite-time convergence is still poorly understood in the most practical single-timescale form. Existing works on analyzing single-timescale actor-critic have been limited to i.i.d. sampling or tabular setting for simplicity. We investigate the more practical online single-timescale actor-critic algorithm on continuous state space, where the critic assumes linear function approximation and updates with a single Markovian sample per actor step. Previous analysis has been unable to establish the convergence for such a challenging scenario. We demonstrate that the online single-timescale actor-critic method provably finds an $ε$-approximate stationary point with $\widetilde{\mathcal{O}}(ε^{-2})$ sample complexity under standard assumptions, which can be further improved to $\mathcal{O}(ε^{-2})$ under the i.i.d. sampling. Our novel framework systematically evaluates and controls the error propagation between the actor and critic. It offers a promising approach for analyzing other single-timescale reinforcement learning algorithms as well.
title	Finite-time analysis of single-timescale actor-critic
topic	Machine Learning Optimization and Control
url	https://arxiv.org/abs/2210.09921

Documenti analoghi