Affichage MARC: :: Library Catalog

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Li, Yunxiang, Yuan, Rui, Fan, Chen, Schmidt, Mark, Horváth, Samuel, Gower, Robert M., Takáč, Martin
Format:	Preprint
Publié:	2024
Sujets:	Machine Learning
Accès en ligne:	https://arxiv.org/abs/2404.07525
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

_version_	1866914748936749056
author	Li, Yunxiang Yuan, Rui Fan, Chen Schmidt, Mark Horváth, Samuel Gower, Robert M. Takáč, Martin
author_facet	Li, Yunxiang Yuan, Rui Fan, Chen Schmidt, Mark Horváth, Samuel Gower, Robert M. Takáč, Martin
contents	Policy gradient is a widely utilized and foundational algorithm in the field of reinforcement learning (RL). Renowned for its convergence guarantees and stability compared to other RL algorithms, its practical application is often hindered by sensitivity to hyper-parameters, particularly the step-size. In this paper, we introduce the integration of the Polyak step-size in RL, which automatically adjusts the step-size without prior knowledge. To adapt this method to RL settings, we address several issues, including unknown f* in the Polyak step-size. Additionally, we showcase the performance of the Polyak step-size in RL through experiments, demonstrating faster convergence and the attainment of more stable policies.
format	Preprint
id	arxiv_https___arxiv_org_abs_2404_07525
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Enhancing Policy Gradient with the Polyak Step-Size Adaption Li, Yunxiang Yuan, Rui Fan, Chen Schmidt, Mark Horváth, Samuel Gower, Robert M. Takáč, Martin Machine Learning Policy gradient is a widely utilized and foundational algorithm in the field of reinforcement learning (RL). Renowned for its convergence guarantees and stability compared to other RL algorithms, its practical application is often hindered by sensitivity to hyper-parameters, particularly the step-size. In this paper, we introduce the integration of the Polyak step-size in RL, which automatically adjusts the step-size without prior knowledge. To adapt this method to RL settings, we address several issues, including unknown f* in the Polyak step-size. Additionally, we showcase the performance of the Polyak step-size in RL through experiments, demonstrating faster convergence and the attainment of more stable policies.
title	Enhancing Policy Gradient with the Polyak Step-Size Adaption
topic	Machine Learning
url	https://arxiv.org/abs/2404.07525

Documents similaires