Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yin, Shuyu, Zhou, Qixuan, Wen, Fei, Luo, Tao
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2402.16899
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911790426750976
author	Yin, Shuyu Zhou, Qixuan Wen, Fei Luo, Tao
author_facet	Yin, Shuyu Zhou, Qixuan Wen, Fei Luo, Tao
contents	Deep reinforcement learning excels in numerous large-scale practical applications. However, existing performance analyses ignores the unique characteristics of continuous-time control problems, is unable to directly estimate the generalization error of the Bellman optimal loss and require a boundedness assumption. Our work focuses on continuous-time control problems and proposes a method that is applicable to all such problems where the transition function satisfies semi-group and Lipschitz properties. Under this method, we can directly analyze the \emph{a priori} generalization error of the Bellman optimal loss. The core of this method lies in two transformations of the loss function. To complete the transformation, we propose a decomposition method for the maximum operator. Additionally, this analysis method does not require a boundedness assumption. Finally, we obtain an \emph{a priori} generalization error without the curse of dimensionality.
format	Preprint
id	arxiv_https___arxiv_org_abs_2402_16899
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	A priori Estimates for Deep Residual Network in Continuous-time Reinforcement Learning Yin, Shuyu Zhou, Qixuan Wen, Fei Luo, Tao Machine Learning Artificial Intelligence Deep reinforcement learning excels in numerous large-scale practical applications. However, existing performance analyses ignores the unique characteristics of continuous-time control problems, is unable to directly estimate the generalization error of the Bellman optimal loss and require a boundedness assumption. Our work focuses on continuous-time control problems and proposes a method that is applicable to all such problems where the transition function satisfies semi-group and Lipschitz properties. Under this method, we can directly analyze the \emph{a priori} generalization error of the Bellman optimal loss. The core of this method lies in two transformations of the loss function. To complete the transformation, we propose a decomposition method for the maximum operator. Additionally, this analysis method does not require a boundedness assumption. Finally, we obtain an \emph{a priori} generalization error without the curse of dimensionality.
title	A priori Estimates for Deep Residual Network in Continuous-time Reinforcement Learning
topic	Machine Learning Artificial Intelligence
url	https://arxiv.org/abs/2402.16899

Similar Items