Saved in:
Bibliographic Details
Main Authors: Yin, Shuyu, Zhou, Qixuan, Wen, Fei, Luo, Tao
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2402.16899
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911790426750976
author Yin, Shuyu
Zhou, Qixuan
Wen, Fei
Luo, Tao
author_facet Yin, Shuyu
Zhou, Qixuan
Wen, Fei
Luo, Tao
contents Deep reinforcement learning excels in numerous large-scale practical applications. However, existing performance analyses ignores the unique characteristics of continuous-time control problems, is unable to directly estimate the generalization error of the Bellman optimal loss and require a boundedness assumption. Our work focuses on continuous-time control problems and proposes a method that is applicable to all such problems where the transition function satisfies semi-group and Lipschitz properties. Under this method, we can directly analyze the \emph{a priori} generalization error of the Bellman optimal loss. The core of this method lies in two transformations of the loss function. To complete the transformation, we propose a decomposition method for the maximum operator. Additionally, this analysis method does not require a boundedness assumption. Finally, we obtain an \emph{a priori} generalization error without the curse of dimensionality.
format Preprint
id arxiv_https___arxiv_org_abs_2402_16899
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle A priori Estimates for Deep Residual Network in Continuous-time Reinforcement Learning
Yin, Shuyu
Zhou, Qixuan
Wen, Fei
Luo, Tao
Machine Learning
Artificial Intelligence
Deep reinforcement learning excels in numerous large-scale practical applications. However, existing performance analyses ignores the unique characteristics of continuous-time control problems, is unable to directly estimate the generalization error of the Bellman optimal loss and require a boundedness assumption. Our work focuses on continuous-time control problems and proposes a method that is applicable to all such problems where the transition function satisfies semi-group and Lipschitz properties. Under this method, we can directly analyze the \emph{a priori} generalization error of the Bellman optimal loss. The core of this method lies in two transformations of the loss function. To complete the transformation, we propose a decomposition method for the maximum operator. Additionally, this analysis method does not require a boundedness assumption. Finally, we obtain an \emph{a priori} generalization error without the curse of dimensionality.
title A priori Estimates for Deep Residual Network in Continuous-time Reinforcement Learning
topic Machine Learning
Artificial Intelligence
url https://arxiv.org/abs/2402.16899