Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.16899 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866911790426750976 |
|---|---|
| author | Yin, Shuyu Zhou, Qixuan Wen, Fei Luo, Tao |
| author_facet | Yin, Shuyu Zhou, Qixuan Wen, Fei Luo, Tao |
| contents | Deep reinforcement learning excels in numerous large-scale practical applications. However, existing performance analyses ignores the unique characteristics of continuous-time control problems, is unable to directly estimate the generalization error of the Bellman optimal loss and require a boundedness assumption. Our work focuses on continuous-time control problems and proposes a method that is applicable to all such problems where the transition function satisfies semi-group and Lipschitz properties. Under this method, we can directly analyze the \emph{a priori} generalization error of the Bellman optimal loss. The core of this method lies in two transformations of the loss function. To complete the transformation, we propose a decomposition method for the maximum operator. Additionally, this analysis method does not require a boundedness assumption. Finally, we obtain an \emph{a priori} generalization error without the curse of dimensionality. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2402_16899 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | A priori Estimates for Deep Residual Network in Continuous-time Reinforcement Learning Yin, Shuyu Zhou, Qixuan Wen, Fei Luo, Tao Machine Learning Artificial Intelligence Deep reinforcement learning excels in numerous large-scale practical applications. However, existing performance analyses ignores the unique characteristics of continuous-time control problems, is unable to directly estimate the generalization error of the Bellman optimal loss and require a boundedness assumption. Our work focuses on continuous-time control problems and proposes a method that is applicable to all such problems where the transition function satisfies semi-group and Lipschitz properties. Under this method, we can directly analyze the \emph{a priori} generalization error of the Bellman optimal loss. The core of this method lies in two transformations of the loss function. To complete the transformation, we propose a decomposition method for the maximum operator. Additionally, this analysis method does not require a boundedness assumption. Finally, we obtain an \emph{a priori} generalization error without the curse of dimensionality. |
| title | A priori Estimates for Deep Residual Network in Continuous-time Reinforcement Learning |
| topic | Machine Learning Artificial Intelligence |
| url | https://arxiv.org/abs/2402.16899 |