Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Sun, Zhongshi, Jia, Guangyan
Format:	Preprint
Published:	2024
Subjects:	Optimization and Control
Online Access:	https://arxiv.org/abs/2402.04721
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866929522573574144
author	Sun, Zhongshi Jia, Guangyan
author_facet	Sun, Zhongshi Jia, Guangyan
contents	In this article, we study a continuous-time stochastic $H_\infty$ control problem based on reinforcement learning (RL) techniques that can be viewed as solving a stochastic linear-quadratic two-person zero-sum differential game (LQZSG). First, we propose an RL algorithm that can iteratively solve stochastic game algebraic Riccati equation based on collected state and control data when all dynamic system information is unknown. In addition, the algorithm only needs to collect data once during the iteration process. Then, we discuss the robustness and convergence of the inner and outer loops of the policy iteration algorithm, respectively, and show that when the error of each iteration is within a certain range, the algorithm can converge to a small neighborhood of the saddle point of the stochastic LQZSG problem. Finally, we applied the proposed RL algorithm to two simulation examples to verify the effectiveness of the algorithm.
format	Preprint
id	arxiv_https___arxiv_org_abs_2402_04721
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Robust policy iteration for continuous-time stochastic $H_\infty$ control problem with unknown dynamics Sun, Zhongshi Jia, Guangyan Optimization and Control In this article, we study a continuous-time stochastic $H_\infty$ control problem based on reinforcement learning (RL) techniques that can be viewed as solving a stochastic linear-quadratic two-person zero-sum differential game (LQZSG). First, we propose an RL algorithm that can iteratively solve stochastic game algebraic Riccati equation based on collected state and control data when all dynamic system information is unknown. In addition, the algorithm only needs to collect data once during the iteration process. Then, we discuss the robustness and convergence of the inner and outer loops of the policy iteration algorithm, respectively, and show that when the error of each iteration is within a certain range, the algorithm can converge to a small neighborhood of the saddle point of the stochastic LQZSG problem. Finally, we applied the proposed RL algorithm to two simulation examples to verify the effectiveness of the algorithm.
title	Robust policy iteration for continuous-time stochastic $H_\infty$ control problem with unknown dynamics
topic	Optimization and Control
url	https://arxiv.org/abs/2402.04721

Similar Items