Table des matières: :: Library Catalog

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Liang, Zongxia, Luo, Xiaodong, Yu, Xiang
Format:	Preprint
Publié:	2025
Sujets:	Optimization and Control 93E20, 93B47, 49K45
Accès en ligne:	https://arxiv.org/abs/2506.22203
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

Table des matières:

We develop a continuous-time reinforcement learning framework for a class of singular stochastic control problems without entropy regularization. The optimal singular control is characterized as the optimal singular control law, which is a pair of regions of time and the augmented states. The goal of learning is to identify such an optimal region via the trial-and-error procedure. In this context, we generalize the existing policy evaluation theories with regular controls to learn our optimal singular control law and develop a policy improvement theorem via the region iteration. To facilitate the model-free policy iteration procedure, we further introduce the zero-order and first-order q-functions arising from singular control problems and establish the martingale characterization for the pair of q-functions together with the value function. Based on our theoretical findings, some q-learning algorithms are devised accordingly and a numerical example based on simulation experiment is presented.

Documents similaires