Saved in:
Bibliographic Details
Main Authors: Alsalti, Mohammad, Lopez, Victor G., Müller, Matthias A.
Format: Preprint
Published: 2023
Subjects:
Online Access:https://arxiv.org/abs/2312.03451
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916362279976960
author Alsalti, Mohammad
Lopez, Victor G.
Müller, Matthias A.
author_facet Alsalti, Mohammad
Lopez, Victor G.
Müller, Matthias A.
contents In this paper, we present a Q-learning algorithm to solve the optimal output regulation problem for discrete-time LTI systems. This off-policy algorithm only relies on using persistently exciting input-output data, measured offline. No model knowledge or state measurements are needed and the obtained optimal policy only uses past input-output information. Moreover, our formulation of the proposed algorithm renders it computationally efficient. We provide conditions that guarantee the convergence of the algorithm to the optimal solution. Finally, the performance of our method is compared to existing algorithms in the literature.
format Preprint
id arxiv_https___arxiv_org_abs_2312_03451
institution arXiv
publishDate 2023
record_format arxiv
spellingShingle An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems
Alsalti, Mohammad
Lopez, Victor G.
Müller, Matthias A.
Systems and Control
In this paper, we present a Q-learning algorithm to solve the optimal output regulation problem for discrete-time LTI systems. This off-policy algorithm only relies on using persistently exciting input-output data, measured offline. No model knowledge or state measurements are needed and the obtained optimal policy only uses past input-output information. Moreover, our formulation of the proposed algorithm renders it computationally efficient. We provide conditions that guarantee the convergence of the algorithm to the optimal solution. Finally, the performance of our method is compared to existing algorithms in the literature.
title An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systems
topic Systems and Control
url https://arxiv.org/abs/2312.03451