Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Kim, Jihun, Lavaei, Javad
Format:	Preprint
Published:	2024
Subjects:	Systems and Control 68W27, 93C10
Online Access:	https://arxiv.org/abs/2410.03230
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914964569063424
author	Kim, Jihun Lavaei, Javad
author_facet	Kim, Jihun Lavaei, Javad
contents	This paper is concerned with the online bandit nonlinear control, which aims to learn the best stabilizing controller from a pool of stabilizing and destabilizing controllers of unknown types for a given nonlinear dynamical system. We develop an algorithm, named Dynamic Batch length and Adaptive learning Rate (DBAR), and study its stability and regret. Unlike the existing Exp3 algorithm requiring an exponentially stabilizing controller, DBAR only needs a significantly weaker notion of controller stability, in which case substantial time may be required to certify the system stability. Dynamic batch length in DBAR effectively addresses this issue and enables the system to attain asymptotic stability, where the algorithm behaves as if there were no destabilizing controllers. Moreover, adaptive learning rate in DBAR only uses the state norm information to achieve a tight regret bound even when none of the stabilizing controllers in the pool are exponentially stabilizing.
format	Preprint
id	arxiv_https___arxiv_org_abs_2410_03230
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Online Bandit Nonlinear Control with Dynamic Batch Length and Adaptive Learning Rate Kim, Jihun Lavaei, Javad Systems and Control 68W27, 93C10 This paper is concerned with the online bandit nonlinear control, which aims to learn the best stabilizing controller from a pool of stabilizing and destabilizing controllers of unknown types for a given nonlinear dynamical system. We develop an algorithm, named Dynamic Batch length and Adaptive learning Rate (DBAR), and study its stability and regret. Unlike the existing Exp3 algorithm requiring an exponentially stabilizing controller, DBAR only needs a significantly weaker notion of controller stability, in which case substantial time may be required to certify the system stability. Dynamic batch length in DBAR effectively addresses this issue and enables the system to attain asymptotic stability, where the algorithm behaves as if there were no destabilizing controllers. Moreover, adaptive learning rate in DBAR only uses the state norm information to achieve a tight regret bound even when none of the stabilizing controllers in the pool are exponentially stabilizing.
title	Online Bandit Nonlinear Control with Dynamic Batch Length and Adaptive Learning Rate
topic	Systems and Control 68W27, 93C10
url	https://arxiv.org/abs/2410.03230

Similar Items