Enregistré dans:
Détails bibliographiques
Auteurs principaux: Wang, Renzi, Acerbo, Flavia Sofia, Son, Tong Duy, Patrinos, Panagiotis
Format: Preprint
Publié: 2024
Sujets:
Accès en ligne:https://arxiv.org/abs/2411.08232
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
_version_ 1866929589245181952
author Wang, Renzi
Acerbo, Flavia Sofia
Son, Tong Duy
Patrinos, Panagiotis
author_facet Wang, Renzi
Acerbo, Flavia Sofia
Son, Tong Duy
Patrinos, Panagiotis
contents This paper presents a novel approach to imitation learning from observations, where an autoregressive mixture of experts model is deployed to fit the underlying policy. The parameters of the model are learned via a two-stage framework. By leveraging the existing dynamics knowledge, the first stage of the framework estimates the control input sequences and hence reduces the problem complexity. At the second stage, the policy is learned by solving a regularized maximum-likelihood estimation problem using the estimated control input sequences. We further extend the learning procedure by incorporating a Lyapunov stability constraint to ensure asymptotic stability of the identified model, for accurate multi-step predictions. The effectiveness of the proposed framework is validated using two autonomous driving datasets collected from human demonstrations, demonstrating its practical applicability in modelling complex nonlinear dynamics.
format Preprint
id arxiv_https___arxiv_org_abs_2411_08232
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Imitation Learning from Observations: An Autoregressive Mixture of Experts Approach
Wang, Renzi
Acerbo, Flavia Sofia
Son, Tong Duy
Patrinos, Panagiotis
Machine Learning
Optimization and Control
This paper presents a novel approach to imitation learning from observations, where an autoregressive mixture of experts model is deployed to fit the underlying policy. The parameters of the model are learned via a two-stage framework. By leveraging the existing dynamics knowledge, the first stage of the framework estimates the control input sequences and hence reduces the problem complexity. At the second stage, the policy is learned by solving a regularized maximum-likelihood estimation problem using the estimated control input sequences. We further extend the learning procedure by incorporating a Lyapunov stability constraint to ensure asymptotic stability of the identified model, for accurate multi-step predictions. The effectiveness of the proposed framework is validated using two autonomous driving datasets collected from human demonstrations, demonstrating its practical applicability in modelling complex nonlinear dynamics.
title Imitation Learning from Observations: An Autoregressive Mixture of Experts Approach
topic Machine Learning
Optimization and Control
url https://arxiv.org/abs/2411.08232