Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Agrawal, Rishabh, Dahlin, Nathan, Jain, Rahul, Nayyar, Ashutosh
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Artificial Intelligence
Online Access:	https://arxiv.org/abs/2408.09125
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913470937563136
author	Agrawal, Rishabh Dahlin, Nathan Jain, Rahul Nayyar, Ashutosh
author_facet	Agrawal, Rishabh Dahlin, Nathan Jain, Rahul Nayyar, Ashutosh
contents	Imitation learning (IL) is notably effective for robotic tasks where directly programming behaviors or defining optimal control costs is challenging. In this work, we address a scenario where the imitator relies solely on observed behavior and cannot make environmental interactions during learning. It does not have additional supplementary datasets beyond the expert's dataset nor any information about the transition dynamics. Unlike state-of-the-art (SOTA) IL methods, this approach tackles the limitations of conventional IL by operating in a more constrained and realistic setting. Our method uses the Markov balance equation and introduces a novel conditional density estimation-based imitation learning framework. It employs conditional normalizing flows for transition dynamics estimation and aims at satisfying a balance equation for the environment. Through a series of numerical experiments on Classic Control and MuJoCo environments, we demonstrate consistently superior empirical performance compared to many SOTA IL algorithms.
format	Preprint
id	arxiv_https___arxiv_org_abs_2408_09125
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Markov Balance Satisfaction Improves Performance in Strictly Batch Offline Imitation Learning Agrawal, Rishabh Dahlin, Nathan Jain, Rahul Nayyar, Ashutosh Machine Learning Artificial Intelligence Imitation learning (IL) is notably effective for robotic tasks where directly programming behaviors or defining optimal control costs is challenging. In this work, we address a scenario where the imitator relies solely on observed behavior and cannot make environmental interactions during learning. It does not have additional supplementary datasets beyond the expert's dataset nor any information about the transition dynamics. Unlike state-of-the-art (SOTA) IL methods, this approach tackles the limitations of conventional IL by operating in a more constrained and realistic setting. Our method uses the Markov balance equation and introduces a novel conditional density estimation-based imitation learning framework. It employs conditional normalizing flows for transition dynamics estimation and aims at satisfying a balance equation for the environment. Through a series of numerical experiments on Classic Control and MuJoCo environments, we demonstrate consistently superior empirical performance compared to many SOTA IL algorithms.
title	Markov Balance Satisfaction Improves Performance in Strictly Batch Offline Imitation Learning
topic	Machine Learning Artificial Intelligence
url	https://arxiv.org/abs/2408.09125

Similar Items