Saved in:
Bibliographic Details
Main Authors: Agrawal, Rishabh, Dahlin, Nathan, Jain, Rahul, Nayyar, Ashutosh
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2408.09125
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913470937563136
author Agrawal, Rishabh
Dahlin, Nathan
Jain, Rahul
Nayyar, Ashutosh
author_facet Agrawal, Rishabh
Dahlin, Nathan
Jain, Rahul
Nayyar, Ashutosh
contents Imitation learning (IL) is notably effective for robotic tasks where directly programming behaviors or defining optimal control costs is challenging. In this work, we address a scenario where the imitator relies solely on observed behavior and cannot make environmental interactions during learning. It does not have additional supplementary datasets beyond the expert's dataset nor any information about the transition dynamics. Unlike state-of-the-art (SOTA) IL methods, this approach tackles the limitations of conventional IL by operating in a more constrained and realistic setting. Our method uses the Markov balance equation and introduces a novel conditional density estimation-based imitation learning framework. It employs conditional normalizing flows for transition dynamics estimation and aims at satisfying a balance equation for the environment. Through a series of numerical experiments on Classic Control and MuJoCo environments, we demonstrate consistently superior empirical performance compared to many SOTA IL algorithms.
format Preprint
id arxiv_https___arxiv_org_abs_2408_09125
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Markov Balance Satisfaction Improves Performance in Strictly Batch Offline Imitation Learning
Agrawal, Rishabh
Dahlin, Nathan
Jain, Rahul
Nayyar, Ashutosh
Machine Learning
Artificial Intelligence
Imitation learning (IL) is notably effective for robotic tasks where directly programming behaviors or defining optimal control costs is challenging. In this work, we address a scenario where the imitator relies solely on observed behavior and cannot make environmental interactions during learning. It does not have additional supplementary datasets beyond the expert's dataset nor any information about the transition dynamics. Unlike state-of-the-art (SOTA) IL methods, this approach tackles the limitations of conventional IL by operating in a more constrained and realistic setting. Our method uses the Markov balance equation and introduces a novel conditional density estimation-based imitation learning framework. It employs conditional normalizing flows for transition dynamics estimation and aims at satisfying a balance equation for the environment. Through a series of numerical experiments on Classic Control and MuJoCo environments, we demonstrate consistently superior empirical performance compared to many SOTA IL algorithms.
title Markov Balance Satisfaction Improves Performance in Strictly Batch Offline Imitation Learning
topic Machine Learning
Artificial Intelligence
url https://arxiv.org/abs/2408.09125