Saved in:
Bibliographic Details
Main Authors: Li, Yixiao, Barth, Julia, Kiefer, Thomas, Fraij, Ahmad
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2510.07562
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Multi-modal behavior cloning faces significant challenges due to mode averaging and mode collapse, where traditional models fail to capture diverse input-output mappings. This problem is critical in applications like robotics, where modeling multiple valid actions ensures both performance and safety. We propose EBGAN-MDN, a framework that integrates energy-based models, Mixture Density Networks (MDNs), and adversarial training. By leveraging a modified InfoNCE loss and an energy-enforced MDN loss, EBGAN-MDN effectively addresses these challenges. Experiments on synthetic and robotic benchmarks demonstrate superior performance, establishing EBGAN-MDN as a effective and efficient solution for multi-modal learning tasks.