Saved in:
Bibliographic Details
Main Authors: Ahn, Michael, Arenas, Montserrat Gonzalez, Bennice, Matthew, Brown, Noah, Chan, Christine, David, Byron, Francis, Anthony, Gonzalez, Gavin, Hessmer, Rainer, Jackson, Tomas, Joshi, Nikhil J, Lam, Daniel, Lee, Tsang-Wei Edward, Luong, Alex, Maddineni, Sharath, Patel, Harsh, Peralta, Jodilyn, Quiambao, Jornell, Reyes, Diego, Ruano, Rosario M Jauregui, Sadigh, Dorsa, Sanketi, Pannag, Takayama, Leila, Vodenski, Pavel, Xia, Fei
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2405.16021
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Robots today can exploit the rich world knowledge of large language models to chain simple behavioral skills into long-horizon tasks. However, robots often get interrupted during long-horizon tasks due to primitive skill failures and dynamic environments. We propose VADER, a plan, execute, detect framework with seeking help as a new skill that enables robots to recover and complete long-horizon tasks with the help of humans or other robots. VADER leverages visual question answering (VQA) modules to detect visual affordances and recognize execution errors. It then generates prompts for a language model planner (LMP) which decides when to seek help from another robot or human to recover from errors in long-horizon task execution. We show the effectiveness of VADER with two long-horizon robotic tasks. Our pilot study showed that VADER is capable of performing complex long-horizon tasks by asking for help from another robot to clear a table. Our user study showed that VADER is capable of performing complex long-horizon tasks by asking for help from a human to clear a path. We gathered feedback from people (N=19) about the performance of the VADER performance vs. a robot that did not ask for help. https://google-vader.github.io/