Saved in:
Bibliographic Details
Main Authors: Mon-Williams, Ruaridh, Li, Gen, Long, Ran, Du, Wenqian, Lucas, Chris
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2406.11231
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910490723090432
author Mon-Williams, Ruaridh
Li, Gen
Long, Ran
Du, Wenqian
Lucas, Chris
author_facet Mon-Williams, Ruaridh
Li, Gen
Long, Ran
Du, Wenqian
Lucas, Chris
contents Completing complex tasks in unpredictable settings like home kitchens challenges robotic systems. These challenges include interpreting high-level human commands, such as "make me a hot beverage" and performing actions like pouring a precise amount of water into a moving mug. To address these challenges, we present a novel framework that combines Large Language Models (LLMs), a curated Knowledge Base, and Integrated Force and Visual Feedback (IFVF). Our approach interprets abstract instructions, performs long-horizon tasks, and handles various uncertainties. It utilises GPT-4 to analyse the user's query and surroundings, then generates code that accesses a curated database of functions during execution. It translates abstract instructions into actionable steps. Each step involves generating custom code by employing retrieval-augmented generalisation to pull IFVF-relevant examples from the Knowledge Base. IFVF allows the robot to respond to noise and disturbances during execution. We use coffee making and plate decoration to demonstrate our approach, including components ranging from pouring to drawer opening, each benefiting from distinct feedback types and methods. This novel advancement marks significant progress toward a scalable, efficient robotic framework for completing complex tasks in uncertain environments. Our findings are illustrated in an accompanying video and supported by an open-source GitHub repository (released upon paper acceptance).
format Preprint
id arxiv_https___arxiv_org_abs_2406_11231
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Enabling robots to follow abstract instructions and complete complex dynamic tasks
Mon-Williams, Ruaridh
Li, Gen
Long, Ran
Du, Wenqian
Lucas, Chris
Robotics
Artificial Intelligence
Computation and Language
Machine Learning
Completing complex tasks in unpredictable settings like home kitchens challenges robotic systems. These challenges include interpreting high-level human commands, such as "make me a hot beverage" and performing actions like pouring a precise amount of water into a moving mug. To address these challenges, we present a novel framework that combines Large Language Models (LLMs), a curated Knowledge Base, and Integrated Force and Visual Feedback (IFVF). Our approach interprets abstract instructions, performs long-horizon tasks, and handles various uncertainties. It utilises GPT-4 to analyse the user's query and surroundings, then generates code that accesses a curated database of functions during execution. It translates abstract instructions into actionable steps. Each step involves generating custom code by employing retrieval-augmented generalisation to pull IFVF-relevant examples from the Knowledge Base. IFVF allows the robot to respond to noise and disturbances during execution. We use coffee making and plate decoration to demonstrate our approach, including components ranging from pouring to drawer opening, each benefiting from distinct feedback types and methods. This novel advancement marks significant progress toward a scalable, efficient robotic framework for completing complex tasks in uncertain environments. Our findings are illustrated in an accompanying video and supported by an open-source GitHub repository (released upon paper acceptance).
title Enabling robots to follow abstract instructions and complete complex dynamic tasks
topic Robotics
Artificial Intelligence
Computation and Language
Machine Learning
url https://arxiv.org/abs/2406.11231