Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Wu, Zhen, Li, Jiaman, Xu, Pei, Liu, C. Karen
Formato:	Preprint
Publicado:	2024
Materias:	Artificial Intelligence Computer Vision and Pattern Recognition
Acceso en línea:	https://arxiv.org/abs/2406.17840
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866909745676288000
author	Wu, Zhen Li, Jiaman Xu, Pei Liu, C. Karen
author_facet	Wu, Zhen Li, Jiaman Xu, Pei Liu, C. Karen
contents	Intelligent agents must autonomously interact with the environments to perform daily tasks based on human-level instructions. They need a foundational understanding of the world to accurately interpret these instructions, along with precise low-level movement and interaction skills to execute the derived actions. In this work, we propose the first complete system for synthesizing physically plausible, long-horizon human-object interactions for object manipulation in contextual environments, driven by human-level instructions. We leverage large language models (LLMs) to interpret the input instructions into detailed execution plans. Unlike prior work, our system is capable of generating detailed finger-object interactions, in seamless coordination with full-body movements. We also train a policy to track generated motions in physics simulation via reinforcement learning (RL) to ensure physical plausibility of the motion. Our experiments demonstrate the effectiveness of our system in synthesizing realistic interactions with diverse objects in complex environments, highlighting its potential for real-world applications.
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_17840
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Human-Object Interaction from Human-Level Instructions Wu, Zhen Li, Jiaman Xu, Pei Liu, C. Karen Artificial Intelligence Computer Vision and Pattern Recognition Intelligent agents must autonomously interact with the environments to perform daily tasks based on human-level instructions. They need a foundational understanding of the world to accurately interpret these instructions, along with precise low-level movement and interaction skills to execute the derived actions. In this work, we propose the first complete system for synthesizing physically plausible, long-horizon human-object interactions for object manipulation in contextual environments, driven by human-level instructions. We leverage large language models (LLMs) to interpret the input instructions into detailed execution plans. Unlike prior work, our system is capable of generating detailed finger-object interactions, in seamless coordination with full-body movements. We also train a policy to track generated motions in physics simulation via reinforcement learning (RL) to ensure physical plausibility of the motion. Our experiments demonstrate the effectiveness of our system in synthesizing realistic interactions with diverse objects in complex environments, highlighting its potential for real-world applications.
title	Human-Object Interaction from Human-Level Instructions
topic	Artificial Intelligence Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2406.17840

Ejemplares similares