Saved in:
Bibliographic Details
Main Authors: Zhang, Hanxin, Dhafer, Abdulqader, Hao, Zhou Daniel, Dong, Hongbiao
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2503.03579
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910859702304768
author Zhang, Hanxin
Dhafer, Abdulqader
Hao, Zhou Daniel
Dong, Hongbiao
author_facet Zhang, Hanxin
Dhafer, Abdulqader
Hao, Zhou Daniel
Dong, Hongbiao
contents We propose a novel system for robot-to-human object handover that emulates human coworker interactions. Unlike most existing studies that focus primarily on grasping strategies and motion planning, our system focus on 1. inferring human handover intents, 2. imagining spatial handover configuration. The first one integrates multimodal perception-combining visual and verbal cues-to infer human intent. The second one using a diffusion-based model to generate the handover configuration, involving the spacial relationship among robot's gripper, the object, and the human hand, thereby mimicking the cognitive process of motor imagery. Experimental results demonstrate that our approach effectively interprets human cues and achieves fluent, human-like handovers, offering a promising solution for collaborative robotics. Code, videos, and data are available at: https://i3handover.github.io.
format Preprint
id arxiv_https___arxiv_org_abs_2503_03579
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle A Generative System for Robot-to-Human Handovers: from Intent Inference to Spatial Configuration Imagery
Zhang, Hanxin
Dhafer, Abdulqader
Hao, Zhou Daniel
Dong, Hongbiao
Robotics
Machine Learning
I.2.9
We propose a novel system for robot-to-human object handover that emulates human coworker interactions. Unlike most existing studies that focus primarily on grasping strategies and motion planning, our system focus on 1. inferring human handover intents, 2. imagining spatial handover configuration. The first one integrates multimodal perception-combining visual and verbal cues-to infer human intent. The second one using a diffusion-based model to generate the handover configuration, involving the spacial relationship among robot's gripper, the object, and the human hand, thereby mimicking the cognitive process of motor imagery. Experimental results demonstrate that our approach effectively interprets human cues and achieves fluent, human-like handovers, offering a promising solution for collaborative robotics. Code, videos, and data are available at: https://i3handover.github.io.
title A Generative System for Robot-to-Human Handovers: from Intent Inference to Spatial Configuration Imagery
topic Robotics
Machine Learning
I.2.9
url https://arxiv.org/abs/2503.03579