Saved in:
Bibliographic Details
Main Authors: Zhang, Yonghao, He, Qiang, Wan, Yanguang, Zhang, Yinda, Deng, Xiaoming, Ma, Cuixia, Wang, Hongan
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2412.20657
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910766422032384
author Zhang, Yonghao
He, Qiang
Wan, Yanguang
Zhang, Yinda
Deng, Xiaoming
Ma, Cuixia
Wang, Hongan
author_facet Zhang, Yonghao
He, Qiang
Wan, Yanguang
Zhang, Yinda
Deng, Xiaoming
Ma, Cuixia
Wang, Hongan
contents Generating high-quality whole-body human object interaction motion sequences is becoming increasingly important in various fields such as animation, VR/AR, and robotics. The main challenge of this task lies in determining the level of involvement of each hand given the complex shapes of objects in different sizes and their different motion trajectories, while ensuring strong grasping realism and guaranteeing the coordination of movement in all body parts. Contrasting with existing work, which either generates human interaction motion sequences without detailed hand grasping poses or only models a static grasping pose, we propose a simple yet effective framework that jointly models the relationship between the body, hands, and the given object motion sequences within a single diffusion model. To guide our network in perceiving the object's spatial position and learning more natural grasping poses, we introduce novel contact-aware losses and incorporate a data-driven, carefully designed guidance. Experimental results demonstrate that our approach outperforms the state-of-the-art method and generates plausible whole-body motion sequences.
format Preprint
id arxiv_https___arxiv_org_abs_2412_20657
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Diffgrasp: Whole-Body Grasping Synthesis Guided by Object Motion Using a Diffusion Model
Zhang, Yonghao
He, Qiang
Wan, Yanguang
Zhang, Yinda
Deng, Xiaoming
Ma, Cuixia
Wang, Hongan
Computer Vision and Pattern Recognition
Generating high-quality whole-body human object interaction motion sequences is becoming increasingly important in various fields such as animation, VR/AR, and robotics. The main challenge of this task lies in determining the level of involvement of each hand given the complex shapes of objects in different sizes and their different motion trajectories, while ensuring strong grasping realism and guaranteeing the coordination of movement in all body parts. Contrasting with existing work, which either generates human interaction motion sequences without detailed hand grasping poses or only models a static grasping pose, we propose a simple yet effective framework that jointly models the relationship between the body, hands, and the given object motion sequences within a single diffusion model. To guide our network in perceiving the object's spatial position and learning more natural grasping poses, we introduce novel contact-aware losses and incorporate a data-driven, carefully designed guidance. Experimental results demonstrate that our approach outperforms the state-of-the-art method and generates plausible whole-body motion sequences.
title Diffgrasp: Whole-Body Grasping Synthesis Guided by Object Motion Using a Diffusion Model
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2412.20657