Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yang, Xueyuan, Yao, Chao, Ban, Xiaojuan
Format:	Preprint
Published:	2023
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence Signal Processing
Online Access:	https://arxiv.org/abs/2401.05412
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911754471079936
author	Yang, Xueyuan Yao, Chao Ban, Xiaojuan
author_facet	Yang, Xueyuan Yao, Chao Ban, Xiaojuan
contents	Leveraging wearable devices for motion reconstruction has emerged as an economical and viable technique. Certain methodologies employ sparse Inertial Measurement Units (IMUs) on the human body and harness data-driven strategies to model human poses. However, the reconstruction of motion based solely on sparse IMUs data is inherently fraught with ambiguity, a consequence of numerous identical IMU readings corresponding to different poses. In this paper, we explore the spatial importance of multiple sensors, supervised by text that describes specific actions. Specifically, uncertainty is introduced to derive weighted features for each IMU. We also design a Hierarchical Temporal Transformer (HTT) and apply contrastive learning to achieve precise temporal and feature alignment of sensor data with textual semantics. Experimental results demonstrate our proposed approach achieves significant improvements in multiple metrics compared to existing methods. Notably, with textual supervision, our method not only differentiates between ambiguous actions such as sitting and standing but also produces more precise and natural motion.
format	Preprint
id	arxiv_https___arxiv_org_abs_2401_05412
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	Spatial-Related Sensors Matters: 3D Human Motion Reconstruction Assisted with Textual Semantics Yang, Xueyuan Yao, Chao Ban, Xiaojuan Computer Vision and Pattern Recognition Artificial Intelligence Signal Processing Leveraging wearable devices for motion reconstruction has emerged as an economical and viable technique. Certain methodologies employ sparse Inertial Measurement Units (IMUs) on the human body and harness data-driven strategies to model human poses. However, the reconstruction of motion based solely on sparse IMUs data is inherently fraught with ambiguity, a consequence of numerous identical IMU readings corresponding to different poses. In this paper, we explore the spatial importance of multiple sensors, supervised by text that describes specific actions. Specifically, uncertainty is introduced to derive weighted features for each IMU. We also design a Hierarchical Temporal Transformer (HTT) and apply contrastive learning to achieve precise temporal and feature alignment of sensor data with textual semantics. Experimental results demonstrate our proposed approach achieves significant improvements in multiple metrics compared to existing methods. Notably, with textual supervision, our method not only differentiates between ambiguous actions such as sitting and standing but also produces more precise and natural motion.
title	Spatial-Related Sensors Matters: 3D Human Motion Reconstruction Assisted with Textual Semantics
topic	Computer Vision and Pattern Recognition Artificial Intelligence Signal Processing
url	https://arxiv.org/abs/2401.05412

Similar Items