Saved in:
Bibliographic Details
Main Authors: Xie, Huilong, Song, Wenwei, Kang, Wenxiong, Lin, Yihong
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2407.05967
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910517886451712
author Xie, Huilong
Song, Wenwei
Kang, Wenxiong
Lin, Yihong
author_facet Xie, Huilong
Song, Wenwei
Kang, Wenxiong
Lin, Yihong
contents Recent advancements in both transformer-based methods and spiral neighbor sampling techniques have greatly enhanced hand mesh reconstruction. Transformers excel in capturing complex vertex relationships, and spiral neighbor sampling is vital for utilizing topological structures. This paper ingeniously integrates spiral sampling into the Transformer architecture, enhancing its ability to leverage mesh topology for superior performance in hand mesh reconstruction, resulting in substantial accuracy boosts. STMR employs a single image encoder for model efficiency. To augment its information extraction capability, we design the multi-scale pose feature extraction (MSPFE) module, which facilitates the extraction of rich pose features, ultimately enhancing the model's performance. Moreover, the proposed predefined pose-to-vertex lifting (PPVL) method improves vertex feature representation, further boosting reconstruction performance. Extensive experiments on the FreiHAND dataset demonstrate the state-of-the-art performance and unparalleled inference speed of STMR compared with similar backbone methods, showcasing its efficiency and effectiveness. The code is available at https://github.com/SmallXieGithub/STMR.
format Preprint
id arxiv_https___arxiv_org_abs_2407_05967
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle STMR: Spiral Transformer for Hand Mesh Reconstruction
Xie, Huilong
Song, Wenwei
Kang, Wenxiong
Lin, Yihong
Computer Vision and Pattern Recognition
Recent advancements in both transformer-based methods and spiral neighbor sampling techniques have greatly enhanced hand mesh reconstruction. Transformers excel in capturing complex vertex relationships, and spiral neighbor sampling is vital for utilizing topological structures. This paper ingeniously integrates spiral sampling into the Transformer architecture, enhancing its ability to leverage mesh topology for superior performance in hand mesh reconstruction, resulting in substantial accuracy boosts. STMR employs a single image encoder for model efficiency. To augment its information extraction capability, we design the multi-scale pose feature extraction (MSPFE) module, which facilitates the extraction of rich pose features, ultimately enhancing the model's performance. Moreover, the proposed predefined pose-to-vertex lifting (PPVL) method improves vertex feature representation, further boosting reconstruction performance. Extensive experiments on the FreiHAND dataset demonstrate the state-of-the-art performance and unparalleled inference speed of STMR compared with similar backbone methods, showcasing its efficiency and effectiveness. The code is available at https://github.com/SmallXieGithub/STMR.
title STMR: Spiral Transformer for Hand Mesh Reconstruction
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2407.05967