Salvato in:
Dettagli Bibliografici
Autori principali: Ma, Yiyao, Chen, Kai, Zhou, Zhongxiang, Song, Zhuheng, Xie, Dongsheng, Tan, Zelong, Xiong, Rong, Dou, Qi
Natura: Preprint
Pubblicazione: 2026
Soggetti:
Accesso online:https://arxiv.org/abs/2605.29661
Tags: Aggiungi Tag
Nessun Tag, puoi essere il primo ad aggiungerne!!
_version_ 1866911727155675136
author Ma, Yiyao
Chen, Kai
Zhou, Zhongxiang
Song, Zhuheng
Xie, Dongsheng
Tan, Zelong
Xiong, Rong
Dou, Qi
author_facet Ma, Yiyao
Chen, Kai
Zhou, Zhongxiang
Song, Zhuheng
Xie, Dongsheng
Tan, Zelong
Xiong, Rong
Dou, Qi
contents Monocular 3D shape recovery is fundamental to geometric understanding, yet achieving robust generalization across arbitrary viewpoints and unseen object categories remains a significant challenge. In this paper, we present a generalizable deformation learning framework that reconstructs 3D objects by explicitly deforming a category-level shape template to match the target observation. To address complex shape variations between the template and the target, we introduce a geometry-guided feature modeling mechanism. This process first enriches foundation features with template topology to yield a geometry-aware representation, which is then explicitly correlated with the target observation to guide precise deformation. Furthermore, to bridge the disparity between the fixed template and arbitrary target views, we propose a view-adaptive feature aggregation module. This module leverages multi-view template features and their corresponding camera poses to enrich the canonical template representation, ensuring robust feature alignment regardless of the target's perspective. Extensive experiments demonstrate that our approach significantly outperforms state-of-the-art methods in handling large shape variations and diverse viewpoints, exhibiting strong generalization to novel categories and effectively supporting downstream real-world dexterous robotic manipulation tasks. Project homepage: https://GODeform.github.io/
format Preprint
id arxiv_https___arxiv_org_abs_2605_29661
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Geometry-Guided Modeling of Foundation Features Enables Generalizable Object Shape Deformation Learning
Ma, Yiyao
Chen, Kai
Zhou, Zhongxiang
Song, Zhuheng
Xie, Dongsheng
Tan, Zelong
Xiong, Rong
Dou, Qi
Computer Vision and Pattern Recognition
Monocular 3D shape recovery is fundamental to geometric understanding, yet achieving robust generalization across arbitrary viewpoints and unseen object categories remains a significant challenge. In this paper, we present a generalizable deformation learning framework that reconstructs 3D objects by explicitly deforming a category-level shape template to match the target observation. To address complex shape variations between the template and the target, we introduce a geometry-guided feature modeling mechanism. This process first enriches foundation features with template topology to yield a geometry-aware representation, which is then explicitly correlated with the target observation to guide precise deformation. Furthermore, to bridge the disparity between the fixed template and arbitrary target views, we propose a view-adaptive feature aggregation module. This module leverages multi-view template features and their corresponding camera poses to enrich the canonical template representation, ensuring robust feature alignment regardless of the target's perspective. Extensive experiments demonstrate that our approach significantly outperforms state-of-the-art methods in handling large shape variations and diverse viewpoints, exhibiting strong generalization to novel categories and effectively supporting downstream real-world dexterous robotic manipulation tasks. Project homepage: https://GODeform.github.io/
title Geometry-Guided Modeling of Foundation Features Enables Generalizable Object Shape Deformation Learning
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2605.29661