Saved in:
Bibliographic Details
Main Authors: Liang, Yiming, Xu, Tianhan, Kikuchi, Yuta
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2504.06210
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912316670345216
author Liang, Yiming
Xu, Tianhan
Kikuchi, Yuta
author_facet Liang, Yiming
Xu, Tianhan
Kikuchi, Yuta
contents We present Hierarchical Motion Representation (HiMoR), a novel deformation representation for 3D Gaussian primitives capable of achieving high-quality monocular dynamic 3D reconstruction. The insight behind HiMoR is that motions in everyday scenes can be decomposed into coarser motions that serve as the foundation for finer details. Using a tree structure, HiMoR's nodes represent different levels of motion detail, with shallower nodes modeling coarse motion for temporal smoothness and deeper nodes capturing finer motion. Additionally, our model uses a few shared motion bases to represent motions of different sets of nodes, aligning with the assumption that motion tends to be smooth and simple. This motion representation design provides Gaussians with a more structured deformation, maximizing the use of temporal relationships to tackle the challenging task of monocular dynamic 3D reconstruction. We also propose using a more reliable perceptual metric as an alternative, given that pixel-level metrics for evaluating monocular dynamic 3D reconstruction can sometimes fail to accurately reflect the true quality of reconstruction. Extensive experiments demonstrate our method's efficacy in achieving superior novel view synthesis from challenging monocular videos with complex motions.
format Preprint
id arxiv_https___arxiv_org_abs_2504_06210
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle HiMoR: Monocular Deformable Gaussian Reconstruction with Hierarchical Motion Representation
Liang, Yiming
Xu, Tianhan
Kikuchi, Yuta
Computer Vision and Pattern Recognition
We present Hierarchical Motion Representation (HiMoR), a novel deformation representation for 3D Gaussian primitives capable of achieving high-quality monocular dynamic 3D reconstruction. The insight behind HiMoR is that motions in everyday scenes can be decomposed into coarser motions that serve as the foundation for finer details. Using a tree structure, HiMoR's nodes represent different levels of motion detail, with shallower nodes modeling coarse motion for temporal smoothness and deeper nodes capturing finer motion. Additionally, our model uses a few shared motion bases to represent motions of different sets of nodes, aligning with the assumption that motion tends to be smooth and simple. This motion representation design provides Gaussians with a more structured deformation, maximizing the use of temporal relationships to tackle the challenging task of monocular dynamic 3D reconstruction. We also propose using a more reliable perceptual metric as an alternative, given that pixel-level metrics for evaluating monocular dynamic 3D reconstruction can sometimes fail to accurately reflect the true quality of reconstruction. Extensive experiments demonstrate our method's efficacy in achieving superior novel view synthesis from challenging monocular videos with complex motions.
title HiMoR: Monocular Deformable Gaussian Reconstruction with Hierarchical Motion Representation
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2504.06210