Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Li, Zhihao, Wang, Yufei, Zheng, Heliang, Luo, Yihao, Wen, Bihan
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2505.14521
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866911001162547200
author	Li, Zhihao Wang, Yufei Zheng, Heliang Luo, Yihao Wen, Bihan
author_facet	Li, Zhihao Wang, Yufei Zheng, Heliang Luo, Yihao Wen, Bihan
contents	High-fidelity 3D object synthesis remains significantly more challenging than 2D image generation due to the unstructured nature of mesh data and the cubic complexity of dense volumetric grids. Existing two-stage pipelines-compressing meshes with a VAE (using either 2D or 3D supervision), followed by latent diffusion sampling-often suffer from severe detail loss caused by inefficient representations and modality mismatches introduced in VAE. We introduce Sparc3D, a unified framework that combines a sparse deformable marching cubes representation Sparcubes with a novel encoder Sparconv-VAE. Sparcubes converts raw meshes into high-resolution ($1024^3$) surfaces with arbitrary topology by scattering signed distance and deformation fields onto a sparse cube, allowing differentiable optimization. Sparconv-VAE is the first modality-consistent variational autoencoder built entirely upon sparse convolutional networks, enabling efficient and near-lossless 3D reconstruction suitable for high-resolution generative modeling through latent diffusion. Sparc3D achieves state-of-the-art reconstruction fidelity on challenging inputs, including open surfaces, disconnected components, and intricate geometry. It preserves fine-grained shape details, reduces training and inference cost, and integrates naturally with latent diffusion models for scalable, high-resolution 3D generation.
format	Preprint
id	arxiv_https___arxiv_org_abs_2505_14521
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Sparc3D: Sparse Representation and Construction for High-Resolution 3D Shapes Modeling Li, Zhihao Wang, Yufei Zheng, Heliang Luo, Yihao Wen, Bihan Computer Vision and Pattern Recognition High-fidelity 3D object synthesis remains significantly more challenging than 2D image generation due to the unstructured nature of mesh data and the cubic complexity of dense volumetric grids. Existing two-stage pipelines-compressing meshes with a VAE (using either 2D or 3D supervision), followed by latent diffusion sampling-often suffer from severe detail loss caused by inefficient representations and modality mismatches introduced in VAE. We introduce Sparc3D, a unified framework that combines a sparse deformable marching cubes representation Sparcubes with a novel encoder Sparconv-VAE. Sparcubes converts raw meshes into high-resolution ($1024^3$) surfaces with arbitrary topology by scattering signed distance and deformation fields onto a sparse cube, allowing differentiable optimization. Sparconv-VAE is the first modality-consistent variational autoencoder built entirely upon sparse convolutional networks, enabling efficient and near-lossless 3D reconstruction suitable for high-resolution generative modeling through latent diffusion. Sparc3D achieves state-of-the-art reconstruction fidelity on challenging inputs, including open surfaces, disconnected components, and intricate geometry. It preserves fine-grained shape details, reduces training and inference cost, and integrates naturally with latent diffusion models for scalable, high-resolution 3D generation.
title	Sparc3D: Sparse Representation and Construction for High-Resolution 3D Shapes Modeling
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2505.14521

Similar Items