Guardado en:
Detalles Bibliográficos
Autores principales: Hema, Vishnu Mani, Aich, Shubhra, Haene, Christian, Bazin, Jean-Charles, de la Torre, Fernando
Formato: Preprint
Publicado: 2024
Materias:
Acceso en línea:https://arxiv.org/abs/2410.09690
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
_version_ 1866917801833267200
author Hema, Vishnu Mani
Aich, Shubhra
Haene, Christian
Bazin, Jean-Charles
de la Torre, Fernando
author_facet Hema, Vishnu Mani
Aich, Shubhra
Haene, Christian
Bazin, Jean-Charles
de la Torre, Fernando
contents The advancement in deep implicit modeling and articulated models has significantly enhanced the process of digitizing human figures in 3D from just a single image. While state-of-the-art methods have greatly improved geometric precision, the challenge of accurately inferring texture remains, particularly in obscured areas such as the back of a person in frontal-view images. This limitation in texture prediction largely stems from the scarcity of large-scale and diverse 3D datasets, whereas their 2D counterparts are abundant and easily accessible. To address this issue, our paper proposes leveraging extensive 2D fashion datasets to enhance both texture and shape prediction in 3D human digitization. We incorporate 2D priors from the fashion dataset to learn the occluded back view, refined with our proposed domain alignment strategy. We then fuse this information with the input image to obtain a fully textured mesh of the given person. Through extensive experimentation on standard 3D human benchmarks, we demonstrate the superior performance of our approach in terms of both texture and geometry. Code and dataset is available at https://github.com/humansensinglab/FAMOUS.
format Preprint
id arxiv_https___arxiv_org_abs_2410_09690
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle FAMOUS: High-Fidelity Monocular 3D Human Digitization Using View Synthesis
Hema, Vishnu Mani
Aich, Shubhra
Haene, Christian
Bazin, Jean-Charles
de la Torre, Fernando
Computer Vision and Pattern Recognition
The advancement in deep implicit modeling and articulated models has significantly enhanced the process of digitizing human figures in 3D from just a single image. While state-of-the-art methods have greatly improved geometric precision, the challenge of accurately inferring texture remains, particularly in obscured areas such as the back of a person in frontal-view images. This limitation in texture prediction largely stems from the scarcity of large-scale and diverse 3D datasets, whereas their 2D counterparts are abundant and easily accessible. To address this issue, our paper proposes leveraging extensive 2D fashion datasets to enhance both texture and shape prediction in 3D human digitization. We incorporate 2D priors from the fashion dataset to learn the occluded back view, refined with our proposed domain alignment strategy. We then fuse this information with the input image to obtain a fully textured mesh of the given person. Through extensive experimentation on standard 3D human benchmarks, we demonstrate the superior performance of our approach in terms of both texture and geometry. Code and dataset is available at https://github.com/humansensinglab/FAMOUS.
title FAMOUS: High-Fidelity Monocular 3D Human Digitization Using View Synthesis
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2410.09690