Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Shin, Jisu, Lee, Junmyeong, Lee, Seongmin, Park, Min-Gyu, Kang, Ju-Mi, Yoon, Ju Hong, Jeon, Hae-Gon
Formato:	Preprint
Publicado:	2024
Materias:	Computer Vision and Pattern Recognition
Acceso en línea:	https://arxiv.org/abs/2407.04345
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866909253275484160
author	Shin, Jisu Lee, Junmyeong Lee, Seongmin Park, Min-Gyu Kang, Ju-Mi Yoon, Ju Hong Jeon, Hae-Gon
author_facet	Shin, Jisu Lee, Junmyeong Lee, Seongmin Park, Min-Gyu Kang, Ju-Mi Yoon, Ju Hong Jeon, Hae-Gon
contents	We present a novel framework for reconstructing animatable human avatars from multiple images, termed CanonicalFusion. Our central concept involves integrating individual reconstruction results into the canonical space. To be specific, we first predict Linear Blend Skinning (LBS) weight maps and depth maps using a shared-encoder-dual-decoder network, enabling direct canonicalization of the 3D mesh from the predicted depth maps. Here, instead of predicting high-dimensional skinning weights, we infer compressed skinning weights, i.e., 3-dimensional vector, with the aid of pre-trained MLP networks. We also introduce a forward skinning-based differentiable rendering scheme to merge the reconstructed results from multiple images. This scheme refines the initial mesh by reposing the canonical mesh via the forward skinning and by minimizing photometric and geometric errors between the rendered and the predicted results. Our optimization scheme considers the position and color of vertices as well as the joint angles for each image, thereby mitigating the negative effects of pose errors. We conduct extensive experiments to demonstrate the effectiveness of our method and compare our CanonicalFusion with state-of-the-art methods. Our source codes are available at https://github.com/jsshin98/CanonicalFusion.
format	Preprint
id	arxiv_https___arxiv_org_abs_2407_04345
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	CanonicalFusion: Generating Drivable 3D Human Avatars from Multiple Images Shin, Jisu Lee, Junmyeong Lee, Seongmin Park, Min-Gyu Kang, Ju-Mi Yoon, Ju Hong Jeon, Hae-Gon Computer Vision and Pattern Recognition We present a novel framework for reconstructing animatable human avatars from multiple images, termed CanonicalFusion. Our central concept involves integrating individual reconstruction results into the canonical space. To be specific, we first predict Linear Blend Skinning (LBS) weight maps and depth maps using a shared-encoder-dual-decoder network, enabling direct canonicalization of the 3D mesh from the predicted depth maps. Here, instead of predicting high-dimensional skinning weights, we infer compressed skinning weights, i.e., 3-dimensional vector, with the aid of pre-trained MLP networks. We also introduce a forward skinning-based differentiable rendering scheme to merge the reconstructed results from multiple images. This scheme refines the initial mesh by reposing the canonical mesh via the forward skinning and by minimizing photometric and geometric errors between the rendered and the predicted results. Our optimization scheme considers the position and color of vertices as well as the joint angles for each image, thereby mitigating the negative effects of pose errors. We conduct extensive experiments to demonstrate the effectiveness of our method and compare our CanonicalFusion with state-of-the-art methods. Our source codes are available at https://github.com/jsshin98/CanonicalFusion.
title	CanonicalFusion: Generating Drivable 3D Human Avatars from Multiple Images
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2407.04345

Ejemplares similares