Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Zhang, Zicheng, Zheng, Ruobing, Liu, Ziwen, Han, Congying, Li, Tianqi, Wang, Meng, Guo, Tiande, Chen, Jingdong, Li, Bonan, Yang, Ming
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2402.17364
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913245392011264
author	Zhang, Zicheng Zheng, Ruobing Liu, Ziwen Han, Congying Li, Tianqi Wang, Meng Guo, Tiande Chen, Jingdong Li, Bonan Yang, Ming
author_facet	Zhang, Zicheng Zheng, Ruobing Liu, Ziwen Han, Congying Li, Tianqi Wang, Meng Guo, Tiande Chen, Jingdong Li, Bonan Yang, Ming
contents	Recent works in implicit representations, such as Neural Radiance Fields (NeRF), have advanced the generation of realistic and animatable head avatars from video sequences. These implicit methods are still confronted by visual artifacts and jitters, since the lack of explicit geometric constraints poses a fundamental challenge in accurately modeling complex facial deformations. In this paper, we introduce Dynamic Tetrahedra (DynTet), a novel hybrid representation that encodes explicit dynamic meshes by neural networks to ensure geometric consistency across various motions and viewpoints. DynTet is parameterized by the coordinate-based networks which learn signed distance, deformation, and material texture, anchoring the training data into a predefined tetrahedra grid. Leveraging Marching Tetrahedra, DynTet efficiently decodes textured meshes with a consistent topology, enabling fast rendering through a differentiable rasterizer and supervision via a pixel loss. To enhance training efficiency, we incorporate classical 3D Morphable Models to facilitate geometry learning and define a canonical space for simplifying texture learning. These advantages are readily achievable owing to the effective geometric representation employed in DynTet. Compared with prior works, DynTet demonstrates significant improvements in fidelity, lip synchronization, and real-time performance according to various metrics. Beyond producing stable and visually appealing synthesis videos, our method also outputs the dynamic meshes which is promising to enable many emerging applications.
format	Preprint
id	arxiv_https___arxiv_org_abs_2402_17364
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis Zhang, Zicheng Zheng, Ruobing Liu, Ziwen Han, Congying Li, Tianqi Wang, Meng Guo, Tiande Chen, Jingdong Li, Bonan Yang, Ming Computer Vision and Pattern Recognition Recent works in implicit representations, such as Neural Radiance Fields (NeRF), have advanced the generation of realistic and animatable head avatars from video sequences. These implicit methods are still confronted by visual artifacts and jitters, since the lack of explicit geometric constraints poses a fundamental challenge in accurately modeling complex facial deformations. In this paper, we introduce Dynamic Tetrahedra (DynTet), a novel hybrid representation that encodes explicit dynamic meshes by neural networks to ensure geometric consistency across various motions and viewpoints. DynTet is parameterized by the coordinate-based networks which learn signed distance, deformation, and material texture, anchoring the training data into a predefined tetrahedra grid. Leveraging Marching Tetrahedra, DynTet efficiently decodes textured meshes with a consistent topology, enabling fast rendering through a differentiable rasterizer and supervision via a pixel loss. To enhance training efficiency, we incorporate classical 3D Morphable Models to facilitate geometry learning and define a canonical space for simplifying texture learning. These advantages are readily achievable owing to the effective geometric representation employed in DynTet. Compared with prior works, DynTet demonstrates significant improvements in fidelity, lip synchronization, and real-time performance according to various metrics. Beyond producing stable and visually appealing synthesis videos, our method also outputs the dynamic meshes which is promising to enable many emerging applications.
title	Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2402.17364

Similar Items