Saved in:
Bibliographic Details
Main Authors: Zhang, Yuhui, Yu, Hui, Liang, Wei, Zhang, Sunjie
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.18849
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Dynamic Neural Radiance Fields (NeRF) have demonstrated considerable success in generating high-fidelity 3D models of talking portraits. Despite significant advancements in the rendering speed and generation quality, challenges persist in accurately and efficiently capturing mouth movements in talking portraits. To tackle this challenge, we propose an automatic method based on blink embedding and hash grid landmarks encoding in this study, which can substantially enhance the fidelity of talking faces. Specifically, we leverage facial features encoded as conditional features and integrate audio features as residual terms into our model through a Dynamic Landmark Transformer. Furthermore, we employ neural radiance fields to model the entire face, resulting in a lifelike face representation. Experimental evaluations have validated the superiority of our approach to existing methods.