:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liu, Haiyang, Hong, Xiaolin, Yang, Xuancheng, Ruan, Yudi, Lian, Xiang, Lingelbach, Michael, Yi, Hongwei, Li, Wei
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2507.18649
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice
by: Yi, Hongwei, et al.
Published: (2025)

DyStream: Streaming Dyadic Talking Heads Generation via Flow Matching-based Autoregressive Model
by: Chen, Bohong, et al.
Published: (2025)

Real-time One-Step Diffusion-based Expressive Portrait Videos Generation
by: Guo, Hanzhong, et al.
Published: (2024)

RSATalker: Realistic Socially-Aware Talking Head Generation for Multi-Turn Conversation
by: Chen, Peng, et al.
Published: (2026)

Human-Aware 3D Scene Generation with Spatially-constrained Diffusion Models
by: Hong, Xiaolin, et al.
Published: (2024)

DreamTalk: When Emotional Talking Head Generation Meets Diffusion Probabilistic Models
by: Ma, Yifeng, et al.
Published: (2023)

MagicDistillation: Weak-to-Strong Video Distillation for Large-Scale Few-Step Synthesis
by: Shao, Shitong, et al.
Published: (2025)

SoulX-FlashHead: Oracle-guided Generation of Infinite Real-time Streaming Talking Heads
by: Yu, Tan, et al.
Published: (2026)

Magic 1-For-1: Generating One Minute Video Clips within One Minute
by: Yi, Hongwei, et al.
Published: (2025)

Learning Online Scale Transformation for Talking Head Video Generation
by: Hong, Fa-Ting, et al.
Published: (2024)

EmoDiffTalk:Emotion-aware Diffusion for Editable 3D Gaussian Talking Head
by: Liu, Chang, et al.
Published: (2025)

OT-Talk: Animating 3D Talking Head with Optimal Transportation
by: Wang, Xinmu, et al.
Published: (2025)

GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting
by: Cho, Kyusun, et al.
Published: (2024)

TalkCLIP: Talking Head Generation with Text-Guided Expressive Speaking Styles
by: Ma, Yifeng, et al.
Published: (2023)

AD-Reasoning: Multimodal Guideline-Guided Reasoning for Alzheimer's Disease Diagnosis
by: Chen, Qiuhui, et al.
Published: (2026)

Enhancing 3D Medical Image Understanding with Pretraining Aided by 2D Multimodal Large Language Models
by: Chen, Qiuhui, et al.
Published: (2025)

Jump Cut Smoothing for Talking Heads
by: Wang, Xiaojuan, et al.
Published: (2024)

Dual Audio-Centric Modality Coupling for Talking Head Generation
by: Fu, Ao, et al.
Published: (2025)

UniAvatar: Taming Lifelike Audio-Driven Talking Head Generation with Comprehensive Motion and Lighting Control
by: Sun, Wenzhang, et al.
Published: (2024)

Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation
by: Zhao, Shuling, et al.
Published: (2024)

SafeRoPE: Risk-specific Head-wise Embedding Rotation for Safe Generation in Rectified Flow Transformers
by: Yang, Xiang, et al.
Published: (2026)

SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis
by: Peng, Ziqiao, et al.
Published: (2023)

Splat-Portrait: Generalizing Talking Heads with Gaussian Splatting
by: Shi, Tong, et al.
Published: (2026)

THEval. Evaluation Framework for Talking Head Video Generation
by: Quignon, Nabyl, et al.
Published: (2025)

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
by: Xu, Sicheng, et al.
Published: (2024)

FD2Talk: Towards Generalized Talking Head Generation with Facial Decoupled Diffusion Model
by: Yao, Ziyu, et al.
Published: (2024)

FixTalk: Taming Identity Leakage for High-Quality Talking Head Generation in Extreme Cases
by: Tan, Shuai, et al.
Published: (2025)

ConsistTalk: Intensity Controllable Temporally Consistent Talking Head Generation with Diffusion Noise Search
by: Liu, Zhenjie, et al.
Published: (2025)

MoCoTalk: Multi-Conditional Diffusion with Adaptive Router for Controllable Talking Head Generation
by: Ye, Xinyan, et al.
Published: (2026)

RTGen: Real-Time Generative Detection Transformer
by: Ruan, Chi, et al.
Published: (2025)

TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis
by: Chen, Shunian, et al.
Published: (2025)

Talking Head Generation via AU-Guided Landmark Prediction
by: Chang, Shao-Yu, et al.
Published: (2025)

SyncAnimation: A Real-Time End-to-End Framework for Audio-Driven Human Pose and Talking Head Animation
by: Liu, Yujian, et al.
Published: (2025)

TalkingHeadBench: A Multi-Modal Benchmark & Analysis of Talking-Head DeepFake Detection
by: Xiong, Xinqi, et al.
Published: (2025)

GaussianHeadTalk: Wobble-Free 3D Talking Heads with Audio Driven Gaussian Splatting
by: Agarwal, Madhav, et al.
Published: (2025)

IF-MDM: Implicit Face Motion Diffusion Model for High-Fidelity Realtime Talking Head Generation
by: Yang, Sejong, et al.
Published: (2024)

EmoTalkingGaussian: Continuous Emotion-conditioned Talking Head Synthesis
by: Cha, Junuk, et al.
Published: (2025)

FreeTalk: Emotional Topology-Free 3D Talking Heads
by: Nocentini, Federico, et al.
Published: (2026)

ScanTalk: 3D Talking Heads from Unregistered Scans
by: Nocentini, Federico, et al.
Published: (2024)

Real-Time Generation of Streamable Talking Portrait Video with Reference-Guided Deep Compression VAEs
by: Xu, Sicheng, et al.
Published: (2026)