Saved in:
| Main Authors: | Liu, Haiyang, Hong, Xiaolin, Yang, Xuancheng, Ruan, Yudi, Lian, Xiang, Lingelbach, Michael, Yi, Hongwei, Li, Wei |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.18649 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice
by: Yi, Hongwei, et al.
Published: (2025)
by: Yi, Hongwei, et al.
Published: (2025)
DyStream: Streaming Dyadic Talking Heads Generation via Flow Matching-based Autoregressive Model
by: Chen, Bohong, et al.
Published: (2025)
by: Chen, Bohong, et al.
Published: (2025)
Real-time One-Step Diffusion-based Expressive Portrait Videos Generation
by: Guo, Hanzhong, et al.
Published: (2024)
by: Guo, Hanzhong, et al.
Published: (2024)
RSATalker: Realistic Socially-Aware Talking Head Generation for Multi-Turn Conversation
by: Chen, Peng, et al.
Published: (2026)
by: Chen, Peng, et al.
Published: (2026)
Human-Aware 3D Scene Generation with Spatially-constrained Diffusion Models
by: Hong, Xiaolin, et al.
Published: (2024)
by: Hong, Xiaolin, et al.
Published: (2024)
DreamTalk: When Emotional Talking Head Generation Meets Diffusion Probabilistic Models
by: Ma, Yifeng, et al.
Published: (2023)
by: Ma, Yifeng, et al.
Published: (2023)
MagicDistillation: Weak-to-Strong Video Distillation for Large-Scale Few-Step Synthesis
by: Shao, Shitong, et al.
Published: (2025)
by: Shao, Shitong, et al.
Published: (2025)
SoulX-FlashHead: Oracle-guided Generation of Infinite Real-time Streaming Talking Heads
by: Yu, Tan, et al.
Published: (2026)
by: Yu, Tan, et al.
Published: (2026)
Magic 1-For-1: Generating One Minute Video Clips within One Minute
by: Yi, Hongwei, et al.
Published: (2025)
by: Yi, Hongwei, et al.
Published: (2025)
Learning Online Scale Transformation for Talking Head Video Generation
by: Hong, Fa-Ting, et al.
Published: (2024)
by: Hong, Fa-Ting, et al.
Published: (2024)
EmoDiffTalk:Emotion-aware Diffusion for Editable 3D Gaussian Talking Head
by: Liu, Chang, et al.
Published: (2025)
by: Liu, Chang, et al.
Published: (2025)
OT-Talk: Animating 3D Talking Head with Optimal Transportation
by: Wang, Xinmu, et al.
Published: (2025)
by: Wang, Xinmu, et al.
Published: (2025)
GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting
by: Cho, Kyusun, et al.
Published: (2024)
by: Cho, Kyusun, et al.
Published: (2024)
TalkCLIP: Talking Head Generation with Text-Guided Expressive Speaking Styles
by: Ma, Yifeng, et al.
Published: (2023)
by: Ma, Yifeng, et al.
Published: (2023)
AD-Reasoning: Multimodal Guideline-Guided Reasoning for Alzheimer's Disease Diagnosis
by: Chen, Qiuhui, et al.
Published: (2026)
by: Chen, Qiuhui, et al.
Published: (2026)
Enhancing 3D Medical Image Understanding with Pretraining Aided by 2D Multimodal Large Language Models
by: Chen, Qiuhui, et al.
Published: (2025)
by: Chen, Qiuhui, et al.
Published: (2025)
Jump Cut Smoothing for Talking Heads
by: Wang, Xiaojuan, et al.
Published: (2024)
by: Wang, Xiaojuan, et al.
Published: (2024)
Dual Audio-Centric Modality Coupling for Talking Head Generation
by: Fu, Ao, et al.
Published: (2025)
by: Fu, Ao, et al.
Published: (2025)
UniAvatar: Taming Lifelike Audio-Driven Talking Head Generation with Comprehensive Motion and Lighting Control
by: Sun, Wenzhang, et al.
Published: (2024)
by: Sun, Wenzhang, et al.
Published: (2024)
Synergizing Motion and Appearance: Multi-Scale Compensatory Codebooks for Talking Head Video Generation
by: Zhao, Shuling, et al.
Published: (2024)
by: Zhao, Shuling, et al.
Published: (2024)
SafeRoPE: Risk-specific Head-wise Embedding Rotation for Safe Generation in Rectified Flow Transformers
by: Yang, Xiang, et al.
Published: (2026)
by: Yang, Xiang, et al.
Published: (2026)
SyncTalk: The Devil is in the Synchronization for Talking Head Synthesis
by: Peng, Ziqiao, et al.
Published: (2023)
by: Peng, Ziqiao, et al.
Published: (2023)
Splat-Portrait: Generalizing Talking Heads with Gaussian Splatting
by: Shi, Tong, et al.
Published: (2026)
by: Shi, Tong, et al.
Published: (2026)
THEval. Evaluation Framework for Talking Head Video Generation
by: Quignon, Nabyl, et al.
Published: (2025)
by: Quignon, Nabyl, et al.
Published: (2025)
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
by: Xu, Sicheng, et al.
Published: (2024)
by: Xu, Sicheng, et al.
Published: (2024)
FD2Talk: Towards Generalized Talking Head Generation with Facial Decoupled Diffusion Model
by: Yao, Ziyu, et al.
Published: (2024)
by: Yao, Ziyu, et al.
Published: (2024)
FixTalk: Taming Identity Leakage for High-Quality Talking Head Generation in Extreme Cases
by: Tan, Shuai, et al.
Published: (2025)
by: Tan, Shuai, et al.
Published: (2025)
ConsistTalk: Intensity Controllable Temporally Consistent Talking Head Generation with Diffusion Noise Search
by: Liu, Zhenjie, et al.
Published: (2025)
by: Liu, Zhenjie, et al.
Published: (2025)
MoCoTalk: Multi-Conditional Diffusion with Adaptive Router for Controllable Talking Head Generation
by: Ye, Xinyan, et al.
Published: (2026)
by: Ye, Xinyan, et al.
Published: (2026)
RTGen: Real-Time Generative Detection Transformer
by: Ruan, Chi, et al.
Published: (2025)
by: Ruan, Chi, et al.
Published: (2025)
TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis
by: Chen, Shunian, et al.
Published: (2025)
by: Chen, Shunian, et al.
Published: (2025)
Talking Head Generation via AU-Guided Landmark Prediction
by: Chang, Shao-Yu, et al.
Published: (2025)
by: Chang, Shao-Yu, et al.
Published: (2025)
SyncAnimation: A Real-Time End-to-End Framework for Audio-Driven Human Pose and Talking Head Animation
by: Liu, Yujian, et al.
Published: (2025)
by: Liu, Yujian, et al.
Published: (2025)
TalkingHeadBench: A Multi-Modal Benchmark & Analysis of Talking-Head DeepFake Detection
by: Xiong, Xinqi, et al.
Published: (2025)
by: Xiong, Xinqi, et al.
Published: (2025)
GaussianHeadTalk: Wobble-Free 3D Talking Heads with Audio Driven Gaussian Splatting
by: Agarwal, Madhav, et al.
Published: (2025)
by: Agarwal, Madhav, et al.
Published: (2025)
IF-MDM: Implicit Face Motion Diffusion Model for High-Fidelity Realtime Talking Head Generation
by: Yang, Sejong, et al.
Published: (2024)
by: Yang, Sejong, et al.
Published: (2024)
EmoTalkingGaussian: Continuous Emotion-conditioned Talking Head Synthesis
by: Cha, Junuk, et al.
Published: (2025)
by: Cha, Junuk, et al.
Published: (2025)
FreeTalk: Emotional Topology-Free 3D Talking Heads
by: Nocentini, Federico, et al.
Published: (2026)
by: Nocentini, Federico, et al.
Published: (2026)
ScanTalk: 3D Talking Heads from Unregistered Scans
by: Nocentini, Federico, et al.
Published: (2024)
by: Nocentini, Federico, et al.
Published: (2024)
Real-Time Generation of Streamable Talking Portrait Video with Reference-Guided Deep Compression VAEs
by: Xu, Sicheng, et al.
Published: (2026)
by: Xu, Sicheng, et al.
Published: (2026)
Similar Items
-
MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice
by: Yi, Hongwei, et al.
Published: (2025) -
DyStream: Streaming Dyadic Talking Heads Generation via Flow Matching-based Autoregressive Model
by: Chen, Bohong, et al.
Published: (2025) -
Real-time One-Step Diffusion-based Expressive Portrait Videos Generation
by: Guo, Hanzhong, et al.
Published: (2024) -
RSATalker: Realistic Socially-Aware Talking Head Generation for Multi-Turn Conversation
by: Chen, Peng, et al.
Published: (2026) -
Human-Aware 3D Scene Generation with Spatially-constrained Diffusion Models
by: Hong, Xiaolin, et al.
Published: (2024)