Saved in:
| Main Authors: | Hogue, Steven, Zhang, Chenxu, Tian, Yapeng, Guo, Xiaohu |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2412.14333 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
DiffTED: One-shot Audio-driven TED Talk Video Generation with Diffusion-based Co-speech Gestures
by: Hogue, Steven, et al.
Published: (2024)
by: Hogue, Steven, et al.
Published: (2024)
Robust Active Speaker Detection in Noisy Environments
by: Vasireddy, Siva Sai Nagender, et al.
Published: (2024)
by: Vasireddy, Siva Sai Nagender, et al.
Published: (2024)
EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling
by: Liu, Haiyang, et al.
Published: (2023)
by: Liu, Haiyang, et al.
Published: (2023)
MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation
by: Zheng, Longtao, et al.
Published: (2024)
by: Zheng, Longtao, et al.
Published: (2024)
TalkingPose: Efficient Face and Gesture Animation with Feedback-guided Diffusion Model
by: Javanmardi, Alireza, et al.
Published: (2025)
by: Javanmardi, Alireza, et al.
Published: (2025)
IP-Adapter Is All You Need: Towards Fine-Tuning-Free Diffusion-Based Talking Face Generation
by: Wu, Hao, et al.
Published: (2026)
by: Wu, Hao, et al.
Published: (2026)
AVI-Talking: Learning Audio-Visual Instructions for Expressive 3D Talking Face Generation
by: Sun, Yasheng, et al.
Published: (2024)
by: Sun, Yasheng, et al.
Published: (2024)
HoloGest: Decoupled Diffusion and Motion Priors for Generating Holisticly Expressive Co-speech Gestures
by: Cheng, Yongkang, et al.
Published: (2025)
by: Cheng, Yongkang, et al.
Published: (2025)
LiveGesture Streamable Co-Speech Gesture Generation Model
by: Saleem, Muhammad Usama, et al.
Published: (2026)
by: Saleem, Muhammad Usama, et al.
Published: (2026)
Co$^{3}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion
by: Qi, Xingqun, et al.
Published: (2025)
by: Qi, Xingqun, et al.
Published: (2025)
DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder
by: Du, Chenpeng, et al.
Published: (2023)
by: Du, Chenpeng, et al.
Published: (2023)
AV-DiT: Efficient Audio-Visual Diffusion Transformer for Joint Audio and Video Generation
by: Wang, Kai, et al.
Published: (2024)
by: Wang, Kai, et al.
Published: (2024)
Dimitra: Audio-driven Diffusion model for Expressive Talking Head Generation
by: Chopin, Baptiste, et al.
Published: (2025)
by: Chopin, Baptiste, et al.
Published: (2025)
MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation
by: Mao, Xiaofeng, et al.
Published: (2024)
by: Mao, Xiaofeng, et al.
Published: (2024)
Contextual Gesture: Co-Speech Gesture Video Generation through Context-aware Gesture Representation
by: Liu, Pinxin, et al.
Published: (2025)
by: Liu, Pinxin, et al.
Published: (2025)
PersonaGesture: Single-Reference Co-Speech Gesture Personalization for Unseen Speakers
by: Zhang, Xiangyue, et al.
Published: (2026)
by: Zhang, Xiangyue, et al.
Published: (2026)
EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion
by: Wang, Haotian, et al.
Published: (2024)
by: Wang, Haotian, et al.
Published: (2024)
Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model
by: He, Xu, et al.
Published: (2024)
by: He, Xu, et al.
Published: (2024)
Recognizing Co-Speech Gestures in-the-Wild
by: Hegde, Sindhu B, et al.
Published: (2026)
by: Hegde, Sindhu B, et al.
Published: (2026)
Context-aware Talking Face Video Generation
by: Xuanyuan, Meidai, et al.
Published: (2024)
by: Xuanyuan, Meidai, et al.
Published: (2024)
CoCoGesture: Toward Coherent Co-speech 3D Gesture Generation in the Wild
by: Qi, Xingqun, et al.
Published: (2024)
by: Qi, Xingqun, et al.
Published: (2024)
DisentTalk: Cross-lingual Talking Face Generation via Semantic Disentangled Diffusion Model
by: Liu, Kangwei, et al.
Published: (2025)
by: Liu, Kangwei, et al.
Published: (2025)
CoordSpeaker: Exploiting Gesture Captioning for Coordinated Caption-Empowered Co-Speech Gesture Generation
by: Fang, Fengyi, et al.
Published: (2025)
by: Fang, Fengyi, et al.
Published: (2025)
Conveying Meaning through Gestures: An Investigation into Semantic Co-Speech Gesture Generation
by: Voss, Hendric, et al.
Published: (2025)
by: Voss, Hendric, et al.
Published: (2025)
Streaming Generation of Co-Speech Gestures via Accelerated Rolling Diffusion
by: Vu, Evgeniia, et al.
Published: (2025)
by: Vu, Evgeniia, et al.
Published: (2025)
ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture Synthesis
by: Mughal, Muhammad Hamza, et al.
Published: (2024)
by: Mughal, Muhammad Hamza, et al.
Published: (2024)
TalkCLIP: Talking Head Generation with Text-Guided Expressive Speaking Styles
by: Ma, Yifeng, et al.
Published: (2023)
by: Ma, Yifeng, et al.
Published: (2023)
Long-Term TalkingFace Generation via Motion-Prior Conditional Diffusion Model
by: Shen, Fei, et al.
Published: (2025)
by: Shen, Fei, et al.
Published: (2025)
GestureLSM: Latent Shortcut based Co-Speech Gesture Generation with Spatial-Temporal Modeling
by: Liu, Pinxin, et al.
Published: (2025)
by: Liu, Pinxin, et al.
Published: (2025)
DuoGesture: Neuro-Inspired and Biomechanically Informed Dual-Stream Co-Speech Gesture Generation
by: Paar, Ferdinand, et al.
Published: (2026)
by: Paar, Ferdinand, et al.
Published: (2026)
JEAN: Joint Expression and Audio-guided NeRF-based Talking Face Generation
by: Chakkera, Sai Tanmay Reddy, et al.
Published: (2024)
by: Chakkera, Sai Tanmay Reddy, et al.
Published: (2024)
Audio-Visual Speech Representation Expert for Enhanced Talking Face Video Generation and Evaluation
by: Yaman, Dogucan, et al.
Published: (2024)
by: Yaman, Dogucan, et al.
Published: (2024)
Audio-Driven Talking Face Video Generation with Joint Uncertainty Learning
by: Xie, Yifan, et al.
Published: (2025)
by: Xie, Yifan, et al.
Published: (2025)
Faces that Speak: Jointly Synthesising Talking Face and Speech from Text
by: Jang, Youngjoon, et al.
Published: (2024)
by: Jang, Youngjoon, et al.
Published: (2024)
SemGes: Semantics-aware Co-Speech Gesture Generation using Semantic Coherence and Relevance Learning
by: Liu, Lanmiao, et al.
Published: (2025)
by: Liu, Lanmiao, et al.
Published: (2025)
Taming Transformer for Emotion-Controllable Talking Face Generation
by: Zhang, Ziqi, et al.
Published: (2025)
by: Zhang, Ziqi, et al.
Published: (2025)
A Unit-based System and Dataset for Expressive Direct Speech-to-Speech Translation
by: Min, Anna, et al.
Published: (2025)
by: Min, Anna, et al.
Published: (2025)
Face Reconstruction from Face Embeddings using Adapter to a Face Foundation Model
by: Shahreza, Hatef Otroshi, et al.
Published: (2024)
by: Shahreza, Hatef Otroshi, et al.
Published: (2024)
Democratizing High-Fidelity Co-Speech Gesture Video Generation
by: Yang, Xu, et al.
Published: (2025)
by: Yang, Xu, et al.
Published: (2025)
EmoFace: Emotion-Content Disentangled Speech-Driven 3D Talking Face Animation
by: Lin, Yihong, et al.
Published: (2024)
by: Lin, Yihong, et al.
Published: (2024)
Similar Items
-
DiffTED: One-shot Audio-driven TED Talk Video Generation with Diffusion-based Co-speech Gestures
by: Hogue, Steven, et al.
Published: (2024) -
Robust Active Speaker Detection in Noisy Environments
by: Vasireddy, Siva Sai Nagender, et al.
Published: (2024) -
EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling
by: Liu, Haiyang, et al.
Published: (2023) -
MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation
by: Zheng, Longtao, et al.
Published: (2024) -
TalkingPose: Efficient Face and Gesture Animation with Feedback-guided Diffusion Model
by: Javanmardi, Alireza, et al.
Published: (2025)