Saved in:
| Main Authors: | Song, Yafei, Zhang, Peng, Zhang, Bang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.02576 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
CoCoGesture: Toward Coherent Co-speech 3D Gesture Generation in the Wild
by: Qi, Xingqun, et al.
Published: (2024)
by: Qi, Xingqun, et al.
Published: (2024)
Co$^{3}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion
by: Qi, Xingqun, et al.
Published: (2025)
by: Qi, Xingqun, et al.
Published: (2025)
Motion-example-controlled Co-speech Gesture Generation Leveraging Large Language Models
by: Chen, Bohong, et al.
Published: (2025)
by: Chen, Bohong, et al.
Published: (2025)
HoloGest: Decoupled Diffusion and Motion Priors for Generating Holisticly Expressive Co-speech Gestures
by: Cheng, Yongkang, et al.
Published: (2025)
by: Cheng, Yongkang, et al.
Published: (2025)
Understanding Co-speech Gestures in-the-wild
by: Hegde, Sindhu B, et al.
Published: (2025)
by: Hegde, Sindhu B, et al.
Published: (2025)
Self-Supervised Learning of Deviation in Latent Representation for Co-speech Gesture Video Generation
by: Yang, Huan, et al.
Published: (2024)
by: Yang, Huan, et al.
Published: (2024)
Co-Speech Gesture Video Generation via Motion-Decoupled Diffusion Model
by: He, Xu, et al.
Published: (2024)
by: He, Xu, et al.
Published: (2024)
DiffTED: One-shot Audio-driven TED Talk Video Generation with Diffusion-based Co-speech Gestures
by: Hogue, Steven, et al.
Published: (2024)
by: Hogue, Steven, et al.
Published: (2024)
Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-speech Gesture Generation
by: Qi, Xingqun, et al.
Published: (2023)
by: Qi, Xingqun, et al.
Published: (2023)
Contextual Gesture: Co-Speech Gesture Video Generation through Context-aware Gesture Representation
by: Liu, Pinxin, et al.
Published: (2025)
by: Liu, Pinxin, et al.
Published: (2025)
MMGT: Motion Mask Guided Two-Stage Network for Co-Speech Gesture Video Generation
by: Wang, Siyuan, et al.
Published: (2025)
by: Wang, Siyuan, et al.
Published: (2025)
SemTalk: Holistic Co-speech Motion Generation with Frame-level Semantic Emphasis
by: Zhang, Xiangyue, et al.
Published: (2024)
by: Zhang, Xiangyue, et al.
Published: (2024)
ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model
by: Qi, Jinwei, et al.
Published: (2025)
by: Qi, Jinwei, et al.
Published: (2025)
MotionRAG-Diff: A Retrieval-Augmented Diffusion Framework for Long-Term Music-to-Dance Generation
by: Huang, Mingyang, et al.
Published: (2025)
by: Huang, Mingyang, et al.
Published: (2025)
TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio Motion Embedding and Diffusion Interpolation
by: Liu, Haiyang, et al.
Published: (2024)
by: Liu, Haiyang, et al.
Published: (2024)
EMAGE: Towards Unified Holistic Co-Speech Gesture Generation via Expressive Masked Audio Gesture Modeling
by: Liu, Haiyang, et al.
Published: (2023)
by: Liu, Haiyang, et al.
Published: (2023)
Exploring Timeline Control for Facial Motion Generation
by: Ma, Yifeng, et al.
Published: (2025)
by: Ma, Yifeng, et al.
Published: (2025)
LiveGesture Streamable Co-Speech Gesture Generation Model
by: Saleem, Muhammad Usama, et al.
Published: (2026)
by: Saleem, Muhammad Usama, et al.
Published: (2026)
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning
by: Lv, Jiaxi, et al.
Published: (2023)
by: Lv, Jiaxi, et al.
Published: (2023)
GestureLSM: Latent Shortcut based Co-Speech Gesture Generation with Spatial-Temporal Modeling
by: Liu, Pinxin, et al.
Published: (2025)
by: Liu, Pinxin, et al.
Published: (2025)
MotionRAG: Motion Retrieval-Augmented Image-to-Video Generation
by: Zhu, Chenhui, et al.
Published: (2025)
by: Zhu, Chenhui, et al.
Published: (2025)
PersonaGesture: Single-Reference Co-Speech Gesture Personalization for Unseen Speakers
by: Zhang, Xiangyue, et al.
Published: (2026)
by: Zhang, Xiangyue, et al.
Published: (2026)
Democratizing High-Fidelity Co-Speech Gesture Video Generation
by: Yang, Xu, et al.
Published: (2025)
by: Yang, Xu, et al.
Published: (2025)
Prompt-to-Gesture: Measuring the Capabilities of Image-to-Video Deictic Gesture Generation
by: Ali, Hassan, et al.
Published: (2026)
by: Ali, Hassan, et al.
Published: (2026)
Dual-task Mutual Reinforcing Embedded Joint Video Paragraph Retrieval and Grounding
by: Wang, Mengzhao, et al.
Published: (2024)
by: Wang, Mengzhao, et al.
Published: (2024)
T2VIndexer: A Generative Video Indexer for Efficient Text-Video Retrieval
by: Li, Yili, et al.
Published: (2024)
by: Li, Yili, et al.
Published: (2024)
ExGes: Expressive Human Motion Retrieval and Modulation for Audio-Driven Gesture Synthesis
by: Zhou, Xukun, et al.
Published: (2025)
by: Zhou, Xukun, et al.
Published: (2025)
EasyGenNet: An Efficient Framework for Audio-Driven Gesture Video Generation Based on Diffusion Model
by: Li, Renda, et al.
Published: (2025)
by: Li, Renda, et al.
Published: (2025)
GeCo: Evaluating Geometric Consistency for Video Generation via Motion and Structure
by: Gu, Leslie, et al.
Published: (2025)
by: Gu, Leslie, et al.
Published: (2025)
CoordSpeaker: Exploiting Gesture Captioning for Coordinated Caption-Empowered Co-Speech Gesture Generation
by: Fang, Fengyi, et al.
Published: (2025)
by: Fang, Fengyi, et al.
Published: (2025)
Joint Co-Speech Gesture and Expressive Talking Face Generation using Diffusion with Adapters
by: Hogue, Steven, et al.
Published: (2024)
by: Hogue, Steven, et al.
Published: (2024)
RAGME: Retrieval Augmented Video Generation for Enhanced Motion Realism
by: Peruzzo, Elia, et al.
Published: (2025)
by: Peruzzo, Elia, et al.
Published: (2025)
PersonaGest: Personalized Co-Speech Gesture Generation with Semantic-Guided Hierarchical Motion Representation
by: Zhao, Junchuan, et al.
Published: (2026)
by: Zhao, Junchuan, et al.
Published: (2026)
Controllable and Expressive One-Shot Video Head Swapping
by: Ji, Chaonan, et al.
Published: (2025)
by: Ji, Chaonan, et al.
Published: (2025)
Video Motion Graphs
by: Liu, Haiyang, et al.
Published: (2025)
by: Liu, Haiyang, et al.
Published: (2025)
Cosh-DiT: Co-Speech Gesture Video Synthesis via Hybrid Audio-Visual Diffusion Transformers
by: Sun, Yasheng, et al.
Published: (2025)
by: Sun, Yasheng, et al.
Published: (2025)
Conveying Meaning through Gestures: An Investigation into Semantic Co-Speech Gesture Generation
by: Voss, Hendric, et al.
Published: (2025)
by: Voss, Hendric, et al.
Published: (2025)
Vgent: Graph-based Retrieval-Reasoning-Augmented Generation For Long Video Understanding
by: Shen, Xiaoqian, et al.
Published: (2025)
by: Shen, Xiaoqian, et al.
Published: (2025)
InteracTalker: Prompt-Based Human-Object Interaction with Co-Speech Gesture Generation
by: Rajan, Sreehari, et al.
Published: (2025)
by: Rajan, Sreehari, et al.
Published: (2025)
Wan-S2V: Audio-Driven Cinematic Video Generation
by: Gao, Xin, et al.
Published: (2025)
by: Gao, Xin, et al.
Published: (2025)
Similar Items
-
CoCoGesture: Toward Coherent Co-speech 3D Gesture Generation in the Wild
by: Qi, Xingqun, et al.
Published: (2024) -
Co$^{3}$Gesture: Towards Coherent Concurrent Co-speech 3D Gesture Generation with Interactive Diffusion
by: Qi, Xingqun, et al.
Published: (2025) -
Motion-example-controlled Co-speech Gesture Generation Leveraging Large Language Models
by: Chen, Bohong, et al.
Published: (2025) -
HoloGest: Decoupled Diffusion and Motion Priors for Generating Holisticly Expressive Co-speech Gestures
by: Cheng, Yongkang, et al.
Published: (2025) -
Understanding Co-speech Gestures in-the-wild
by: Hegde, Sindhu B, et al.
Published: (2025)