Guardado en:
| Autores principales: | Yu, Haojie, Wang, Zhaonian, Pan, Yihan, Cheng, Meng, Yang, Hao, Wang, Chao, Xie, Tao, Xu, Xiaoming, Wei, Xiaoming, Cai, Xunliang |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2506.05806 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
por: Kong, Zhe, et al.
Publicado: (2025)
por: Kong, Zhe, et al.
Publicado: (2025)
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency
por: Jiang, Jianwen, et al.
Publicado: (2024)
por: Jiang, Jianwen, et al.
Publicado: (2024)
InfiniteTalk: Audio-driven Video Generation for Sparse-Frame Video Dubbing
por: Yang, Shaoshu, et al.
Publicado: (2025)
por: Yang, Shaoshu, et al.
Publicado: (2025)
LongCat-Video-Avatar 1.5 Technical Report
por: Meituan LongCat Team, et al.
Publicado: (2026)
por: Meituan LongCat Team, et al.
Publicado: (2026)
FantasyTalking2: Timestep-Layer Adaptive Preference Optimization for Audio-Driven Portrait Animation
por: Wang, MengChao, et al.
Publicado: (2025)
por: Wang, MengChao, et al.
Publicado: (2025)
StableAvatar: Infinite-Length Audio-Driven Avatar Video Generation
por: Tu, Shuyuan, et al.
Publicado: (2025)
por: Tu, Shuyuan, et al.
Publicado: (2025)
Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion Generation
por: Zhen, Dingcheng, et al.
Publicado: (2025)
por: Zhen, Dingcheng, et al.
Publicado: (2025)
FlowPortrait: Reinforcement Learning for Audio-Driven Portrait Video Generation
por: Tan, Weiting, et al.
Publicado: (2026)
por: Tan, Weiting, et al.
Publicado: (2026)
TurboTalk: Progressive Distillation for One-Step Audio-Driven Talking Avatar Generation
por: Liu, Xiangyu, et al.
Publicado: (2026)
por: Liu, Xiangyu, et al.
Publicado: (2026)
Knot Forcing: Taming Autoregressive Video Diffusion Models for Real-time Infinite Interactive Portrait Animation
por: Xiao, Steven, et al.
Publicado: (2025)
por: Xiao, Steven, et al.
Publicado: (2025)
MobilePortrait: Real-Time One-Shot Neural Head Avatars on Mobile Devices
por: Jiang, Jianwen, et al.
Publicado: (2024)
por: Jiang, Jianwen, et al.
Publicado: (2024)
AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation
por: Wei, Huawei, et al.
Publicado: (2024)
por: Wei, Huawei, et al.
Publicado: (2024)
RAP: Real-time Audio-driven Portrait Animation with Video Diffusion Transformer
por: Du, Fangyu, et al.
Publicado: (2025)
por: Du, Fangyu, et al.
Publicado: (2025)
GMTalker: Gaussian Mixture-based Audio-Driven Emotional Talking Video Portraits
por: Xia, Yibo, et al.
Publicado: (2023)
por: Xia, Yibo, et al.
Publicado: (2023)
StreamAvatar: Streaming Diffusion Models for Real-Time Interactive Human Avatars
por: Sun, Zhiyao, et al.
Publicado: (2025)
por: Sun, Zhiyao, et al.
Publicado: (2025)
Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length
por: Huang, Yubo, et al.
Publicado: (2025)
por: Huang, Yubo, et al.
Publicado: (2025)
EMO: Emote Portrait Alive -- Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions
por: Tian, Linrui, et al.
Publicado: (2024)
por: Tian, Linrui, et al.
Publicado: (2024)
JoyStreamer-Flash: Real-time and Infinite Audio-Driven Avatar Generation with Autoregressive Diffusion
por: Li, Chaochao, et al.
Publicado: (2025)
por: Li, Chaochao, et al.
Publicado: (2025)
SkyReels-Audio: Omni Audio-Conditioned Talking Portraits in Video Diffusion Transformers
por: Fei, Zhengcong, et al.
Publicado: (2025)
por: Fei, Zhengcong, et al.
Publicado: (2025)
ALIVE: An Avatar-Lecture Interactive Video Engine with Content-Aware Retrieval for Real-Time Interaction
por: Islam, Md Zabirul, et al.
Publicado: (2025)
por: Islam, Md Zabirul, et al.
Publicado: (2025)
TalkingMachines: Real-Time Audio-Driven FaceTime-Style Video via Autoregressive Diffusion Models
por: Low, Chetwin, et al.
Publicado: (2025)
por: Low, Chetwin, et al.
Publicado: (2025)
ECHO: Towards Emotionally Appropriate and Contextually Aware Interactive Head Generation
por: Kong, Xiangyu, et al.
Publicado: (2026)
por: Kong, Xiangyu, et al.
Publicado: (2026)
OmniAvatar: Efficient Audio-Driven Avatar Video Generation with Adaptive Body Animation
por: Gan, Qijun, et al.
Publicado: (2025)
por: Gan, Qijun, et al.
Publicado: (2025)
EMO2: End-Effector Guided Audio-Driven Avatar Video Generation
por: Tian, Linrui, et al.
Publicado: (2025)
por: Tian, Linrui, et al.
Publicado: (2025)
SoulX-FlashTalk: Real-Time Infinite Streaming of Audio-Driven Avatars via Self-Correcting Bidirectional Distillation
por: Shen, Le, et al.
Publicado: (2025)
por: Shen, Le, et al.
Publicado: (2025)
U-Mind: A Unified Framework for Real-Time Multimodal Interaction with Audiovisual Generation
por: Deng, Xiang, et al.
Publicado: (2026)
por: Deng, Xiang, et al.
Publicado: (2026)
A Unit Enhancement and Guidance Framework for Audio-Driven Avatar Video Generation
por: Zhou, S. Z., et al.
Publicado: (2025)
por: Zhou, S. Z., et al.
Publicado: (2025)
MVP4D: Multi-View Portrait Video Diffusion for Animatable 4D Avatars
por: Taubner, Felix, et al.
Publicado: (2025)
por: Taubner, Felix, et al.
Publicado: (2025)
Active Intelligence in Video Avatars via Closed-loop World Modeling
por: He, Xuanhua, et al.
Publicado: (2025)
por: He, Xuanhua, et al.
Publicado: (2025)
JoyVASA: Portrait and Animal Image Animation with Diffusion-Based Audio-Driven Facial Dynamics and Head Motion Generation
por: Cao, Xuyang, et al.
Publicado: (2024)
por: Cao, Xuyang, et al.
Publicado: (2024)
Few-shot Semantic Encoding and Decoding for Video Surveillance
por: Cheng, Baoping, et al.
Publicado: (2025)
por: Cheng, Baoping, et al.
Publicado: (2025)
Refined Geometry-guided Head Avatar Reconstruction from Monocular RGB Video
por: Park, Pilseo, et al.
Publicado: (2025)
por: Park, Pilseo, et al.
Publicado: (2025)
Nods of Agreement: Webcam-Driven Avatars Improve Meeting Outcomes and Avatar Satisfaction Over Audio-Driven or Static Avatars in All-Avatar Work Videoconferencing
por: Ma, Fang, et al.
Publicado: (2024)
por: Ma, Fang, et al.
Publicado: (2024)
The Latency Wall: Benchmarking Off-the-Shelf Emotion Recognition for Real-Time Virtual Avatars
por: Benyamin, Yarin
Publicado: (2026)
por: Benyamin, Yarin
Publicado: (2026)
Towards Practical Real-Time Low-Latency Music Source Separation
por: Wu, Junyu, et al.
Publicado: (2025)
por: Wu, Junyu, et al.
Publicado: (2025)
Comparative Analysis of Audio Feature Extraction for Real-Time Talking Portrait Synthesis
por: Salehi, Pegah, et al.
Publicado: (2024)
por: Salehi, Pegah, et al.
Publicado: (2024)
Stable Video-Driven Portraits
por: R., Mallikarjun B., et al.
Publicado: (2025)
por: R., Mallikarjun B., et al.
Publicado: (2025)
Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation
por: Ki, Taekyung, et al.
Publicado: (2026)
por: Ki, Taekyung, et al.
Publicado: (2026)
FPGA‐Based Low‐Latency Semantic Feedback System for Real‐Time Instrument Localization in Telesurgery
por: Zhikang Ma, et al.
Publicado: (2026)
por: Zhikang Ma, et al.
Publicado: (2026)
Combining Generative and Geometry Priors for Wide-Angle Portrait Correction
por: Yao, Lan, et al.
Publicado: (2024)
por: Yao, Lan, et al.
Publicado: (2024)
Ejemplares similares
-
Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation
por: Kong, Zhe, et al.
Publicado: (2025) -
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency
por: Jiang, Jianwen, et al.
Publicado: (2024) -
InfiniteTalk: Audio-driven Video Generation for Sparse-Frame Video Dubbing
por: Yang, Shaoshu, et al.
Publicado: (2025) -
LongCat-Video-Avatar 1.5 Technical Report
por: Meituan LongCat Team, et al.
Publicado: (2026) -
FantasyTalking2: Timestep-Layer Adaptive Preference Optimization for Audio-Driven Portrait Animation
por: Wang, MengChao, et al.
Publicado: (2025)