:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Redondo, Rafael
Format:	Preprint
Published:	2024
Subjects:	Sound Graphics Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2406.16155
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Combining Genre Classification and Harmonic-Percussive Features with Diffusion Models for Music-Video Generation
by: Pina, Leonardo, et al.
Published: (2024)

ELGAR: Expressive Cello Performance Motion Generation for Audio Rendition
by: Qiu, Zhiping, et al.
Published: (2025)

Beat-It: Beat-Synchronized Multi-Condition 3D Dance Generation
by: Huang, Zikai, et al.
Published: (2024)

DGFM: Full Body Dance Generation Driven by Music Foundation Models
by: Liu, Xinran, et al.
Published: (2025)

SyncViolinist: Music-Oriented Violin Motion Generation Based on Bowing and Fingering
by: Nishizawa, Hiroki, et al.
Published: (2024)

Inter-Diffusion Generation Model of Speakers and Listeners for Effective Communication
by: Huang, Jinhe, et al.
Published: (2025)

Content and Style Aware Audio-Driven Facial Animation
by: Liu, Qingju, et al.
Published: (2024)

Text-Driven Voice Conversion via Latent State-Space Modeling
by: Li, Wen, et al.
Published: (2025)

NAT: Neural Acoustic Transfer for Interactive Scenes in Real Time
by: Jin, Xutong, et al.
Published: (2025)

EnchantDance: Unveiling the Potential of Music-Driven Dance Movement
by: Han, Bo, et al.
Published: (2023)

MusicScore: A Dataset for Music Score Modeling and Generation
by: Lin, Yuheng, et al.
Published: (2024)

Seeing Sound: Assembling Sounds from Visuals for Audio-to-Image Generation
by: Petermann, Darius, et al.
Published: (2025)

MATHDance: Mamba-Transformer Architecture with Uniform Tokenization for High-Quality 3D Dance Generation
by: Yang, Kaixing, et al.
Published: (2025)

DanceAnyWay: Synthesizing Beat-Guided 3D Dances with Randomized Temporal Contrastive Learning
by: Bhattacharya, Aneesh, et al.
Published: (2023)

Gaunt coefficients for complex and real spherical harmonics with applications to spherical array processing and Ambisonics
by: Politis, Archontis
Published: (2024)

Meta-Learning Empowered Meta-Face: Personalized Speaking Style Adaptation for Audio-Driven 3D Talking Face Animation
by: Zhou, Xukun, et al.
Published: (2024)

FürElise: Capturing and Physically Synthesizing Hand Motions of Piano Performance
by: Wang, Ruocheng, et al.
Published: (2024)

LAV: Audio-Driven Dynamic Visual Generation with Neural Compression and StyleGAN2
by: Jung, Jongmin, et al.
Published: (2025)

RAP: Real-time Audio-driven Portrait Animation with Video Diffusion Transformer
by: Du, Fangyu, et al.
Published: (2025)

MIDGET: Music Conditioned 3D Dance Generation
by: Wang, Jinwu, et al.
Published: (2024)

Listen through the Sound: Generative Speech Restoration Leveraging Acoustic Context Representation
by: Chung, Soo-Whan, et al.
Published: (2025)

PointTalk: Audio-Driven Dynamic Lip Point Cloud for 3D Gaussian-based Talking Head Synthesis
by: Xie, Yifan, et al.
Published: (2024)

Open Your Ears and Take a Look: A State-of-the-Art Report on the Integration of Sonification and Visualization
by: Enge, Kajetan, et al.
Published: (2024)

Audio is all in one: speech-driven gesture synthetics using WavLM pre-trained model
by: Zhang, Fan, et al.
Published: (2023)

Tiny is not small enough: High-quality, low-resource facial animation models through hybrid knowledge distillation
by: Han, Zhen, et al.
Published: (2025)

MACS: Multi-source Audio-to-image Generation with Contextual Significance and Semantic Alignment
by: Zhou, Hao, et al.
Published: (2025)

READ: Real-time and Efficient Asynchronous Diffusion for Audio-driven Talking Head Generation
by: Wang, Haotian, et al.
Published: (2025)

GCDance: Genre-Controlled Music-Driven 3D Full Body Dance Generation
by: Liu, Xinran, et al.
Published: (2025)

DuetGen: Music Driven Two-Person Dance Generation via Hierarchical Masked Modeling
by: Ghosh, Anindita, et al.
Published: (2025)

Lodge: A Coarse to Fine Diffusion Network for Long Dance Generation Guided by the Characteristic Dance Primitives
by: Li, Ronghui, et al.
Published: (2024)

DIDiffGes: Decoupled Semi-Implicit Diffusion Models for Real-time Gesture Generation from Speech
by: Cheng, Yongkang, et al.
Published: (2025)

MotionRAG-Diff: A Retrieval-Augmented Diffusion Framework for Long-Term Music-to-Dance Generation
by: Huang, Mingyang, et al.
Published: (2025)

Fast Algorithm for Moving Sound Source
by: Yang, Dong
Published: (2025)

SounDiT: Geo-Contextual Soundscape-to-Landscape Generation
by: Wang, Junbo, et al.
Published: (2025)

Semantic Gesticulator: Semantics-Aware Co-Speech Gesture Synthesis
by: Zhang, Zeyi, et al.
Published: (2024)

Flowers Revisited: A Preliminary Replication of Flowers et al. 1997
by: Enge, Kajetan, et al.
Published: (2024)

Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance Accompaniment
by: Siyao, Li, et al.
Published: (2024)

Text-driven Talking Face Synthesis by Reprogramming Audio-driven Models
by: Choi, Jeongsoo, et al.
Published: (2023)

Audio-Plane: Audio Factorization Plane Gaussian Splatting for Real-Time Talking Head Synthesis
by: Shen, Shuai, et al.
Published: (2025)

TCDiff++: An End-to-end Trajectory-Controllable Diffusion Model for Harmonious Music-Driven Group Choreography
by: Dai, Yuqin, et al.
Published: (2025)