Saved in:
| Main Authors: | Wang, Xuanchen, Wang, Heng, Cai, Weidong |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.13244 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ChoreoMuse: Robust Music-to-Dance Video Generation with Style Transfer and Beat-Adherent Motion
by: Wang, Xuanchen, et al.
Published: (2025)
by: Wang, Xuanchen, et al.
Published: (2025)
Dance Any Beat: Blending Beats with Visuals in Dance Video Generation
by: Wang, Xuanchen, et al.
Published: (2024)
by: Wang, Xuanchen, et al.
Published: (2024)
MusicWeaver: Composer-Style Structural Editing and Minute-Scale Coherent Music Generation
by: Wang, Xuanchen, et al.
Published: (2025)
by: Wang, Xuanchen, et al.
Published: (2025)
Music Arena: Live Evaluation for Text-to-Music
by: Kim, Yonghyun, et al.
Published: (2025)
by: Kim, Yonghyun, et al.
Published: (2025)
MusicSwarm: Biologically Inspired Intelligence for Music Composition
by: Buehler, Markus J.
Published: (2025)
by: Buehler, Markus J.
Published: (2025)
A Survey on Music Generation from Single-Modal, Cross-Modal, and Multi-Modal Perspectives
by: Li, Shuyu, et al.
Published: (2025)
by: Li, Shuyu, et al.
Published: (2025)
Let the Model Learn to Feel: Mode-Guided Tonality Injection for Symbolic Music Emotion Recognition
by: Xia, Haiying, et al.
Published: (2025)
by: Xia, Haiying, et al.
Published: (2025)
Music-Aligned Holistic 3D Dance Generation via Hierarchical Motion Modeling
by: Li, Xiaojie, et al.
Published: (2025)
by: Li, Xiaojie, et al.
Published: (2025)
Amadeus: Autoregressive Model with Bidirectional Attribute Modelling for Symbolic Music
by: Su, Hongju, et al.
Published: (2025)
by: Su, Hongju, et al.
Published: (2025)
AHA: Aligning Large Audio-Language Models for Reasoning Hallucinations via Counterfactual Hard Negatives
by: Chen, Yanxi, et al.
Published: (2025)
by: Chen, Yanxi, et al.
Published: (2025)
EXPOTION: Facial Expression and Motion Control for Multimodal Music Generation
by: Izzati, Fathinah, et al.
Published: (2025)
by: Izzati, Fathinah, et al.
Published: (2025)
GACA-DiT: Diffusion-based Dance-to-Music Generation with Genre-Adaptive Rhythm and Context-Aware Alignment
by: Wang, Jinting, et al.
Published: (2025)
by: Wang, Jinting, et al.
Published: (2025)
Memo2496: Expert-Annotated Dataset and Dual-View Adaptive Framework for Music Emotion Recognition
by: Li, Qilin, et al.
Published: (2025)
by: Li, Qilin, et al.
Published: (2025)
Music for All: Representational Bias and Cross-Cultural Adaptability of Music Generation Models
by: Mehta, Atharva, et al.
Published: (2025)
by: Mehta, Atharva, et al.
Published: (2025)
MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models
by: Zhang, Yixiao, et al.
Published: (2024)
by: Zhang, Yixiao, et al.
Published: (2024)
MusicAIR: A Multimodal AI Music Generation Framework Powered by an Algorithm-Driven Core
by: Liao, Callie C., et al.
Published: (2025)
by: Liao, Callie C., et al.
Published: (2025)
PianoMotion10M: Dataset and Benchmark for Hand Motion Generation in Piano Performance
by: Gan, Qijun, et al.
Published: (2024)
by: Gan, Qijun, et al.
Published: (2024)
Live Music Diffusion Models: Efficient Fine-Tuning and Post-Training of Interactive Diffusion Music Generators
by: Novack, Zachary, et al.
Published: (2026)
by: Novack, Zachary, et al.
Published: (2026)
Exploring Acoustic Similarity in Emotional Speech and Music via Self-Supervised Representations
by: Sun, Yujia, et al.
Published: (2024)
by: Sun, Yujia, et al.
Published: (2024)
Emotion-Aligned Contrastive Learning Between Images and Music
by: Stewart, Shanti, et al.
Published: (2023)
by: Stewart, Shanti, et al.
Published: (2023)
A Survey of Foundation Models for Music Understanding
by: Li, Wenjun, et al.
Published: (2024)
by: Li, Wenjun, et al.
Published: (2024)
AUREXA-SE: Audio-Visual Unified Representation Exchange Architecture with Cross-Attention and Squeezeformer for Speech Enhancement
by: Sajid, M., et al.
Published: (2025)
by: Sajid, M., et al.
Published: (2025)
Uncertainty-Aware 3D Emotional Talking Face Synthesis with Emotion Prior Distillation
by: Shen, Nanhan, et al.
Published: (2026)
by: Shen, Nanhan, et al.
Published: (2026)
Frechet Music Distance: A Metric For Generative Symbolic Music Evaluation
by: Retkowski, Jan, et al.
Published: (2024)
by: Retkowski, Jan, et al.
Published: (2024)
Leveraging Pre-Trained Autoencoders for Interpretable Prototype Learning of Music Audio
by: Alonso-Jiménez, Pablo, et al.
Published: (2024)
by: Alonso-Jiménez, Pablo, et al.
Published: (2024)
MusER: Musical Element-Based Regularization for Generating Symbolic Music with Emotion
by: Ji, Shulei, et al.
Published: (2023)
by: Ji, Shulei, et al.
Published: (2023)
MMVA: Multimodal Matching Based on Valence and Arousal across Images, Music, and Musical Captions
by: Choi, Suhwan, et al.
Published: (2025)
by: Choi, Suhwan, et al.
Published: (2025)
Towards Assessing Data Replication in Music Generation with Music Similarity Metrics on Raw Audio
by: Batlle-Roca, Roser, et al.
Published: (2024)
by: Batlle-Roca, Roser, et al.
Published: (2024)
ONOTE: Benchmarking Omnimodal Notation Processing for Expert-level Music Intelligence
by: Ma, Menghe, et al.
Published: (2026)
by: Ma, Menghe, et al.
Published: (2026)
Align Your Rhythm: Generating Highly Aligned Dance Poses with Gating-Enhanced Rhythm-Aware Feature Representation
by: Fan, Congyi, et al.
Published: (2025)
by: Fan, Congyi, et al.
Published: (2025)
Cross-Modal Learning for Music-to-Music-Video Description Generation
by: Mao, Zhuoyuan, et al.
Published: (2025)
by: Mao, Zhuoyuan, et al.
Published: (2025)
UniSRCodec: Unified and Low-Bitrate Single Codebook Codec with Sub-Band Reconstruction
by: Zhang, Zhisheng, et al.
Published: (2026)
by: Zhang, Zhisheng, et al.
Published: (2026)
Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning
by: Zhang, Yixiao, et al.
Published: (2024)
by: Zhang, Yixiao, et al.
Published: (2024)
ReactMotion: Generating Reactive Listener Motions from Speaker Utterance
by: Luo, Cheng, et al.
Published: (2026)
by: Luo, Cheng, et al.
Published: (2026)
Hierarchical Semantic Correlation-Aware Masked Autoencoder for Unsupervised Audio-Visual Representation Learning
by: Zeng, Donghuo, et al.
Published: (2026)
by: Zeng, Donghuo, et al.
Published: (2026)
YuE: Scaling Open Foundation Models for Long-Form Music Generation
by: Yuan, Ruibin, et al.
Published: (2025)
by: Yuan, Ruibin, et al.
Published: (2025)
MoMu-Diffusion: On Learning Long-Term Motion-Music Synchronization and Correspondence
by: You, Fuming, et al.
Published: (2024)
by: You, Fuming, et al.
Published: (2024)
SynthGuard: An Open Platform for Detecting AI-Generated Multimedia with Multimodal LLMs
by: Desai, Shail, et al.
Published: (2025)
by: Desai, Shail, et al.
Published: (2025)
Iterative Residual Cross-Attention Mechanism: An Integrated Approach for Audio-Visual Navigation Tasks
by: Zhang, Hailong, et al.
Published: (2025)
by: Zhang, Hailong, et al.
Published: (2025)
The Name-Free Gap: Policy-Aware Stylistic Control in Music Generation
by: Nagarajan, Ashwin, et al.
Published: (2025)
by: Nagarajan, Ashwin, et al.
Published: (2025)
Similar Items
-
ChoreoMuse: Robust Music-to-Dance Video Generation with Style Transfer and Beat-Adherent Motion
by: Wang, Xuanchen, et al.
Published: (2025) -
Dance Any Beat: Blending Beats with Visuals in Dance Video Generation
by: Wang, Xuanchen, et al.
Published: (2024) -
MusicWeaver: Composer-Style Structural Editing and Minute-Scale Coherent Music Generation
by: Wang, Xuanchen, et al.
Published: (2025) -
Music Arena: Live Evaluation for Text-to-Music
by: Kim, Yonghyun, et al.
Published: (2025) -
MusicSwarm: Biologically Inspired Intelligence for Music Composition
by: Buehler, Markus J.
Published: (2025)