Saved in:
| Main Authors: | Cheung, Hei Shing, Zhang, Boya, Chan, Jonathan H. |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.19991 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Musical Attention Transformer: Music Generation Using a Music-Specific Attention Model
by: Taksuka, Shinnosuke, et al.
Published: (2026)
by: Taksuka, Shinnosuke, et al.
Published: (2026)
Extract and Diffuse: Latent Integration for Improved Diffusion-based Speech and Vocal Enhancement
by: Yang, Yudong, et al.
Published: (2024)
by: Yang, Yudong, et al.
Published: (2024)
Multi-Source Music Generation with Latent Diffusion
by: Xu, Zhongweiyang, et al.
Published: (2024)
by: Xu, Zhongweiyang, et al.
Published: (2024)
MGE-LDM: Joint Latent Diffusion for Simultaneous Music Generation and Source Extraction
by: Chae, Yunkee, et al.
Published: (2025)
by: Chae, Yunkee, et al.
Published: (2025)
Naturalistic Music Decoding from EEG Data via Latent Diffusion Models
by: Postolache, Emilian, et al.
Published: (2024)
by: Postolache, Emilian, et al.
Published: (2024)
Fast Timing-Conditioned Latent Audio Diffusion
by: Evans, Zach, et al.
Published: (2024)
by: Evans, Zach, et al.
Published: (2024)
Live Music Diffusion Models: Efficient Fine-Tuning and Post-Training of Interactive Diffusion Music Generators
by: Novack, Zachary, et al.
Published: (2026)
by: Novack, Zachary, et al.
Published: (2026)
Generalized Multi-Source Inference for Text Conditioned Music Diffusion Models
by: Postolache, Emilian, et al.
Published: (2024)
by: Postolache, Emilian, et al.
Published: (2024)
Hierarchical Generative Modeling of Melodic Vocal Contours in Hindustani Classical Music
by: Shikarpur, Nithya, et al.
Published: (2024)
by: Shikarpur, Nithya, et al.
Published: (2024)
ProGress: Structured Music Generation via Graph Diffusion and Hierarchical Music Analysis
by: Ni-Hahn, Stephen, et al.
Published: (2025)
by: Ni-Hahn, Stephen, et al.
Published: (2025)
Machine Learning Approaches to Vocal Register Classification in Contemporary Male Pop Music
by: Kim, Alexander, et al.
Published: (2025)
by: Kim, Alexander, et al.
Published: (2025)
Subtractive Training for Music Stem Insertion using Latent Diffusion Models
by: Villa-Renteria, Ivan, et al.
Published: (2024)
by: Villa-Renteria, Ivan, et al.
Published: (2024)
Bass Accompaniment Generation via Latent Diffusion
by: Pasini, Marco, et al.
Published: (2024)
by: Pasini, Marco, et al.
Published: (2024)
Recognizing Ornaments in Vocal Indian Art Music with Active Annotation
by: Kumar, Sumit, et al.
Published: (2025)
by: Kumar, Sumit, et al.
Published: (2025)
Count The Notes: Histogram-Based Supervision for Automatic Music Transcription
by: Yaffe, Jonathan, et al.
Published: (2025)
by: Yaffe, Jonathan, et al.
Published: (2025)
Generating Music with Structure Using Self-Similarity as Attention
by: Hager, Sophia, et al.
Published: (2024)
by: Hager, Sophia, et al.
Published: (2024)
Music2Latent: Consistency Autoencoders for Latent Audio Compression
by: Pasini, Marco, et al.
Published: (2024)
by: Pasini, Marco, et al.
Published: (2024)
A Dataset for Automatic Vocal Mode Classification
by: Hinrichs, Reemt, et al.
Published: (2026)
by: Hinrichs, Reemt, et al.
Published: (2026)
Benchmarking Music Generation Models and Metrics via Human Preference Studies
by: Grötschla, Florian, et al.
Published: (2025)
by: Grötschla, Florian, et al.
Published: (2025)
Text Conditioned Symbolic Drumbeat Generation using Latent Diffusion Models
by: Jajoria, Pushkar, et al.
Published: (2024)
by: Jajoria, Pushkar, et al.
Published: (2024)
VocalBridge: Latent Diffusion-Bridge Purification for Defeating Perturbation-Based Voiceprint Defenses
by: Abbasihafshejani, Maryam, et al.
Published: (2026)
by: Abbasihafshejani, Maryam, et al.
Published: (2026)
Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion
by: Huang, Yujia, et al.
Published: (2024)
by: Huang, Yujia, et al.
Published: (2024)
Latent Space Disentanglement via Activation Steering for Interpretable Attribute Control in Symbolic Music Generation
by: Prokopiou, Ioannis, et al.
Published: (2026)
by: Prokopiou, Ioannis, et al.
Published: (2026)
LiLAC: A Lightweight Latent ControlNet for Musical Audio Generation
by: Baker, Tom, et al.
Published: (2025)
by: Baker, Tom, et al.
Published: (2025)
MegaTTS 3: Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis
by: Jiang, Ziyue, et al.
Published: (2025)
by: Jiang, Ziyue, et al.
Published: (2025)
DITTO-2: Distilled Diffusion Inference-Time T-Optimization for Music Generation
by: Novack, Zachary, et al.
Published: (2024)
by: Novack, Zachary, et al.
Published: (2024)
Depth-Structured Music Recurrence: Budgeted Recurrent Attention for Full-Piece Symbolic Music Modeling
by: Yi, Yungang, et al.
Published: (2026)
by: Yi, Yungang, et al.
Published: (2026)
Multi-Source Diffusion Models for Simultaneous Music Generation and Separation
by: Mariani, Giorgio, et al.
Published: (2023)
by: Mariani, Giorgio, et al.
Published: (2023)
Fusing Memory and Attention: A study on LSTM, Transformer and Hybrid Architectures for Symbolic Music Generation
by: Ghoshal, Soudeep, et al.
Published: (2026)
by: Ghoshal, Soudeep, et al.
Published: (2026)
WhAM: Towards A Translative Model of Sperm Whale Vocalization
by: Paradise, Orr, et al.
Published: (2025)
by: Paradise, Orr, et al.
Published: (2025)
A Real-Time Lyrics Alignment System Using Chroma And Phonetic Features For Classical Vocal Performance
by: Park, Jiyun, et al.
Published: (2024)
by: Park, Jiyun, et al.
Published: (2024)
Rethinking Music Captioning with Music Metadata LLMs
by: Bukey, Irmak, et al.
Published: (2026)
by: Bukey, Irmak, et al.
Published: (2026)
Online Symbolic Music Alignment with Offline Reinforcement Learning
by: Peter, Silvan David
Published: (2023)
by: Peter, Silvan David
Published: (2023)
Improving Real-Time Music Accompaniment Separation with MMDenseNet
by: Wang, Chun-Hsiang, et al.
Published: (2024)
by: Wang, Chun-Hsiang, et al.
Published: (2024)
Efficient Fine-Grained Guidance for Diffusion Model Based Symbolic Music Generation
by: Zhu, Tingyu, et al.
Published: (2024)
by: Zhu, Tingyu, et al.
Published: (2024)
Bias beyond Borders: Global Inequalities in AI-Generated Music
by: Solak, Ahmet, et al.
Published: (2025)
by: Solak, Ahmet, et al.
Published: (2025)
Generating Separated Singing Vocals Using a Diffusion Model Conditioned on Music Mixtures
by: Plaja-Roglans, Genís, et al.
Published: (2025)
by: Plaja-Roglans, Genís, et al.
Published: (2025)
MambaVoiceCloning: Efficient and Expressive Text-to-Speech via State-Space Modeling and Diffusion Control
by: Kumar, Sahil, et al.
Published: (2026)
by: Kumar, Sahil, et al.
Published: (2026)
SYMPLEX: Controllable Symbolic Music Generation using Simplex Diffusion with Vocabulary Priors
by: Jonason, Nicolas, et al.
Published: (2024)
by: Jonason, Nicolas, et al.
Published: (2024)
Of All StrIPEs: Investigating Structure-informed Positional Encoding for Efficient Music Generation
by: Agarwal, Manvi, et al.
Published: (2025)
by: Agarwal, Manvi, et al.
Published: (2025)
Similar Items
-
Musical Attention Transformer: Music Generation Using a Music-Specific Attention Model
by: Taksuka, Shinnosuke, et al.
Published: (2026) -
Extract and Diffuse: Latent Integration for Improved Diffusion-based Speech and Vocal Enhancement
by: Yang, Yudong, et al.
Published: (2024) -
Multi-Source Music Generation with Latent Diffusion
by: Xu, Zhongweiyang, et al.
Published: (2024) -
MGE-LDM: Joint Latent Diffusion for Simultaneous Music Generation and Source Extraction
by: Chae, Yunkee, et al.
Published: (2025) -
Naturalistic Music Decoding from EEG Data via Latent Diffusion Models
by: Postolache, Emilian, et al.
Published: (2024)