Saved in:
| Main Authors: | Gong, Junmin, Song, Yulin, Zhao, Wenxiao, Wang, Sen, Xu, Shengyuan, Guo, Jing, Yang, Xuerui |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.00744 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ACE-Step: A Step Towards Music Generation Foundation Model
by: Gong, Junmin, et al.
Published: (2025)
by: Gong, Junmin, et al.
Published: (2025)
ConSinger: Efficient High-Fidelity Singing Voice Generation with Minimal Steps
by: Song, Yulin, et al.
Published: (2024)
by: Song, Yulin, et al.
Published: (2024)
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
by: Zhang, Xueyao, et al.
Published: (2023)
by: Zhang, Xueyao, et al.
Published: (2023)
HeartMuLa: A Family of Open Sourced Music Foundation Models
by: Yang, Dongchao, et al.
Published: (2026)
by: Yang, Dongchao, et al.
Published: (2026)
CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection
by: Zang, Yongyi, et al.
Published: (2024)
by: Zang, Yongyi, et al.
Published: (2024)
OpenACE: An Open Benchmark for Evaluating Audio Coding Performance
by: Coldenhoff, Jozef, et al.
Published: (2024)
by: Coldenhoff, Jozef, et al.
Published: (2024)
U3-xi: Pushing the Boundaries of Speaker Recognition by Incorporating Uncertainty
by: Li, Junjie, et al.
Published: (2026)
by: Li, Junjie, et al.
Published: (2026)
Multi-Source Music Generation with Latent Diffusion
by: Xu, Zhongweiyang, et al.
Published: (2024)
by: Xu, Zhongweiyang, et al.
Published: (2024)
DSFlow: Dual Supervision and Step-Aware Architecture for One-Step Flow Matching Speech Synthesis
by: Lin, Bin, et al.
Published: (2026)
by: Lin, Bin, et al.
Published: (2026)
ACMID: Automatic Curation of Musical Instrument Dataset for 7-Stem Music Source Separation
by: Yu, Ji, et al.
Published: (2025)
by: Yu, Ji, et al.
Published: (2025)
MusicEval: A Generative Music Dataset with Expert Ratings for Automatic Text-to-Music Evaluation
by: Liu, Cheng, et al.
Published: (2025)
by: Liu, Cheng, et al.
Published: (2025)
CompLex: Music Theory Lexicon Constructed by Autonomous Agents for Automatic Music Generation
by: Hu, Zhejing, et al.
Published: (2025)
by: Hu, Zhejing, et al.
Published: (2025)
BEAT: Tokenizing and Generating Symbolic Music by Uniform Temporal Steps
by: Qian, Lekai, et al.
Published: (2026)
by: Qian, Lekai, et al.
Published: (2026)
Video Echoed in Music: Semantic, Temporal, and Rhythmic Alignment for Video-to-Music Generation
by: Tong, Xinyi, et al.
Published: (2025)
by: Tong, Xinyi, et al.
Published: (2025)
MSRBench: A Benchmarking Dataset for Music Source Restoration
by: Zang, Yongyi, et al.
Published: (2025)
by: Zang, Yongyi, et al.
Published: (2025)
SLAM-LLM: A Modular, Open-Source Multimodal Large Language Model Framework and Best Practice for Speech, Language, Audio and Music Processing
by: Ma, Ziyang, et al.
Published: (2026)
by: Ma, Ziyang, et al.
Published: (2026)
Efficient Long-Sequence Diffusion Modeling for Symbolic Music Generation
by: Xu, Jinhan, et al.
Published: (2026)
by: Xu, Jinhan, et al.
Published: (2026)
Source Separation for A Cappella Music
by: Lanzendörfer, Luca A., et al.
Published: (2025)
by: Lanzendörfer, Luca A., et al.
Published: (2025)
Presto! Distilling Steps and Layers for Accelerating Music Generation
by: Novack, Zachary, et al.
Published: (2024)
by: Novack, Zachary, et al.
Published: (2024)
Score-informed Music Source Separation: Improving Synthetic-to-real Generalization in Classical Music
by: Tunturi, Eetu, et al.
Published: (2025)
by: Tunturi, Eetu, et al.
Published: (2025)
Diffusion-based Symbolic Music Generation with Structured State Space Models
by: Yuan, Shenghua, et al.
Published: (2025)
by: Yuan, Shenghua, et al.
Published: (2025)
Singing Voice Data Scaling-up: An Introduction to ACE-Opencpop and ACE-KiSing
by: Shi, Jiatong, et al.
Published: (2024)
by: Shi, Jiatong, et al.
Published: (2024)
MAGE: Modality-Agnostic Music Generation and Editing
by: Saleem, Muhammad Usama, et al.
Published: (2026)
by: Saleem, Muhammad Usama, et al.
Published: (2026)
MusicDET: Zero-Shot AI-Generated Music Detection
by: Han, Chaolei, et al.
Published: (2026)
by: Han, Chaolei, et al.
Published: (2026)
WeaveMuse: An Open Agentic System for Multimodal Music Understanding and Generation
by: Karystinaios, Emmanouil
Published: (2025)
by: Karystinaios, Emmanouil
Published: (2025)
Streaming Generation for Music Accompaniment
by: Wu, Yusong, et al.
Published: (2025)
by: Wu, Yusong, et al.
Published: (2025)
CCMusic: An Open and Diverse Database for Chinese Music Information Retrieval Research
by: Zhou, Monan, et al.
Published: (2025)
by: Zhou, Monan, et al.
Published: (2025)
Multi-Source Diffusion Models for Simultaneous Music Generation and Separation
by: Mariani, Giorgio, et al.
Published: (2023)
by: Mariani, Giorgio, et al.
Published: (2023)
Intelligent Text-Conditioned Music Generation
by: Xie, Zhouyao, et al.
Published: (2024)
by: Xie, Zhouyao, et al.
Published: (2024)
YuE: Scaling Open Foundation Models for Long-Form Music Generation
by: Yuan, Ruibin, et al.
Published: (2025)
by: Yuan, Ruibin, et al.
Published: (2025)
Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
by: Bai, Ye, et al.
Published: (2024)
by: Bai, Ye, et al.
Published: (2024)
Pushing the Limits of End-to-End Diarization
by: Broughton, Samuel J., et al.
Published: (2025)
by: Broughton, Samuel J., et al.
Published: (2025)
Music Source Restoration
by: Zang, Yongyi, et al.
Published: (2025)
by: Zang, Yongyi, et al.
Published: (2025)
Towards Practical Real-Time Low-Latency Music Source Separation
by: Wu, Junyu, et al.
Published: (2025)
by: Wu, Junyu, et al.
Published: (2025)
MusicMamba: A Dual-Feature Modeling Approach for Generating Chinese Traditional Music with Modal Precision
by: Chen, Jiatao, et al.
Published: (2024)
by: Chen, Jiatao, et al.
Published: (2024)
Procedural Music Generation Systems in Games
by: Luo, Shangxuan, et al.
Published: (2025)
by: Luo, Shangxuan, et al.
Published: (2025)
OpenBEATs: A Fully Open-Source General-Purpose Audio Encoder
by: Bharadwaj, Shikhar, et al.
Published: (2025)
by: Bharadwaj, Shikhar, et al.
Published: (2025)
MeanFlow-Accelerated Multimodal Video-to-Audio Synthesis via One-Step Generation
by: Yang, Xiaoran, et al.
Published: (2025)
by: Yang, Xiaoran, et al.
Published: (2025)
Generalized Multi-Source Inference for Text Conditioned Music Diffusion Models
by: Postolache, Emilian, et al.
Published: (2024)
by: Postolache, Emilian, et al.
Published: (2024)
Improving Music Source Separation with Diffusion and Consistency Refinement
by: Karchkhadze, Tornike, et al.
Published: (2024)
by: Karchkhadze, Tornike, et al.
Published: (2024)
Similar Items
-
ACE-Step: A Step Towards Music Generation Foundation Model
by: Gong, Junmin, et al.
Published: (2025) -
ConSinger: Efficient High-Fidelity Singing Voice Generation with Minimal Steps
by: Song, Yulin, et al.
Published: (2024) -
Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
by: Zhang, Xueyao, et al.
Published: (2023) -
HeartMuLa: A Family of Open Sourced Music Foundation Models
by: Yang, Dongchao, et al.
Published: (2026) -
CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection
by: Zang, Yongyi, et al.
Published: (2024)