:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Gong, Junmin, Song, Yulin, Zhao, Wenxiao, Wang, Sen, Xu, Shengyuan, Guo, Jing, Yang, Xuerui
Format:	Preprint
Published:	2026
Subjects:	Sound
Online Access:	https://arxiv.org/abs/2602.00744
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

ACE-Step: A Step Towards Music Generation Foundation Model
by: Gong, Junmin, et al.
Published: (2025)

ConSinger: Efficient High-Fidelity Singing Voice Generation with Minimal Steps
by: Song, Yulin, et al.
Published: (2024)

Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
by: Zhang, Xueyao, et al.
Published: (2023)

HeartMuLa: A Family of Open Sourced Music Foundation Models
by: Yang, Dongchao, et al.
Published: (2026)

CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection
by: Zang, Yongyi, et al.
Published: (2024)

OpenACE: An Open Benchmark for Evaluating Audio Coding Performance
by: Coldenhoff, Jozef, et al.
Published: (2024)

U3-xi: Pushing the Boundaries of Speaker Recognition by Incorporating Uncertainty
by: Li, Junjie, et al.
Published: (2026)

Multi-Source Music Generation with Latent Diffusion
by: Xu, Zhongweiyang, et al.
Published: (2024)

DSFlow: Dual Supervision and Step-Aware Architecture for One-Step Flow Matching Speech Synthesis
by: Lin, Bin, et al.
Published: (2026)

ACMID: Automatic Curation of Musical Instrument Dataset for 7-Stem Music Source Separation
by: Yu, Ji, et al.
Published: (2025)

MusicEval: A Generative Music Dataset with Expert Ratings for Automatic Text-to-Music Evaluation
by: Liu, Cheng, et al.
Published: (2025)

CompLex: Music Theory Lexicon Constructed by Autonomous Agents for Automatic Music Generation
by: Hu, Zhejing, et al.
Published: (2025)

BEAT: Tokenizing and Generating Symbolic Music by Uniform Temporal Steps
by: Qian, Lekai, et al.
Published: (2026)

Video Echoed in Music: Semantic, Temporal, and Rhythmic Alignment for Video-to-Music Generation
by: Tong, Xinyi, et al.
Published: (2025)

MSRBench: A Benchmarking Dataset for Music Source Restoration
by: Zang, Yongyi, et al.
Published: (2025)

SLAM-LLM: A Modular, Open-Source Multimodal Large Language Model Framework and Best Practice for Speech, Language, Audio and Music Processing
by: Ma, Ziyang, et al.
Published: (2026)

Efficient Long-Sequence Diffusion Modeling for Symbolic Music Generation
by: Xu, Jinhan, et al.
Published: (2026)

Source Separation for A Cappella Music
by: Lanzendörfer, Luca A., et al.
Published: (2025)

Presto! Distilling Steps and Layers for Accelerating Music Generation
by: Novack, Zachary, et al.
Published: (2024)

Score-informed Music Source Separation: Improving Synthetic-to-real Generalization in Classical Music
by: Tunturi, Eetu, et al.
Published: (2025)

Diffusion-based Symbolic Music Generation with Structured State Space Models
by: Yuan, Shenghua, et al.
Published: (2025)

Singing Voice Data Scaling-up: An Introduction to ACE-Opencpop and ACE-KiSing
by: Shi, Jiatong, et al.
Published: (2024)

MAGE: Modality-Agnostic Music Generation and Editing
by: Saleem, Muhammad Usama, et al.
Published: (2026)

MusicDET: Zero-Shot AI-Generated Music Detection
by: Han, Chaolei, et al.
Published: (2026)

WeaveMuse: An Open Agentic System for Multimodal Music Understanding and Generation
by: Karystinaios, Emmanouil
Published: (2025)

Streaming Generation for Music Accompaniment
by: Wu, Yusong, et al.
Published: (2025)

CCMusic: An Open and Diverse Database for Chinese Music Information Retrieval Research
by: Zhou, Monan, et al.
Published: (2025)

Multi-Source Diffusion Models for Simultaneous Music Generation and Separation
by: Mariani, Giorgio, et al.
Published: (2023)

Intelligent Text-Conditioned Music Generation
by: Xie, Zhouyao, et al.
Published: (2024)

YuE: Scaling Open Foundation Models for Long-Form Music Generation
by: Yuan, Ruibin, et al.
Published: (2025)

Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
by: Bai, Ye, et al.
Published: (2024)

Pushing the Limits of End-to-End Diarization
by: Broughton, Samuel J., et al.
Published: (2025)

Music Source Restoration
by: Zang, Yongyi, et al.
Published: (2025)

Towards Practical Real-Time Low-Latency Music Source Separation
by: Wu, Junyu, et al.
Published: (2025)

MusicMamba: A Dual-Feature Modeling Approach for Generating Chinese Traditional Music with Modal Precision
by: Chen, Jiatao, et al.
Published: (2024)

Procedural Music Generation Systems in Games
by: Luo, Shangxuan, et al.
Published: (2025)

OpenBEATs: A Fully Open-Source General-Purpose Audio Encoder
by: Bharadwaj, Shikhar, et al.
Published: (2025)

MeanFlow-Accelerated Multimodal Video-to-Audio Synthesis via One-Step Generation
by: Yang, Xiaoran, et al.
Published: (2025)

Generalized Multi-Source Inference for Text Conditioned Music Diffusion Models
by: Postolache, Emilian, et al.
Published: (2024)

Improving Music Source Separation with Diffusion and Consistency Refinement
by: Karchkhadze, Tornike, et al.
Published: (2024)