:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Parker, Julian D., Evans, Zach, Carr, CJ, Zukowski, Zachary, Taylor, Josiah, Rice, Matthew, Pons, Jordi
Format:	Preprint
Published:	2026
Subjects:	Sound Artificial Intelligence
Online Access:	https://arxiv.org/abs/2605.18613
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Stable Audio 3
by: Evans, Zach, et al.
Published: (2026)

Stable Audio Open
by: Evans, Zach, et al.
Published: (2024)

Music and Artificial Intelligence: Artistic Trends
by: Pons, Jordi, et al.
Published: (2025)

Low-Resource Guidance for Controllable Latent Audio Diffusion
by: Novack, Zachary, et al.
Published: (2026)

Long-form music generation with latent diffusion
by: Evans, Zach, et al.
Published: (2024)

Scaling Transformers for Low-Bitrate High-Quality Speech Coding
by: Parker, Julian D, et al.
Published: (2024)

Fast Text-to-Audio Generation with Adversarial Post-Training
by: Novack, Zachary, et al.
Published: (2025)

Fast Timing-Conditioned Latent Audio Diffusion
by: Evans, Zach, et al.
Published: (2024)

Perceptually Aligning Representations of Music via Noise-Augmented Autoencoders
by: Bjare, Mathias Rose, et al.
Published: (2025)

MuseTok: Symbolic Music Tokenization for Generation and Semantic Understanding
by: Huang, Jingyue, et al.
Published: (2025)

DITTO-2: Distilled Diffusion Inference-Time T-Optimization for Music Generation
by: Novack, Zachary, et al.
Published: (2024)

PDMX: A Large-Scale Public Domain MusicXML Dataset for Symbolic Music Processing
by: Long, Phillip, et al.
Published: (2024)

MuseCPBench: an Empirical Study of Music Editing Methods through Music Context Preservation
by: Vishe, Yash, et al.
Published: (2025)

Aligning Text-to-Music Evaluation with Human Preferences
by: Huang, Yichen, et al.
Published: (2025)

DITTO: Diffusion Inference-Time T-Optimization for Music Generation
by: Novack, Zachary, et al.
Published: (2024)

Steering Autoregressive Music Generation with Recursive Feature Machines
by: Zhao, Daniel, et al.
Published: (2025)

WildFX: A DAW-Powered Pipeline for In-the-Wild Audio FX Graph Modeling
by: Yang, Qihui, et al.
Published: (2025)

Presto! Distilling Steps and Layers for Accelerating Music Generation
by: Novack, Zachary, et al.
Published: (2024)

Live Music Diffusion Models: Efficient Fine-Tuning and Post-Training of Interactive Diffusion Music Generators
by: Novack, Zachary, et al.
Published: (2026)

CoLLAP: Contrastive Long-form Language-Audio Pretraining with Musical Temporal Structure Augmentation
by: Wu, Junda, et al.
Published: (2024)

Story2MIDI: Emotionally Aligned Music Generation from Text
by: Shokri, Mohammad, et al.
Published: (2025)

DAIRHuM: A Platform for Directly Aligning AI Representations with Human Musical Judgments applied to Carnatic Music
by: Ravikumar, Prashanth Thattai
Published: (2024)

Composer Vector: Style-steering Symbolic Music Generation in a Latent Space
by: Jiang, Xunyi, et al.
Published: (2026)

Aligning Generative Music AI with Human Preferences: Methods and Challenges
by: Herremans, Dorien, et al.
Published: (2025)

Bob's Confetti: Phonetic Memorization Attacks in Music and Video Generation
by: Roh, Jaechul, et al.
Published: (2025)

MOSA: Music Motion with Semantic Annotation Dataset for Cross-Modal Music Processing
by: Huang, Yu-Fen, et al.
Published: (2024)

Leveraging Pre-Trained Autoencoders for Interpretable Prototype Learning of Music Audio
by: Alonso-Jiménez, Pablo, et al.
Published: (2024)

CSyMR: Benchmarking Compositional Music Information Retrieval in Symbolic Music Reasoning
by: Wang, Boyang, et al.
Published: (2025)

MotionBeat: Motion-Aligned Music Representation via Embodied Contrastive Learning and Bar-Equivariant Contact-Aware Encoding
by: Wang, Xuanchen, et al.
Published: (2025)

Depth-Structured Music Recurrence: Budgeted Recurrent Attention for Full-Piece Symbolic Music Modeling
by: Yi, Yungang, et al.
Published: (2026)

Modeling Music as a Time-Frequency Image: A 2D Tokenizer for Music Generation
by: Cheng, Yuqing, et al.
Published: (2026)

MusicSynth: An Automated Pipeline for Generating Violin Fingerboard Animations from Sheet Music Using Optical Music Recognition
by: Kaushik, Abhimanyu
Published: (2026)

Foley Control: Aligning a Frozen Latent Text-to-Audio Model to Video
by: Rowles, Ciara, et al.
Published: (2025)

HAIM: Human-AI Music Datasets for AI Music Production Tracking Benchmark
by: Go, Seonghyeon, et al.
Published: (2026)

Music Arena: Live Evaluation for Text-to-Music
by: Kim, Yonghyun, et al.
Published: (2025)

CompLex: Music Theory Lexicon Constructed by Autonomous Agents for Automatic Music Generation
by: Hu, Zhejing, et al.
Published: (2025)

Device-Guided Music Transfer
by: Hung, Manh Pham, et al.
Published: (2025)

MusicSwarm: Biologically Inspired Intelligence for Music Composition
by: Buehler, Markus J.
Published: (2025)

Musical Score Understanding Benchmark: Evaluating Large Language Models' Comprehension of Complete Musical Scores
by: Dai, Congren, et al.
Published: (2025)

Music Style Transfer With Diffusion Model
by: Huang, Hong, et al.
Published: (2024)