:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Baker, Tom, Nistal, Javier
Format:	Preprint
Published:	2025
Subjects:	Sound Machine Learning Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2506.11476
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Latent Granular Resynthesis using Neural Audio Codecs
by: Tokui, Nao, et al.
Published: (2025)

Music2Latent: Consistency Autoencoders for Latent Audio Compression
by: Pasini, Marco, et al.
Published: (2024)

Accompaniment Prompt Adherence: A Measure for Evaluating Music Accompaniment Systems
by: Grachten, Maarten, et al.
Published: (2025)

Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer
by: Hou, Siyuan, et al.
Published: (2024)

Improving Musical Accompaniment Co-creation via Diffusion Transformers
by: Nistal, Javier, et al.
Published: (2024)

The Rarity of Musical Audio Signals Within the Space of Possible Audio Generation
by: Collins, Nick
Published: (2024)

Multi-Source Music Generation with Latent Diffusion
by: Xu, Zhongweiyang, et al.
Published: (2024)

HAAQI-Net: A Non-intrusive Neural Music Audio Quality Assessment Model for Hearing Aids
by: Wisnu, Dyah A. M. G., et al.
Published: (2024)

Learning Music Audio Representations With Limited Data
by: Plachouras, Christos, et al.
Published: (2025)

Diff-A-Riff: Musical Accompaniment Co-creation via Latent Diffusion Models
by: Nistal, Javier, et al.
Published: (2024)

MGE-LDM: Joint Latent Diffusion for Simultaneous Music Generation and Source Extraction
by: Chae, Yunkee, et al.
Published: (2025)

Continuous Autoregressive Models with Noise Augmentation Avoid Error Accumulation
by: Pasini, Marco, et al.
Published: (2024)

Generating Sample-Based Musical Instruments Using Neural Audio Codec Language Models
by: Nercessian, Shahan, et al.
Published: (2024)

Do Foundational Audio Encoders Understand Music Structure?
by: Toyama, Keisuke, et al.
Published: (2025)

Evaluating Disentangled Representations for Controllable Music Generation
by: Ibáñez-Martínez, Laura, et al.
Published: (2026)

Learning to Upsample and Upmix Audio in the Latent Domain
by: Bralios, Dimitrios, et al.
Published: (2025)

Fast Timing-Conditioned Latent Audio Diffusion
by: Evans, Zach, et al.
Published: (2024)

COCOLA: Coherence-Oriented Contrastive Learning of Musical Audio Representations
by: Ciranni, Ruben, et al.
Published: (2024)

Audio Processing using Pattern Recognition for Music Genre Classification
by: Chatterjee, Sivangi, et al.
Published: (2024)

Music2Latent2: Audio Compression with Summary Embeddings and Autoregressive Decoding
by: Pasini, Marco, et al.
Published: (2025)

Music Boomerang: Reusing Diffusion Models for Data Augmentation and Audio Manipulation
by: Fichtinger, Alexander, et al.
Published: (2025)

FIGARO: Generating Symbolic Music with Fine-Grained Artistic Control
by: von Rütte, Dimitri, et al.
Published: (2022)

Bridging the Perception Gap: A Lightweight Coarse-to-Fine Architecture for Edge Audio Systems
by: Zhang, Hengfan, et al.
Published: (2026)

Improving Real-Time Music Accompaniment Separation with MMDenseNet
by: Wang, Chun-Hsiang, et al.
Published: (2024)

TTS-CtrlNet: Time varying emotion aligned text-to-speech generation with ControlNet
by: Jeong, Jaeseok, et al.
Published: (2025)

Subtractive Training for Music Stem Insertion using Latent Diffusion Models
by: Villa-Renteria, Ivan, et al.
Published: (2024)

Re-Bottleneck: Latent Re-Structuring for Neural Audio Autoencoders
by: Bralios, Dimitrios, et al.
Published: (2025)

Generative AI for Music and Audio
by: Dong, Hao-Wen
Published: (2024)

MusicRL: Aligning Music Generation to Human Preferences
by: Cideron, Geoffrey, et al.
Published: (2024)

Naturalistic Music Decoding from EEG Data via Latent Diffusion Models
by: Postolache, Emilian, et al.
Published: (2024)

LC-Protonets: Multi-Label Few-Shot Learning for World Music Audio Tagging
by: Papaioannou, Charilaos, et al.
Published: (2024)

SYMPLEX: Controllable Symbolic Music Generation using Simplex Diffusion with Vocabulary Priors
by: Jonason, Nicolas, et al.
Published: (2024)

Nested Music Transformer: Sequentially Decoding Compound Tokens in Symbolic Music and Audio Generation
by: Yoo, HaeJun, et al.
Published: (2024)

SpecMaskFoley: Steering Pretrained Spectral Masked Generative Transformer Toward Synchronized Video-to-audio Synthesis via ControlNet
by: Zhong, Zhi, et al.
Published: (2025)

Active Restoration of Lost Audio Signals Using Machine Learning and Latent Information
by: Cheddad, Zohra Adila, et al.
Published: (2021)

ProGress: Structured Music Generation via Graph Diffusion and Hierarchical Music Analysis
by: Ni-Hahn, Stephen, et al.
Published: (2025)

Score-informed Music Source Separation: Improving Synthetic-to-real Generalization in Classical Music
by: Tunturi, Eetu, et al.
Published: (2025)

Simple and Controllable Music Generation
by: Copet, Jade, et al.
Published: (2023)

Watermarking Training Data of Music Generation Models
by: Epple, Pascal, et al.
Published: (2024)

LMFCA-Net: A Lightweight Model for Multi-Channel Speech Enhancement with Efficient Narrow-Band and Cross-Band Attention
by: Zhang, Yaokai, et al.
Published: (2025)