:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liu, Yuxuan, Zhang, Peihong, Sang, Rui, Li, Zhixin, Li, Shengchen
Format:	Preprint
Published:	2025
Subjects:	Sound Machine Learning Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2509.04980
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Training a Perceptual Model for Evaluating Auditory Similarity in Music Adversarial Attack
by: Liu, Yuxuan, et al.
Published: (2025)

TF-SepNet: An Efficient 1D Kernel Design in CNNs for Low-Complexity Acoustic Scene Classification
by: Cai, Yiqiang, et al.
Published: (2023)

Similarity-Guided Diffusion for Long-Gap Music Inpainting
by: Turland, Sean, et al.
Published: (2025)

Geo-ATBench: A Benchmark for Geospatial Audio Tagging with Geospatial Semantic Context
by: Hou, Yuanbo, et al.
Published: (2026)

Unrolled Creative Adversarial Network For Generating Novel Musical Pieces
by: Nag, Pratik
Published: (2024)

Are Modern Speech Enhancement Systems Vulnerable to Adversarial Attacks?
by: Makarov, Rostislav, et al.
Published: (2025)

Revisiting Meter Tracking in Carnatic Music using Deep Learning Approaches
by: Prabhu, Satyajeet
Published: (2025)

Machine Learning Approaches to Vocal Register Classification in Contemporary Male Pop Music
by: Kim, Alexander, et al.
Published: (2025)

MusicRL: Aligning Music Generation to Human Preferences
by: Cideron, Geoffrey, et al.
Published: (2024)

Exploring Transformer-Based Music Overpainting for Jazz Piano Variations
by: Row, Eleanor, et al.
Published: (2024)

Music Emotion Prediction Using Recurrent Neural Networks
by: Chang, Xinyu, et al.
Published: (2024)

Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning
by: Wu, Haibin, et al.
Published: (2021)

An Ensemble Approach to Music Source Separation: A Comparative Analysis of Conventional and Hierarchical Stem Separation
by: Vardhan, Saarth, et al.
Published: (2024)

Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion
by: Huang, Yujia, et al.
Published: (2024)

Anticipatory Music Transformer
by: Thickstun, John, et al.
Published: (2023)

Leveraging Self-supervised Audio Representations for Data-Efficient Acoustic Scene Classification
by: Cai, Yiqiang, et al.
Published: (2024)

Spectrotemporal Modulation: Efficient and Interpretable Feature Representation for Classifying Speech, Music, and Environmental Sounds
by: Chang, Andrew, et al.
Published: (2025)

Adversarial Data Augmentation for Robust Speaker Verification
by: Zhou, Zhenyu, et al.
Published: (2024)

Subtractive Training for Music Stem Insertion using Latent Diffusion Models
by: Villa-Renteria, Ivan, et al.
Published: (2024)

ProGress: Structured Music Generation via Graph Diffusion and Hierarchical Music Analysis
by: Ni-Hahn, Stephen, et al.
Published: (2025)

Score-informed Music Source Separation: Improving Synthetic-to-real Generalization in Classical Music
by: Tunturi, Eetu, et al.
Published: (2025)

Integrating Text-to-Music Models with Language Models: Composing Long Structured Music Pieces
by: Atassi, Lilac
Published: (2024)

Generating Sample-Based Musical Instruments Using Neural Audio Codec Language Models
by: Nercessian, Shahan, et al.
Published: (2024)

On the Generation and Removal of Speaker Adversarial Perturbation for Voice-Privacy Protection
by: Guo, Chenyang, et al.
Published: (2024)

Tune It Up: Music Genre Transfer and Prediction
by: Samet, Fidan, et al.
Published: (2025)

Learning Music Audio Representations With Limited Data
by: Plachouras, Christos, et al.
Published: (2025)

Evaluating Disentangled Representations for Controllable Music Generation
by: Ibáñez-Martínez, Laura, et al.
Published: (2026)

Watermarking Training Data of Music Generation Models
by: Epple, Pascal, et al.
Published: (2024)

Multi-Source Music Generation with Latent Diffusion
by: Xu, Zhongweiyang, et al.
Published: (2024)

Benchmarking Representations for Speech, Music, and Acoustic Events
by: La Quatra, Moreno, et al.
Published: (2024)

Music Genre Classification: Training an AI model
by: Mogonediwa, Keoikantse
Published: (2024)

Are Deep Speech Denoising Models Robust to Adversarial Noise?
by: Schwarzer, Will, et al.
Published: (2025)

Behind the Scenes: Mechanistic Interpretability of LoRA-adapted Whisper for Speech Emotion Recognition
by: Ma, Yujian, et al.
Published: (2025)

Do Foundational Audio Encoders Understand Music Structure?
by: Toyama, Keisuke, et al.
Published: (2025)

Semantic-Aware Interpretable Multimodal Music Auto-Tagging
by: Patakis, Andreas, et al.
Published: (2025)

Transcribing Rhythmic Patterns of the Guitar Track in Polyphonic Music
by: Lukoianov, Aleksandr, et al.
Published: (2025)

Online Symbolic Music Alignment with Offline Reinforcement Learning
by: Peter, Silvan David
Published: (2023)

Generating Music with Structure Using Self-Similarity as Attention
by: Hager, Sophia, et al.
Published: (2024)

Parameter-Efficient Transfer Learning for Music Foundation Models
by: Ding, Yiwei, et al.
Published: (2024)

Improving BERT for Symbolic Music Understanding Using Token Denoising and Pianoroll Prediction
by: Wang, Jun-You, et al.
Published: (2025)