:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Go, Gyehun, Han, Satbyul, Choi, Ahyeon, Choi, Eunjin, Nam, Juhan, Park, Jeong Mi
Format:	Preprint
Published:	2025
Subjects:	Sound Artificial Intelligence Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2509.00813
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

On the de-duplication of the Lakh MIDI dataset
by: Choi, Eunjin, et al.
Published: (2025)

PIAST: A Multimodal Piano Dataset with Audio, Symbolic and Text
by: Bang, Hayeon, et al.
Published: (2024)

TalkPlayData 2: An Agentic Synthetic Data Pipeline for Multimodal Conversational Music Recommendation
by: Choi, Keunwoo, et al.
Published: (2025)

TALKPLAY: Multimodal Music Recommendation with Large Language Models
by: Doh, Seungheon, et al.
Published: (2025)

Musical Word Embedding for Music Tagging and Retrieval
by: Doh, SeungHeon, et al.
Published: (2024)

Dialogue in Resonance: An Interactive Music Piece for Piano and Real-Time Automatic Transcription System
by: Bang, Hayeon, et al.
Published: (2025)

TalkPlay-Tools: Conversational Music Recommendation with LLM Tool Calling
by: Doh, Seungheon, et al.
Published: (2025)

Enriching Music Descriptions with a Finetuned-LLM and Metadata for Text-to-Music Retrieval
by: Doh, SeungHeon, et al.
Published: (2024)

Expressive Acoustic Guitar Sound Synthesis with an Instrument-Specific Input Representation and Diffusion Outpainting
by: Kim, Hounsu, et al.
Published: (2024)

Music Discovery Dialogue Generation Using Human Intent Analysis and Large Language Models
by: Doh, SeungHeon, et al.
Published: (2024)

KAD: No More FAD! An Effective and Efficient Evaluation Metric for Audio Generation
by: Chung, Yoonjin, et al.
Published: (2025)

Segment Transformer: AI-Generated Music Detection via Music Structural Analysis
by: Kim, Yumin, et al.
Published: (2025)

Real-world Music Plagiarism Detection With Music Segment Transcription System
by: Go, Seonghyeon
Published: (2025)

FlashSR: One-step Versatile Audio Super-resolution via Diffusion Distillation
by: Im, Jaekwon, et al.
Published: (2025)

DIFFRENT: A Diffusion Model for Recording Environment Transfer of Speech
by: Im, Jaekwon, et al.
Published: (2024)

CONMOD: Controllable Neural Frame-based Modulation Effects
by: Lee, Gyubin, et al.
Published: (2024)

Towards Efficient and Real-Time Piano Transcription Using Neural Autoregressive Models
by: Kwon, Taegyun, et al.
Published: (2024)

Joint Learning of Emotions in Music and Generalized Sounds
by: Simonetta, Federico, et al.
Published: (2024)

Aligning Text-to-Music Evaluation with Human Preferences
by: Huang, Yichen, et al.
Published: (2025)

The Music Maestro or The Musically Challenged, A Massive Music Evaluation Benchmark for Large Language Models
by: Li, Jiajia, et al.
Published: (2024)

Wearable Music2Emotion : Assessing Emotions Induced by AI-Generated Music through Portable EEG-fNIRS Fusion
by: Zhao, Sha, et al.
Published: (2025)

Predicting User Intents and Musical Attributes from Music Discovery Conversations
by: Kwon, Daeyong, et al.
Published: (2024)

MusicFlow: Cascaded Flow Matching for Text Guided Music Generation
by: Prajwal, K R, et al.
Published: (2024)

D3RM: A Discrete Denoising Diffusion Refinement Model for Piano Transcription
by: Kim, Hounsu, et al.
Published: (2025)

Self Training and Ensembling Frequency Dependent Networks with Coarse Prediction Pooling and Sound Event Bounding Boxes
by: Nam, Hyeonuk, et al.
Published: (2024)

The Interpretation Gap in Text-to-Music Generation Models
by: Zang, Yongyi, et al.
Published: (2024)

Six Dragons Fly Again: Reviving 15th-Century Korean Court Music with Transformers and Novel Encoding
by: Han, Danbinaerin, et al.
Published: (2024)

MMVA: Multimodal Matching Based on Valence and Arousal across Images, Music, and Musical Captions
by: Choi, Suhwan, et al.
Published: (2025)

MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music Generation
by: Lan, Yun-Han, et al.
Published: (2024)

MusicEval: A Generative Music Dataset with Expert Ratings for Automatic Text-to-Music Evaluation
by: Liu, Cheng, et al.
Published: (2025)

MuSpike: A Benchmark and Evaluation Framework for Symbolic Music Generation with Spiking Neural Networks
by: Liang, Qian, et al.
Published: (2025)

MusER: Musical Element-Based Regularization for Generating Symbolic Music with Emotion
by: Ji, Shulei, et al.
Published: (2023)

AudioGenX: Explainability on Text-to-Audio Generative Models
by: Kang, Hyunju, et al.
Published: (2025)

BERT-APC: A Reference-free Framework for Automatic Pitch Correction via Musical Context Inference
by: Kim, Sungjae, et al.
Published: (2025)

T-FOLEY: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis
by: Chung, Yoonjin, et al.
Published: (2024)

Emotion-driven Piano Music Generation via Two-stage Disentanglement and Functional Representation
by: Huang, Jingyue, et al.
Published: (2024)

Enhancing Speech Emotion Recognition through Segmental Average Pooling of Self-Supervised Learning Features
by: Hyeon, Jonghwan, et al.
Published: (2024)

A Real-Time Lyrics Alignment System Using Chroma And Phonetic Features For Classical Vocal Performance
by: Park, Jiyun, et al.
Published: (2024)

CLaMP 3: Universal Music Information Retrieval Across Unaligned Modalities and Unseen Languages
by: Wu, Shangda, et al.
Published: (2025)

Semi-Supervised Self-Learning Enhanced Music Emotion Recognition
by: Sun, Yifu, et al.
Published: (2024)