:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Malandro, Martin E.
Format:	Preprint
Published:	2024
Subjects:	Sound Machine Learning Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2407.14700
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Barwise Section Boundary Detection in Symbolic Music Using Convolutional Neural Networks
by: Eldeeb, Omar, et al.
Published: (2025)

End-to-end Piano Performance-MIDI to Score Conversion with Transformers
by: Beyer, Tim, et al.
Published: (2024)

Annotation-Free MIDI-to-Audio Synthesis via Concatenative Synthesis and Generative Refinement
by: Take, Osamu, et al.
Published: (2024)

MIDI-GPT: A Controllable Generative Model for Computer-Assisted Multitrack Music Composition
by: Pasquier, Philippe, et al.
Published: (2025)

The Florence Price Art Song Dataset and Piano Accompaniment Generator
by: He, Tao-Tao, et al.
Published: (2025)

FIGARO: Generating Symbolic Music with Fine-Grained Artistic Control
by: von Rütte, Dimitri, et al.
Published: (2022)

MidiCaps: A large-scale MIDI dataset with text captions
by: Melechovsky, Jan, et al.
Published: (2024)

Unsupervised Composable Representations for Audio
by: Bindi, Giovanni, et al.
Published: (2024)

On the de-duplication of the Lakh MIDI dataset
by: Choi, Eunjin, et al.
Published: (2025)

Adaptable Symbolic Music Infilling with MIDI-RWKV
by: Zhou-Zheng, Christian, et al.
Published: (2025)

PBSCR: The Piano Bootleg Score Composer Recognition Dataset
by: Jain, Arhan, et al.
Published: (2024)

Composer's Assistant 2: Interactive Multi-Track MIDI Infilling With Fine-Grained User Control
by: Martin E. Malandro
Published: (2024)

Fine-grained Soundscape Control for Augmented Hearing
by: Oh, Seunghyun, et al.
Published: (2026)

Integrating Text-to-Music Models with Language Models: Composing Long Structured Music Pieces
by: Atassi, Lilac
Published: (2024)

Developing an AI-Guided Assistant Device for the Deaf and Hearing Impaired
by: Jiayu, et al.
Published: (2025)

SELMA: A Speech-Enabled Language Model for Virtual Assistant Interactions
by: Wagner, Dominik, et al.
Published: (2025)

Transcription-Free Fine-Tuning of Speech Separation Models for Noisy and Reverberant Multi-Speaker Automatic Speech Recognition
by: Ravenscroft, William, et al.
Published: (2024)

How to Infer Repeat Structures in MIDI Performances
by: Peter, Silvan, et al.
Published: (2025)

Transcribing Rhythmic Patterns of the Guitar Track in Polyphonic Music
by: Lukoianov, Aleksandr, et al.
Published: (2025)

Evaluating Disentangled Representations for Controllable Music Generation
by: Ibáñez-Martínez, Laura, et al.
Published: (2026)

Fine-Tuning Whisper for Inclusive Prosodic Stress Analysis
by: Sohn, Samuel S., et al.
Published: (2025)

Aligning Spoken Dialogue Models from User Interactions
by: Wu, Anne, et al.
Published: (2025)

From Coarse to Fine: Efficient Training for Audio Spectrogram Transformers
by: Feng, Jiu, et al.
Published: (2024)

Automatic Equalization for Individual Instrument Tracks Using Convolutional Neural Networks
by: Mockenhaupt, Florian, et al.
Published: (2024)

Revisiting Meter Tracking in Carnatic Music using Deep Learning Approaches
by: Prabhu, Satyajeet
Published: (2025)

Sonos Voice Control Bias Assessment Dataset: A Methodology for Demographic Bias Assessment in Voice Assistants
by: Sekkat, Chloé, et al.
Published: (2024)

Gradient Norm-based Fine-Tuning for Backdoor Defense in Automatic Speech Recognition
by: Zhou, Nanjun, et al.
Published: (2025)

Filling MIDI Velocity using U-Net Image Colorizer
by: He, Zhanhong, et al.
Published: (2025)

AMT-APC: Automatic Piano Cover by Fine-Tuning an Automatic Music Transcription Model
by: Komiya, Kazuma, et al.
Published: (2024)

Bridging the Perception Gap: A Lightweight Coarse-to-Fine Architecture for Edge Audio Systems
by: Zhang, Hengfan, et al.
Published: (2026)

Autoregressive Guidance of Deep Spatially Selective Filters using Bayesian Tracking for Efficient Extraction of Moving Speakers
by: Kienegger, Jakob, et al.
Published: (2026)

Windowed SummaryMixing: An Efficient Fine-Tuning of Self-Supervised Learning Models for Low-resource Speech Recognition
by: Menon, Aditya Srinivas, et al.
Published: (2026)

MIDI-Informed Singing Accompaniment Generation in a Compositional Song Pipeline
by: Tsai, Fang-Duo, et al.
Published: (2026)

Score-Informed Transformer for Refining MIDI Velocity in Automatic Music Transcription
by: He, Zhanhong, et al.
Published: (2025)

Compose Yourself: Average-Velocity Flow Matching for One-Step Speech Enhancement
by: Yang, Gang, et al.
Published: (2025)

Multi-Microphone and Multi-Modal Emotion Recognition in Reverberant Environment
by: Cohen, Ohad, et al.
Published: (2024)

Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model
by: Zezario, Ryandhimas E., et al.
Published: (2023)

ControlSpeech: Towards Simultaneous and Independent Zero-shot Speaker Cloning and Zero-shot Language Style Control
by: Ji, Shengpeng, et al.
Published: (2024)

Automatic Music Sample Identification with Multi-Track Contrastive Learning
by: Riou, Alain, et al.
Published: (2025)

Deep Learning-based Non-Intrusive Multi-Objective Speech Assessment Model with Cross-Domain Features
by: Zezario, Ryandhimas E., et al.
Published: (2021)