:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Edwards, Drew, Riley, Xavier, Sarmento, Pedro, Dixon, Simon
Format:	Preprint
Published:	2024
Subjects:	Sound Computation and Language Information Retrieval
Online Access:	https://arxiv.org/abs/2408.05024
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

SynthTab: Leveraging Synthesized Data for Guitar Tablature Transcription
by: Zang, Yongyi, et al.
Published: (2023)

High Resolution Guitar Transcription via Domain Adaptation
by: Riley, Xavier, et al.
Published: (2024)

GAPS: A Large and Diverse Classical Guitar Dataset and Benchmark Transcription Model
by: Riley, Xavier, et al.
Published: (2024)

The GigaMIDI Dataset with Features for Expressive Music Performance Detection
by: Lee, Keon Ju Maverick, et al.
Published: (2025)

A Machine Learning Approach for MIDI to Guitar Tablature Conversion
by: Kaliakatsos-Papakostas, Maximos, et al.
Published: (2025)

Fretting-Transformer: Encoder-Decoder Model for MIDI to Tablature Transcription
by: Hamberger, Anna, et al.
Published: (2025)

GuitarFlow: Realistic Electric Guitar Synthesis From Tablatures via Flow Matching and Style Transfer
by: Loth, Jackson, et al.
Published: (2025)

Music Information Retrieval on Representative Mexican Folk Vocal Melodies Through MIDI Feature Extraction
by: Reyes, Mario Alberto Vallejo
Published: (2025)

Guitar Chord Diagram Suggestion for Western Popular Music
by: d'Hooge, Alexandre, et al.
Published: (2024)

GOAT: A Large Dataset of Paired Guitar Audio Recordings and Tablatures
by: Loth, Jackson, et al.
Published: (2025)

CLASP: Contrastive Language-Speech Pretraining for Multilingual Multimodal Information Retrieval
by: Abootorabi, Mohammad Mahdi, et al.
Published: (2024)

WikiMuTe: A web-sourced dataset of semantic descriptions for music audio
by: Weck, Benno, et al.
Published: (2023)

Multi-Modal Retrieval For Large Language Model Based Speech Recognition
by: Kolehmainen, Jari, et al.
Published: (2024)

Transforming LLMs into Cross-modal and Cross-lingual Retrieval Systems
by: Gomez, Frank Palma, et al.
Published: (2024)

SpeechDPR: End-to-End Spoken Passage Retrieval for Open-Domain Spoken Question Answering
by: Lin, Chyi-Jiunn, et al.
Published: (2024)

Analyzing Byte-Pair Encoding on Monophonic and Polyphonic Symbolic Music: A Focus on Musical Phrase Segmentation
by: Le, Dinh-Viet-Toan, et al.
Published: (2024)

MIDI-LLM: Adapting Large Language Models for Text-to-MIDI Music Generation
by: Wu, Shih-Lun, et al.
Published: (2025)

I can listen but cannot read: An evaluation of two-tower multimodal systems for instrument recognition
by: Vasilakis, Yannis, et al.
Published: (2024)

A GEN AI Framework for Medical Note Generation
by: Leong, Hui Yi, et al.
Published: (2024)

More than words: Advancements and challenges in speech recognition for singing
by: Kruspe, Anna
Published: (2024)

Beyond Musical Descriptors: Extracting Preference-Bearing Intent in Music Queries
by: Baranes, Marion, et al.
Published: (2026)

Technical Report on classification of literature related to children speech disorder
by: Wang, Ziang, et al.
Published: (2025)

Navigating Speech Recording Collections with AI-Generated Illustrations
by: Håland, Sirina, et al.
Published: (2025)

The Kolmogorov Complexity of Irish traditional dance music
by: McGettrick, Michael, et al.
Published: (2024)

Can Impressions of Music be Extracted from Thumbnail Images?
by: Harada, Takashi, et al.
Published: (2025)

Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language
by: Hamilton, Mark, et al.
Published: (2024)

Leveraging Real Electric Guitar Tones and Effects to Improve Robustness in Guitar Tablature Transcription Modeling
by: Pedroza, Hegel, et al.
Published: (2024)

A Lightweight Two-Branch Architecture for Multi-Instrument Transcription via Note-Level Contrastive Clustering
by: Li, Ruigang, et al.
Published: (2025)

Efficient Inference for Large Language Model-based Generative Recommendation
by: Lin, Xinyu, et al.
Published: (2024)

The language of sound search: Examining User Queries in Audio Search Engines
by: Weck, Benno, et al.
Published: (2024)

wav2graph: A Framework for Supervised Learning Knowledge Graph from Speech
by: Le-Duc, Khai, et al.
Published: (2024)

TabEmbed: Benchmarking and Learning Generalist Embeddings for Tabular Understanding
by: Qiang, Minjie, et al.
Published: (2026)

LoopLens: Supporting Search as Creation in Loop-Based Music Composition
by: Long, Sheng, et al.
Published: (2026)

Quality Over Quantity? LLM-Based Curation for a Data-Efficient Audio-Video Foundation Model
by: Vosoughi, Ali, et al.
Published: (2025)

TALKPLAY: Multimodal Music Recommendation with Large Language Models
by: Doh, Seungheon, et al.
Published: (2025)

Towards Explainable and Interpretable Musical Difficulty Estimation: A Parameter-efficient Approach
by: Ramoneda, Pedro, et al.
Published: (2024)

PianoBind: A Multimodal Joint Embedding Model for Pop-piano Music
by: Bang, Hayeon, et al.
Published: (2025)

Cosmodoit: A Python Package for Adaptive, Efficient Pipelining of Feature Extraction from Performed Music
by: Guichaoua, Corentin, et al.
Published: (2026)

Understanding Human Perception of Music Plagiarism Through a Computational Approach
by: Hwang, Daeun, et al.
Published: (2026)

Advancing Multi-Instrument Music Transcription: Results from the 2025 AMT Challenge
by: Chaturvedi, Ojas, et al.
Published: (2026)