Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Gupta, Shubham, Gomez-Sarmiento, Isaac Neri, Mezdari, Faez Amjed, Ravanelli, Mirco, Subakan, Cem
Formato:	Preprint
Publicado:	2024
Materias:	Machine Learning Artificial Intelligence Sound Audio and Speech Processing
Acceso en línea:	https://arxiv.org/abs/2410.05455
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866913535926206464
author	Gupta, Shubham Gomez-Sarmiento, Isaac Neri Mezdari, Faez Amjed Ravanelli, Mirco Subakan, Cem
author_facet	Gupta, Shubham Gomez-Sarmiento, Isaac Neri Mezdari, Faez Amjed Ravanelli, Mirco Subakan, Cem
contents	We propose a novel approach for humming transcription that combines a CNN-based architecture with a dynamic programming-based post-processing algorithm, utilizing the recently introduced HumTrans dataset. We identify and address inherent problems with the offset and onset ground truth provided by the dataset, offering heuristics to improve these annotations, resulting in a dataset with precise annotations that will aid future research. Additionally, we compare the transcription accuracy of our method against several others, demonstrating state-of-the-art (SOTA) results. All our code and corrected dataset is available at https://github.com/shubham-gupta-30/humming_transcription
format	Preprint
id	arxiv_https___arxiv_org_abs_2410_05455
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Dynamic HumTrans: Humming Transcription Using CNNs and Dynamic Programming Gupta, Shubham Gomez-Sarmiento, Isaac Neri Mezdari, Faez Amjed Ravanelli, Mirco Subakan, Cem Machine Learning Artificial Intelligence Sound Audio and Speech Processing We propose a novel approach for humming transcription that combines a CNN-based architecture with a dynamic programming-based post-processing algorithm, utilizing the recently introduced HumTrans dataset. We identify and address inherent problems with the offset and onset ground truth provided by the dataset, offering heuristics to improve these annotations, resulting in a dataset with precise annotations that will aid future research. Additionally, we compare the transcription accuracy of our method against several others, demonstrating state-of-the-art (SOTA) results. All our code and corrected dataset is available at https://github.com/shubham-gupta-30/humming_transcription
title	Dynamic HumTrans: Humming Transcription Using CNNs and Dynamic Programming
topic	Machine Learning Artificial Intelligence Sound Audio and Speech Processing
url	https://arxiv.org/abs/2410.05455

Ejemplares similares