Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Kong, Yuexuan, Tran, Viet-Anh, Hennequin, Romain
Format:	Preprint
Published:	2024
Subjects:	Sound Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2406.04140
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866916277960835072
author	Kong, Yuexuan Tran, Viet-Anh Hennequin, Romain
author_facet	Kong, Yuexuan Tran, Viet-Anh Hennequin, Romain
contents	There is a limited amount of large-scale public datasets that contain downloadable music audio files and rich lead singer metadata. To provide such a dataset to benefit research in singing voices, we created Singer Traits Dataset (STraDa) with two subsets: automatic-strada and annotated-strada. The automatic-strada contains twenty-five thousand tracks across numerous genres and languages of more than five thousand unique lead singers, which includes cross-validated lead singer metadata as well as other track metadata. The annotated-strada consists of two hundred tracks that are balanced in terms of 2 genders, 5 languages, and 4 age groups. To show its use for model training and bias analysis thanks to its metadata's richness and downloadable audio files, we benchmarked singer sex classification (SSC) and conducted bias analysis.
format	Preprint
id	arxiv_https___arxiv_org_abs_2406_04140
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	STraDa: A Singer Traits Dataset Kong, Yuexuan Tran, Viet-Anh Hennequin, Romain Sound Audio and Speech Processing There is a limited amount of large-scale public datasets that contain downloadable music audio files and rich lead singer metadata. To provide such a dataset to benefit research in singing voices, we created Singer Traits Dataset (STraDa) with two subsets: automatic-strada and annotated-strada. The automatic-strada contains twenty-five thousand tracks across numerous genres and languages of more than five thousand unique lead singers, which includes cross-validated lead singer metadata as well as other track metadata. The annotated-strada consists of two hundred tracks that are balanced in terms of 2 genders, 5 languages, and 4 age groups. To show its use for model training and bias analysis thanks to its metadata's richness and downloadable audio files, we benchmarked singer sex classification (SSC) and conducted bias analysis.
title	STraDa: A Singer Traits Dataset
topic	Sound Audio and Speech Processing
url	https://arxiv.org/abs/2406.04140

Similar Items