Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Labrador, Beltrán, Otero-Gonzalez, Manuel, Lozano-Diez, Alicia, Ramos, Daniel, Toledano, Doroteo T., Gonzalez-Rodriguez, Joaquin
Format:	Preprint
Published:	2023
Subjects:	Sound Machine Learning Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2401.09441
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

This paper presents VoxCeleb-ESP, a collection of pointers and timestamps to YouTube videos facilitating the creation of a novel speaker recognition dataset. VoxCeleb-ESP captures real-world scenarios, incorporating diverse speaking styles, noises, and channel distortions. It includes 160 Spanish celebrities spanning various categories, ensuring a representative distribution across age groups and geographic regions in Spain. We provide two speaker trial lists for speaker identification tasks, each of them with same-video or different-video target trials respectively, accompanied by a cross-lingual evaluation of ResNet pretrained models. Preliminary speaker identification results suggest that the complexity of the detection task in VoxCeleb-ESP is equivalent to that of the original and much larger VoxCeleb in English. VoxCeleb-ESP contributes to the expansion of speaker recognition benchmarks with a comprehensive and diverse dataset for the Spanish language.

Similar Items