Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Akram, Ali, Stanojevic, Marija, Ehghaghi, Malikeh, Novikova, Jekaterina
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Sound Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2404.01981
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913302133604352
author	Akram, Ali Stanojevic, Marija Ehghaghi, Malikeh Novikova, Jekaterina
author_facet	Akram, Ali Stanojevic, Marija Ehghaghi, Malikeh Novikova, Jekaterina
contents	Due to the substantial number of clinicians, patients, and data collection environments involved in clinical trials, gathering data of superior quality poses a significant challenge. In clinical trials, patients are assessed based on their speech data to detect and monitor cognitive and mental health disorders. We propose using these speech recordings to verify the identities of enrolled patients and identify and exclude the individuals who try to enroll multiple times in the same trial. Since clinical studies are often conducted across different countries, creating a system that can perform speaker verification in diverse languages without additional development effort is imperative. We evaluate pre-trained TitaNet, ECAPA-TDNN, and SpeakerNet models by enrolling and testing with speech-impaired patients speaking English, German, Danish, Spanish, and Arabic languages. Our results demonstrate that tested models can effectively generalize to clinical speakers, with less than 2.7% EER for European Languages and 8.26% EER for Arabic. This represents a significant step in developing more versatile and efficient speaker verification systems for cognitive and mental health clinical trials that can be used across a wide range of languages and dialects, substantially reducing the effort required to develop speaker verification systems for multiple languages. We also evaluate how speech tasks and number of speakers involved in the trial influence the performance and show that the type of speech tasks impacts the model performance.
format	Preprint
id	arxiv_https___arxiv_org_abs_2404_01981
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Zero-Shot Multi-Lingual Speaker Verification in Clinical Trials Akram, Ali Stanojevic, Marija Ehghaghi, Malikeh Novikova, Jekaterina Machine Learning Sound Audio and Speech Processing Due to the substantial number of clinicians, patients, and data collection environments involved in clinical trials, gathering data of superior quality poses a significant challenge. In clinical trials, patients are assessed based on their speech data to detect and monitor cognitive and mental health disorders. We propose using these speech recordings to verify the identities of enrolled patients and identify and exclude the individuals who try to enroll multiple times in the same trial. Since clinical studies are often conducted across different countries, creating a system that can perform speaker verification in diverse languages without additional development effort is imperative. We evaluate pre-trained TitaNet, ECAPA-TDNN, and SpeakerNet models by enrolling and testing with speech-impaired patients speaking English, German, Danish, Spanish, and Arabic languages. Our results demonstrate that tested models can effectively generalize to clinical speakers, with less than 2.7% EER for European Languages and 8.26% EER for Arabic. This represents a significant step in developing more versatile and efficient speaker verification systems for cognitive and mental health clinical trials that can be used across a wide range of languages and dialects, substantially reducing the effort required to develop speaker verification systems for multiple languages. We also evaluate how speech tasks and number of speakers involved in the trial influence the performance and show that the type of speech tasks impacts the model performance.
title	Zero-Shot Multi-Lingual Speaker Verification in Clinical Trials
topic	Machine Learning Sound Audio and Speech Processing
url	https://arxiv.org/abs/2404.01981

Similar Items