תוכן הענינים: :: Library Catalog

שמור ב:

מידע ביבליוגרפי
Main Authors:	Serre, Thomas, Fontaine, Mathieu, Benhaim, Éric, Dutour, Geoffroy, Essid, Slim
פורמט:	Preprint
יצא לאור:	2024
נושאים:	Sound Audio and Speech Processing
גישה מקוונת:	https://arxiv.org/abs/2404.08022
תגים:	הוספת תג אין תגיות, היה/י הראשונ/ה לתייג את הרשומה!

תוכן הענינים:

Isolating the desired speaker's voice amidst multiplespeakers in a noisy acoustic context is a challenging task. Per-sonalized speech enhancement (PSE) endeavours to achievethis by leveraging prior knowledge of the speaker's voice.Recent research efforts have yielded promising PSE mod-els, albeit often accompanied by computationally intensivearchitectures, unsuitable for resource-constrained embeddeddevices. In this paper, we introduce a novel method to per-sonalize a lightweight dual-stage Speech Enhancement (SE)model and implement it within DeepFilterNet2, a SE modelrenowned for its state-of-the-art performance. We seek anoptimal integration of speaker information within the model,exploring different positions for the integration of the speakerembeddings within the dual-stage enhancement architec-ture. We also investigate a tailored training strategy whenadapting DeepFilterNet2 to a PSE task. We show that ourpersonalization method greatly improves the performancesof DeepFilterNet2 while preserving minimal computationaloverhead.

פריטים דומים