Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Wang, Yikang, Wang, Xingming, Nishizaki, Hiromitsu, Li, Ming
Format:	Preprint
Published:	2024
Subjects:	Sound Audio and Speech Processing Signal Processing
Online Access:	https://arxiv.org/abs/2407.20111
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910545713561600
author	Wang, Yikang Wang, Xingming Nishizaki, Hiromitsu Li, Ming
author_facet	Wang, Yikang Wang, Xingming Nishizaki, Hiromitsu Li, Ming
contents	Current research in synthesized speech detection primarily focuses on the generalization of detection systems to unknown spoofing methods of noise-free speech. However, the performance of anti-spoofing countermeasures (CM) system is often don't work as well in more challenging scenarios, such as those involving noise and reverberation. To address the problem of enhancing the robustness of CM systems, we propose a transfer learning-based speech enhancement front-end joint optimization (TL-SEJ) method, investigating its effectiveness in improving robustness against noise and reverberation. We evaluated the proposed method's performance through a series of comparative and ablation experiments. The experimental results show that, across different signal-to-noise ratio test conditions, the proposed TL-SEJ method improves recognition accuracy by 2.7% to 15.8% compared to the baseline. Compared to conventional data augmentation methods, our system achieves an accuracy improvement ranging from 0.7% to 5.8% in various noisy conditions and from 1.7% to 2.8% under different RT60 reverberation scenarios. These experiments demonstrate that the proposed method effectively enhances system robustness in noisy and reverberant conditions.
format	Preprint
id	arxiv_https___arxiv_org_abs_2407_20111
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Enhancing Anti-spoofing Countermeasures Robustness through Joint Optimization and Transfer Learning Wang, Yikang Wang, Xingming Nishizaki, Hiromitsu Li, Ming Sound Audio and Speech Processing Signal Processing Current research in synthesized speech detection primarily focuses on the generalization of detection systems to unknown spoofing methods of noise-free speech. However, the performance of anti-spoofing countermeasures (CM) system is often don't work as well in more challenging scenarios, such as those involving noise and reverberation. To address the problem of enhancing the robustness of CM systems, we propose a transfer learning-based speech enhancement front-end joint optimization (TL-SEJ) method, investigating its effectiveness in improving robustness against noise and reverberation. We evaluated the proposed method's performance through a series of comparative and ablation experiments. The experimental results show that, across different signal-to-noise ratio test conditions, the proposed TL-SEJ method improves recognition accuracy by 2.7% to 15.8% compared to the baseline. Compared to conventional data augmentation methods, our system achieves an accuracy improvement ranging from 0.7% to 5.8% in various noisy conditions and from 1.7% to 2.8% under different RT60 reverberation scenarios. These experiments demonstrate that the proposed method effectively enhances system robustness in noisy and reverberant conditions.
title	Enhancing Anti-spoofing Countermeasures Robustness through Joint Optimization and Transfer Learning
topic	Sound Audio and Speech Processing Signal Processing
url	https://arxiv.org/abs/2407.20111

Similar Items