Saved in:
Bibliographic Details
Main Authors: Wang, Yikang, Wang, Xingming, Nishizaki, Hiromitsu, Li, Ming
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2407.20111
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910545713561600
author Wang, Yikang
Wang, Xingming
Nishizaki, Hiromitsu
Li, Ming
author_facet Wang, Yikang
Wang, Xingming
Nishizaki, Hiromitsu
Li, Ming
contents Current research in synthesized speech detection primarily focuses on the generalization of detection systems to unknown spoofing methods of noise-free speech. However, the performance of anti-spoofing countermeasures (CM) system is often don't work as well in more challenging scenarios, such as those involving noise and reverberation. To address the problem of enhancing the robustness of CM systems, we propose a transfer learning-based speech enhancement front-end joint optimization (TL-SEJ) method, investigating its effectiveness in improving robustness against noise and reverberation. We evaluated the proposed method's performance through a series of comparative and ablation experiments. The experimental results show that, across different signal-to-noise ratio test conditions, the proposed TL-SEJ method improves recognition accuracy by 2.7% to 15.8% compared to the baseline. Compared to conventional data augmentation methods, our system achieves an accuracy improvement ranging from 0.7% to 5.8% in various noisy conditions and from 1.7% to 2.8% under different RT60 reverberation scenarios. These experiments demonstrate that the proposed method effectively enhances system robustness in noisy and reverberant conditions.
format Preprint
id arxiv_https___arxiv_org_abs_2407_20111
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Enhancing Anti-spoofing Countermeasures Robustness through Joint Optimization and Transfer Learning
Wang, Yikang
Wang, Xingming
Nishizaki, Hiromitsu
Li, Ming
Sound
Audio and Speech Processing
Signal Processing
Current research in synthesized speech detection primarily focuses on the generalization of detection systems to unknown spoofing methods of noise-free speech. However, the performance of anti-spoofing countermeasures (CM) system is often don't work as well in more challenging scenarios, such as those involving noise and reverberation. To address the problem of enhancing the robustness of CM systems, we propose a transfer learning-based speech enhancement front-end joint optimization (TL-SEJ) method, investigating its effectiveness in improving robustness against noise and reverberation. We evaluated the proposed method's performance through a series of comparative and ablation experiments. The experimental results show that, across different signal-to-noise ratio test conditions, the proposed TL-SEJ method improves recognition accuracy by 2.7% to 15.8% compared to the baseline. Compared to conventional data augmentation methods, our system achieves an accuracy improvement ranging from 0.7% to 5.8% in various noisy conditions and from 1.7% to 2.8% under different RT60 reverberation scenarios. These experiments demonstrate that the proposed method effectively enhances system robustness in noisy and reverberant conditions.
title Enhancing Anti-spoofing Countermeasures Robustness through Joint Optimization and Transfer Learning
topic Sound
Audio and Speech Processing
Signal Processing
url https://arxiv.org/abs/2407.20111