Internformat: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Liu, Xuechen, Wang, Xin, Yamagishi, Junichi
Format:	Preprint
Veröffentlicht:	2025
Schlagworte:	Sound
Online-Zugang:	https://arxiv.org/abs/2509.21728
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

_version_	1866911361719599104
author	Liu, Xuechen Wang, Xin Yamagishi, Junichi
author_facet	Liu, Xuechen Wang, Xin Yamagishi, Junichi
contents	Modern audio deepfake detectors built on foundation models and large training datasets achieve promising detection performance. However, they struggle with zero-day attacks, where the audio samples are generated by novel synthesis methods that models have not seen from reigning training data. Conventional approaches fine-tune the detector, which can be problematic when prompt response is needed. This paper proposes a training-free retrieval-augmented framework for zero-day audio deepfake detection that leverages knowledge representations and voice profile matching. Within this framework, we propose simple yet effective retrieval and ensemble methods that reach performance comparable to supervised baselines and their fine-tuned counterparts on the DeepFake-Eval-2024 benchmark, without any additional model training. We also conduct ablation on voice profile attributes, and demonstrate the cross-database generalizability of the framework with introducing simple and training-free fusion strategies.
format	Preprint
id	arxiv_https___arxiv_org_abs_2509_21728
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Zero-Day Audio DeepFake Detection via Retrieval Augmentation and Profile Matching Liu, Xuechen Wang, Xin Yamagishi, Junichi Sound Modern audio deepfake detectors built on foundation models and large training datasets achieve promising detection performance. However, they struggle with zero-day attacks, where the audio samples are generated by novel synthesis methods that models have not seen from reigning training data. Conventional approaches fine-tune the detector, which can be problematic when prompt response is needed. This paper proposes a training-free retrieval-augmented framework for zero-day audio deepfake detection that leverages knowledge representations and voice profile matching. Within this framework, we propose simple yet effective retrieval and ensemble methods that reach performance comparable to supervised baselines and their fine-tuned counterparts on the DeepFake-Eval-2024 benchmark, without any additional model training. We also conduct ablation on voice profile attributes, and demonstrate the cross-database generalizability of the framework with introducing simple and training-free fusion strategies.
title	Zero-Day Audio DeepFake Detection via Retrieval Augmentation and Profile Matching
topic	Sound
url	https://arxiv.org/abs/2509.21728

Ähnliche Einträge