:: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Liu, Xuechen, Wang, Xin, Yamagishi, Junichi
Format:	Preprint
Published:	2025
Subjects:	Sound
Online Access:	https://arxiv.org/abs/2509.21728
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

A Preliminary Case Study on Long-Form In-the-Wild Audio Spoofing Detection
by: Liu, Xuechen, et al.
Published: (2024)

LENS-DF: Deepfake Detection and Temporal Localization for Long-Form Noisy Speech
by: Liu, Xuechen, et al.
Published: (2025)

From Sharpness to Better Generalization for Speech Deepfake Detection
by: Huang, Wen, et al.
Published: (2025)

Improving curriculum learning for target speaker extraction with synthetic speakers
by: Liu, Yun, et al.
Published: (2024)

DeepFake Doctor: Diagnosing and Treating Audio-Video Fake Detection
by: Klemt, Marcel, et al.
Published: (2025)

Training Dynamics-Aware Multi-Factor Curriculum Learning for Target Speaker Extraction
by: Liu, Yun, et al.
Published: (2026)

Explaining Speaker and Spoof Embeddings via Probing
by: Liu, Xuechen, et al.
Published: (2024)

Quantifying Source Speaker Leakage in One-to-One Voice Conversion
by: Wellington, Scott, et al.
Published: (2025)

Can DeepFake Speech be Reliably Detected?
by: Liu, Hongbin, et al.
Published: (2024)

Target Speaker Extraction with Curriculum Learning
by: Liu, Yun, et al.
Published: (2024)

Libri2Vox Dataset: Target Speaker Extraction with Diverse Speaker Conditions and Synthetic Data
by: Liu, Yun, et al.
Published: (2024)

Disentangling the Prosody and Semantic Information with Pre-trained Model for In-Context Learning based Zero-Shot Voice Conversion
by: Chen, Zhengyang, et al.
Published: (2024)

Are audio DeepFake detection models polyglots?
by: Marek, Bartłomiej, et al.
Published: (2024)

AfriHuBERT: A self-supervised speech representation model for African languages
by: Alabi, Jesujoba O., et al.
Published: (2024)

Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems
by: Chen, Zhengyang, et al.
Published: (2024)

Self Voice Conversion as an Attack against Neural Audio Watermarking
by: Özer, Yigitcan, et al.
Published: (2026)

Generalized Fake Audio Detection via Deep Stable Learning
by: Wang, Zhiyong, et al.
Published: (2024)

Speech DF Arena: A Leaderboard for Speech DeepFake Detection Models
by: Dowerah, Sandipana, et al.
Published: (2025)

Zero-Shot Fake Video Detection by Audio-Visual Consistency
by: Li, Xiaolou, et al.
Published: (2024)

SCDF: A Speaker Characteristics DeepFake Speech Dataset for Bias Analysis
by: Staněk, Vojtěch, et al.
Published: (2025)

PolyGlotFake: A Novel Multilingual and Multimodal DeepFake Dataset
by: Hou, Yang, et al.
Published: (2024)

ASVspoof 5: Evaluation of Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech
by: Wang, Xin, et al.
Published: (2026)

A Comparative Study on Proactive and Passive Detection of Deepfake Speech
by: Wu, Chia-Hua, et al.
Published: (2025)

A Noval Feature via Color Quantisation for Fake Audio Detection
by: Wang, Zhiyong, et al.
Published: (2024)

Spoof Diarization: "What Spoofed When" in Partially Spoofed Audio
by: Zhang, Lin, et al.
Published: (2024)

EmoFake: An Initial Dataset for Emotion Fake Audio Detection
by: Zhao, Yan, et al.
Published: (2022)

ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations
by: Gong, Cheng, et al.
Published: (2023)

Towards An Integrated Approach for Expressive Piano Performance Synthesis from Music Scores
by: Tang, Jingjing, et al.
Published: (2025)

Spoofing-Aware Speaker Verification Robust Against Domain and Channel Mismatches
by: Zeng, Chang, et al.
Published: (2024)

Retrieval-Augmented Audio Deepfake Detection
by: Kang, Zuheng, et al.
Published: (2024)

Evaluating Fake Music Detection Performance Under Audio Augmentations
by: Sroka, Tomasz, et al.
Published: (2025)

Mitigating Language Mismatch in SSL-Based Speaker Anonymization
by: Zhang, Zhe, et al.
Published: (2025)

Spoofing attack augmentation: can differently-trained attack models improve generalisation?
by: Ge, Wanying, et al.
Published: (2023)

Self-Attention and Hybrid Features for Replay and Deep-Fake Audio Detection
by: Huang, Lian, et al.
Published: (2024)

Genuine-Focused Learning using Mask AutoEncoder for Generalized Fake Audio Detection
by: Wang, Xiaopeng, et al.
Published: (2024)

SceneFake: An Initial Dataset and Benchmarks for Scene Fake Audio Detection
by: Yi, Jiangyan, et al.
Published: (2022)

Revisiting and Improving Scoring Fusion for Spoofing-aware Speaker Verification Using Compositional Data Analysis
by: Wang, Xin, et al.
Published: (2024)

Trusted Fake Audio Detection Based on Dirichlet Distribution
by: Ding, Chi, et al.
Published: (2025)

SAVe: Self-Supervised Audio-visual Deepfake Detection Exploiting Visual Artifacts and Audio-visual Misalignment
by: Shahzad, Sahibzada Adil, et al.
Published: (2026)

Retrieval-Augmented Text-to-Audio Generation
by: Yuan, Yi, et al.
Published: (2023)