:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Lin, Rohdin, Johan, Wang, Xin, Peng, Junyi, Liu, Tianchi, Zhang, You, Luong, Hieu-Thi, Wang, Shuai, Liang, Chengdong, Silnova, Anna, Evans, Nicholas
Format:	Preprint
Published:	2026
Subjects:	Sound Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2601.15240
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Challenging margin-based speaker embedding extractors by using the variational information bottleneck
by: Stafylakis, Themos, et al.
Published: (2024)

Leveraging Self-Supervised Learning for Speaker Diarization
by: Han, Jiangyu, et al.
Published: (2024)

Fine-tune Before Structured Pruning: Towards Compact and Accurate Self-Supervised Models for Speaker Diarization
by: Han, Jiangyu, et al.
Published: (2025)

Hybrid Pruning: In-Situ Compression of Self-Supervised Speech Models for Speaker Verification and Anti-Spoofing
by: Peng, Junyi, et al.
Published: (2025)

BUT Systems and Analyses for the ASVspoof 5 Challenge
by: Rohdin, Johan, et al.
Published: (2024)

LlamaPartialSpoof: An LLM-Driven Fake Speech Dataset Simulating Disinformation Generation
by: Luong, Hieu-Thi, et al.
Published: (2024)

Room Impulse Responses help attackers to evade Deep Fake Detection
by: Luong, Hieu-Thi, et al.
Published: (2024)

BUT Systems for WildSpoof Challenge: SASV in the Wild
by: Peng, Junyi, et al.
Published: (2025)

Analysis of ABC Frontend Audio Systems for the NIST-SRE24
by: Barahona, Sara, et al.
Published: (2025)

Robust Localization of Partially Fake Speech: Metrics and Out-of-Domain Evaluation
by: Luong, Hieu-Thi, et al.
Published: (2025)

RADAR Challenge 2026: Robust Audio Deepfake Recognition under Media Transformations
by: Luong, Hieu-Thi, et al.
Published: (2026)

EmoFake: An Initial Dataset for Emotion Fake Audio Detection
by: Zhao, Yan, et al.
Published: (2022)

WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction
by: Wang, Shuai, et al.
Published: (2024)

Spatially Aware Self-Supervised Models for Multi-Channel Neural Speaker Diarization
by: Han, Jiangyu, et al.
Published: (2025)

Audio Deepfake Verification
by: Wang, Li, et al.
Published: (2025)

Bayesian Learning for Domain-Invariant Speaker Verification and Anti-Spoofing
by: Li, Jin, et al.
Published: (2025)

Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
by: Zhang, Xueyao, et al.
Published: (2023)

Integrated Spoofing-Robust Automatic Speaker Verification via a Three-Class Formulation and LLR
by: Tan, Kai, et al.
Published: (2026)

Efficient and Generalizable Speaker Diarization via Structured Pruning of Self-Supervised Models
by: Han, Jiangyu, et al.
Published: (2025)

PhiNet: Speaker Verification with Phonetic Interpretability
by: Ma, Yi, et al.
Published: (2026)

SceneFake: An Initial Dataset and Benchmarks for Scene Fake Audio Detection
by: Yi, Jiangyan, et al.
Published: (2022)

VERSA: A Versatile Evaluation Toolkit for Speech, Audio, and Music
by: Shi, Jiatong, et al.
Published: (2024)

Generalized Fake Audio Detection via Deep Stable Learning
by: Wang, Zhiyong, et al.
Published: (2024)

Do End-to-End Neural Diarization Attractors Need to Encode Speaker Characteristic Information?
by: Zhang, Lin, et al.
Published: (2024)

Overview of the Amphion Toolkit (v0.2)
by: Li, Jiaqi, et al.
Published: (2025)

Trusted Fake Audio Detection Based on Dirichlet Distribution
by: Ding, Chi, et al.
Published: (2025)

A Noval Feature via Color Quantisation for Fake Audio Detection
by: Wang, Zhiyong, et al.
Published: (2024)

CodecFake: Enhancing Anti-Spoofing Models Against Deepfake Audios from Codec-Based Speech Synthesis Systems
by: Wu, Haibin, et al.
Published: (2024)

NTU-NPU System for Voice Privacy 2024 Challenge
by: Kuzmin, Nikita, et al.
Published: (2024)

ExPO: Explainable Phonetic Trait-Oriented Network for Speaker Verification
by: Ma, Yi, et al.
Published: (2025)

Can LLMs Help Localize Fake Words in Partially Fake Speech?
by: Zhang, Lin, et al.
Published: (2026)

Zero-Shot Fake Video Detection by Audio-Visual Consistency
by: Li, Xiaolou, et al.
Published: (2024)

Detecting and Defending Against Adversarial Attacks on Automatic Speech Recognition via Diffusion Models
by: Kühne, Nikolai L., et al.
Published: (2024)

Genuine-Focused Learning using Mask AutoEncoder for Generalized Fake Audio Detection
by: Wang, Xiaopeng, et al.
Published: (2024)

Classical Machine Learning Baselines for Deepfake Audio Detection on the Fake-or-Real Dataset
by: Ahmad, Faheem, et al.
Published: (2026)

Can Audio Large Language Models Verify Speaker Identity?
by: Ren, Yiming, et al.
Published: (2025)

Spoof Diarization: "What Spoofed When" in Partially Spoofed Audio
by: Zhang, Lin, et al.
Published: (2024)

Audio-Mind: An Auditable Agentic Framework for Audio Understanding
by: Wang, Yucheng, et al.
Published: (2026)

Mixture of Experts Fusion for Fake Audio Detection Using Frozen wav2vec 2.0
by: Wang, Zhiyong, et al.
Published: (2024)

SingFake: Singing Voice Deepfake Detection
by: Zang, Yongyi, et al.
Published: (2023)