:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Cooper, Erica, Maguer, Sébastien Le, Klabbers, Esther, Yamagishi, Junichi
Format:	Preprint
Published:	2025
Subjects:	Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2503.03250
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Towards An Integrated Approach for Expressive Piano Performance Synthesis from Music Scores
by: Tang, Jingjing, et al.
Published: (2025)

Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems
by: Chen, Zhengyang, et al.
Published: (2024)

Spoofing-Aware Speaker Verification Robust Against Domain and Channel Mismatches
by: Zeng, Chang, et al.
Published: (2024)

AfriHuBERT: A self-supervised speech representation model for African languages
by: Alabi, Jesujoba O., et al.
Published: (2024)

Towards Data Drift Monitoring for Speech Deepfake Detection in the context of MLOps
by: Wang, Xin, et al.
Published: (2025)

FakeMark: Deepfake Speech Attribution With Watermarked Artifacts
by: Ge, Wanying, et al.
Published: (2025)

Does Fine-tuning by Reinforcement Learning Improve Generalization in Binary Speech Deepfake Detection?
by: Wang, Xin, et al.
Published: (2026)

VoxEffects: A Speech-Oriented Audio Effects Dataset and Benchmark
by: Zhang, Zhe, et al.
Published: (2026)

ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations
by: Gong, Cheng, et al.
Published: (2023)

Spoof Diarization: "What Spoofed When" in Partially Spoofed Audio
by: Zhang, Lin, et al.
Published: (2024)

Improving curriculum learning for target speaker extraction with synthetic speakers
by: Liu, Yun, et al.
Published: (2024)

Post-training for Deepfake Speech Detection
by: Ge, Wanying, et al.
Published: (2025)

The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction
by: Huang, Wen-Chin, et al.
Published: (2024)

Target Speaker Extraction with Curriculum Learning
by: Liu, Yun, et al.
Published: (2024)

Libri2Vox Dataset: Target Speaker Extraction with Diverse Speaker Conditions and Synthetic Data
by: Liu, Yun, et al.
Published: (2024)

Explaining Speaker and Spoof Embeddings via Probing
by: Liu, Xuechen, et al.
Published: (2024)

Quantifying Source Speaker Leakage in One-to-One Voice Conversion
by: Wellington, Scott, et al.
Published: (2025)

A Preliminary Case Study on Long-Form In-the-Wild Audio Spoofing Detection
by: Liu, Xuechen, et al.
Published: (2024)

Human perception of audio deepfakes: the role of language and speaking style
by: Segundo, Eugenia San, et al.
Published: (2025)

From Sharpness to Better Generalization for Speech Deepfake Detection
by: Huang, Wen, et al.
Published: (2025)

Mitigating Language Mismatch in SSL-Based Speaker Anonymization
by: Zhang, Zhe, et al.
Published: (2025)

Assessing speech quality metrics for evaluation of neural audio codecs under clean speech conditions
by: Mack, Wolfgang, et al.
Published: (2025)

An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios
by: Gong, Cheng, et al.
Published: (2024)

LENS-DF: Deepfake Detection and Temporal Localization for Long-Form Noisy Speech
by: Liu, Xuechen, et al.
Published: (2025)

Revisiting and Improving Scoring Fusion for Spoofing-aware Speaker Verification Using Compositional Data Analysis
by: Wang, Xin, et al.
Published: (2024)

Disentangling the Prosody and Semantic Information with Pre-trained Model for In-Context Learning based Zero-Shot Voice Conversion
by: Chen, Zhengyang, et al.
Published: (2024)

Deepfake Word Detection by Next-token Prediction using Fine-tuned Whisper
by: Tran, Hoan My, et al.
Published: (2026)

The First VoicePrivacy Attacker Challenge
by: Tomashenko, Natalia, et al.
Published: (2025)

The First VoicePrivacy Attacker Challenge Evaluation Plan
by: Tomashenko, Natalia, et al.
Published: (2024)

Spoofing attack augmentation: can differently-trained attack models improve generalisation?
by: Ge, Wanying, et al.
Published: (2023)

ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech
by: Wang, Xin, et al.
Published: (2025)

The VoicePrivacy 2022 Challenge: Progress and Perspectives in Voice Anonymisation
by: Panariello, Michele, et al.
Published: (2024)

Ensemble of classifiers for speech evaluation
by: Belokrylov, G., et al.
Published: (2024)

SHEET: A Multi-purpose Open-source Speech Human Evaluation Estimation Toolkit
by: Huang, Wen-Chin, et al.
Published: (2025)

MOS-Bench: Benchmarking Generalization Abilities of Subjective Speech Quality Assessment Models
by: Huang, Wen-Chin, et al.
Published: (2024)

CodecMOS-Accent: A MOS Benchmark of Resynthesized and TTS Speech from Neural Codecs Across English Accents
by: Huang, Wen-Chin, et al.
Published: (2026)

MUSHRA-1S: A scalable and sensitive test approach for evaluating top-tier speech processing systems
by: Lechler, Laura, et al.
Published: (2025)

Target speaker anonymization in multi-speaker recordings
by: Tomashenko, Natalia, et al.
Published: (2025)

MIDI-VALLE: Improving Expressive Piano Performance Synthesis Through Neural Codec Language Modelling
by: Tang, Jingjing, et al.
Published: (2025)

Lightweight speech enhancement guided target speech extraction in noisy multi-speaker scenarios
by: Huang, Ziling, et al.
Published: (2025)