Saved in:
| Main Authors: | Liu, Yun, Liu, Xuechen, Miao, Xiaoxiao, Yamagishi, Junichi |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.04943 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Target Speaker Extraction with Curriculum Learning
by: Liu, Yun, et al.
Published: (2024)
by: Liu, Yun, et al.
Published: (2024)
Libri2Vox Dataset: Target Speaker Extraction with Diverse Speaker Conditions and Synthetic Data
by: Liu, Yun, et al.
Published: (2024)
by: Liu, Yun, et al.
Published: (2024)
Quantifying Source Speaker Leakage in One-to-One Voice Conversion
by: Wellington, Scott, et al.
Published: (2025)
by: Wellington, Scott, et al.
Published: (2025)
Improving curriculum learning for target speaker extraction with synthetic speakers
by: Liu, Yun, et al.
Published: (2024)
by: Liu, Yun, et al.
Published: (2024)
Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems
by: Chen, Zhengyang, et al.
Published: (2024)
by: Chen, Zhengyang, et al.
Published: (2024)
Explaining Speaker and Spoof Embeddings via Probing
by: Liu, Xuechen, et al.
Published: (2024)
by: Liu, Xuechen, et al.
Published: (2024)
Spoofing-Aware Speaker Verification Robust Against Domain and Channel Mismatches
by: Zeng, Chang, et al.
Published: (2024)
by: Zeng, Chang, et al.
Published: (2024)
Zero-Day Audio DeepFake Detection via Retrieval Augmentation and Profile Matching
by: Liu, Xuechen, et al.
Published: (2025)
by: Liu, Xuechen, et al.
Published: (2025)
Mitigating Language Mismatch in SSL-Based Speaker Anonymization
by: Zhang, Zhe, et al.
Published: (2025)
by: Zhang, Zhe, et al.
Published: (2025)
A Preliminary Case Study on Long-Form In-the-Wild Audio Spoofing Detection
by: Liu, Xuechen, et al.
Published: (2024)
by: Liu, Xuechen, et al.
Published: (2024)
LENS-DF: Deepfake Detection and Temporal Localization for Long-Form Noisy Speech
by: Liu, Xuechen, et al.
Published: (2025)
by: Liu, Xuechen, et al.
Published: (2025)
AfriHuBERT: A self-supervised speech representation model for African languages
by: Alabi, Jesujoba O., et al.
Published: (2024)
by: Alabi, Jesujoba O., et al.
Published: (2024)
From Sharpness to Better Generalization for Speech Deepfake Detection
by: Huang, Wen, et al.
Published: (2025)
by: Huang, Wen, et al.
Published: (2025)
Disentangling the Prosody and Semantic Information with Pre-trained Model for In-Context Learning based Zero-Shot Voice Conversion
by: Chen, Zhengyang, et al.
Published: (2024)
by: Chen, Zhengyang, et al.
Published: (2024)
Training-Free Multi-Step Inference for Target Speaker Extraction
by: You, Zhenghai, et al.
Published: (2026)
by: You, Zhenghai, et al.
Published: (2026)
Multi-Level Speaker Representation for Target Speaker Extraction
by: Zhang, Ke, et al.
Published: (2024)
by: Zhang, Ke, et al.
Published: (2024)
Generalizing Speaker Verification for Spoof Awareness in the Embedding Space
by: Liu, Xuechen, et al.
Published: (2024)
by: Liu, Xuechen, et al.
Published: (2024)
Revisiting and Improving Scoring Fusion for Spoofing-aware Speaker Verification Using Compositional Data Analysis
by: Wang, Xin, et al.
Published: (2024)
by: Wang, Xin, et al.
Published: (2024)
Robust Audio-Visual Target Speaker Extraction with Emotion-Aware Multiple Enrollment Fusion
by: Jin, Zhan, et al.
Published: (2025)
by: Jin, Zhan, et al.
Published: (2025)
Brainprint-Modulated Target Speaker Extraction
by: Han, Qiushi, et al.
Published: (2025)
by: Han, Qiushi, et al.
Published: (2025)
Enhancing Target Speaker Extraction with Explicit Speaker Consistency Modeling
by: Wu, Shu, et al.
Published: (2025)
by: Wu, Shu, et al.
Published: (2025)
Beyond Speaker Identity: Text Guided Target Speech Extraction
by: Huo, Mingyue, et al.
Published: (2025)
by: Huo, Mingyue, et al.
Published: (2025)
SecureSpeech: Prompt-based Speaker and Content Protection
by: Hui, Belinda Soh Hui, et al.
Published: (2025)
by: Hui, Belinda Soh Hui, et al.
Published: (2025)
USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction
by: Zeng, Bang, et al.
Published: (2024)
by: Zeng, Bang, et al.
Published: (2024)
SEF-MK: Speaker-Embedding-Free Voice Anonymization through Multi-k-means Quantization
by: Tang, Beilong, et al.
Published: (2025)
by: Tang, Beilong, et al.
Published: (2025)
Enhancing Intelligibility for Generative Target Speech Extraction via Joint Optimization with Target Speaker ASR
by: Ma, Hao, et al.
Published: (2025)
by: Ma, Hao, et al.
Published: (2025)
Improved Feature Extraction Network for Neuro-Oriented Target Speaker Extraction
by: Fan, Cunhang, et al.
Published: (2025)
by: Fan, Cunhang, et al.
Published: (2025)
ASVspoof 5: Evaluation of Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech
by: Wang, Xin, et al.
Published: (2026)
by: Wang, Xin, et al.
Published: (2026)
On the effectiveness of enrollment speech augmentation for Target Speaker Extraction
by: Li, Junjie, et al.
Published: (2024)
by: Li, Junjie, et al.
Published: (2024)
Listen to Extract: Onset-Prompted Target Speaker Extraction
by: Shen, Pengjie, et al.
Published: (2025)
by: Shen, Pengjie, et al.
Published: (2025)
Binaural Target Speaker Extraction using Individualized HRTF
by: Ellinson, Yoav, et al.
Published: (2025)
by: Ellinson, Yoav, et al.
Published: (2025)
Universal Speaker Embedding Free Target Speaker Extraction and Personal Voice Activity Detection
by: Zeng, Bang, et al.
Published: (2025)
by: Zeng, Bang, et al.
Published: (2025)
Two-stage Audio-Visual Target Speaker Extraction System for Real-Time Processing On Edge Device
by: Li, Zixuan, et al.
Published: (2025)
by: Li, Zixuan, et al.
Published: (2025)
Memory-Efficient Training for Deep Speaker Embedding Learning in Speaker Verification
by: Liu, Bei, et al.
Published: (2024)
by: Liu, Bei, et al.
Published: (2024)
The Third VoicePrivacy Challenge: Preserving Emotional Expressiveness and Linguistic Content in Voice Anonymization
by: Tomashenko, Natalia, et al.
Published: (2026)
by: Tomashenko, Natalia, et al.
Published: (2026)
Binaural Selective Attention Model for Target Speaker Extraction
by: Meng, Hanyu, et al.
Published: (2024)
by: Meng, Hanyu, et al.
Published: (2024)
FlowTSE: Target Speaker Extraction with Flow Matching
by: Navon, Aviv, et al.
Published: (2025)
by: Navon, Aviv, et al.
Published: (2025)
Self Voice Conversion as an Attack against Neural Audio Watermarking
by: Özer, Yigitcan, et al.
Published: (2026)
by: Özer, Yigitcan, et al.
Published: (2026)
M3ANet: Multi-scale and Multi-Modal Alignment Network for Brain-Assisted Target Speaker Extraction
by: Fan, Cunhang, et al.
Published: (2025)
by: Fan, Cunhang, et al.
Published: (2025)
Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC
by: Kang, Jiawen, et al.
Published: (2024)
by: Kang, Jiawen, et al.
Published: (2024)
Similar Items
-
Target Speaker Extraction with Curriculum Learning
by: Liu, Yun, et al.
Published: (2024) -
Libri2Vox Dataset: Target Speaker Extraction with Diverse Speaker Conditions and Synthetic Data
by: Liu, Yun, et al.
Published: (2024) -
Quantifying Source Speaker Leakage in One-to-One Voice Conversion
by: Wellington, Scott, et al.
Published: (2025) -
Improving curriculum learning for target speaker extraction with synthetic speakers
by: Liu, Yun, et al.
Published: (2024) -
Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems
by: Chen, Zhengyang, et al.
Published: (2024)