Saved in:
| Main Authors: | Li, Junjie, Lee, Kong Aik |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.15719 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Cosine Scoring with Uncertainty for Neural Speaker Embedding
by: Wang, Qiongqiong, et al.
Published: (2024)
by: Wang, Qiongqiong, et al.
Published: (2024)
Xi+: Uncertainty Supervision for Robust Speaker Embedding
by: Li, Junjie, et al.
Published: (2025)
by: Li, Junjie, et al.
Published: (2025)
On the effectiveness of enrollment speech augmentation for Target Speaker Extraction
by: Li, Junjie, et al.
Published: (2024)
by: Li, Junjie, et al.
Published: (2024)
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
by: Wang, Shuai, et al.
Published: (2024)
by: Wang, Shuai, et al.
Published: (2024)
MeMo: Attentional Momentum for Real-time Audio-visual Speaker Extraction under Impaired Visual Conditions
by: Li, Junjie, et al.
Published: (2025)
by: Li, Junjie, et al.
Published: (2025)
Golden Gemini is All You Need: Finding the Sweet Spots for Speaker Verification
by: Liu, Tianchi, et al.
Published: (2023)
by: Liu, Tianchi, et al.
Published: (2023)
Any-to-any Speaker Attribute Perturbation for Asynchronous Voice Anonymization
by: Chen, Liping, et al.
Published: (2025)
by: Chen, Liping, et al.
Published: (2025)
Emphasized Non-Target Speaker Knowledge in Knowledge Distillation for Automatic Speaker Verification
by: Truong, Duc-Tuan, et al.
Published: (2023)
by: Truong, Duc-Tuan, et al.
Published: (2023)
Generalizing Speaker Verification for Spoof Awareness in the Embedding Space
by: Liu, Xuechen, et al.
Published: (2024)
by: Liu, Xuechen, et al.
Published: (2024)
Text-dependent Speaker Verification (TdSV) Challenge 2024: Challenge Evaluation Plan
by: Hossein, Zeinali, et al.
Published: (2024)
by: Hossein, Zeinali, et al.
Published: (2024)
MoMuSE: Momentum Multi-modal Target Speaker Extraction for Real-time Scenarios with Impaired Visual Cues
by: Li, Junjie, et al.
Published: (2024)
by: Li, Junjie, et al.
Published: (2024)
Revisiting and Improving Scoring Fusion for Spoofing-aware Speaker Verification Using Compositional Data Analysis
by: Wang, Xin, et al.
Published: (2024)
by: Wang, Xin, et al.
Published: (2024)
On the Generation and Removal of Speaker Adversarial Perturbation for Voice-Privacy Protection
by: Guo, Chenyang, et al.
Published: (2024)
by: Guo, Chenyang, et al.
Published: (2024)
VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis
by: Lin, Weiwei, et al.
Published: (2024)
by: Lin, Weiwei, et al.
Published: (2024)
QAMO: Quality-aware Multi-centroid One-class Learning For Speech Deepfake Detection
by: Truong, Duc-Tuan, et al.
Published: (2025)
by: Truong, Duc-Tuan, et al.
Published: (2025)
Addressing Gradient Misalignment in Data-Augmented Training for Robust Speech Deepfake Detection
by: Truong, Duc-Tuan, et al.
Published: (2025)
by: Truong, Duc-Tuan, et al.
Published: (2025)
Speaking Clearly: A Simplified Whisper-Based Codec for Low-Bitrate Speech Coding
by: Zhang, Xin, et al.
Published: (2025)
by: Zhang, Xin, et al.
Published: (2025)
Gradient weighting for speaker verification in extremely low Signal-to-Noise Ratio
by: Ma, Yi, et al.
Published: (2024)
by: Ma, Yi, et al.
Published: (2024)
A Comprehensive Investigation on Speaker Augmentation for Speaker Recognition
by: Zhou, Zhenyu, et al.
Published: (2024)
by: Zhou, Zhenyu, et al.
Published: (2024)
The First Voice Timbre Attribute Detection Challenge
by: Chen, Liping, et al.
Published: (2025)
by: Chen, Liping, et al.
Published: (2025)
Multi-Level Speaker Representation for Target Speaker Extraction
by: Zhang, Ke, et al.
Published: (2024)
by: Zhang, Ke, et al.
Published: (2024)
LlamaPartialSpoof: An LLM-Driven Fake Speech Dataset Simulating Disinformation Generation
by: Luong, Hieu-Thi, et al.
Published: (2024)
by: Luong, Hieu-Thi, et al.
Published: (2024)
Room Impulse Responses help attackers to evade Deep Fake Detection
by: Luong, Hieu-Thi, et al.
Published: (2024)
by: Luong, Hieu-Thi, et al.
Published: (2024)
ACE-Step 1.5: Pushing the Boundaries of Open-Source Music Generation
by: Gong, Junmin, et al.
Published: (2026)
by: Gong, Junmin, et al.
Published: (2026)
Joint Learning Global-Local Speaker Classification to Enhance End-to-End Speaker Diarization and Recognition
by: Dai, Yuhang, et al.
Published: (2026)
by: Dai, Yuhang, et al.
Published: (2026)
Robust Localization of Partially Fake Speech: Metrics and Out-of-Domain Evaluation
by: Luong, Hieu-Thi, et al.
Published: (2025)
by: Luong, Hieu-Thi, et al.
Published: (2025)
Adversarial speech for voice privacy protection from Personalized Speech generation
by: Chen, Shihao, et al.
Published: (2024)
by: Chen, Shihao, et al.
Published: (2024)
Nes2Net: A Lightweight Nested Architecture for Foundation Model Driven Speech Anti-spoofing
by: Liu, Tianchi, et al.
Published: (2025)
by: Liu, Tianchi, et al.
Published: (2025)
SpeakerLM: End-to-End Versatile Speaker Diarization and Recognition with Multimodal Large Language Models
by: Yin, Han, et al.
Published: (2025)
by: Yin, Han, et al.
Published: (2025)
Emotion Recognition in Multi-Speaker Conversations through Speaker Identification, Knowledge Distillation, and Hierarchical Fusion
by: Li, Xiao, et al.
Published: (2025)
by: Li, Xiao, et al.
Published: (2025)
The Voice Timbre Attribute Detection 2025 Challenge Evaluation Plan
by: Sheng, Zhengyan, et al.
Published: (2025)
by: Sheng, Zhengyan, et al.
Published: (2025)
Introducing voice timbre attribute detection
by: He, Jinghao, et al.
Published: (2025)
by: He, Jinghao, et al.
Published: (2025)
Enhancing Dysarthric Speech Recognition for Unseen Speakers via Prototype-Based Adaptation
by: Wang, Shiyao, et al.
Published: (2024)
by: Wang, Shiyao, et al.
Published: (2024)
Speaker Recognition -- Wavelet Packet Based Multiresolution Feature Extraction Approach
by: Bhardwaj, Saurabh, et al.
Published: (2025)
by: Bhardwaj, Saurabh, et al.
Published: (2025)
ELF: Encoding Speaker-Specific Latent Speech Feature for Speech Synthesis
by: Kong, Jungil, et al.
Published: (2023)
by: Kong, Jungil, et al.
Published: (2023)
Multi-Target Backdoor Attacks Against Speaker Recognition
by: Fortier, Alexandrine, et al.
Published: (2025)
by: Fortier, Alexandrine, et al.
Published: (2025)
Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
by: Tao, Ruijie, et al.
Published: (2024)
by: Tao, Ruijie, et al.
Published: (2024)
Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC
by: Kang, Jiawen, et al.
Published: (2024)
by: Kang, Jiawen, et al.
Published: (2024)
Breaking Speaker Recognition with PaddingBack
by: Ye, Zhe, et al.
Published: (2023)
by: Ye, Zhe, et al.
Published: (2023)
SE/BN Adapter: Parametric Efficient Domain Adaptation for Speaker Recognition
by: Wang, Tianhao, et al.
Published: (2024)
by: Wang, Tianhao, et al.
Published: (2024)
Similar Items
-
Cosine Scoring with Uncertainty for Neural Speaker Embedding
by: Wang, Qiongqiong, et al.
Published: (2024) -
Xi+: Uncertainty Supervision for Robust Speaker Embedding
by: Li, Junjie, et al.
Published: (2025) -
On the effectiveness of enrollment speech augmentation for Target Speaker Extraction
by: Li, Junjie, et al.
Published: (2024) -
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
by: Wang, Shuai, et al.
Published: (2024) -
MeMo: Attentional Momentum for Real-time Audio-visual Speaker Extraction under Impaired Visual Conditions
by: Li, Junjie, et al.
Published: (2025)