Saved in:
| Main Authors: | Chen, Zhiyong, Wu, Shuhang, Duan, Yingjie, Xu, Xinkang, Hu, Xinhui |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.13605 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Learning Emotion-Invariant Speaker Representations for Speaker Verification
by: Tian, Jingguang, et al.
Published: (2025)
by: Tian, Jingguang, et al.
Published: (2025)
Enhancing Open-Set Speaker Identification through Rapid Tuning with Speaker Reciprocal Points and Negative Sample
by: Chen, Zhiyong, et al.
Published: (2024)
by: Chen, Zhiyong, et al.
Published: (2024)
openFEAT: Improving Speaker Identification by Open-set Few-shot Embedding Adaptation with Transformer
by: C, Kishan K, et al.
Published: (2022)
by: C, Kishan K, et al.
Published: (2022)
Emotion Recognition in Multi-Speaker Conversations through Speaker Identification, Knowledge Distillation, and Hierarchical Fusion
by: Li, Xiao, et al.
Published: (2025)
by: Li, Xiao, et al.
Published: (2025)
VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification Benchmark
by: Lin, Yuke, et al.
Published: (2024)
by: Lin, Yuke, et al.
Published: (2024)
Pretraining Multi-Speaker Identification for Neural Speaker Diarization
by: Horiguchi, Shota, et al.
Published: (2025)
by: Horiguchi, Shota, et al.
Published: (2025)
Rhythm Features for Speaker Identification
by: Mehlman, Nick, et al.
Published: (2025)
by: Mehlman, Nick, et al.
Published: (2025)
Discrete Audio Representations for Automated Audio Captioning
by: Tian, Jingguang, et al.
Published: (2025)
by: Tian, Jingguang, et al.
Published: (2025)
The THU-HCSI Multi-Speaker Multi-Lingual Few-Shot Voice Cloning System for LIMMITS'24 Challenge
by: Zhou, Yixuan, et al.
Published: (2024)
by: Zhou, Yixuan, et al.
Published: (2024)
Enhancing Target Speaker Extraction with Explicit Speaker Consistency Modeling
by: Wu, Shu, et al.
Published: (2025)
by: Wu, Shu, et al.
Published: (2025)
A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification
by: Zhang, You, et al.
Published: (2022)
by: Zhang, You, et al.
Published: (2022)
Joint Optimization of Speaker and Spoof Detectors for Spoofing-Robust Automatic Speaker Verification
by: Kurnaz, Oğuzhan, et al.
Published: (2025)
by: Kurnaz, Oğuzhan, et al.
Published: (2025)
Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning
by: Wu, Haibin, et al.
Published: (2021)
by: Wu, Haibin, et al.
Published: (2021)
A Toolkit for Joint Speaker Diarization and Identification with Application to Speaker-Attributed ASR
by: Morrone, Giovanni, et al.
Published: (2024)
by: Morrone, Giovanni, et al.
Published: (2024)
SC-MoE: Switch Conformer Mixture of Experts for Unified Streaming and Non-streaming Code-Switching ASR
by: Ye, Shuaishuai, et al.
Published: (2024)
by: Ye, Shuaishuai, et al.
Published: (2024)
Enhancing Age-Related Robustness in Children Speaker Verification
by: Shetty, Vishwas M., et al.
Published: (2025)
by: Shetty, Vishwas M., et al.
Published: (2025)
Emotional Styles Hide in Deep Speaker Embeddings: Disentangle Deep Speaker Embeddings for Speaker Clustering
by: Lin, Chaohao, et al.
Published: (2025)
by: Lin, Chaohao, et al.
Published: (2025)
Enhancing Zero-Shot Multi-Speaker TTS with Negated Speaker Representations
by: Jeon, Yejin, et al.
Published: (2024)
by: Jeon, Yejin, et al.
Published: (2024)
UNet-Based Fusion and Exponential Moving Average Adaptation for Noise-Robust Speaker Recognition
by: Gan, Chong-Xin, et al.
Published: (2026)
by: Gan, Chong-Xin, et al.
Published: (2026)
In This Environment, As That Speaker: A Text-Driven Framework for Multi-Attribute Speech Conversion
by: Jin, Jiawei, et al.
Published: (2025)
by: Jin, Jiawei, et al.
Published: (2025)
Study on Inter and Intra Speaker Variability in Speaker Recognition
by: Okhotnikov, Anton, et al.
Published: (2024)
by: Okhotnikov, Anton, et al.
Published: (2024)
Improving the Speaker Anonymization Evaluation's Robustness to Target Speakers with Adversarial Learning
by: Franzreb, Carlos, et al.
Published: (2025)
by: Franzreb, Carlos, et al.
Published: (2025)
ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation
by: Fu, Ruibo, et al.
Published: (2024)
by: Fu, Ruibo, et al.
Published: (2024)
An Age-Agnostic System for Robust Speaker Verification
by: Zheng, Jiusi, et al.
Published: (2025)
by: Zheng, Jiusi, et al.
Published: (2025)
Multi-Speaker DOA Estimation in Binaural Hearing Aids using Deep Learning and Speaker Count Fusion
by: Jazaeri, Farnaz, et al.
Published: (2025)
by: Jazaeri, Farnaz, et al.
Published: (2025)
Whisper Speaker Identification: Leveraging Pre-Trained Multilingual Transformers for Robust Speaker Embeddings
by: Emon, Jakaria Islam, et al.
Published: (2025)
by: Emon, Jakaria Islam, et al.
Published: (2025)
Noro: Noise-Robust One-shot Voice Conversion with Hidden Speaker Representation Learning
by: He, Haorui, et al.
Published: (2024)
by: He, Haorui, et al.
Published: (2024)
Speaker Contrastive Learning for Source Speaker Tracing
by: Wang, Qing, et al.
Published: (2024)
by: Wang, Qing, et al.
Published: (2024)
On the Role of Spatial Features in Foundation-Model-Based Speaker Diarization
by: Deegen, Marc, et al.
Published: (2026)
by: Deegen, Marc, et al.
Published: (2026)
Exploring Speech Foundation Models for Speaker Diarization Across Lifespan
by: Xu, Anfeng, et al.
Published: (2026)
by: Xu, Anfeng, et al.
Published: (2026)
On-the-fly Routing for Zero-shot MoE Speaker Adaptation of Speech Foundation Models for Dysarthric Speech Recognition
by: HU, Shujie, et al.
Published: (2025)
by: HU, Shujie, et al.
Published: (2025)
3D-Speaker-Toolkit: An Open-Source Toolkit for Multimodal Speaker Verification and Diarization
by: Chen, Yafeng, et al.
Published: (2024)
by: Chen, Yafeng, et al.
Published: (2024)
Robust Audio-Visual Target Speaker Extraction with Emotion-Aware Multiple Enrollment Fusion
by: Jin, Zhan, et al.
Published: (2025)
by: Jin, Zhan, et al.
Published: (2025)
Generating Novel and Realistic Speakers for Voice Conversion
by: Chen, Meiying Melissa, et al.
Published: (2025)
by: Chen, Meiying Melissa, et al.
Published: (2025)
Optimizing a-DCF for Spoofing-Robust Speaker Verification
by: Kurnaz, Oğuzhan, et al.
Published: (2024)
by: Kurnaz, Oğuzhan, et al.
Published: (2024)
Multi-Label Training for Text-Independent Speaker Identification
by: Xue, Yuqi
Published: (2022)
by: Xue, Yuqi
Published: (2022)
Speaker-Smoothed kNN Speaker Adaptation for End-to-End ASR
by: Li, Shaojun, et al.
Published: (2024)
by: Li, Shaojun, et al.
Published: (2024)
A Comprehensive Investigation on Speaker Augmentation for Speaker Recognition
by: Zhou, Zhenyu, et al.
Published: (2024)
by: Zhou, Zhenyu, et al.
Published: (2024)
Multi-Level Speaker Representation for Target Speaker Extraction
by: Zhang, Ke, et al.
Published: (2024)
by: Zhang, Ke, et al.
Published: (2024)
An Investigation on Speaker Augmentation for End-to-End Speaker Extraction
by: You, Zhenghai, et al.
Published: (2025)
by: You, Zhenghai, et al.
Published: (2025)
Similar Items
-
Learning Emotion-Invariant Speaker Representations for Speaker Verification
by: Tian, Jingguang, et al.
Published: (2025) -
Enhancing Open-Set Speaker Identification through Rapid Tuning with Speaker Reciprocal Points and Negative Sample
by: Chen, Zhiyong, et al.
Published: (2024) -
openFEAT: Improving Speaker Identification by Open-set Few-shot Embedding Adaptation with Transformer
by: C, Kishan K, et al.
Published: (2022) -
Emotion Recognition in Multi-Speaker Conversations through Speaker Identification, Knowledge Distillation, and Hierarchical Fusion
by: Li, Xiao, et al.
Published: (2025) -
VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification Benchmark
by: Lin, Yuke, et al.
Published: (2024)