:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Chen, Zhiyong, Wu, Shuhang, Duan, Yingjie, Xu, Xinkang, Hu, Xinhui
Format:	Preprint
Published:	2026
Subjects:	Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2604.13605
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Learning Emotion-Invariant Speaker Representations for Speaker Verification
by: Tian, Jingguang, et al.
Published: (2025)

Enhancing Open-Set Speaker Identification through Rapid Tuning with Speaker Reciprocal Points and Negative Sample
by: Chen, Zhiyong, et al.
Published: (2024)

openFEAT: Improving Speaker Identification by Open-set Few-shot Embedding Adaptation with Transformer
by: C, Kishan K, et al.
Published: (2022)

Emotion Recognition in Multi-Speaker Conversations through Speaker Identification, Knowledge Distillation, and Hierarchical Fusion
by: Li, Xiao, et al.
Published: (2025)

VoxBlink2: A 100K+ Speaker Recognition Corpus and the Open-Set Speaker-Identification Benchmark
by: Lin, Yuke, et al.
Published: (2024)

Pretraining Multi-Speaker Identification for Neural Speaker Diarization
by: Horiguchi, Shota, et al.
Published: (2025)

Rhythm Features for Speaker Identification
by: Mehlman, Nick, et al.
Published: (2025)

Discrete Audio Representations for Automated Audio Captioning
by: Tian, Jingguang, et al.
Published: (2025)

The THU-HCSI Multi-Speaker Multi-Lingual Few-Shot Voice Cloning System for LIMMITS'24 Challenge
by: Zhou, Yixuan, et al.
Published: (2024)

Enhancing Target Speaker Extraction with Explicit Speaker Consistency Modeling
by: Wu, Shu, et al.
Published: (2025)

A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification
by: Zhang, You, et al.
Published: (2022)

Joint Optimization of Speaker and Spoof Detectors for Spoofing-Robust Automatic Speaker Verification
by: Kurnaz, Oğuzhan, et al.
Published: (2025)

Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning
by: Wu, Haibin, et al.
Published: (2021)

A Toolkit for Joint Speaker Diarization and Identification with Application to Speaker-Attributed ASR
by: Morrone, Giovanni, et al.
Published: (2024)

SC-MoE: Switch Conformer Mixture of Experts for Unified Streaming and Non-streaming Code-Switching ASR
by: Ye, Shuaishuai, et al.
Published: (2024)

Enhancing Age-Related Robustness in Children Speaker Verification
by: Shetty, Vishwas M., et al.
Published: (2025)

Emotional Styles Hide in Deep Speaker Embeddings: Disentangle Deep Speaker Embeddings for Speaker Clustering
by: Lin, Chaohao, et al.
Published: (2025)

Enhancing Zero-Shot Multi-Speaker TTS with Negated Speaker Representations
by: Jeon, Yejin, et al.
Published: (2024)

UNet-Based Fusion and Exponential Moving Average Adaptation for Noise-Robust Speaker Recognition
by: Gan, Chong-Xin, et al.
Published: (2026)

In This Environment, As That Speaker: A Text-Driven Framework for Multi-Attribute Speech Conversion
by: Jin, Jiawei, et al.
Published: (2025)

Study on Inter and Intra Speaker Variability in Speaker Recognition
by: Okhotnikov, Anton, et al.
Published: (2024)

Improving the Speaker Anonymization Evaluation's Robustness to Target Speakers with Adversarial Learning
by: Franzreb, Carlos, et al.
Published: (2025)

ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation
by: Fu, Ruibo, et al.
Published: (2024)

An Age-Agnostic System for Robust Speaker Verification
by: Zheng, Jiusi, et al.
Published: (2025)

Multi-Speaker DOA Estimation in Binaural Hearing Aids using Deep Learning and Speaker Count Fusion
by: Jazaeri, Farnaz, et al.
Published: (2025)

Whisper Speaker Identification: Leveraging Pre-Trained Multilingual Transformers for Robust Speaker Embeddings
by: Emon, Jakaria Islam, et al.
Published: (2025)

Noro: Noise-Robust One-shot Voice Conversion with Hidden Speaker Representation Learning
by: He, Haorui, et al.
Published: (2024)

Speaker Contrastive Learning for Source Speaker Tracing
by: Wang, Qing, et al.
Published: (2024)

On the Role of Spatial Features in Foundation-Model-Based Speaker Diarization
by: Deegen, Marc, et al.
Published: (2026)

Exploring Speech Foundation Models for Speaker Diarization Across Lifespan
by: Xu, Anfeng, et al.
Published: (2026)

On-the-fly Routing for Zero-shot MoE Speaker Adaptation of Speech Foundation Models for Dysarthric Speech Recognition
by: HU, Shujie, et al.
Published: (2025)

3D-Speaker-Toolkit: An Open-Source Toolkit for Multimodal Speaker Verification and Diarization
by: Chen, Yafeng, et al.
Published: (2024)

Robust Audio-Visual Target Speaker Extraction with Emotion-Aware Multiple Enrollment Fusion
by: Jin, Zhan, et al.
Published: (2025)

Generating Novel and Realistic Speakers for Voice Conversion
by: Chen, Meiying Melissa, et al.
Published: (2025)

Optimizing a-DCF for Spoofing-Robust Speaker Verification
by: Kurnaz, Oğuzhan, et al.
Published: (2024)

Multi-Label Training for Text-Independent Speaker Identification
by: Xue, Yuqi
Published: (2022)

Speaker-Smoothed kNN Speaker Adaptation for End-to-End ASR
by: Li, Shaojun, et al.
Published: (2024)

A Comprehensive Investigation on Speaker Augmentation for Speaker Recognition
by: Zhou, Zhenyu, et al.
Published: (2024)

Multi-Level Speaker Representation for Target Speaker Extraction
by: Zhang, Ke, et al.
Published: (2024)

An Investigation on Speaker Augmentation for End-to-End Speaker Extraction
by: You, Zhenghai, et al.
Published: (2025)