:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	You, Zhenghai, Shi, Ying, Li, Lantian, Wang, Dong
Format:	Preprint
Published:	2026
Subjects:	Sound
Online Access:	https://arxiv.org/abs/2603.10921
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

An Investigation on Speaker Augmentation for End-to-End Speaker Extraction
by: You, Zhenghai, et al.
Published: (2025)

A Comprehensive Investigation on Speaker Augmentation for Speaker Recognition
by: Zhou, Zhenyu, et al.
Published: (2024)

SE/BN Adapter: Parametric Efficient Domain Adaptation for Speaker Recognition
by: Wang, Tianhao, et al.
Published: (2024)

Multi-Level Speaker Representation for Target Speaker Extraction
by: Zhang, Ke, et al.
Published: (2024)

MT-HuBERT: Self-Supervised Mix-Training for Few-Shot Keyword Spotting in Mixed Speech
by: Yuan, Junming, et al.
Published: (2025)

Serialized Output Training by Learned Dominance
by: Shi, Ying, et al.
Published: (2024)

USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction
by: Zeng, Bang, et al.
Published: (2024)

Training Dynamics-Aware Multi-Factor Curriculum Learning for Target Speaker Extraction
by: Liu, Yun, et al.
Published: (2026)

Adversarial Data Augmentation for Robust Speaker Verification
by: Zhou, Zhenyu, et al.
Published: (2024)

Universal Speaker Embedding Free Target Speaker Extraction and Personal Voice Activity Detection
by: Zeng, Bang, et al.
Published: (2025)

AlphaFlowTSE: One-Step Generative Target Speaker Extraction via Conditional AlphaFlow
by: Li, Duojia, et al.
Published: (2026)

Neural Scoring: A Refreshed End-to-End Approach for Speaker Recognition in Complex Conditions
by: Lin, Wan, et al.
Published: (2024)

Unmixing the Crowd: Learning Mixture-to-Set Speaker Embeddings for Enrollment-Free Target Speech Extraction
by: Sidharth, FNU, et al.
Published: (2026)

Brainprint-Modulated Target Speaker Extraction
by: Han, Qiushi, et al.
Published: (2025)

On the effectiveness of enrollment speech augmentation for Target Speaker Extraction
by: Li, Junjie, et al.
Published: (2024)

Target Speaker Extraction with Curriculum Learning
by: Liu, Yun, et al.
Published: (2024)

Enhancing Target Speaker Extraction with Explicit Speaker Consistency Modeling
by: Wu, Shu, et al.
Published: (2025)

Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
by: Tao, Ruijie, et al.
Published: (2024)

M3ANet: Multi-scale and Multi-Modal Alignment Network for Brain-Assisted Target Speaker Extraction
by: Fan, Cunhang, et al.
Published: (2025)

Discriminative-Generative Target Speaker Extraction with Decoder-Only Language Models
by: Zeng, Bang, et al.
Published: (2026)

Listen to Extract: Onset-Prompted Target Speaker Extraction
by: Shen, Pengjie, et al.
Published: (2025)

Improved Feature Extraction Network for Neuro-Oriented Target Speaker Extraction
by: Fan, Cunhang, et al.
Published: (2025)

Beyond Speaker Identity: Text Guided Target Speech Extraction
by: Huo, Mingyue, et al.
Published: (2025)

Binaural Target Speaker Extraction using Individualized HRTF
by: Ellinson, Yoav, et al.
Published: (2025)

Two-stage Audio-Visual Target Speaker Extraction System for Real-Time Processing On Edge Device
by: Li, Zixuan, et al.
Published: (2025)

Inter-Speaker Relative Cues for Text-Guided Target Speech Extraction
by: Dai, Wang, et al.
Published: (2025)

WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction
by: Wang, Shuai, et al.
Published: (2024)

Speaker Targeting via Self-Speaker Adaptation for Multi-talker ASR
by: Wang, Weiqing, et al.
Published: (2025)

Libri2Vox Dataset: Target Speaker Extraction with Diverse Speaker Conditions and Synthetic Data
by: Liu, Yun, et al.
Published: (2024)

Enhancing Intelligibility for Generative Target Speech Extraction via Joint Optimization with Target Speaker ASR
by: Ma, Hao, et al.
Published: (2025)

Binaural Selective Attention Model for Target Speaker Extraction
by: Meng, Hanyu, et al.
Published: (2024)

FlowTSE: Target Speaker Extraction with Flow Matching
by: Navon, Aviv, et al.
Published: (2025)

TSELM: Target Speaker Extraction using Discrete Tokens and Language Models
by: Tang, Beilong, et al.
Published: (2024)

Towards Streaming Target Speaker Extraction via Chunk-wise Interleaved Splicing of Autoregressive Language Model
by: Peng, Shuhai, et al.
Published: (2026)

HRTF-guided Binaural Target Speaker Extraction with Real-World Validation
by: Ellinson, Yoav, et al.
Published: (2026)

USED: Universal Speaker Extraction and Diarization
by: Ao, Junyi, et al.
Published: (2023)

Robust Audio-Visual Target Speaker Extraction with Emotion-Aware Multiple Enrollment Fusion
by: Jin, Zhan, et al.
Published: (2025)

How phonemes contribute to deep speaker models?
by: Li, Pengqi, et al.
Published: (2024)

Training-Free Multi-Step Audio Source Separation
by: Zang, Yongyi, et al.
Published: (2025)

Multi-Target Backdoor Attacks Against Speaker Recognition
by: Fortier, Alexandrine, et al.
Published: (2025)