Saved in:
| Main Authors: | Olijslager, Mariëtte, Ziabari, Seyed Sahand Mohammadi, Alsahag, Ali Mohammed Mansoor |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.01363 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Probing the Feasibility of Multilingual Speaker Anonymization
by: Meyer, Sarina, et al.
Published: (2024)
by: Meyer, Sarina, et al.
Published: (2024)
Disentangling Speaker Traits for Deepfake Source Verification via Chebyshev Polynomial and Riemannian Metric Learning
by: Xuan, Xi, et al.
Published: (2026)
by: Xuan, Xi, et al.
Published: (2026)
Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT
by: Komatsu, Ryota, et al.
Published: (2024)
by: Komatsu, Ryota, et al.
Published: (2024)
MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models
by: Nguyen, Thai-Binh, et al.
Published: (2024)
by: Nguyen, Thai-Binh, et al.
Published: (2024)
System Description for the Displace Speaker Diarization Challenge 2023
by: Aliyev, Ali
Published: (2024)
by: Aliyev, Ali
Published: (2024)
Emotional Styles Hide in Deep Speaker Embeddings: Disentangle Deep Speaker Embeddings for Speaker Clustering
by: Lin, Chaohao, et al.
Published: (2025)
by: Lin, Chaohao, et al.
Published: (2025)
Advancing Topic Segmentation of Broadcasted Speech with Multilingual Semantic Embeddings
by: Shukla, Sakshi Deo, et al.
Published: (2024)
by: Shukla, Sakshi Deo, et al.
Published: (2024)
SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross Attention
by: Li, Junjie, et al.
Published: (2023)
by: Li, Junjie, et al.
Published: (2023)
Speaker-Distinguishable CTC: Learning Speaker Distinction Using CTC for Multi-Talker Speech Recognition
by: Sakuma, Asahi, et al.
Published: (2025)
by: Sakuma, Asahi, et al.
Published: (2025)
Investigation of Speaker Representation for Target-Speaker Speech Processing
by: Ashihara, Takanori, et al.
Published: (2024)
by: Ashihara, Takanori, et al.
Published: (2024)
Cross-lingual Embedding Clustering for Hierarchical Softmax in Low-Resource Multilingual Speech Recognition
by: Yang, Zhengdong, et al.
Published: (2025)
by: Yang, Zhengdong, et al.
Published: (2025)
Multilingual Dysarthric Speech Assessment Using Universal Phone Recognition and Language-Specific Phonemic Contrast Modeling
by: Yeo, Eunjung, et al.
Published: (2026)
by: Yeo, Eunjung, et al.
Published: (2026)
Exploring Multilingual Unseen Speaker Emotion Recognition: Leveraging Co-Attention Cues in Multitask Learning
by: Goel, Arnav, et al.
Published: (2024)
by: Goel, Arnav, et al.
Published: (2024)
Speaker-Reasoner: Scaling Interaction Turns and Reasoning Patterns for Timestamped Speaker-Attributed ASR
by: Lin, Zhennan, et al.
Published: (2026)
by: Lin, Zhennan, et al.
Published: (2026)
R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces
by: Chang, Heng-Jui, et al.
Published: (2023)
by: Chang, Heng-Jui, et al.
Published: (2023)
Multilingual Prosody Transfer: Comparing Supervised & Transfer Learning
by: Goel, Arnav, et al.
Published: (2024)
by: Goel, Arnav, et al.
Published: (2024)
Noro: Noise-Robust One-shot Voice Conversion with Hidden Speaker Representation Learning
by: He, Haorui, et al.
Published: (2024)
by: He, Haorui, et al.
Published: (2024)
Speaker Contrastive Learning for Source Speaker Tracing
by: Wang, Qing, et al.
Published: (2024)
by: Wang, Qing, et al.
Published: (2024)
Stepback: Enhanced Disentanglement for Voice Conversion via Multi-Task Learning
by: Yang, Qian, et al.
Published: (2025)
by: Yang, Qian, et al.
Published: (2025)
Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis
by: Fujita, Kenichi, et al.
Published: (2024)
by: Fujita, Kenichi, et al.
Published: (2024)
In-Context Learning Boosts Speech Recognition via Human-like Adaptation to Speakers and Language Varieties
by: Roll, Nathan, et al.
Published: (2025)
by: Roll, Nathan, et al.
Published: (2025)
Speaker-Aware Simulation Improves Conversational Speech Recognition
by: Gedeon, Máté, et al.
Published: (2026)
by: Gedeon, Máté, et al.
Published: (2026)
CoLMbo: Speaker Language Model for Descriptive Profiling
by: Baali, Massa, et al.
Published: (2025)
by: Baali, Massa, et al.
Published: (2025)
A Review of Common Online Speaker Diarization Methods
by: Aperdannier, Roman, et al.
Published: (2024)
by: Aperdannier, Roman, et al.
Published: (2024)
Unifying Diarization, Separation, and ASR with Multi-Speaker Encoder
by: Shakeel, Muhammad, et al.
Published: (2025)
by: Shakeel, Muhammad, et al.
Published: (2025)
DiariST: Streaming Speech Translation with Speaker Diarization
by: Yang, Mu, et al.
Published: (2023)
by: Yang, Mu, et al.
Published: (2023)
Continual Learning Optimizations for Auto-regressive Decoder of Multilingual ASR systems
by: Kwok, Chin Yuen, et al.
Published: (2024)
by: Kwok, Chin Yuen, et al.
Published: (2024)
kNN For Whisper And Its Effect On Bias And Speaker Adaptation
by: Nachesa, Maya K., et al.
Published: (2024)
by: Nachesa, Maya K., et al.
Published: (2024)
Systematic Evaluation of Online Speaker Diarization Systems Regarding their Latency
by: Aperdannier, Roman, et al.
Published: (2024)
by: Aperdannier, Roman, et al.
Published: (2024)
Romanization Encoding For Multilingual ASR
by: Ding, Wen, et al.
Published: (2024)
by: Ding, Wen, et al.
Published: (2024)
Speech Separation based on Contrastive Learning and Deep Modularization
by: Ochieng, Peter
Published: (2023)
by: Ochieng, Peter
Published: (2023)
Factorized RVQ-GAN For Disentangled Speech Tokenization
by: Khurana, Sameer, et al.
Published: (2025)
by: Khurana, Sameer, et al.
Published: (2025)
Disentanglement in a GAN for Unconditional Speech Synthesis
by: Baas, Matthew, et al.
Published: (2023)
by: Baas, Matthew, et al.
Published: (2023)
Teaching a Multilingual Large Language Model to Understand Multilingual Speech via Multi-Instructional Training
by: Denisov, Pavel, et al.
Published: (2024)
by: Denisov, Pavel, et al.
Published: (2024)
Analysis of Speech Temporal Dynamics in the Context of Speaker Verification and Voice Anonymization
by: Tomashenko, Natalia, et al.
Published: (2024)
by: Tomashenko, Natalia, et al.
Published: (2024)
ELF: Encoding Speaker-Specific Latent Speech Feature for Speech Synthesis
by: Kong, Jungil, et al.
Published: (2023)
by: Kong, Jungil, et al.
Published: (2023)
Emotion-Anchored Contrastive Learning Framework for Emotion Recognition in Conversation
by: Yu, Fangxu, et al.
Published: (2024)
by: Yu, Fangxu, et al.
Published: (2024)
LASE: Language-Adversarial Speaker Encoding for Indic Cross-Script Identity Preservation
by: Menta, Venkata Pushpak Teja
Published: (2026)
by: Menta, Venkata Pushpak Teja
Published: (2026)
CALM: Joint Contextual Acoustic-Linguistic Modeling for Personalization of Multi-Speaker ASR
by: Shakeel, Muhammad, et al.
Published: (2026)
by: Shakeel, Muhammad, et al.
Published: (2026)
Improving Speaker Diarization using Semantic Information: Joint Pairwise Constraints Propagation
by: Cheng, Luyao, et al.
Published: (2023)
by: Cheng, Luyao, et al.
Published: (2023)
Similar Items
-
Probing the Feasibility of Multilingual Speaker Anonymization
by: Meyer, Sarina, et al.
Published: (2024) -
Disentangling Speaker Traits for Deepfake Source Verification via Chebyshev Polynomial and Riemannian Metric Learning
by: Xuan, Xi, et al.
Published: (2026) -
Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT
by: Komatsu, Ryota, et al.
Published: (2024) -
MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models
by: Nguyen, Thai-Binh, et al.
Published: (2024) -
System Description for the Displace Speaker Diarization Challenge 2023
by: Aliyev, Ali
Published: (2024)