Saved in:
| Main Author: | Bartolo, Matthias |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2408.06804 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Memory-Efficient Training for Deep Speaker Embedding Learning in Speaker Verification
by: Liu, Bei, et al.
Published: (2024)
by: Liu, Bei, et al.
Published: (2024)
Evaluating Speaker Identity Coding in Self-supervised Models and Humans
by: Elbanna, Gasser
Published: (2024)
by: Elbanna, Gasser
Published: (2024)
Speaker Embeddings to Improve Tracking of Intermittent and Moving Speakers
by: Iatariene, Taous, et al.
Published: (2025)
by: Iatariene, Taous, et al.
Published: (2025)
Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC
by: Kang, Jiawen, et al.
Published: (2024)
by: Kang, Jiawen, et al.
Published: (2024)
Developing an Effective Training Dataset to Enhance the Performance of AI-based Speaker Separation Systems
by: Melhem, Rawad, et al.
Published: (2024)
by: Melhem, Rawad, et al.
Published: (2024)
Leveraging Speaker Embeddings in End-to-End Neural Diarization for Two-Speaker Scenarios
by: Alvarez-Trejos, Juan Ignacio, et al.
Published: (2024)
by: Alvarez-Trejos, Juan Ignacio, et al.
Published: (2024)
Text-dependent Speaker Verification (TdSV) Challenge 2024: Challenge Evaluation Plan
by: Hossein, Zeinali, et al.
Published: (2024)
by: Hossein, Zeinali, et al.
Published: (2024)
Whisper Speaker Identification: Leveraging Pre-Trained Multilingual Transformers for Robust Speaker Embeddings
by: Emon, Jakaria Islam, et al.
Published: (2025)
by: Emon, Jakaria Islam, et al.
Published: (2025)
Explainable Attribute-Based Speaker Verification
by: Wu, Xiaoliang, et al.
Published: (2024)
by: Wu, Xiaoliang, et al.
Published: (2024)
From Modular to End-to-End Speaker Diarization
by: Landini, Federico
Published: (2024)
by: Landini, Federico
Published: (2024)
Certification of Speaker Recognition Models to Additive Perturbations
by: Korzh, Dmitrii, et al.
Published: (2024)
by: Korzh, Dmitrii, et al.
Published: (2024)
The VoxCeleb Speaker Recognition Challenge: A Retrospective
by: Huh, Jaesung, et al.
Published: (2024)
by: Huh, Jaesung, et al.
Published: (2024)
SDBench: A Comprehensive Benchmark Suite for Speaker Diarization
by: Pacheco, Eduardo, et al.
Published: (2025)
by: Pacheco, Eduardo, et al.
Published: (2025)
Quranic Audio Dataset: Crowdsourced and Labeled Recitation from Non-Arabic Speakers
by: Salameh, Raghad, et al.
Published: (2024)
by: Salameh, Raghad, et al.
Published: (2024)
End-to-End Supervised Hierarchical Graph Clustering for Speaker Diarization
by: Singh, Prachi, et al.
Published: (2024)
by: Singh, Prachi, et al.
Published: (2024)
LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec
by: Guo, Yiwei, et al.
Published: (2024)
by: Guo, Yiwei, et al.
Published: (2024)
DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech
by: Melechovsky, Jan, et al.
Published: (2024)
by: Melechovsky, Jan, et al.
Published: (2024)
Asynchronous Voice Anonymization Using Adversarial Perturbation On Speaker Embedding
by: Wang, Rui, et al.
Published: (2024)
by: Wang, Rui, et al.
Published: (2024)
Unispeaker: A Unified Approach for Multimodality-driven Speaker Generation
by: Sheng, Zhengyan, et al.
Published: (2025)
by: Sheng, Zhengyan, et al.
Published: (2025)
Automatic Speech Recognition in the Modern Era: Architectures, Training, and Evaluation
by: Nayeem, Md., et al.
Published: (2025)
by: Nayeem, Md., et al.
Published: (2025)
Removing Speaker Information from Speech Representation using Variable-Length Soft Pooling
by: Hwang, Injune, et al.
Published: (2024)
by: Hwang, Injune, et al.
Published: (2024)
Towards Low-Latency Tracking of Multiple Speakers With Short-Context Speaker Embeddings
by: Iatariene, Taous, et al.
Published: (2025)
by: Iatariene, Taous, et al.
Published: (2025)
Who is Authentic Speaker
by: Huang, Qiang
Published: (2024)
by: Huang, Qiang
Published: (2024)
Music Genre Classification: A Comparative Analysis of Classical Machine Learning and Deep Learning Approaches
by: Prajuli, Sachin, et al.
Published: (2026)
by: Prajuli, Sachin, et al.
Published: (2026)
NeuroSpex: Neuro-Guided Speaker Extraction with Cross-Modal Attention
by: De Silva, Dashanka, et al.
Published: (2024)
by: De Silva, Dashanka, et al.
Published: (2024)
ASoBO: Attentive Beamformer Selection for Distant Speaker Diarization in Meetings
by: Mariotte, Theo, et al.
Published: (2024)
by: Mariotte, Theo, et al.
Published: (2024)
ExPO: Explainable Phonetic Trait-Oriented Network for Speaker Verification
by: Ma, Yi, et al.
Published: (2025)
by: Ma, Yi, et al.
Published: (2025)
Multi-Speaker Conversational Audio Deepfake: Taxonomy, Dataset and Pilot Study
by: Ahmed, Alabi, et al.
Published: (2026)
by: Ahmed, Alabi, et al.
Published: (2026)
EmoSpeech: A Corpus of Emotionally Rich and Contextually Detailed Speech Annotations
by: Bian, Weizhen, et al.
Published: (2024)
by: Bian, Weizhen, et al.
Published: (2024)
Towards Speaker Identification with Minimal Dataset and Constrained Resources using 1D-Convolution Neural Network
by: Shahan, Irfan Nafiz, et al.
Published: (2024)
by: Shahan, Irfan Nafiz, et al.
Published: (2024)
Perceiver-Prompt: Flexible Speaker Adaptation in Whisper for Chinese Disordered Speech Recognition
by: Jiang, Yicong, et al.
Published: (2024)
by: Jiang, Yicong, et al.
Published: (2024)
ML-SAN: Multi-Level Speaker-Adaptive Network for Emotion Recognition in Conversations
by: Wang, Kexue, et al.
Published: (2026)
by: Wang, Kexue, et al.
Published: (2026)
Improving Neural Diarization through Speaker Attribute Attractors and Local Dependency Modeling
by: Palzer, David, et al.
Published: (2025)
by: Palzer, David, et al.
Published: (2025)
Target Speaker Extraction through Comparing Noisy Positive and Negative Audio Enrollments
by: Xu, Shitong, et al.
Published: (2025)
by: Xu, Shitong, et al.
Published: (2025)
Stable-TTS: Stable Speaker-Adaptive Text-to-Speech Synthesis via Prosody Prompting
by: Han, Wooseok, et al.
Published: (2024)
by: Han, Wooseok, et al.
Published: (2024)
Egocentric Speaker Classification in Child-Adult Dyadic Interactions: From Sensing to Computational Modeling
by: Feng, Tiantian, et al.
Published: (2024)
by: Feng, Tiantian, et al.
Published: (2024)
The SVASR System for Text-dependent Speaker Verification (TdSV) AAIC Challenge 2024
by: Molavi, Mohammadreza, et al.
Published: (2024)
by: Molavi, Mohammadreza, et al.
Published: (2024)
MultiActor-Audiobook: Zero-Shot Audiobook Generation with Faces and Voices of Multiple Speakers
by: Park, Kyeongman, et al.
Published: (2025)
by: Park, Kyeongman, et al.
Published: (2025)
Do Not Mimic My Voice: Speaker Identity Unlearning for Zero-Shot Text-to-Speech
by: Kim, Taesoo, et al.
Published: (2025)
by: Kim, Taesoo, et al.
Published: (2025)
Plug-and-Play Co-Occurring Face Attention for Robust Audio-Visual Speaker Extraction
by: Pan, Zexu, et al.
Published: (2025)
by: Pan, Zexu, et al.
Published: (2025)
Similar Items
-
Memory-Efficient Training for Deep Speaker Embedding Learning in Speaker Verification
by: Liu, Bei, et al.
Published: (2024) -
Evaluating Speaker Identity Coding in Self-supervised Models and Humans
by: Elbanna, Gasser
Published: (2024) -
Speaker Embeddings to Improve Tracking of Intermittent and Moving Speakers
by: Iatariene, Taous, et al.
Published: (2025) -
Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC
by: Kang, Jiawen, et al.
Published: (2024) -
Developing an Effective Training Dataset to Enhance the Performance of AI-based Speaker Separation Systems
by: Melhem, Rawad, et al.
Published: (2024)