:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wittek, Nils, Melnikov, Anton, Kaiser, Bert, Zimmermann, André
Format:	Preprint
Published:	2025
Subjects:	Audio and Speech Processing Sound
Online Access:	https://arxiv.org/abs/2512.17708
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Online Audio-Visual Autoregressive Speaker Extraction
by: Pan, Zexu, et al.
Published: (2025)

Utilizing Speaker Profiles for Impersonation Audio Detection
by: Gu, Hao, et al.
Published: (2024)

Can Audio Large Language Models Verify Speaker Identity?
by: Ren, Yiming, et al.
Published: (2025)

Speaker Distance Estimation in Enclosures from Single-Channel Audio
by: Neri, Michael, et al.
Published: (2024)

Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
by: Tao, Ruijie, et al.
Published: (2024)

Analysis of Speaker Verification Performance Trade-offs with Neural Audio Codec Transmission
by: Thakur, Nirmalya Mallick, et al.
Published: (2025)

HearFit+: Personalized Fitness Monitoring via Audio Signals on Smart Speakers
by: Xie, Yadong, et al.
Published: (2025)

Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
by: Li, Guinan, et al.
Published: (2024)

Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
by: Wang, Shuai, et al.
Published: (2024)

Robust Audio-Visual Target Speaker Extraction with Emotion-Aware Multiple Enrollment Fusion
by: Jin, Zhan, et al.
Published: (2025)

Descriptor:: Extended-Length Audio Dataset for Synthetic Voice Detection and Speaker Recognition (ELAD-SVDSR)
by: Vijaykumar, Rahul, et al.
Published: (2025)

MT2KD: Towards A General-Purpose Encoder for Speech, Speaker, and Audio Events
by: Yang, Xiaoyu, et al.
Published: (2024)

HiddenSpeaker: Generate Imperceptible Unlearnable Audios for Speaker Verification System
by: Zhang, Zhisheng, et al.
Published: (2024)

Audio-Visual Speaker Tracking: Progress, Challenges, and Future Directions
by: Zhao, Jinzheng, et al.
Published: (2023)

Zero Shot Audio to Audio Emotion Transfer With Speaker Disentanglement
by: Dutta, Soumya, et al.
Published: (2024)

Two-stage Audio-Visual Target Speaker Extraction System for Real-Time Processing On Edge Device
by: Li, Zixuan, et al.
Published: (2025)

AVFSNet: Audio-Visual Speech Separation for Flexible Number of Speakers with Multi-Scale and Multi-Task Learning
by: Zhang, Daning, et al.
Published: (2025)

Vibration Sensitivity of one-port and two-port MEMS microphones
by: Doyon-D'Amour, Francis, et al.
Published: (2024)

Interfacing PDM MEMS microphones with PFM spiking systems: Application for Neuromorphic Auditory Sensors
by: Jimenez-Fernandez, Angel, et al.
Published: (2019)

Quality-Aware End-to-End Audio-Visual Neural Speaker Diarization
by: He, Mao-Kui, et al.
Published: (2024)

Speaker Contrastive Learning for Source Speaker Tracing
by: Wang, Qing, et al.
Published: (2024)

BANC: Towards Efficient Binaural Audio Neural Codec for Overlapping Speech
by: Ratnarajah, Anton, et al.
Published: (2023)

Emotional Styles Hide in Deep Speaker Embeddings: Disentangle Deep Speaker Embeddings for Speaker Clustering
by: Lin, Chaohao, et al.
Published: (2025)

Pretraining Multi-Speaker Identification for Neural Speaker Diarization
by: Horiguchi, Shota, et al.
Published: (2025)

Learning Emotion-Invariant Speaker Representations for Speaker Verification
by: Tian, Jingguang, et al.
Published: (2025)

An Investigation on Speaker Augmentation for End-to-End Speaker Extraction
by: You, Zhenghai, et al.
Published: (2025)

A Comprehensive Investigation on Speaker Augmentation for Speaker Recognition
by: Zhou, Zhenyu, et al.
Published: (2024)

Multi-Level Speaker Representation for Target Speaker Extraction
by: Zhang, Ke, et al.
Published: (2024)

Plug-and-Steer: Decoupling Separation and Selection in Audio-Visual Target Speaker Extraction
by: Kwak, Doyeop, et al.
Published: (2026)

Enhancing Target Speaker Extraction with Explicit Speaker Consistency Modeling
by: Wu, Shu, et al.
Published: (2025)

Mitigating Non-Target Speaker Bias in Guided Speaker Embedding
by: Horiguchi, Shota, et al.
Published: (2025)

Implementation and Applications of WakeWords Integrated with Speaker Recognition: A Case Study
by: Filho, Alexandre Costa Ferro, et al.
Published: (2024)

Speaker Targeting via Self-Speaker Adaptation for Multi-talker ASR
by: Wang, Weiqing, et al.
Published: (2025)

USEF-TSE: Universal Speaker Embedding Free Target Speaker Extraction
by: Zeng, Bang, et al.
Published: (2024)

Enhancing Zero-Shot Multi-Speaker TTS with Negated Speaker Representations
by: Jeon, Yejin, et al.
Published: (2024)

Speaker-Smoothed kNN Speaker Adaptation for End-to-End ASR
by: Li, Shaojun, et al.
Published: (2024)

NanoVoice: Efficient Speaker-Adaptive Text-to-Speech for Multiple Speakers
by: Park, Nohil, et al.
Published: (2024)

On Speaker Attribution with SURT
by: Raj, Desh, et al.
Published: (2024)

Guided Speaker Embedding
by: Horiguchi, Shota, et al.
Published: (2024)

Can We Really Repurpose Multi-Speaker ASR Corpus for Speaker Diarization?
by: Horiguchi, Shota, et al.
Published: (2025)