Saved in:
| Main Authors: | Gu, Bin, Guo, Wu, Dai, Lirong, Du, Jun |
|---|---|
| Format: | Preprint |
| Published: |
2020
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2002.06049 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
3D-Speaker-Toolkit: An Open-Source Toolkit for Multimodal Speaker Verification and Diarization
by: Chen, Yafeng, et al.
Published: (2024)
by: Chen, Yafeng, et al.
Published: (2024)
Advanced Signal Analysis in Detecting Replay Attacks for Automatic Speaker Verification Systems
by: Kuang, Lee Shih
Published: (2024)
by: Kuang, Lee Shih
Published: (2024)
ERes2NetV2: Boosting Short-Duration Speaker Verification Performance with Computational Efficiency
by: Chen, Yafeng, et al.
Published: (2024)
by: Chen, Yafeng, et al.
Published: (2024)
Target Speaker Selection for Neural Network Beamforming in Multi-Speaker Scenarios
by: Fiorio, Luan Vinícius, et al.
Published: (2025)
by: Fiorio, Luan Vinícius, et al.
Published: (2025)
SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling
by: Sato, Hiroshi, et al.
Published: (2024)
by: Sato, Hiroshi, et al.
Published: (2024)
Binaural Selective Attention Model for Target Speaker Extraction
by: Meng, Hanyu, et al.
Published: (2024)
by: Meng, Hanyu, et al.
Published: (2024)
Speakers Localization Using Batch EM In Unfolding Neural Network
by: Veler, Rina, et al.
Published: (2026)
by: Veler, Rina, et al.
Published: (2026)
Speaker and Style Disentanglement of Speech Based on Contrastive Predictive Coding Supported Factorized Variational Autoencoder
by: Xie, Yuying, et al.
Published: (2024)
by: Xie, Yuying, et al.
Published: (2024)
Tracking of Intermittent and Moving Speakers : Dataset and Metrics
by: Iatariene, Taous, et al.
Published: (2025)
by: Iatariene, Taous, et al.
Published: (2025)
A Stage-Wise Learning Strategy with Fixed Anchors for Robust Speaker Verification
by: Gu, Bin, et al.
Published: (2025)
by: Gu, Bin, et al.
Published: (2025)
Robustness of Speech Separation Models for Similar-pitch Speakers
by: Lay, Bunlong, et al.
Published: (2024)
by: Lay, Bunlong, et al.
Published: (2024)
Comparison of Frequency-Fusion Mechanisms for Binaural Direction-of-Arrival Estimation for Multiple Speakers
by: Fejgin, Daniel, et al.
Published: (2024)
by: Fejgin, Daniel, et al.
Published: (2024)
Solid State Bus-Comp: A Large-Scale and Diverse Dataset for Dynamic Range Compressor Virtual Analog Modeling
by: Gu, Yicheng, et al.
Published: (2025)
by: Gu, Yicheng, et al.
Published: (2025)
Exploiting an External Microphone for Binaural RTF-Vector-Based Direction of Arrival Estimation for Multiple Speakers
by: Fejgin, Daniel, et al.
Published: (2023)
by: Fejgin, Daniel, et al.
Published: (2023)
Completing Sets of Prototype Transfer Functions for Subspace-based Direction of Arrival Estimation of Multiple Speakers
by: Fejgin, Daniel, et al.
Published: (2025)
by: Fejgin, Daniel, et al.
Published: (2025)
TTSlow: Slow Down Text-to-Speech with Efficiency Robustness Evaluations
by: Gao, Xiaoxue, et al.
Published: (2024)
by: Gao, Xiaoxue, et al.
Published: (2024)
Towards Low-Latency Tracking of Multiple Speakers With Short-Context Speaker Embeddings
by: Iatariene, Taous, et al.
Published: (2025)
by: Iatariene, Taous, et al.
Published: (2025)
Explainable AI in Speaker Recognition -- Making Latent Representations Understandable
by: Xu, Yanze, et al.
Published: (2026)
by: Xu, Yanze, et al.
Published: (2026)
Multi-channel Replay Speech Detection using an Adaptive Learnable Beamformer
by: Neri, Michael, et al.
Published: (2025)
by: Neri, Michael, et al.
Published: (2025)
SELM: Speech Enhancement Using Discrete Tokens and Language Models
by: Wang, Ziqian, et al.
Published: (2023)
by: Wang, Ziqian, et al.
Published: (2023)
Mitigating Intra-Speaker Variability in Diarization with Style-Controllable Speech Augmentation
by: Kim, Miseul, et al.
Published: (2025)
by: Kim, Miseul, et al.
Published: (2025)
Zero-Bit Transmission of Adaptive Pre- and De-emphasis Filters for Speech and Audio Coding
by: Piralideh, Niloofar Omidi, et al.
Published: (2024)
by: Piralideh, Niloofar Omidi, et al.
Published: (2024)
Optimizing Domain-Adaptive Self-Supervised Learning for Clinical Voice-Based Disease Classification
by: Liu, Weixin, et al.
Published: (2026)
by: Liu, Weixin, et al.
Published: (2026)
Exploring Audio-Visual Information Fusion for Sound Event Localization and Detection In Low-Resource Realistic Scenarios
by: Jiang, Ya, et al.
Published: (2024)
by: Jiang, Ya, et al.
Published: (2024)
Speech-preserving active noise control: a deep learning approach in reverberant environments
by: Dai, Shuning
Published: (2026)
by: Dai, Shuning
Published: (2026)
Breaking Speaker Recognition with PaddingBack
by: Ye, Zhe, et al.
Published: (2023)
by: Ye, Zhe, et al.
Published: (2023)
An Investigation of Time-Frequency Representation Discriminators for High-Fidelity Vocoder
by: Gu, Yicheng, et al.
Published: (2024)
by: Gu, Yicheng, et al.
Published: (2024)
A Neural Denoising Vocoder for Clean Waveform Generation from Noisy Mel-Spectrogram based on Amplitude and Phase Predictions
by: Du, Hui-Peng, et al.
Published: (2024)
by: Du, Hui-Peng, et al.
Published: (2024)
SIRUP: A diffusion-based virtual upmixer of steering vectors for highly-directive spatialization with first-order ambisonics
by: Picard, Emilio, et al.
Published: (2026)
by: Picard, Emilio, et al.
Published: (2026)
Self-Tuning Spectral Clustering for Speaker Diarization
by: Raghav, Nikhil, et al.
Published: (2024)
by: Raghav, Nikhil, et al.
Published: (2024)
ParaS2S: Benchmarking and Aligning Spoken Language Models for Paralinguistic-aware Speech-to-Speech Interaction
by: Yang, Shu-wen, et al.
Published: (2025)
by: Yang, Shu-wen, et al.
Published: (2025)
Aliasing-Free Neural Audio Synthesis
by: Gu, Yicheng, et al.
Published: (2025)
by: Gu, Yicheng, et al.
Published: (2025)
FUN-SSL: Full-band Layer Followed by U-Net with Narrow-band Layers for Multiple Moving Sound Source Localization
by: Choi, Yuseon, et al.
Published: (2025)
by: Choi, Yuseon, et al.
Published: (2025)
StreamVoiceAnon+: Emotion-Preserving Streaming Speaker Anonymization via Frame-Level Acoustic Distillation
by: Kuzmin, Nikita, et al.
Published: (2026)
by: Kuzmin, Nikita, et al.
Published: (2026)
SoundSpring: Loss-Resilient Audio Transceiver with Dual-Functional Masked Language Modeling
by: Yao, Shengshi, et al.
Published: (2025)
by: Yao, Shengshi, et al.
Published: (2025)
Automotive sound field reproduction using deep optimization with spatial domain constraint
by: Qian, Yufan, et al.
Published: (2025)
by: Qian, Yufan, et al.
Published: (2025)
Binaural Localization Model for Speech in Noise
by: Tokala, Vikas, et al.
Published: (2025)
by: Tokala, Vikas, et al.
Published: (2025)
Adaptive Diagonal Loading using Krylov Subspaces for Robust Beamforming
by: Mittal, Manan, et al.
Published: (2026)
by: Mittal, Manan, et al.
Published: (2026)
FlowSE: Efficient and High-Quality Speech Enhancement via Flow Matching
by: Wang, Ziqian, et al.
Published: (2025)
by: Wang, Ziqian, et al.
Published: (2025)
Dependence on Early and Late Reverberation of Single-Channel Speaker Distance Estimation
by: Neri, Michael, et al.
Published: (2026)
by: Neri, Michael, et al.
Published: (2026)
Similar Items
-
3D-Speaker-Toolkit: An Open-Source Toolkit for Multimodal Speaker Verification and Diarization
by: Chen, Yafeng, et al.
Published: (2024) -
Advanced Signal Analysis in Detecting Replay Attacks for Automatic Speaker Verification Systems
by: Kuang, Lee Shih
Published: (2024) -
ERes2NetV2: Boosting Short-Duration Speaker Verification Performance with Computational Efficiency
by: Chen, Yafeng, et al.
Published: (2024) -
Target Speaker Selection for Neural Network Beamforming in Multi-Speaker Scenarios
by: Fiorio, Luan Vinícius, et al.
Published: (2025) -
SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling
by: Sato, Hiroshi, et al.
Published: (2024)