Saved in:
| Main Authors: | Rozenfeld, Vadim, Goldshtein, Bracha Laufer |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.11804 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Uncertainty Quantification and Risk Control for Multi-Speaker Sound Source Localization
by: Rozenfeld, Vadim, et al.
Published: (2026)
by: Rozenfeld, Vadim, et al.
Published: (2026)
MAPSS: Manifold-based Assessment of Perceptual Source Separation
by: Ivry, Amir, et al.
Published: (2025)
by: Ivry, Amir, et al.
Published: (2025)
LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading
by: Yemini, Yochai, et al.
Published: (2023)
by: Yemini, Yochai, et al.
Published: (2023)
CNN-based Robust Sound Source Localization with SRP-PHAT for the Extreme Edge
by: Yin, Jun, et al.
Published: (2025)
by: Yin, Jun, et al.
Published: (2025)
NEST: Self-supervised Fast Conformer as All-purpose Seasoning to Speech Processing Tasks
by: Huang, He, et al.
Published: (2024)
by: Huang, He, et al.
Published: (2024)
Where's That Voice Coming? Continual Learning for Sound Source Localization
by: Xiao, Yang, et al.
Published: (2024)
by: Xiao, Yang, et al.
Published: (2024)
An Efficient GPU-based Implementation for Noise Robust Sound Source Localization
by: Lin, Zirui, et al.
Published: (2025)
by: Lin, Zirui, et al.
Published: (2025)
Analytic Class Incremental Learning for Sound Source Localization with Privacy Protection
by: Qian, Xinyuan, et al.
Published: (2024)
by: Qian, Xinyuan, et al.
Published: (2024)
TF-Mamba: A Time-Frequency Network for Sound Source Localization
by: Xiao, Yang, et al.
Published: (2024)
by: Xiao, Yang, et al.
Published: (2024)
Steered Response Power for Sound Source Localization: A Tutorial Review
by: Grinstein, Eric, et al.
Published: (2024)
by: Grinstein, Eric, et al.
Published: (2024)
GRAFX: An Open-Source Library for Audio Processing Graphs in PyTorch
by: Lee, Sungho, et al.
Published: (2024)
by: Lee, Sungho, et al.
Published: (2024)
Determined Blind Source Separation with Sinkhorn Divergence-based Optimal Allocation of the Source Power
by: Wang, Jianyu, et al.
Published: (2025)
by: Wang, Jianyu, et al.
Published: (2025)
Conformer-based Ultrasound-to-Speech Conversion
by: Ibrahimov, Ibrahim, et al.
Published: (2025)
by: Ibrahimov, Ibrahim, et al.
Published: (2025)
IPDnet: A Universal Direct-Path IPD Estimation Network for Sound Source Localization
by: Wang, Yabo, et al.
Published: (2024)
by: Wang, Yabo, et al.
Published: (2024)
A Steered Response Power Method for Sound Source Localization With Generic Acoustic Models
by: Müller, Kaspar, et al.
Published: (2025)
by: Müller, Kaspar, et al.
Published: (2025)
Combining Audio and Non-Audio Inputs in Evolved Neural Networks for Ovenbird
by: Hernandez, Sergio Poo, et al.
Published: (2025)
by: Hernandez, Sergio Poo, et al.
Published: (2025)
Efficient Area-based and Speaker-Agnostic Source Separation
by: Strauss, Martin, et al.
Published: (2024)
by: Strauss, Martin, et al.
Published: (2024)
SingMOS: An extensive Open-Source Singing Voice Dataset for MOS Prediction
by: Tang, Yuxun, et al.
Published: (2024)
by: Tang, Yuxun, et al.
Published: (2024)
Unrestricted Global Phase Bias-Aware Single-channel Speech Enhancement with Conformer-based Metric GAN
by: Zhang, Shiqi, et al.
Published: (2024)
by: Zhang, Shiqi, et al.
Published: (2024)
SYKI-SVC: Advancing Singing Voice Conversion with Post-Processing Innovations and an Open-Source Professional Testset
by: Zhou, Yiquan, et al.
Published: (2025)
by: Zhou, Yiquan, et al.
Published: (2025)
Interpreting End-to-End Deep Learning Models for Speech Source Localization Using Layer-wise Relevance Propagation
by: Comanducci, Luca, et al.
Published: (2024)
by: Comanducci, Luca, et al.
Published: (2024)
Diffusion based Text-to-Music Generation with Global and Local Text based Conditioning
by: Zhang, Jisi, et al.
Published: (2025)
by: Zhang, Jisi, et al.
Published: (2025)
Binaural Sound Event Localization and Detection Neural Network based on HRTF Localization Cues for Humanoid Robots
by: Lee, Gyeong-Tae
Published: (2025)
by: Lee, Gyeong-Tae
Published: (2025)
Reference Microphone Selection for Guided Source Separation based on the Normalized L-p Norm
by: Lohmann, Anselm, et al.
Published: (2025)
by: Lohmann, Anselm, et al.
Published: (2025)
Determined Multichannel Blind Source Separation with Clustered Source Model
by: Wang, Jianyu, et al.
Published: (2024)
by: Wang, Jianyu, et al.
Published: (2024)
DTT-BSR: GAN-based DTTNet with RoPE Transformer Enhancement for Music Source Restoration
by: Tan, Shihong, et al.
Published: (2026)
by: Tan, Shihong, et al.
Published: (2026)
ChunkFormer: Masked Chunking Conformer For Long-Form Speech Transcription
by: Le, Khanh, et al.
Published: (2025)
by: Le, Khanh, et al.
Published: (2025)
Single-Microphone-Based Sound Source Localization for Mobile Robots in Reverberant Environments
by: Wang, Jiang, et al.
Published: (2025)
by: Wang, Jiang, et al.
Published: (2025)
Efficient and Robust Long-Form Speech Recognition with Hybrid H3-Conformer
by: Honda, Tomoki, et al.
Published: (2024)
by: Honda, Tomoki, et al.
Published: (2024)
Binaural Sound Event Localization and Detection based on HRTF Cues for Humanoid Robots
by: Lee, Gyeong-Tae, et al.
Published: (2025)
by: Lee, Gyeong-Tae, et al.
Published: (2025)
Noise-Robust Contrastive Learning with an MFCC-Conformer For Coronary Artery Disease Detection
by: Marocchi, Milan, et al.
Published: (2026)
by: Marocchi, Milan, et al.
Published: (2026)
Source Separation by Flow Matching
by: Scheibler, Robin, et al.
Published: (2025)
by: Scheibler, Robin, et al.
Published: (2025)
Source Verification for Speech Deepfakes
by: Negroni, Viola, et al.
Published: (2025)
by: Negroni, Viola, et al.
Published: (2025)
Frame-Aligned Fusion of Canary and WavLM for Non-Intrusive Intelligibility Prediction of Hearing-Aid-Processed Speech
by: Nakazawa, Kazushi
Published: (2026)
by: Nakazawa, Kazushi
Published: (2026)
A Comparative Study on Positional Encoding for Time-frequency Domain Dual-path Transformer-based Source Separation Models
by: Saijo, Kohei, et al.
Published: (2025)
by: Saijo, Kohei, et al.
Published: (2025)
AuralNet: Hierarchical Attention-based 3D Binaural Localization of Overlapping Speakers
by: Fu, Linya, et al.
Published: (2025)
by: Fu, Linya, et al.
Published: (2025)
WTFormer: A Wavelet Conformer Network for MIMO Speech Enhancement with Spatial Cues Peservation
by: Han, Lu, et al.
Published: (2025)
by: Han, Lu, et al.
Published: (2025)
Source Tracing of Audio Deepfake Systems
by: Klein, Nicholas, et al.
Published: (2024)
by: Klein, Nicholas, et al.
Published: (2024)
Task-Aware Unified Source Separation
by: Saijo, Kohei, et al.
Published: (2024)
by: Saijo, Kohei, et al.
Published: (2024)
Fast Algorithm for Moving Sound Source
by: Yang, Dong
Published: (2025)
by: Yang, Dong
Published: (2025)
Similar Items
-
Uncertainty Quantification and Risk Control for Multi-Speaker Sound Source Localization
by: Rozenfeld, Vadim, et al.
Published: (2026) -
MAPSS: Manifold-based Assessment of Perceptual Source Separation
by: Ivry, Amir, et al.
Published: (2025) -
LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading
by: Yemini, Yochai, et al.
Published: (2023) -
CNN-based Robust Sound Source Localization with SRP-PHAT for the Extreme Edge
by: Yin, Jun, et al.
Published: (2025) -
NEST: Self-supervised Fast Conformer as All-purpose Seasoning to Speech Processing Tasks
by: Huang, He, et al.
Published: (2024)