Saved in:
| Main Authors: | Liu, Yang, Wan, Li, Huang, Yiteng, Xu, Yong, shi, yangyang, Adya, Saurabh, sun, ming, Metze, Florian |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2507.05609 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Multi-Channel Differential ASR for Robust Wearer Speech Recognition on Smart Glasses
by: Yang, Yufeng, et al.
Published: (2025)
by: Yang, Yufeng, et al.
Published: (2025)
MASV: Speaker Verification with Global and Local Context Mamba
by: Liu, Yang, et al.
Published: (2024)
by: Liu, Yang, et al.
Published: (2024)
M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses
by: Yang, Yufeng, et al.
Published: (2024)
by: Yang, Yufeng, et al.
Published: (2024)
Directional Source Separation for Robust Speech Recognition on Smart Glasses
by: Feng, Tiantian, et al.
Published: (2023)
by: Feng, Tiantian, et al.
Published: (2023)
Thinking in Directivity: Speech Large Language Model for Multi-Talker Directional Speech Recognition
by: Xie, Jiamin, et al.
Published: (2025)
by: Xie, Jiamin, et al.
Published: (2025)
M2R-Whisper: Multi-stage and Multi-scale Retrieval Augmentation for Enhancing Whisper
by: Zhou, Jiaming, et al.
Published: (2024)
by: Zhou, Jiaming, et al.
Published: (2024)
GAN-Based Multi-Microphone Spatial Target Speaker Extraction
by: Shetu, Shrishti Saha, et al.
Published: (2025)
by: Shetu, Shrishti Saha, et al.
Published: (2025)
WhisperMask: A Noise Suppressive Mask-Type Microphone for Whisper Speech
by: Hiraki, Hirotaka, et al.
Published: (2024)
by: Hiraki, Hirotaka, et al.
Published: (2024)
Whisper-PMFA: Partial Multi-Scale Feature Aggregation for Speaker Verification using Whisper Models
by: Zhao, Yiyang, et al.
Published: (2024)
by: Zhao, Yiyang, et al.
Published: (2024)
Advances in Microphone Array Processing and Multichannel Speech Enhancement
by: Huang, Gongping, et al.
Published: (2025)
by: Huang, Gongping, et al.
Published: (2025)
TellWhisper: Tell Whisper Who Speaks When
by: Hu, Yifan, et al.
Published: (2026)
by: Hu, Yifan, et al.
Published: (2026)
Hierarchical Sparse Sound Field Reconstruction with Spherical and Linear Microphone Arrays
by: Xu, Shunxi, et al.
Published: (2025)
by: Xu, Shunxi, et al.
Published: (2025)
Microphone Occlusion Mitigation for Own-Voice Enhancement in Head-Worn Microphone Arrays Using Switching-Adaptive Beamforming
by: Middelberg, Wiebke, et al.
Published: (2025)
by: Middelberg, Wiebke, et al.
Published: (2025)
Neural Directional Filtering Using a Compact Microphone Array
by: Huang, Weilong, et al.
Published: (2025)
by: Huang, Weilong, et al.
Published: (2025)
Enhancing Conversational TTS with Cascaded Prompting and ICL-Based Online Reinforcement Learning
by: Ouyang, Zhicheng, et al.
Published: (2026)
by: Ouyang, Zhicheng, et al.
Published: (2026)
Neural Ambisonic Encoding For Multi-Speaker Scenarios Using A Circular Microphone Array
by: Qiao, Yue, et al.
Published: (2024)
by: Qiao, Yue, et al.
Published: (2024)
SpatialEmb: Extract and Encode Spatial Information for 1-Stage Multi-channel Multi-speaker ASR on Arbitrary Microphone Arrays
by: Shao, Yiwen, et al.
Published: (2026)
by: Shao, Yiwen, et al.
Published: (2026)
Blind Identification of Binaural Room Impulse Responses from Smart Glasses
by: Deppisch, Thomas, et al.
Published: (2024)
by: Deppisch, Thomas, et al.
Published: (2024)
Multi-Microphone Noise Data Augmentation for DNN-based Own Voice Reconstruction for Hearables in Noisy Environments
by: Ohlenbusch, Mattes, et al.
Published: (2023)
by: Ohlenbusch, Mattes, et al.
Published: (2023)
Low-Complexity Own Voice Reconstruction for Hearables with an In-Ear Microphone
by: Ohlenbusch, Mattes, et al.
Published: (2024)
by: Ohlenbusch, Mattes, et al.
Published: (2024)
WhisperVC: Decoupled Cross-Domain Alignment and Speech Generation for Low-Resource Whisper-to-Normal Conversion
by: Liu, Dong, et al.
Published: (2025)
by: Liu, Dong, et al.
Published: (2025)
FADI-AEC: Fast Score Based Diffusion Model Guided by Far-end Signal for Acoustic Echo Cancellation
by: Liu, Yang, et al.
Published: (2024)
by: Liu, Yang, et al.
Published: (2024)
A Unified SVD-Modal Solution for Sparse Sound Field Reconstruction with Hybrid Spherical-Linear Microphone Arrays
by: Xu, Shunxi, et al.
Published: (2026)
by: Xu, Shunxi, et al.
Published: (2026)
Whisper-SV: Adapting Whisper for Low-data-resource Speaker Verification
by: Zhang, Li, et al.
Published: (2024)
by: Zhang, Li, et al.
Published: (2024)
SQ-Whisper: Speaker-Querying based Whisper Model for Target-Speaker ASR
by: Guo, Pengcheng, et al.
Published: (2024)
by: Guo, Pengcheng, et al.
Published: (2024)
Enhanced Deep Speech Separation in Clustered Ad Hoc Distributed Microphone Environments
by: Kim, Jihyun, et al.
Published: (2024)
by: Kim, Jihyun, et al.
Published: (2024)
Ambisonics Encoding For Arbitrary Microphone Arrays Incorporating Residual Channels For Binaural Reproduction
by: Gayer, Yhonatan, et al.
Published: (2024)
by: Gayer, Yhonatan, et al.
Published: (2024)
A Self-Training Approach for Whisper to Enhance Long Dysarthric Speech Recognition
by: Wang, Shiyao, et al.
Published: (2025)
by: Wang, Shiyao, et al.
Published: (2025)
Effective Integration of KAN for Keyword Spotting
by: Xu, Anfeng, et al.
Published: (2024)
by: Xu, Anfeng, et al.
Published: (2024)
Multi-Microphone and Multi-Modal Emotion Recognition in Reverberant Environment
by: Cohen, Ohad, et al.
Published: (2024)
by: Cohen, Ohad, et al.
Published: (2024)
WhisperFlow: speech foundation models in real time
by: Wang, Rongxiang, et al.
Published: (2024)
by: Wang, Rongxiang, et al.
Published: (2024)
HOMULA-RIR: A Room Impulse Response Dataset for Teleconferencing and Spatial Audio Applications Acquired Through Higher-Order Microphones and Uniform Linear Microphone Arrays
by: Miotello, Federico, et al.
Published: (2024)
by: Miotello, Federico, et al.
Published: (2024)
State-Space Models in Efficient Whispered and Multi-dialect Speech Recognition
by: Farhadipour, Aref, et al.
Published: (2025)
by: Farhadipour, Aref, et al.
Published: (2025)
Speech-dependent Data Augmentation for Own Voice Reconstruction with Hearable Microphones in Noisy Environments
by: Ohlenbusch, Mattes, et al.
Published: (2024)
by: Ohlenbusch, Mattes, et al.
Published: (2024)
The trajectoRIR Database: Room Acoustic Recordings Along a Trajectory of Moving Microphones
by: Damiano, Stefano, et al.
Published: (2025)
by: Damiano, Stefano, et al.
Published: (2025)
Design and Analysis of Binaural Signal Matching with Arbitrary Microphone Arrays and Listener Head Rotations
by: Madmoni, Lior, et al.
Published: (2024)
by: Madmoni, Lior, et al.
Published: (2024)
Target Speaker ASR with Whisper
by: Polok, Alexander, et al.
Published: (2024)
by: Polok, Alexander, et al.
Published: (2024)
ASAP: An Azimuth-Priority Strip-Based Search Approach to Planar Microphone Array DOA Estimation in 3D
by: Huang, Ming, et al.
Published: (2026)
by: Huang, Ming, et al.
Published: (2026)
Impact of Microphone Array Mismatches to Learning-based Replay Speech Detection
by: Neri, Michael, et al.
Published: (2025)
by: Neri, Michael, et al.
Published: (2025)
Applying Automatic Differentiation to Optimize Differential Microphone Array Designs
by: Galougah, Siminfar Samakoush, et al.
Published: (2024)
by: Galougah, Siminfar Samakoush, et al.
Published: (2024)
Similar Items
-
Multi-Channel Differential ASR for Robust Wearer Speech Recognition on Smart Glasses
by: Yang, Yufeng, et al.
Published: (2025) -
MASV: Speaker Verification with Global and Local Context Mamba
by: Liu, Yang, et al.
Published: (2024) -
M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses
by: Yang, Yufeng, et al.
Published: (2024) -
Directional Source Separation for Robust Speech Recognition on Smart Glasses
by: Feng, Tiantian, et al.
Published: (2023) -
Thinking in Directivity: Speech Large Language Model for Multi-Talker Directional Speech Recognition
by: Xie, Jiamin, et al.
Published: (2025)