:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Liu, Yang, Wan, Li, Huang, Yiteng, Xu, Yong, shi, yangyang, Adya, Saurabh, sun, ming, Metze, Florian
Format:	Preprint
Published:	2025
Subjects:	Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2507.05609
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Multi-Channel Differential ASR for Robust Wearer Speech Recognition on Smart Glasses
by: Yang, Yufeng, et al.
Published: (2025)

MASV: Speaker Verification with Global and Local Context Mamba
by: Liu, Yang, et al.
Published: (2024)

M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses
by: Yang, Yufeng, et al.
Published: (2024)

Directional Source Separation for Robust Speech Recognition on Smart Glasses
by: Feng, Tiantian, et al.
Published: (2023)

Thinking in Directivity: Speech Large Language Model for Multi-Talker Directional Speech Recognition
by: Xie, Jiamin, et al.
Published: (2025)

M2R-Whisper: Multi-stage and Multi-scale Retrieval Augmentation for Enhancing Whisper
by: Zhou, Jiaming, et al.
Published: (2024)

GAN-Based Multi-Microphone Spatial Target Speaker Extraction
by: Shetu, Shrishti Saha, et al.
Published: (2025)

WhisperMask: A Noise Suppressive Mask-Type Microphone for Whisper Speech
by: Hiraki, Hirotaka, et al.
Published: (2024)

Whisper-PMFA: Partial Multi-Scale Feature Aggregation for Speaker Verification using Whisper Models
by: Zhao, Yiyang, et al.
Published: (2024)

Advances in Microphone Array Processing and Multichannel Speech Enhancement
by: Huang, Gongping, et al.
Published: (2025)

TellWhisper: Tell Whisper Who Speaks When
by: Hu, Yifan, et al.
Published: (2026)

Hierarchical Sparse Sound Field Reconstruction with Spherical and Linear Microphone Arrays
by: Xu, Shunxi, et al.
Published: (2025)

Microphone Occlusion Mitigation for Own-Voice Enhancement in Head-Worn Microphone Arrays Using Switching-Adaptive Beamforming
by: Middelberg, Wiebke, et al.
Published: (2025)

Neural Directional Filtering Using a Compact Microphone Array
by: Huang, Weilong, et al.
Published: (2025)

Enhancing Conversational TTS with Cascaded Prompting and ICL-Based Online Reinforcement Learning
by: Ouyang, Zhicheng, et al.
Published: (2026)

Neural Ambisonic Encoding For Multi-Speaker Scenarios Using A Circular Microphone Array
by: Qiao, Yue, et al.
Published: (2024)

SpatialEmb: Extract and Encode Spatial Information for 1-Stage Multi-channel Multi-speaker ASR on Arbitrary Microphone Arrays
by: Shao, Yiwen, et al.
Published: (2026)

Blind Identification of Binaural Room Impulse Responses from Smart Glasses
by: Deppisch, Thomas, et al.
Published: (2024)

Multi-Microphone Noise Data Augmentation for DNN-based Own Voice Reconstruction for Hearables in Noisy Environments
by: Ohlenbusch, Mattes, et al.
Published: (2023)

Low-Complexity Own Voice Reconstruction for Hearables with an In-Ear Microphone
by: Ohlenbusch, Mattes, et al.
Published: (2024)

WhisperVC: Decoupled Cross-Domain Alignment and Speech Generation for Low-Resource Whisper-to-Normal Conversion
by: Liu, Dong, et al.
Published: (2025)

FADI-AEC: Fast Score Based Diffusion Model Guided by Far-end Signal for Acoustic Echo Cancellation
by: Liu, Yang, et al.
Published: (2024)

A Unified SVD-Modal Solution for Sparse Sound Field Reconstruction with Hybrid Spherical-Linear Microphone Arrays
by: Xu, Shunxi, et al.
Published: (2026)

Whisper-SV: Adapting Whisper for Low-data-resource Speaker Verification
by: Zhang, Li, et al.
Published: (2024)

SQ-Whisper: Speaker-Querying based Whisper Model for Target-Speaker ASR
by: Guo, Pengcheng, et al.
Published: (2024)

Enhanced Deep Speech Separation in Clustered Ad Hoc Distributed Microphone Environments
by: Kim, Jihyun, et al.
Published: (2024)

Ambisonics Encoding For Arbitrary Microphone Arrays Incorporating Residual Channels For Binaural Reproduction
by: Gayer, Yhonatan, et al.
Published: (2024)

A Self-Training Approach for Whisper to Enhance Long Dysarthric Speech Recognition
by: Wang, Shiyao, et al.
Published: (2025)

Effective Integration of KAN for Keyword Spotting
by: Xu, Anfeng, et al.
Published: (2024)

Multi-Microphone and Multi-Modal Emotion Recognition in Reverberant Environment
by: Cohen, Ohad, et al.
Published: (2024)

WhisperFlow: speech foundation models in real time
by: Wang, Rongxiang, et al.
Published: (2024)

HOMULA-RIR: A Room Impulse Response Dataset for Teleconferencing and Spatial Audio Applications Acquired Through Higher-Order Microphones and Uniform Linear Microphone Arrays
by: Miotello, Federico, et al.
Published: (2024)

State-Space Models in Efficient Whispered and Multi-dialect Speech Recognition
by: Farhadipour, Aref, et al.
Published: (2025)

Speech-dependent Data Augmentation for Own Voice Reconstruction with Hearable Microphones in Noisy Environments
by: Ohlenbusch, Mattes, et al.
Published: (2024)

The trajectoRIR Database: Room Acoustic Recordings Along a Trajectory of Moving Microphones
by: Damiano, Stefano, et al.
Published: (2025)

Design and Analysis of Binaural Signal Matching with Arbitrary Microphone Arrays and Listener Head Rotations
by: Madmoni, Lior, et al.
Published: (2024)

Target Speaker ASR with Whisper
by: Polok, Alexander, et al.
Published: (2024)

ASAP: An Azimuth-Priority Strip-Based Search Approach to Planar Microphone Array DOA Estimation in 3D
by: Huang, Ming, et al.
Published: (2026)

Impact of Microphone Array Mismatches to Learning-based Replay Speech Detection
by: Neri, Michael, et al.
Published: (2025)

Applying Automatic Differentiation to Optimize Differential Microphone Array Designs
by: Galougah, Siminfar Samakoush, et al.
Published: (2024)