:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Yang, Qiaoyu
Format:	Preprint
Published:	2025
Subjects:	Audio and Speech Processing Sound
Online Access:	https://arxiv.org/abs/2502.00295
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

The taste of IPA: Towards open-vocabulary keyword spotting and forced alignment in any language
by: Zhu, Jian, et al.
Published: (2023)

Open vocabulary keyword spotting through transfer learning from speech synthesis
by: V, Kesavaraj, et al.
Published: (2024)

A circular microphone array with virtual microphones based on acoustics-informed neural networks
by: Zhao, Sipei, et al.
Published: (2024)

Boosting keyword spotting through on-device learnable user speech characteristics
by: Cioflan, Cristian, et al.
Published: (2024)

Towards noise-robust speech inversion through multi-task learning with speech enhancement
by: Tabatabaee, Saba, et al.
Published: (2026)

Effects of automotive microphone frequency response characteristics and noise conditions on speech and ASR quality -- an experimental evaluation
by: Buccoli, Michele, et al.
Published: (2025)

EBEN: Extreme bandwidth extension network applied to speech signals captured with noise-resilient body-conduction microphones
by: Hauret, Julien, et al.
Published: (2022)

Neural Ambisonics encoding for compact irregular microphone arrays
by: Heikkinen, Mikko, et al.
Published: (2024)

Binaural rendering from microphone array signals of arbitrary geometry
by: Iijima, Naoto, et al.
Published: (2021)

Vibration Sensitivity of one-port and two-port MEMS microphones
by: Doyon-D'Amour, Francis, et al.
Published: (2024)

Listening broadband physical model for microphones: a first step
by: Millot, Laurent, et al.
Published: (2024)

Modal smoothing for analysis of room reflections measured with spherical microphone and loudspeaker arrays
by: Morgenstern, Hai, et al.
Published: (2024)

Improving curriculum learning for target speaker extraction with synthetic speakers
by: Liu, Yun, et al.
Published: (2024)

Design framework for spherical microphone and loudspeaker arrays in a multiple-input multiple-output system
by: Morgenstern, Hai, et al.
Published: (2024)

Improved direction of arrival estimations with a wearable microphone array for dynamic environments by reliability weighting
by: Mitchell, Daniel A., et al.
Published: (2024)

Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation
by: Lin, Zhaofeng, et al.
Published: (2023)

Paraformer-v2: An improved non-autoregressive transformer for noise-robust speech recognition
by: An, Keyu, et al.
Published: (2024)

Interfacing PDM MEMS microphones with PFM spiking systems: Application for Neuromorphic Auditory Sensors
by: Jimenez-Fernandez, Angel, et al.
Published: (2019)

From the perspective of perceptual speech quality: The robustness of frequency bands to noise
by: Fan, Junyi, et al.
Published: (2025)

A data-driven two-microphone method for in-situ sound absorption measurements
by: Emmerich, Leon, et al.
Published: (2025)

WavJEPA: Semantic learning unlocks robust audio foundation models for raw waveforms
by: Yuksel, Goksenin, et al.
Published: (2025)

Hardware-accelerated graph neural networks: an alternative approach for neuromorphic event-based audio classification and keyword spotting on SoC FPGA
by: Jeziorek, Kamil, et al.
Published: (2026)

Improving vision-inspired keyword spotting using dynamic module skipping in streaming conformer encoder
by: Bittar, Alexandre, et al.
Published: (2023)

A noise-robust acoustic method for recognizing foraging activities of grazing cattle
by: Martinez-Rau, Luciano S., et al.
Published: (2023)

Cascaded noise reduction and acoustic echo cancellation based on an extended noise reduction
by: Roebben, Arnout, et al.
Published: (2024)

Towards robust paralinguistic assessment for real-world mobile health (mHealth) monitoring: an initial study of reverberation effects on speech
by: Dineley, Judith, et al.
Published: (2023)

A robust audio deepfake detection system via multi-view feature
by: Yang, Yujie, et al.
Published: (2024)

Discrete Tokens Exhibit Interlanguage Speech Intelligibility Benefit: an Analytical Study Towards Accent-robust ASR Only with Native Speech Data
by: Onda, Kentaro, et al.
Published: (2025)

Speech-preserving active noise control: a deep learning approach in reverberant environments
by: Dai, Shuning
Published: (2026)

Towards interpretable emotion recognition: Identifying key features with machine learning
by: Kaloga, Yacouba, et al.
Published: (2025)

A lightweight and robust method for blind wideband-to-fullband extension of speech
by: Büthe, Jan, et al.
Published: (2024)

GDiffuSE: Diffusion-based speech enhancement with noise model guidance
by: Yanir, Efrayim, et al.
Published: (2025)

Hidden bawls, whispers, and yelps: can text be made to sound more than just its words?
by: Pataca, Caluã de Lacerda, et al.
Published: (2022)

A microscopic investigation of the effect of random envelope fluctuations on phoneme-in-noise perception
by: Osses, Alejandro, et al.
Published: (2024)

ComplexDec: A Domain-robust High-fidelity Neural Audio Codec with Complex Spectrum Modeling
by: Wu, Yi-Chiao, et al.
Published: (2025)

Controllable joint noise reduction and hearing loss compensation using a differentiable auditory model
by: Gonzalez, Philippe, et al.
Published: (2025)

Localizing broadband noise sources using the Loève spectrum and a 2.5D approach
by: Kasess, Christian H., et al.
Published: (2026)

Towards Decoupling Frontend Enhancement and Backend Recognition in Monaural Robust ASR
by: Yang, Yufeng, et al.
Published: (2024)

AdaKWS: Towards Robust Keyword Spotting with Test-Time Adaptation
by: Xiao, Yang, et al.
Published: (2025)

Towards Expressive Zero-Shot Speech Synthesis with Hierarchical Prosody Modeling
by: Jiang, Yuepeng, et al.
Published: (2024)