Saved in:
| Main Author: | Yang, Qiaoyu |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.00295 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
The taste of IPA: Towards open-vocabulary keyword spotting and forced alignment in any language
by: Zhu, Jian, et al.
Published: (2023)
by: Zhu, Jian, et al.
Published: (2023)
Open vocabulary keyword spotting through transfer learning from speech synthesis
by: V, Kesavaraj, et al.
Published: (2024)
by: V, Kesavaraj, et al.
Published: (2024)
A circular microphone array with virtual microphones based on acoustics-informed neural networks
by: Zhao, Sipei, et al.
Published: (2024)
by: Zhao, Sipei, et al.
Published: (2024)
Boosting keyword spotting through on-device learnable user speech characteristics
by: Cioflan, Cristian, et al.
Published: (2024)
by: Cioflan, Cristian, et al.
Published: (2024)
Towards noise-robust speech inversion through multi-task learning with speech enhancement
by: Tabatabaee, Saba, et al.
Published: (2026)
by: Tabatabaee, Saba, et al.
Published: (2026)
Effects of automotive microphone frequency response characteristics and noise conditions on speech and ASR quality -- an experimental evaluation
by: Buccoli, Michele, et al.
Published: (2025)
by: Buccoli, Michele, et al.
Published: (2025)
EBEN: Extreme bandwidth extension network applied to speech signals captured with noise-resilient body-conduction microphones
by: Hauret, Julien, et al.
Published: (2022)
by: Hauret, Julien, et al.
Published: (2022)
Neural Ambisonics encoding for compact irregular microphone arrays
by: Heikkinen, Mikko, et al.
Published: (2024)
by: Heikkinen, Mikko, et al.
Published: (2024)
Binaural rendering from microphone array signals of arbitrary geometry
by: Iijima, Naoto, et al.
Published: (2021)
by: Iijima, Naoto, et al.
Published: (2021)
Vibration Sensitivity of one-port and two-port MEMS microphones
by: Doyon-D'Amour, Francis, et al.
Published: (2024)
by: Doyon-D'Amour, Francis, et al.
Published: (2024)
Listening broadband physical model for microphones: a first step
by: Millot, Laurent, et al.
Published: (2024)
by: Millot, Laurent, et al.
Published: (2024)
Modal smoothing for analysis of room reflections measured with spherical microphone and loudspeaker arrays
by: Morgenstern, Hai, et al.
Published: (2024)
by: Morgenstern, Hai, et al.
Published: (2024)
Improving curriculum learning for target speaker extraction with synthetic speakers
by: Liu, Yun, et al.
Published: (2024)
by: Liu, Yun, et al.
Published: (2024)
Design framework for spherical microphone and loudspeaker arrays in a multiple-input multiple-output system
by: Morgenstern, Hai, et al.
Published: (2024)
by: Morgenstern, Hai, et al.
Published: (2024)
Improved direction of arrival estimations with a wearable microphone array for dynamic environments by reliability weighting
by: Mitchell, Daniel A., et al.
Published: (2024)
by: Mitchell, Daniel A., et al.
Published: (2024)
Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation
by: Lin, Zhaofeng, et al.
Published: (2023)
by: Lin, Zhaofeng, et al.
Published: (2023)
Paraformer-v2: An improved non-autoregressive transformer for noise-robust speech recognition
by: An, Keyu, et al.
Published: (2024)
by: An, Keyu, et al.
Published: (2024)
Interfacing PDM MEMS microphones with PFM spiking systems: Application for Neuromorphic Auditory Sensors
by: Jimenez-Fernandez, Angel, et al.
Published: (2019)
by: Jimenez-Fernandez, Angel, et al.
Published: (2019)
From the perspective of perceptual speech quality: The robustness of frequency bands to noise
by: Fan, Junyi, et al.
Published: (2025)
by: Fan, Junyi, et al.
Published: (2025)
A data-driven two-microphone method for in-situ sound absorption measurements
by: Emmerich, Leon, et al.
Published: (2025)
by: Emmerich, Leon, et al.
Published: (2025)
WavJEPA: Semantic learning unlocks robust audio foundation models for raw waveforms
by: Yuksel, Goksenin, et al.
Published: (2025)
by: Yuksel, Goksenin, et al.
Published: (2025)
Hardware-accelerated graph neural networks: an alternative approach for neuromorphic event-based audio classification and keyword spotting on SoC FPGA
by: Jeziorek, Kamil, et al.
Published: (2026)
by: Jeziorek, Kamil, et al.
Published: (2026)
Improving vision-inspired keyword spotting using dynamic module skipping in streaming conformer encoder
by: Bittar, Alexandre, et al.
Published: (2023)
by: Bittar, Alexandre, et al.
Published: (2023)
A noise-robust acoustic method for recognizing foraging activities of grazing cattle
by: Martinez-Rau, Luciano S., et al.
Published: (2023)
by: Martinez-Rau, Luciano S., et al.
Published: (2023)
Cascaded noise reduction and acoustic echo cancellation based on an extended noise reduction
by: Roebben, Arnout, et al.
Published: (2024)
by: Roebben, Arnout, et al.
Published: (2024)
Towards robust paralinguistic assessment for real-world mobile health (mHealth) monitoring: an initial study of reverberation effects on speech
by: Dineley, Judith, et al.
Published: (2023)
by: Dineley, Judith, et al.
Published: (2023)
A robust audio deepfake detection system via multi-view feature
by: Yang, Yujie, et al.
Published: (2024)
by: Yang, Yujie, et al.
Published: (2024)
Discrete Tokens Exhibit Interlanguage Speech Intelligibility Benefit: an Analytical Study Towards Accent-robust ASR Only with Native Speech Data
by: Onda, Kentaro, et al.
Published: (2025)
by: Onda, Kentaro, et al.
Published: (2025)
Speech-preserving active noise control: a deep learning approach in reverberant environments
by: Dai, Shuning
Published: (2026)
by: Dai, Shuning
Published: (2026)
Towards interpretable emotion recognition: Identifying key features with machine learning
by: Kaloga, Yacouba, et al.
Published: (2025)
by: Kaloga, Yacouba, et al.
Published: (2025)
A lightweight and robust method for blind wideband-to-fullband extension of speech
by: Büthe, Jan, et al.
Published: (2024)
by: Büthe, Jan, et al.
Published: (2024)
GDiffuSE: Diffusion-based speech enhancement with noise model guidance
by: Yanir, Efrayim, et al.
Published: (2025)
by: Yanir, Efrayim, et al.
Published: (2025)
Hidden bawls, whispers, and yelps: can text be made to sound more than just its words?
by: Pataca, Caluã de Lacerda, et al.
Published: (2022)
by: Pataca, Caluã de Lacerda, et al.
Published: (2022)
A microscopic investigation of the effect of random envelope fluctuations on phoneme-in-noise perception
by: Osses, Alejandro, et al.
Published: (2024)
by: Osses, Alejandro, et al.
Published: (2024)
ComplexDec: A Domain-robust High-fidelity Neural Audio Codec with Complex Spectrum Modeling
by: Wu, Yi-Chiao, et al.
Published: (2025)
by: Wu, Yi-Chiao, et al.
Published: (2025)
Controllable joint noise reduction and hearing loss compensation using a differentiable auditory model
by: Gonzalez, Philippe, et al.
Published: (2025)
by: Gonzalez, Philippe, et al.
Published: (2025)
Localizing broadband noise sources using the Loève spectrum and a 2.5D approach
by: Kasess, Christian H., et al.
Published: (2026)
by: Kasess, Christian H., et al.
Published: (2026)
Towards Decoupling Frontend Enhancement and Backend Recognition in Monaural Robust ASR
by: Yang, Yufeng, et al.
Published: (2024)
by: Yang, Yufeng, et al.
Published: (2024)
AdaKWS: Towards Robust Keyword Spotting with Test-Time Adaptation
by: Xiao, Yang, et al.
Published: (2025)
by: Xiao, Yang, et al.
Published: (2025)
Towards Expressive Zero-Shot Speech Synthesis with Hierarchical Prosody Modeling
by: Jiang, Yuepeng, et al.
Published: (2024)
by: Jiang, Yuepeng, et al.
Published: (2024)
Similar Items
-
The taste of IPA: Towards open-vocabulary keyword spotting and forced alignment in any language
by: Zhu, Jian, et al.
Published: (2023) -
Open vocabulary keyword spotting through transfer learning from speech synthesis
by: V, Kesavaraj, et al.
Published: (2024) -
A circular microphone array with virtual microphones based on acoustics-informed neural networks
by: Zhao, Sipei, et al.
Published: (2024) -
Boosting keyword spotting through on-device learnable user speech characteristics
by: Cioflan, Cristian, et al.
Published: (2024) -
Towards noise-robust speech inversion through multi-task learning with speech enhancement
by: Tabatabaee, Saba, et al.
Published: (2026)