Saved in:
| Main Authors: | Osses, Alejandro, Varnet, Léo |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.13765 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
How phonemes contribute to deep speaker models?
by: Li, Pengqi, et al.
Published: (2024)
by: Li, Pengqi, et al.
Published: (2024)
LLM-based phoneme-to-grapheme for phoneme-based speech recognition
by: Ma, Te, et al.
Published: (2025)
by: Ma, Te, et al.
Published: (2025)
Zero-Shot Sing Voice Conversion: built upon clustering-based phoneme representations
by: Zhou, Wangjin, et al.
Published: (2024)
by: Zhou, Wangjin, et al.
Published: (2024)
PRODIS -- a speech database and a phoneme-based language model for the study of predictability effects in Polish
by: Malisz, Zofia, et al.
Published: (2024)
by: Malisz, Zofia, et al.
Published: (2024)
Advancing LLM-based phoneme-to-grapheme for multilingual speech recognition
by: Dong, Lukuang, et al.
Published: (2026)
by: Dong, Lukuang, et al.
Published: (2026)
Contrastive prediction strategies for unsupervised segmentation and categorization of phonemes and words
by: Cuervo, Santiago, et al.
Published: (2021)
by: Cuervo, Santiago, et al.
Published: (2021)
Data-driven grapheme-to-phoneme representations for a lexicon-free text-to-speech
by: Garg, Abhinav, et al.
Published: (2024)
by: Garg, Abhinav, et al.
Published: (2024)
Cascaded noise reduction and acoustic echo cancellation based on an extended noise reduction
by: Roebben, Arnout, et al.
Published: (2024)
by: Roebben, Arnout, et al.
Published: (2024)
Combined assessment of auditory distance perception and externalization
by: Hoppe, Henning, et al.
Published: (2024)
by: Hoppe, Henning, et al.
Published: (2024)
Detecting gamma-band responses to the speech envelope for the ICASSP 2024 Auditory EEG Decoding Signal Processing Grand Challenge
by: Thornton, Mike, et al.
Published: (2024)
by: Thornton, Mike, et al.
Published: (2024)
Effect of laboratory conditions on the perception of virtual stages for music
by: Accolti, Ernesto
Published: (2025)
by: Accolti, Ernesto
Published: (2025)
Effects of auditory distance cues and reverberation on spatial perception and listening strategies
by: Missoni, Fulvio, et al.
Published: (2025)
by: Missoni, Fulvio, et al.
Published: (2025)
Human-CLAP: Human-perception-based contrastive language-audio pretraining
by: Takano, Taisei, et al.
Published: (2025)
by: Takano, Taisei, et al.
Published: (2025)
Loss functions incorporating auditory spatial perception in deep learning -- a review
by: Rafaely, Boaz, et al.
Published: (2025)
by: Rafaely, Boaz, et al.
Published: (2025)
GDiffuSE: Diffusion-based speech enhancement with noise model guidance
by: Yanir, Efrayim, et al.
Published: (2025)
by: Yanir, Efrayim, et al.
Published: (2025)
Paraformer-v2: An improved non-autoregressive transformer for noise-robust speech recognition
by: An, Keyu, et al.
Published: (2024)
by: An, Keyu, et al.
Published: (2024)
Toward noise-robust whisper keyword spotting on headphones with in-earcup microphone and curriculum learning
by: Yang, Qiaoyu
Published: (2025)
by: Yang, Qiaoyu
Published: (2025)
Towards noise-robust speech inversion through multi-task learning with speech enhancement
by: Tabatabaee, Saba, et al.
Published: (2026)
by: Tabatabaee, Saba, et al.
Published: (2026)
Controllable joint noise reduction and hearing loss compensation using a differentiable auditory model
by: Gonzalez, Philippe, et al.
Published: (2025)
by: Gonzalez, Philippe, et al.
Published: (2025)
Localizing broadband noise sources using the Loève spectrum and a 2.5D approach
by: Kasess, Christian H., et al.
Published: (2026)
by: Kasess, Christian H., et al.
Published: (2026)
Effects of automotive microphone frequency response characteristics and noise conditions on speech and ASR quality -- an experimental evaluation
by: Buccoli, Michele, et al.
Published: (2025)
by: Buccoli, Michele, et al.
Published: (2025)
EBEN: Extreme bandwidth extension network applied to speech signals captured with noise-resilient body-conduction microphones
by: Hauret, Julien, et al.
Published: (2022)
by: Hauret, Julien, et al.
Published: (2022)
A Dataset for Automatic Assessment of TTS Quality in Spanish
by: Welford, Alejandro Sosa, et al.
Published: (2025)
by: Welford, Alejandro Sosa, et al.
Published: (2025)
Complexity of frequency fluctuations and the interpretive style in the bass viola da gamba
by: Lugo, Igor, et al.
Published: (2025)
by: Lugo, Igor, et al.
Published: (2025)
Theory and investigation of acoustic multiple-input multiple-output systems based on spherical arrays in a room
by: Morgenstern, Hai, et al.
Published: (2024)
by: Morgenstern, Hai, et al.
Published: (2024)
From the perspective of perceptual speech quality: The robustness of frequency bands to noise
by: Fan, Junyi, et al.
Published: (2025)
by: Fan, Junyi, et al.
Published: (2025)
Improved symbolic drum style classification with grammar-based hierarchical representations
by: Géré, Léo, et al.
Published: (2024)
by: Géré, Léo, et al.
Published: (2024)
Speech-preserving active noise control: a deep learning approach in reverberant environments
by: Dai, Shuning
Published: (2026)
by: Dai, Shuning
Published: (2026)
Simi-SFX: A similarity-based conditioning method for controllable sound effect synthesis
by: Liu, Yunyi, et al.
Published: (2024)
by: Liu, Yunyi, et al.
Published: (2024)
Selective Invocation for Multilingual ASR: A Cost-effective Approach Adapting to Speech Recognition Difficulty
by: Xue, Hongfei, et al.
Published: (2025)
by: Xue, Hongfei, et al.
Published: (2025)
On the effectiveness of enrollment speech augmentation for Target Speaker Extraction
by: Li, Junjie, et al.
Published: (2024)
by: Li, Junjie, et al.
Published: (2024)
Real-time implementation of vibrato transfer as an audio effect
by: Hyrkas, Jeremy
Published: (2025)
by: Hyrkas, Jeremy
Published: (2025)
Mmm whatcha say? Uncovering distal and proximal context effects in first and second-language word perception using psychophysical reverse correlation
by: Tuttösí, Paige, et al.
Published: (2024)
by: Tuttösí, Paige, et al.
Published: (2024)
Signal processing algorithm effective for sound quality of hearing loss simulators
by: Irino, Toshio, et al.
Published: (2024)
by: Irino, Toshio, et al.
Published: (2024)
Disentangling peripheral hearing loss from central and cognitive effects on speech intelligibility in older adults
by: Irino, Toshio, et al.
Published: (2025)
by: Irino, Toshio, et al.
Published: (2025)
Blind estimation of audio effects using an auto-encoder approach and differentiable digital signal processing
by: Peladeau, Côme, et al.
Published: (2023)
by: Peladeau, Côme, et al.
Published: (2023)
Identifying Hearing Difficulty Moments in Conversational Audio
by: Collins, Jack, et al.
Published: (2025)
by: Collins, Jack, et al.
Published: (2025)
A noise-robust acoustic method for recognizing foraging activities of grazing cattle
by: Martinez-Rau, Luciano S., et al.
Published: (2023)
by: Martinez-Rau, Luciano S., et al.
Published: (2023)
Towards robust paralinguistic assessment for real-world mobile health (mHealth) monitoring: an initial study of reverberation effects on speech
by: Dineley, Judith, et al.
Published: (2023)
by: Dineley, Judith, et al.
Published: (2023)
Adaptive ship-radiated noise recognition with learnable fine-grained wavelet transform
by: Xie, Yuan, et al.
Published: (2023)
by: Xie, Yuan, et al.
Published: (2023)
Similar Items
-
How phonemes contribute to deep speaker models?
by: Li, Pengqi, et al.
Published: (2024) -
LLM-based phoneme-to-grapheme for phoneme-based speech recognition
by: Ma, Te, et al.
Published: (2025) -
Zero-Shot Sing Voice Conversion: built upon clustering-based phoneme representations
by: Zhou, Wangjin, et al.
Published: (2024) -
PRODIS -- a speech database and a phoneme-based language model for the study of predictability effects in Polish
by: Malisz, Zofia, et al.
Published: (2024) -
Advancing LLM-based phoneme-to-grapheme for multilingual speech recognition
by: Dong, Lukuang, et al.
Published: (2026)