:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Osses, Alejandro, Varnet, Léo
Format:	Preprint
Published:	2024
Subjects:	Sound Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2409.13765
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

How phonemes contribute to deep speaker models?
by: Li, Pengqi, et al.
Published: (2024)

LLM-based phoneme-to-grapheme for phoneme-based speech recognition
by: Ma, Te, et al.
Published: (2025)

Zero-Shot Sing Voice Conversion: built upon clustering-based phoneme representations
by: Zhou, Wangjin, et al.
Published: (2024)

PRODIS -- a speech database and a phoneme-based language model for the study of predictability effects in Polish
by: Malisz, Zofia, et al.
Published: (2024)

Advancing LLM-based phoneme-to-grapheme for multilingual speech recognition
by: Dong, Lukuang, et al.
Published: (2026)

Contrastive prediction strategies for unsupervised segmentation and categorization of phonemes and words
by: Cuervo, Santiago, et al.
Published: (2021)

Data-driven grapheme-to-phoneme representations for a lexicon-free text-to-speech
by: Garg, Abhinav, et al.
Published: (2024)

Cascaded noise reduction and acoustic echo cancellation based on an extended noise reduction
by: Roebben, Arnout, et al.
Published: (2024)

Combined assessment of auditory distance perception and externalization
by: Hoppe, Henning, et al.
Published: (2024)

Detecting gamma-band responses to the speech envelope for the ICASSP 2024 Auditory EEG Decoding Signal Processing Grand Challenge
by: Thornton, Mike, et al.
Published: (2024)

Effect of laboratory conditions on the perception of virtual stages for music
by: Accolti, Ernesto
Published: (2025)

Effects of auditory distance cues and reverberation on spatial perception and listening strategies
by: Missoni, Fulvio, et al.
Published: (2025)

Human-CLAP: Human-perception-based contrastive language-audio pretraining
by: Takano, Taisei, et al.
Published: (2025)

Loss functions incorporating auditory spatial perception in deep learning -- a review
by: Rafaely, Boaz, et al.
Published: (2025)

GDiffuSE: Diffusion-based speech enhancement with noise model guidance
by: Yanir, Efrayim, et al.
Published: (2025)

Paraformer-v2: An improved non-autoregressive transformer for noise-robust speech recognition
by: An, Keyu, et al.
Published: (2024)

Toward noise-robust whisper keyword spotting on headphones with in-earcup microphone and curriculum learning
by: Yang, Qiaoyu
Published: (2025)

Towards noise-robust speech inversion through multi-task learning with speech enhancement
by: Tabatabaee, Saba, et al.
Published: (2026)

Controllable joint noise reduction and hearing loss compensation using a differentiable auditory model
by: Gonzalez, Philippe, et al.
Published: (2025)

Localizing broadband noise sources using the Loève spectrum and a 2.5D approach
by: Kasess, Christian H., et al.
Published: (2026)

Effects of automotive microphone frequency response characteristics and noise conditions on speech and ASR quality -- an experimental evaluation
by: Buccoli, Michele, et al.
Published: (2025)

EBEN: Extreme bandwidth extension network applied to speech signals captured with noise-resilient body-conduction microphones
by: Hauret, Julien, et al.
Published: (2022)

A Dataset for Automatic Assessment of TTS Quality in Spanish
by: Welford, Alejandro Sosa, et al.
Published: (2025)

Complexity of frequency fluctuations and the interpretive style in the bass viola da gamba
by: Lugo, Igor, et al.
Published: (2025)

Theory and investigation of acoustic multiple-input multiple-output systems based on spherical arrays in a room
by: Morgenstern, Hai, et al.
Published: (2024)

From the perspective of perceptual speech quality: The robustness of frequency bands to noise
by: Fan, Junyi, et al.
Published: (2025)

Improved symbolic drum style classification with grammar-based hierarchical representations
by: Géré, Léo, et al.
Published: (2024)

Speech-preserving active noise control: a deep learning approach in reverberant environments
by: Dai, Shuning
Published: (2026)

Simi-SFX: A similarity-based conditioning method for controllable sound effect synthesis
by: Liu, Yunyi, et al.
Published: (2024)

Selective Invocation for Multilingual ASR: A Cost-effective Approach Adapting to Speech Recognition Difficulty
by: Xue, Hongfei, et al.
Published: (2025)

On the effectiveness of enrollment speech augmentation for Target Speaker Extraction
by: Li, Junjie, et al.
Published: (2024)

Real-time implementation of vibrato transfer as an audio effect
by: Hyrkas, Jeremy
Published: (2025)

Mmm whatcha say? Uncovering distal and proximal context effects in first and second-language word perception using psychophysical reverse correlation
by: Tuttösí, Paige, et al.
Published: (2024)

Signal processing algorithm effective for sound quality of hearing loss simulators
by: Irino, Toshio, et al.
Published: (2024)

Disentangling peripheral hearing loss from central and cognitive effects on speech intelligibility in older adults
by: Irino, Toshio, et al.
Published: (2025)

Blind estimation of audio effects using an auto-encoder approach and differentiable digital signal processing
by: Peladeau, Côme, et al.
Published: (2023)

Identifying Hearing Difficulty Moments in Conversational Audio
by: Collins, Jack, et al.
Published: (2025)

A noise-robust acoustic method for recognizing foraging activities of grazing cattle
by: Martinez-Rau, Luciano S., et al.
Published: (2023)

Towards robust paralinguistic assessment for real-world mobile health (mHealth) monitoring: an initial study of reverberation effects on speech
by: Dineley, Judith, et al.
Published: (2023)

Adaptive ship-radiated noise recognition with learnable fine-grained wavelet transform
by: Xie, Yuan, et al.
Published: (2023)