Saved in:
| Main Authors: | Khurana, Sameer, Dawalatabad, Nauman, Laurent, Antoine, Vicente, Luis, Gimeno, Pablo, Mingote, Victoria, Glass, James |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2306.00789 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Automatic Prediction of Amyotrophic Lateral Sclerosis Progression using Longitudinal Speech Transformer
by: Wang, Liming, et al.
Published: (2024)
by: Wang, Liming, et al.
Published: (2024)
Speak in the Scene: Diffusion-based Acoustic Scene Transfer toward Immersive Speech Generation
by: Kim, Miseul, et al.
Published: (2024)
by: Kim, Miseul, et al.
Published: (2024)
Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation
by: Kim, Minsu, et al.
Published: (2023)
by: Kim, Minsu, et al.
Published: (2023)
SpeechMLC: Speech Multi-label Classification
by: Kim, Miseul, et al.
Published: (2025)
by: Kim, Miseul, et al.
Published: (2025)
On Improving Error Resilience of Neural End-to-End Speech Coders
by: Gupta, Kishan, et al.
Published: (2024)
by: Gupta, Kishan, et al.
Published: (2024)
Late Fusion and Multi-Level Fission Amplify Cross-Modal Transfer in Text-Speech LMs
by: Cuervo, Santiago, et al.
Published: (2025)
by: Cuervo, Santiago, et al.
Published: (2025)
TouchASP: Elastic Automatic Speech Perception that Everyone Can Touch
by: Song, Xingchen, et al.
Published: (2024)
by: Song, Xingchen, et al.
Published: (2024)
A Speech Production Model for Radar: Connecting Speech Acoustics with Radar-Measured Vibrations
by: Lenz, Isabella, et al.
Published: (2025)
by: Lenz, Isabella, et al.
Published: (2025)
ParaS2S: Benchmarking and Aligning Spoken Language Models for Paralinguistic-aware Speech-to-Speech Interaction
by: Yang, Shu-wen, et al.
Published: (2025)
by: Yang, Shu-wen, et al.
Published: (2025)
Automatic Speech Recognition of Non-Native Child Speech for Language Learning Applications
by: Wills, Simone, et al.
Published: (2023)
by: Wills, Simone, et al.
Published: (2023)
Binaural Localization Model for Speech in Noise
by: Tokala, Vikas, et al.
Published: (2025)
by: Tokala, Vikas, et al.
Published: (2025)
Speech-Based Prioritization for Schizophrenia Intervention
by: Premananth, Gowtham, et al.
Published: (2025)
by: Premananth, Gowtham, et al.
Published: (2025)
Prompt-driven Target Speech Diarization
by: Jiang, Yidi, et al.
Published: (2023)
by: Jiang, Yidi, et al.
Published: (2023)
Brain-Informed Speech Separation for Cochlear Implants
by: Gajecki, Tom, et al.
Published: (2026)
by: Gajecki, Tom, et al.
Published: (2026)
Speech Enhancement based on cascaded two flows
by: Lee, Seonggyu, et al.
Published: (2025)
by: Lee, Seonggyu, et al.
Published: (2025)
CrossSpeech++: Cross-lingual Speech Synthesis with Decoupled Language and Speaker Generation
by: Kim, Ji-Hoon, et al.
Published: (2024)
by: Kim, Ji-Hoon, et al.
Published: (2024)
Advanced Signal Analysis in Detecting Replay Attacks for Automatic Speaker Verification Systems
by: Kuang, Lee Shih
Published: (2024)
by: Kuang, Lee Shih
Published: (2024)
FlowSE: Flow Matching-based Speech Enhancement
by: Lee, Seonggyu, et al.
Published: (2025)
by: Lee, Seonggyu, et al.
Published: (2025)
Bottleneck Transformer-Based Approach for Improved Automatic STOI Score Prediction
by: Amartyaveer, et al.
Published: (2026)
by: Amartyaveer, et al.
Published: (2026)
Towards Improved Objective Perceptual Audio Quality Assessment -- Part 1: A Novel Data-Driven Cognitive Model
by: Delgado, Pablo M., et al.
Published: (2024)
by: Delgado, Pablo M., et al.
Published: (2024)
Harmonics to the Rescue: Why Voiced Speech is Not a Wss Process
by: Bologni, Giovanni, et al.
Published: (2025)
by: Bologni, Giovanni, et al.
Published: (2025)
SELM: Speech Enhancement Using Discrete Tokens and Language Models
by: Wang, Ziqian, et al.
Published: (2023)
by: Wang, Ziqian, et al.
Published: (2023)
Binaural Speech Enhancement Using Complex Convolutional Recurrent Networks
by: Tokala, Vikas, et al.
Published: (2025)
by: Tokala, Vikas, et al.
Published: (2025)
Analyzing the Impact of Accent on English Speech: Acoustic and Articulatory Perspectives
by: Premananth, Gowtham, et al.
Published: (2025)
by: Premananth, Gowtham, et al.
Published: (2025)
The Overview of Segmental Durations Modification Algorithms on Speech Signal Characteristics
by: Jang, Kyeomeun, et al.
Published: (2025)
by: Jang, Kyeomeun, et al.
Published: (2025)
USDnet: Unsupervised Speech Dereverberation via Neural Forward Filtering
by: Wang, Zhong-Qiu
Published: (2024)
by: Wang, Zhong-Qiu
Published: (2024)
TTSlow: Slow Down Text-to-Speech with Efficiency Robustness Evaluations
by: Gao, Xiaoxue, et al.
Published: (2024)
by: Gao, Xiaoxue, et al.
Published: (2024)
Parameter-Efficient Fine-Tuning of Foundation Models for CLP Speech Classification
by: Bhattacharjee, Susmita, et al.
Published: (2025)
by: Bhattacharjee, Susmita, et al.
Published: (2025)
Exploring Disentangled Neural Speech Codecs from Self-Supervised Representations
by: Aihara, Ryo, et al.
Published: (2025)
by: Aihara, Ryo, et al.
Published: (2025)
Impact of Microphone Array Mismatches to Learning-based Replay Speech Detection
by: Neri, Michael, et al.
Published: (2025)
by: Neri, Michael, et al.
Published: (2025)
DeFTAN-II: Efficient Multichannel Speech Enhancement with Subgroup Processing
by: Lee, Dongheon, et al.
Published: (2023)
by: Lee, Dongheon, et al.
Published: (2023)
Multi-channel Replay Speech Detection using an Adaptive Learnable Beamformer
by: Neri, Michael, et al.
Published: (2025)
by: Neri, Michael, et al.
Published: (2025)
Entropy-Guided GRVQ for Ultra-Low Bitrate Neural Speech Codec
by: Ren, Yanzhou, et al.
Published: (2026)
by: Ren, Yanzhou, et al.
Published: (2026)
BanglaNum -- A Public Dataset for Bengali Digit Recognition from Speech
by: Mohammad, Mir Sayeed, et al.
Published: (2024)
by: Mohammad, Mir Sayeed, et al.
Published: (2024)
Mixture to Mixture: Leveraging Close-talk Mixtures as Weak-supervision for Speech Separation
by: Wang, Zhong-Qiu
Published: (2024)
by: Wang, Zhong-Qiu
Published: (2024)
FlowSE: Efficient and High-Quality Speech Enhancement via Flow Matching
by: Wang, Ziqian, et al.
Published: (2025)
by: Wang, Ziqian, et al.
Published: (2025)
HyBeam: Hybrid Microphone-Beamforming Array-Agnostic Speech Enhancement for Wearables
by: Ilan, Yuval Bar, et al.
Published: (2025)
by: Ilan, Yuval Bar, et al.
Published: (2025)
Automatic Voice Classification Of Autistic Subjects
by: Vacca, Jessica, et al.
Published: (2024)
by: Vacca, Jessica, et al.
Published: (2024)
Zero-Bit Transmission of Adaptive Pre- and De-emphasis Filters for Speech and Audio Coding
by: Piralideh, Niloofar Omidi, et al.
Published: (2024)
by: Piralideh, Niloofar Omidi, et al.
Published: (2024)
AffectSpeech: A Large-Scale Emotional Speech Dataset with Fine-Grained Textual Descriptions for Speech Emotion Captioning and Synthesis
by: Qi, Tianhua, et al.
Published: (2026)
by: Qi, Tianhua, et al.
Published: (2026)
Similar Items
-
Automatic Prediction of Amyotrophic Lateral Sclerosis Progression using Longitudinal Speech Transformer
by: Wang, Liming, et al.
Published: (2024) -
Speak in the Scene: Diffusion-based Acoustic Scene Transfer toward Immersive Speech Generation
by: Kim, Miseul, et al.
Published: (2024) -
Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation
by: Kim, Minsu, et al.
Published: (2023) -
SpeechMLC: Speech Multi-label Classification
by: Kim, Miseul, et al.
Published: (2025) -
On Improving Error Resilience of Neural End-to-End Speech Coders
by: Gupta, Kishan, et al.
Published: (2024)