Saved in:
| Main Authors: | Berriche, Lamia, Driss, Maha, Almuntashri, Areej Ahmed, Lghabi, Asma Mufreh, Almudhi, Heba Saleh, Almansour, Munerah Abdul-Aziz |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2411.13592 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Towards Zero-Shot Text-To-Speech for Arabic Dialects
by: Doan, Khai Duy, et al.
Published: (2024)
by: Doan, Khai Duy, et al.
Published: (2024)
Arabic Little STT: Arabic Children Speech Recognition Dataset
by: Alkadri, Mouhand, et al.
Published: (2025)
by: Alkadri, Mouhand, et al.
Published: (2025)
Abjad-Kids: An Arabic Speech Classification Dataset for Primary Education
by: Snoubara, Abdul Aziz, et al.
Published: (2026)
by: Snoubara, Abdul Aziz, et al.
Published: (2026)
Assessing the Impact of Speaker Identity in Speech Spoofing Detection
by: Dao, Anh-Tuan, et al.
Published: (2026)
by: Dao, Anh-Tuan, et al.
Published: (2026)
Audio2Tool: Speak, Call, Act -- A Dataset for Benchmarking Speech Tool Use
by: Pahwa, Ramit, et al.
Published: (2026)
by: Pahwa, Ramit, et al.
Published: (2026)
NADI 2025: The First Multidialectal Arabic Speech Processing Shared Task
by: Talafha, Bashar, et al.
Published: (2025)
by: Talafha, Bashar, et al.
Published: (2025)
Speaking Clearly: A Simplified Whisper-Based Codec for Low-Bitrate Speech Coding
by: Zhang, Xin, et al.
Published: (2025)
by: Zhang, Xin, et al.
Published: (2025)
Speaking Without Sound: Multi-speaker Silent Speech Voicing with Facial Inputs Only
by: Lee, Jaejun, et al.
Published: (2026)
by: Lee, Jaejun, et al.
Published: (2026)
Dialectal Coverage And Generalization in Arabic Speech Recognition
by: Djanibekov, Amirbek, et al.
Published: (2024)
by: Djanibekov, Amirbek, et al.
Published: (2024)
Hybrid CNN-Transformer Architecture for Arabic Speech Emotion Recognition
by: Gheffari, Youcef Soufiane, et al.
Published: (2026)
by: Gheffari, Youcef Soufiane, et al.
Published: (2026)
Building Tailored Speech Recognizers for Japanese Speaking Assessment
by: Kubo, Yotaro, et al.
Published: (2025)
by: Kubo, Yotaro, et al.
Published: (2025)
JIS: A Speech Corpus of Japanese Idol Speakers with Various Speaking Styles
by: Kondo, Yuto, et al.
Published: (2025)
by: Kondo, Yuto, et al.
Published: (2025)
ARCADE: A City-Scale Corpus for Fine-Grained Arabic Dialect Tagging
by: Nacar, Omer, et al.
Published: (2026)
by: Nacar, Omer, et al.
Published: (2026)
SpeakStream: Streaming Text-to-Speech with Interleaved Data
by: Bai, Richard He, et al.
Published: (2025)
by: Bai, Richard He, et al.
Published: (2025)
Expressive Speech Retrieval using Natural Language Descriptions of Speaking Style
by: Kang, Wonjune, et al.
Published: (2025)
by: Kang, Wonjune, et al.
Published: (2025)
Harf-Speech: A Clinically Aligned Framework for Arabic Phoneme-Level Speech Assessment
by: Azad, Asif, et al.
Published: (2026)
by: Azad, Asif, et al.
Published: (2026)
Speech Separation for Hearing-Impaired Children in the Classroom
by: Olalere, Feyisayo, et al.
Published: (2025)
by: Olalere, Feyisayo, et al.
Published: (2025)
Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis
by: Chen, Yushen, et al.
Published: (2026)
by: Chen, Yushen, et al.
Published: (2026)
Voice Quality Dimensions as Interpretable Primitives for Speaking Style for Atypical Speech and Affect
by: Narain, Jaya, et al.
Published: (2025)
by: Narain, Jaya, et al.
Published: (2025)
Analysis of Self-Supervised Speech Models on Children's Speech and Infant Vocalizations
by: Li, Jialu, et al.
Published: (2024)
by: Li, Jialu, et al.
Published: (2024)
RA-CLAP: Relation-Augmented Emotional Speaking Style Contrastive Language-Audio Pretraining For Speech Retrieval
by: Sun, Haoqin, et al.
Published: (2025)
by: Sun, Haoqin, et al.
Published: (2025)
ArVoice: A Multi-Speaker Dataset for Arabic Speech Synthesis
by: Toyin, Hawau Olamide, et al.
Published: (2025)
by: Toyin, Hawau Olamide, et al.
Published: (2025)
Code-Switching in End-to-End Automatic Speech Recognition: A Systematic Literature Review
by: Agro, Maha Tufail, et al.
Published: (2025)
by: Agro, Maha Tufail, et al.
Published: (2025)
ParaMETA: Towards Learning Disentangled Paralinguistic Speaking Styles Representations from Speech
by: Lou, Haowei, et al.
Published: (2026)
by: Lou, Haowei, et al.
Published: (2026)
What Do Speech Foundation Models Not Learn About Speech?
by: Waheed, Abdul, et al.
Published: (2024)
by: Waheed, Abdul, et al.
Published: (2024)
LinTO Audio and Textual Datasets to Train and Evaluate Automatic Speech Recognition in Tunisian Arabic Dialect
by: Naouara, Hedi, et al.
Published: (2025)
by: Naouara, Hedi, et al.
Published: (2025)
Contextualized Token Discrimination for Speech Search Query Correction
by: Lu, Junyu, et al.
Published: (2025)
by: Lu, Junyu, et al.
Published: (2025)
A Novel Data Augmentation Approach for Automatic Speaking Assessment on Opinion Expressions
by: Wang, Chung-Chun, et al.
Published: (2025)
by: Wang, Chung-Chun, et al.
Published: (2025)
Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults
by: Attia, Ahmed Adel, et al.
Published: (2023)
by: Attia, Ahmed Adel, et al.
Published: (2023)
Denoising GER: A Noise-Robust Generative Error Correction with LLM for Speech Recognition
by: Liu, Yanyan, et al.
Published: (2025)
by: Liu, Yanyan, et al.
Published: (2025)
Speaking from Coarse to Fine: Improving Neural Codec Language Model via Multi-Scale Speech Coding and Generation
by: Guo, Haohan, et al.
Published: (2024)
by: Guo, Haohan, et al.
Published: (2024)
FaceSpeak: Expressive and High-Quality Speech Synthesis from Human Portraits of Different Styles
by: Zhang, Tian-Hao, et al.
Published: (2025)
by: Zhang, Tian-Hao, et al.
Published: (2025)
Speak Your Mind: The Speech Continuation Task as a Probe of Voice-Based Model Bias
by: Satish, Shree Harsha Bokkahalli, et al.
Published: (2025)
by: Satish, Shree Harsha Bokkahalli, et al.
Published: (2025)
Efficient VoIP Communications through LLM-based Real-Time Speech Reconstruction and Call Prioritization for Emergency Services
by: Venkateshperumal, Danush, et al.
Published: (2024)
by: Venkateshperumal, Danush, et al.
Published: (2024)
A multilingual training strategy for low resource Text to Speech
by: Amalas, Asma, et al.
Published: (2024)
by: Amalas, Asma, et al.
Published: (2024)
IQRA 2026: Interspeech Challenge on Automatic Pronunciation Assessment for Modern Standard Arabic (MSA)
by: Kheir, Yassine El, et al.
Published: (2026)
by: Kheir, Yassine El, et al.
Published: (2026)
YMIR: A new Benchmark Dataset and Model for Arabic Yemeni Music Genre Classification Using Convolutional Neural Networks
by: AL-Makhlafi, Moeen, et al.
Published: (2026)
by: AL-Makhlafi, Moeen, et al.
Published: (2026)
KidSpeak: A General Multi-purpose LLM for Kids' Speech Recognition and Screening
by: Sharma, Rohan, et al.
Published: (2025)
by: Sharma, Rohan, et al.
Published: (2025)
Comparative Evaluation of Acoustic Feature Extraction Tools for Clinical Speech Analysis
by: Choi, Anna Seo Gyeong, et al.
Published: (2025)
by: Choi, Anna Seo Gyeong, et al.
Published: (2025)
When Vision Speaks for Sound
by: Wen, Xiaofei, et al.
Published: (2026)
by: Wen, Xiaofei, et al.
Published: (2026)
Similar Items
-
Towards Zero-Shot Text-To-Speech for Arabic Dialects
by: Doan, Khai Duy, et al.
Published: (2024) -
Arabic Little STT: Arabic Children Speech Recognition Dataset
by: Alkadri, Mouhand, et al.
Published: (2025) -
Abjad-Kids: An Arabic Speech Classification Dataset for Primary Education
by: Snoubara, Abdul Aziz, et al.
Published: (2026) -
Assessing the Impact of Speaker Identity in Speech Spoofing Detection
by: Dao, Anh-Tuan, et al.
Published: (2026) -
Audio2Tool: Speak, Call, Act -- A Dataset for Benchmarking Speech Tool Use
by: Pahwa, Ramit, et al.
Published: (2026)