:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Kheir, Yassine El, Mubarak, Hamdy, Ali, Ahmed, Chowdhury, Shammur Absar
Format:	Preprint
Published:	2024
Subjects:	Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2408.02430
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Speech Representation Analysis based on Inter- and Intra-Model Similarities
by: Kheir, Yassine El, et al.
Published: (2024)

IQRA 2026: Interspeech Challenge on Automatic Pronunciation Assessment for Modern Standard Arabic (MSA)
by: Kheir, Yassine El, et al.
Published: (2026)

CAFE A Novel Code switching Dataset for Algerian Dialect French and English
by: Lachemat, Houssam Eddine-Othman, et al.
Published: (2024)

HARNESS: Lightweight Distilled Arabic Speech Foundation Models
by: Sukhadia, Vrunda N., et al.
Published: (2026)

Children's Speech Recognition through Discrete Token Enhancement
by: Sukhadia, Vrunda N., et al.
Published: (2024)

Multi-Task Instruction Tuning via Data Scheduling for Low-Resource Arabic AudioLLMs
by: Bhatti, Hunzalah Hassan, et al.
Published: (2026)

BiCrossMamba-ST: Speech Deepfake Detection with Bidirectional Mamba Spectro-Temporal Cross-Attention
by: Kheir, Yassine El, et al.
Published: (2025)

Towards a Unified Benchmark for Arabic Pronunciation Assessment: Quranic Recitation as Case Study
by: Kheir, Yassine El, et al.
Published: (2025)

Comprehensive Layer-wise Analysis of SSL Models for Audio Deepfake Detection
by: Kheir, Yassine El, et al.
Published: (2025)

Automatic Assessment of Dysarthria Using Audio-visual Vowel Graph Attention Network
by: Liu, Xiaokang, et al.
Published: (2024)

LinTO Audio and Textual Datasets to Train and Evaluate Automatic Speech Recognition in Tunisian Arabic Dialect
by: Naouara, Hedi, et al.
Published: (2025)

Dialectal Coverage And Generalization in Arabic Speech Recognition
by: Djanibekov, Amirbek, et al.
Published: (2024)

Generalizable Audio Spoofing Detection using Non-Semantic Representations
by: Das, Arnab, et al.
Published: (2025)

Towards Zero-Shot Text-To-Speech for Arabic Dialects
by: Doan, Khai Duy, et al.
Published: (2024)

Two Views, One Truth: Spectral and Self-Supervised Features Fusion for Robust Speech Deepfake Detection
by: Kheir, Yassine El, et al.
Published: (2025)

DeepFense: A Unified, Modular, and Extensible Framework for Robust Deepfake Audio Detection
by: Kheir, Yassine El, et al.
Published: (2026)

Automatic Speech Recognition with BERT and CTC Transformers: A Review
by: Djeffal, Noussaiba, et al.
Published: (2024)

MENASpeechBank: A Reference Voice Bank with Persona-Conditioned Multi-Turn Conversations for AudioLLMs
by: Ali, Zien Sheikh, et al.
Published: (2026)

Voice Conversion Improves Cross-Domain Robustness for Spoken Arabic Dialect Identification
by: Abdullah, Badr M., et al.
Published: (2025)

Habibi: Laying the Open-Source Foundation of Unified-Dialectal Arabic Speech Synthesis
by: Chen, Yushen, et al.
Published: (2026)

You Sound a Little Tense: L2 Tailored Clear TTS Using Durational Vowel Properties
by: Tuttösí, Paige, et al.
Published: (2025)

Overcoming Data Scarcity in Multi-Dialectal Arabic ASR via Whisper Fine-Tuning
by: Özyilmaz, Ömer Tarik, et al.
Published: (2025)

Hybrid Deep Learning and Signal Processing for Arabic Dialect Recognition in Low-Resource Settings
by: Al-Shwayyat, Ghazal, et al.
Published: (2025)

Cross-Dialect Bird Species Recognition with Dialect-Calibrated Augmentation
by: Ding, Jiani, et al.
Published: (2025)

A Multi-Dialectal Dataset for German Dialect ASR and Dialect-to-Standard Speech Translation
by: Blaschke, Verena, et al.
Published: (2025)

Dolphin-CN-Dialect: Where Chinese Dialects Matter
by: Meng, Yangyang, et al.
Published: (2026)

Automatic Speech Recognition using Advanced Deep Learning Approaches: A survey
by: Kheddar, Hamza, et al.
Published: (2024)

From Words to Waves: Analyzing Concept Formation in Speech and Text-Based Foundation Models
by: Ersoy, Asım, et al.
Published: (2025)

LAMA-UT: Language Agnostic Multilingual ASR through Orthography Unification and Language-Specific Transliteration
by: Lee, Sangmin, et al.
Published: (2024)

DiveSound: LLM-Assisted Automatic Taxonomy Construction for Diverse Audio Generation
by: Li, Baihan, et al.
Published: (2024)

Automatic Sound Event Detection and Classification of Great Ape Calls Using Neural Networks
by: Jiang, Zifan, et al.
Published: (2023)

Improving the Robustness and Clinical Applicability of Automatic Respiratory Sound Classification Using Deep Learning-Based Audio Enhancement: Algorithm Development and Validation
by: Tzeng, Jing-Tong, et al.
Published: (2024)

Arabic TTS with FastPitch: Reproducible Baselines, Adversarial Training, and Oversmoothing Analysis
by: Nippert, Lars
Published: (2025)

Adaptive Representations of Sound for Automatic Insect Recognition
by: Faiß, Marius, et al.
Published: (2023)

Towards Naturalistic Voice Conversion: NaturalVoices Dataset with an Automatic Processing Pipeline
by: Salman, Ali N., et al.
Published: (2024)

Arabic ASR on the SADA Large-Scale Arabic Speech Corpus with Transformer-Based Models
by: Gerazov, Branislav, et al.
Published: (2025)

Classification of Short Segment Pediatric Heart Sounds Based on a Transformer-Based Convolutional Neural Network
by: Hassanuzzaman, Md, et al.
Published: (2024)

A Unified Denoising and Adaptation Framework for Self-Supervised Bengali Dialectal ASR
by: Biswas, Swadhin, et al.
Published: (2025)

SoulX-Podcast: Towards Realistic Long-form Podcasts with Dialectal and Paralinguistic Diversity
by: Xie, Hanke, et al.
Published: (2025)

Automatic Inspection Based on Switch Sounds of Electric Point Machines
by: Shibata, Ayano, et al.
Published: (2025)