Saved in:
| Main Authors: | Amooie, Reihaneh, de Vries, Wietse, Hao, Yun, Dijkstra, Jelske, Coler, Matt, Wieling, Martijn |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.04883 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Overcoming Data Scarcity in Multi-Dialectal Arabic ASR via Whisper Fine-Tuning
by: Özyilmaz, Ömer Tarik, et al.
Published: (2025)
by: Özyilmaz, Ömer Tarik, et al.
Published: (2025)
Breaking the Transcription Bottleneck: Fine-tuning ASR Models for Extremely Low-Resource Fieldwork Languages
by: Liang, Siyu, et al.
Published: (2025)
by: Liang, Siyu, et al.
Published: (2025)
A Unified Denoising and Adaptation Framework for Self-Supervised Bengali Dialectal ASR
by: Biswas, Swadhin, et al.
Published: (2025)
by: Biswas, Swadhin, et al.
Published: (2025)
Performance Analysis of Speech Encoders for Low-Resource SLU and ASR in Tunisian Dialect
by: Mdhaffar, Salima, et al.
Published: (2024)
by: Mdhaffar, Salima, et al.
Published: (2024)
Enhancing Pre-trained ASR System Fine-tuning for Dysarthric Speech Recognition using Adversarial Data Augmentation
by: Wang, Huimeng, et al.
Published: (2024)
by: Wang, Huimeng, et al.
Published: (2024)
Relationship between objective and subjective perceptual measures of speech in individuals with head and neck cancer
by: Halpern, Bence Mark, et al.
Published: (2025)
by: Halpern, Bence Mark, et al.
Published: (2025)
Leveraging Large Language Models for Sarcastic Speech Annotation in Sarcasm Detection
by: Li, Zhu, et al.
Published: (2025)
by: Li, Zhu, et al.
Published: (2025)
Modeling Sarcastic Speech: Semantic and Prosodic Cues in a Speech Synthesis Framework
by: Li, Zhu, et al.
Published: (2025)
by: Li, Zhu, et al.
Published: (2025)
A Functional Trade-off between Prosodic and Semantic Cues in Conveying Sarcasm
by: Li, Zhu, et al.
Published: (2024)
by: Li, Zhu, et al.
Published: (2024)
LUPET: Incorporating Hierarchical Information Path into Multilingual ASR
by: Liu, Wei, et al.
Published: (2024)
by: Liu, Wei, et al.
Published: (2024)
EFFUSE: Efficient Self-Supervised Feature Fusion for E2E ASR in Low Resource and Multilingual Scenarios
by: Srivastava, Tejes, et al.
Published: (2023)
by: Srivastava, Tejes, et al.
Published: (2023)
Romanization Encoding For Multilingual ASR
by: Ding, Wen, et al.
Published: (2024)
by: Ding, Wen, et al.
Published: (2024)
Exploring the Impact of Data Quantity on ASR in Extremely Low-resource Languages
by: Cheng, Yao-Fei, et al.
Published: (2024)
by: Cheng, Yao-Fei, et al.
Published: (2024)
Complexity boosted adaptive training for better low resource ASR performance
by: Lu, Hongxuan, et al.
Published: (2024)
by: Lu, Hongxuan, et al.
Published: (2024)
OLMoASR: Open Models and Data for Training Robust Speech Recognition Models
by: Ngo, Huong, et al.
Published: (2025)
by: Ngo, Huong, et al.
Published: (2025)
VoxCog: Towards End-to-End Multilingual Cognitive Impairment Classification through Dialectal Knowledge
by: Feng, Tiantian, et al.
Published: (2026)
by: Feng, Tiantian, et al.
Published: (2026)
Improving Multilingual ASR in the Wild Using Simple N-best Re-ranking
by: Yan, Brian, et al.
Published: (2024)
by: Yan, Brian, et al.
Published: (2024)
MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models
by: Nguyen, Thai-Binh, et al.
Published: (2024)
by: Nguyen, Thai-Binh, et al.
Published: (2024)
Voice Conversion Improves Cross-Domain Robustness for Spoken Arabic Dialect Identification
by: Abdullah, Badr M., et al.
Published: (2025)
by: Abdullah, Badr M., et al.
Published: (2025)
Improving Multilingual Speech Models on ML-SUPERB 2.0: Fine-tuning with Data Augmentation and LID-Aware CTC
by: Wang, Qingzheng, et al.
Published: (2025)
by: Wang, Qingzheng, et al.
Published: (2025)
Multistage Fine-tuning Strategies for Automatic Speech Recognition in Low-resource Languages
by: Pillai, Leena G, et al.
Published: (2024)
by: Pillai, Leena G, et al.
Published: (2024)
Exploring SSL Discrete Tokens for Multilingual ASR
by: Cui, Mingyu, et al.
Published: (2024)
by: Cui, Mingyu, et al.
Published: (2024)
Configurable Multilingual ASR with Speech Summary Representations
by: Zhu, Harrison, et al.
Published: (2024)
by: Zhu, Harrison, et al.
Published: (2024)
Selective Invocation for Multilingual ASR: A Cost-effective Approach Adapting to Speech Recognition Difficulty
by: Xue, Hongfei, et al.
Published: (2025)
by: Xue, Hongfei, et al.
Published: (2025)
Fine-Tuning ASR for Stuttered Speech: Personalized vs. Generalized Approaches
by: Mujtaba, Dena, et al.
Published: (2025)
by: Mujtaba, Dena, et al.
Published: (2025)
Cross-Dialect Bird Species Recognition with Dialect-Calibrated Augmentation
by: Ding, Jiani, et al.
Published: (2025)
by: Ding, Jiani, et al.
Published: (2025)
Extending Whisper with prompt tuning to target-speaker ASR
by: Ma, Hao, et al.
Published: (2023)
by: Ma, Hao, et al.
Published: (2023)
Advocating Character Error Rate for Multilingual ASR Evaluation
by: K, Thennal D, et al.
Published: (2024)
by: K, Thennal D, et al.
Published: (2024)
Fine-tune the pretrained ATST model for sound event detection
by: Shao, Nian, et al.
Published: (2023)
by: Shao, Nian, et al.
Published: (2023)
SCORE: Self-supervised Correspondence Fine-tuning for Improved Content Representations
by: Meghanani, Amit, et al.
Published: (2024)
by: Meghanani, Amit, et al.
Published: (2024)
Delayed-KD: Delayed Knowledge Distillation based CTC for Low-Latency Streaming ASR
by: Li, Longhao, et al.
Published: (2025)
by: Li, Longhao, et al.
Published: (2025)
Continual Learning Optimizations for Auto-regressive Decoder of Multilingual ASR systems
by: Kwok, Chin Yuen, et al.
Published: (2024)
by: Kwok, Chin Yuen, et al.
Published: (2024)
Efficient Multilingual ASR Finetuning via LoRA Language Experts
by: Li, Jiahong, et al.
Published: (2025)
by: Li, Jiahong, et al.
Published: (2025)
Improving Acoustic Scene Classification in Low-Resource Conditions
by: Chen, Zhi, et al.
Published: (2024)
by: Chen, Zhi, et al.
Published: (2024)
Anatomy of Industrial Scale Multilingual ASR
by: Ramirez, Francis McCann, et al.
Published: (2024)
by: Ramirez, Francis McCann, et al.
Published: (2024)
SOA: Reducing Domain Mismatch in SSL Pipeline by Speech Only Adaptation for Low Resource ASR
by: Shankar, Natarajan Balaji, et al.
Published: (2024)
by: Shankar, Natarajan Balaji, et al.
Published: (2024)
Target Speaker ASR with Whisper
by: Polok, Alexander, et al.
Published: (2024)
by: Polok, Alexander, et al.
Published: (2024)
Index-ASR Technical Report
by: Song, Zheshu, et al.
Published: (2025)
by: Song, Zheshu, et al.
Published: (2025)
Improving noisy student training for low-resource languages in End-to-End ASR using CycleGAN and inter-domain losses
by: Li, Chia-Yu, et al.
Published: (2024)
by: Li, Chia-Yu, et al.
Published: (2024)
Improving Spoken Language Modeling with Phoneme Classification: A Simple Fine-tuning Approach
by: Poli, Maxime, et al.
Published: (2024)
by: Poli, Maxime, et al.
Published: (2024)
Similar Items
-
Overcoming Data Scarcity in Multi-Dialectal Arabic ASR via Whisper Fine-Tuning
by: Özyilmaz, Ömer Tarik, et al.
Published: (2025) -
Breaking the Transcription Bottleneck: Fine-tuning ASR Models for Extremely Low-Resource Fieldwork Languages
by: Liang, Siyu, et al.
Published: (2025) -
A Unified Denoising and Adaptation Framework for Self-Supervised Bengali Dialectal ASR
by: Biswas, Swadhin, et al.
Published: (2025) -
Performance Analysis of Speech Encoders for Low-Resource SLU and ASR in Tunisian Dialect
by: Mdhaffar, Salima, et al.
Published: (2024) -
Enhancing Pre-trained ASR System Fine-tuning for Dysarthric Speech Recognition using Adversarial Data Augmentation
by: Wang, Huimeng, et al.
Published: (2024)