Saved in:
| Main Authors: | Best, Paul, Cuervo, Santiago, Marxer, Ricard |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2404.01737 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Speech foundation models on intelligibility prediction for hearing-impaired listeners
by: Cuervo, Santiago, et al.
Published: (2024)
by: Cuervo, Santiago, et al.
Published: (2024)
Scaling Properties of Speech Language Models
by: Cuervo, Santiago, et al.
Published: (2024)
by: Cuervo, Santiago, et al.
Published: (2024)
Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection
by: Wang, Haoyu, et al.
Published: (2024)
by: Wang, Haoyu, et al.
Published: (2024)
Contrastive prediction strategies for unsupervised segmentation and categorization of phonemes and words
by: Cuervo, Santiago, et al.
Published: (2021)
by: Cuervo, Santiago, et al.
Published: (2021)
Improving the Inclusivity of Dutch Speech Recognition by Fine-tuning Whisper on the JASMIN-CGN Corpus
by: Shekoufandeh, Golshid, et al.
Published: (2025)
by: Shekoufandeh, Golshid, et al.
Published: (2025)
Can Whisper perform speech-based in-context learning?
by: Wang, Siyin, et al.
Published: (2023)
by: Wang, Siyin, et al.
Published: (2023)
Extending Whisper with prompt tuning to target-speaker ASR
by: Ma, Hao, et al.
Published: (2023)
by: Ma, Hao, et al.
Published: (2023)
kNN For Whisper And Its Effect On Bias And Speaker Adaptation
by: Nachesa, Maya K., et al.
Published: (2024)
by: Nachesa, Maya K., et al.
Published: (2024)
Fast Streaming Transducer ASR Prototyping via Knowledge Distillation with Whisper
by: Thorbecke, Iuliia, et al.
Published: (2024)
by: Thorbecke, Iuliia, et al.
Published: (2024)
Do Prompts Really Prompt? Exploring the Prompt Understanding Capability of Whisper
by: Yang, Chih-Kai, et al.
Published: (2024)
by: Yang, Chih-Kai, et al.
Published: (2024)
Transcript-Prompted Whisper with Dictionary-Enhanced Decoding for Japanese Speech Annotation
by: Hu, Rui, et al.
Published: (2025)
by: Hu, Rui, et al.
Published: (2025)
DQ-Whisper: Joint Distillation and Quantization for Efficient Multilingual Speech Recognition
by: Shao, Hang, et al.
Published: (2023)
by: Shao, Hang, et al.
Published: (2023)
Quantizing Whisper-small: How design choices affect ASR performance
by: Söhler, Arthur, et al.
Published: (2025)
by: Söhler, Arthur, et al.
Published: (2025)
WhisperKit: On-device Real-time ASR with Billion-Scale Transformers
by: Orhon, Atila, et al.
Published: (2025)
by: Orhon, Atila, et al.
Published: (2025)
Factorized RVQ-GAN For Disentangled Speech Tokenization
by: Khurana, Sameer, et al.
Published: (2025)
by: Khurana, Sameer, et al.
Published: (2025)
Muting Whisper: A Universal Acoustic Adversarial Attack on Speech Foundation Models
by: Raina, Vyas, et al.
Published: (2024)
by: Raina, Vyas, et al.
Published: (2024)
Adapting Whisper for Code-Switching through Encoding Refining and Language-Aware Decoding
by: Zhao, Jiahui, et al.
Published: (2024)
by: Zhao, Jiahui, et al.
Published: (2024)
Controlling Whisper: Universal Acoustic Adversarial Attacks to Control Speech Foundation Models
by: Raina, Vyas, et al.
Published: (2024)
by: Raina, Vyas, et al.
Published: (2024)
Swedish Whispers; Leveraging a Massive Speech Corpus for Swedish Speech Recognition
by: Vesterbacka, Leonora, et al.
Published: (2025)
by: Vesterbacka, Leonora, et al.
Published: (2025)
Adapting Diarization-Conditioned Whisper for End-to-End Multi-Talker Speech Recognition
by: Kocour, Martin, et al.
Published: (2025)
by: Kocour, Martin, et al.
Published: (2025)
WhisperRT -- Turning Whisper into a Causal Streaming Model
by: Krichli, Tomer, et al.
Published: (2025)
by: Krichli, Tomer, et al.
Published: (2025)
BaldWhisper: Faster Whisper with Head Shearing and Layer Merging
by: Sy, Yaya, et al.
Published: (2025)
by: Sy, Yaya, et al.
Published: (2025)
Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models
by: Cui, Ziyun, et al.
Published: (2024)
by: Cui, Ziyun, et al.
Published: (2024)
Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System
by: Meng, Lingwei, et al.
Published: (2024)
by: Meng, Lingwei, et al.
Published: (2024)
PI-Whisper: Designing an Adaptive and Incremental Automatic Speech Recognition System for Edge Devices
by: Nassereldine, Amir, et al.
Published: (2024)
by: Nassereldine, Amir, et al.
Published: (2024)
LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
by: Zhuo, Le, et al.
Published: (2023)
by: Zhuo, Le, et al.
Published: (2023)
Overcoming Data Scarcity in Multi-Dialectal Arabic ASR via Whisper Fine-Tuning
by: Özyilmaz, Ömer Tarik, et al.
Published: (2025)
by: Özyilmaz, Ömer Tarik, et al.
Published: (2025)
Towards Rehearsal-Free Multilingual ASR: A LoRA-based Case Study on Whisper
by: Xu, Tianyi, et al.
Published: (2024)
by: Xu, Tianyi, et al.
Published: (2024)
Improving Whisper's Recognition Performance for Under-Represented Language Kazakh Leveraging Unpaired Speech and Text
by: Li, Jinpeng, et al.
Published: (2024)
by: Li, Jinpeng, et al.
Published: (2024)
uDistil-Whisper: Label-Free Data Filtering for Knowledge Distillation in Low-Data Regimes
by: Waheed, Abdul, et al.
Published: (2024)
by: Waheed, Abdul, et al.
Published: (2024)
Qwen vs. Gemma Integration with Whisper: A Comparative Study in Multilingual SpeechLLM Systems
by: Nguyen, Tuan, et al.
Published: (2025)
by: Nguyen, Tuan, et al.
Published: (2025)
Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults
by: Attia, Ahmed Adel, et al.
Published: (2023)
by: Attia, Ahmed Adel, et al.
Published: (2023)
Multilingual DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific Experts
by: Ferraz, Thomas Palmeira, et al.
Published: (2023)
by: Ferraz, Thomas Palmeira, et al.
Published: (2023)
OWSM-Biasing: Contextualizing Open Whisper-Style Speech Models for Automatic Speech Recognition with Dynamic Vocabulary
by: Sudo, Yui, et al.
Published: (2025)
by: Sudo, Yui, et al.
Published: (2025)
OWSM v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning
by: Peng, Yifan, et al.
Published: (2025)
by: Peng, Yifan, et al.
Published: (2025)
Late Fusion and Multi-Level Fission Amplify Cross-Modal Transfer in Text-Speech LMs
by: Cuervo, Santiago, et al.
Published: (2025)
by: Cuervo, Santiago, et al.
Published: (2025)
Whisper Turns Stronger: Augmenting Wav2Vec 2.0 for Superior ASR in Low-Resource Languages
by: Anidjar, Or Haim, et al.
Published: (2024)
by: Anidjar, Or Haim, et al.
Published: (2024)
Multilingual Prosody Transfer: Comparing Supervised & Transfer Learning
by: Goel, Arnav, et al.
Published: (2024)
by: Goel, Arnav, et al.
Published: (2024)
Audio-to-Score Conversion Model Based on Whisper methodology
by: Zhang, Hongyao, et al.
Published: (2024)
by: Zhang, Hongyao, et al.
Published: (2024)
WhiSPA: Semantically and Psychologically Aligned Whisper with Self-Supervised Contrastive and Student-Teacher Learning
by: Rao, Rajath, et al.
Published: (2025)
by: Rao, Rajath, et al.
Published: (2025)
Similar Items
-
Speech foundation models on intelligibility prediction for hearing-impaired listeners
by: Cuervo, Santiago, et al.
Published: (2024) -
Scaling Properties of Speech Language Models
by: Cuervo, Santiago, et al.
Published: (2024) -
Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection
by: Wang, Haoyu, et al.
Published: (2024) -
Contrastive prediction strategies for unsupervised segmentation and categorization of phonemes and words
by: Cuervo, Santiago, et al.
Published: (2021) -
Improving the Inclusivity of Dutch Speech Recognition by Fine-tuning Whisper on the JASMIN-CGN Corpus
by: Shekoufandeh, Golshid, et al.
Published: (2025)