:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yeh, Sung-Lin, Meng, Yen, Tang, Hao
Format:	Preprint
Published:	2025
Subjects:	Audio and Speech Processing Computation and Language
Online Access:	https://arxiv.org/abs/2509.09987
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Estimating the Completeness of Discrete Speech Units
by: Yeh, Sung-Lin, et al.
Published: (2024)

Learning Speech Representations with Variational Predictive Coding
by: Yeh, Sung-Lin, et al.
Published: (2025)

Deepfake Word Detection by Next-token Prediction using Fine-tuned Whisper
by: Tran, Hoan My, et al.
Published: (2026)

MELD: Mel-Spectrogram-Based Speech Language Modeling with Discrete Latent Variables
by: Yeh, Sung-Lin, et al.
Published: (2026)

Effective Context in Neural Speech Models
by: Meng, Yen, et al.
Published: (2025)

Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection
by: Wang, Haoyu, et al.
Published: (2024)

Revisiting Self-supervised Learning of Speech Representation from a Mutual Information Perspective
by: Liu, Alexander H., et al.
Published: (2024)

One Whisper to Grade Them All
by: Phan, Nhan, et al.
Published: (2025)

Extending Whisper with prompt tuning to target-speaker ASR
by: Ma, Hao, et al.
Published: (2023)

PhoWhisper: Automatic Speech Recognition for Vietnamese
by: Le, Thanh-Thien, et al.
Published: (2024)

Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System
by: Meng, Lingwei, et al.
Published: (2024)

Transcript-Prompted Whisper with Dictionary-Enhanced Decoding for Japanese Speech Annotation
by: Hu, Rui, et al.
Published: (2025)

POWSM: A Phonetic Open Whisper-Style Speech Foundation Model
by: Li, Chin-Jou, et al.
Published: (2025)

Fine-tuning Whisper on Low-Resource Languages for Real-World Applications
by: Timmel, Vincenzo, et al.
Published: (2024)

Adapting Whisper for Code-Switching through Encoding Refining and Language-Aware Decoding
by: Zhao, Jiahui, et al.
Published: (2024)

Enhancing Whisper's Accuracy and Speed for Indian Languages through Prompt-Tuning and Tokenization
by: Tripathi, Kumud, et al.
Published: (2024)

Transfer Learning from Whisper for Microscopic Intelligibility Prediction
by: Best, Paul, et al.
Published: (2024)

Can Whisper perform speech-based in-context learning?
by: Wang, Siyin, et al.
Published: (2023)

FASA: a Flexible and Automatic Speech Aligner for Extracting High-quality Aligned Children Speech Data
by: Liu, Dancheng, et al.
Published: (2024)

LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
by: Zhuo, Le, et al.
Published: (2023)

kNN For Whisper And Its Effect On Bias And Speaker Adaptation
by: Nachesa, Maya K., et al.
Published: (2024)

WhisperRT -- Turning Whisper into a Causal Streaming Model
by: Krichli, Tomer, et al.
Published: (2025)

BaldWhisper: Faster Whisper with Head Shearing and Layer Merging
by: Sy, Yaya, et al.
Published: (2025)

Anatomy of the Modality Gap: Dissecting the Internal States of End-to-End Speech LLMs
by: Hsu, Ming-Hao, et al.
Published: (2026)

OWSM v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning
by: Peng, Yifan, et al.
Published: (2025)

OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer
by: Peng, Yifan, et al.
Published: (2024)

Quantizing Whisper-small: How design choices affect ASR performance
by: Söhler, Arthur, et al.
Published: (2025)

WhisperKit: On-device Real-time ASR with Billion-Scale Transformers
by: Orhon, Atila, et al.
Published: (2025)

Fast Streaming Transducer ASR Prototyping via Knowledge Distillation with Whisper
by: Thorbecke, Iuliia, et al.
Published: (2024)

DQ-Whisper: Joint Distillation and Quantization for Efficient Multilingual Speech Recognition
by: Shao, Hang, et al.
Published: (2023)

Do Prompts Really Prompt? Exploring the Prompt Understanding Capability of Whisper
by: Yang, Chih-Kai, et al.
Published: (2024)

Swedish Whispers; Leveraging a Massive Speech Corpus for Swedish Speech Recognition
by: Vesterbacka, Leonora, et al.
Published: (2025)

Adapting Diarization-Conditioned Whisper for End-to-End Multi-Talker Speech Recognition
by: Kocour, Martin, et al.
Published: (2025)

Muting Whisper: A Universal Acoustic Adversarial Attack on Speech Foundation Models
by: Raina, Vyas, et al.
Published: (2024)

Controlling Whisper: Universal Acoustic Adversarial Attacks to Control Speech Foundation Models
by: Raina, Vyas, et al.
Published: (2024)

Improving the Inclusivity of Dutch Speech Recognition by Fine-tuning Whisper on the JASMIN-CGN Corpus
by: Shekoufandeh, Golshid, et al.
Published: (2025)

Overcoming Data Scarcity in Multi-Dialectal Arabic ASR via Whisper Fine-Tuning
by: Özyilmaz, Ömer Tarik, et al.
Published: (2025)

Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models
by: Cui, Ziyun, et al.
Published: (2024)

PI-Whisper: Designing an Adaptive and Incremental Automatic Speech Recognition System for Edge Devices
by: Nassereldine, Amir, et al.
Published: (2024)

Investigating the Impact of Word Informativeness on Speech Emotion Recognition
by: Kakouros, Sofoklis
Published: (2025)