:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Fang, Zihao, Shen, Yingda, Guan, Zifan, Song, Tongtong, Liu, Zhenyi, Wu, Zhizheng
Format:	Preprint
Published:	2026
Subjects:	Sound Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2603.08046
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Vocoder-Free Non-Parallel Conversion of Whispered Speech With Masked Cycle-Consistent Generative Adversarial Networks
by: Wagner, Dominik, et al.
Published: (2023)

Whisper-PMFA: Partial Multi-Scale Feature Aggregation for Speaker Verification using Whisper Models
by: Zhao, Yiyang, et al.
Published: (2024)

Improving Whispered Speech Recognition Performance using Pseudo-whispered based Data Augmentation
by: Lin, Zhaofeng, et al.
Published: (2023)

Prompting Whisper for Joint Speech Transcription and Diarization
by: Zamyrova, Mariia, et al.
Published: (2026)

Probing Whisper for Dysarthric Speech in Detection and Assessment
by: Yue, Zhengjun, et al.
Published: (2025)

FlowW2N: Whispered-to-Normal Speech Conversion via Flow-Matching
by: Ritter-Gutierrez, Fabian, et al.
Published: (2026)

BrainWhisperer: Leveraging Large-Scale ASR Models for Neural Speech Decoding
by: Boccato, Tommaso, et al.
Published: (2026)

Adapting Whisper for Streaming Speech Recognition via Two-Pass Decoding
by: Zhou, Haoran, et al.
Published: (2025)

A Study on Incorporating Whisper for Robust Speech Assessment
by: Zezario, Ryandhimas E., et al.
Published: (2023)

Whisper-SV: Adapting Whisper for Low-data-resource Speaker Verification
by: Zhang, Li, et al.
Published: (2024)

Improvement Speaker Similarity for Zero-Shot Any-to-Any Voice Conversion of Whispered and Regular Speech
by: Avdeeva, Anastasia, et al.
Published: (2024)

SQ-Whisper: Speaker-Querying based Whisper Model for Target-Speaker ASR
by: Guo, Pengcheng, et al.
Published: (2024)

Sparsely Shared LoRA on Whisper for Child Speech Recognition
by: Liu, Wei, et al.
Published: (2023)

Leveraging Self-Supervised Models for Automatic Whispered Speech Recognition
by: Farhadipour, Aref, et al.
Published: (2024)

WhisperD: Dementia Speech Recognition and Filler Word Detection with Whisper
by: Akinrintoyo, Emmanuel, et al.
Published: (2025)

Target Speaker ASR with Whisper
by: Polok, Alexander, et al.
Published: (2024)

Speech Intelligibility Assessment with Uncertainty-Aware Whisper Embeddings and sLSTM
by: Zezario, Ryandhimas E., et al.
Published: (2025)

State-Space Models in Efficient Whispered and Multi-dialect Speech Recognition
by: Farhadipour, Aref, et al.
Published: (2025)

M2R-Whisper: Multi-stage and Multi-scale Retrieval Augmentation for Enhancing Whisper
by: Zhou, Jiaming, et al.
Published: (2024)

Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection
by: Wang, Haoyu, et al.
Published: (2024)

A Self-Training Approach for Whisper to Enhance Long Dysarthric Speech Recognition
by: Wang, Shiyao, et al.
Published: (2025)

Bangla-WhisperDiar: Fine-Tuning Whisper and PyAnnote for Bangla Long-Form Speech Recognition and Speaker Diarization
by: Bhuiyan, Mohammed Aman, et al.
Published: (2026)

Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation
by: Rouditchenko, Andrew, et al.
Published: (2024)

DiCoW: Diarization-Conditioned Whisper for Target Speaker Automatic Speech Recognition
by: Polok, Alexander, et al.
Published: (2024)

Investigation of Whisper ASR Hallucinations Induced by Non-Speech Audio
by: Barański, Mateusz, et al.
Published: (2025)

Application of Whisper in Clinical Practice: the Post-Stroke Speech Assessment during a Naming Task
by: Davudova, Milena, et al.
Published: (2025)

Swedish Whispers; Leveraging a Massive Speech Corpus for Swedish Speech Recognition
by: Vesterbacka, Leonora, et al.
Published: (2025)

WhisperFlow: speech foundation models in real time
by: Wang, Rongxiang, et al.
Published: (2024)

Transcript-Prompted Whisper with Dictionary-Enhanced Decoding for Japanese Speech Annotation
by: Hu, Rui, et al.
Published: (2025)

DQ-Whisper: Joint Distillation and Quantization for Efficient Multilingual Speech Recognition
by: Shao, Hang, et al.
Published: (2023)

WhisperMask: A Noise Suppressive Mask-Type Microphone for Whisper Speech
by: Hiraki, Hirotaka, et al.
Published: (2024)

OWSM v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning
by: Peng, Yifan, et al.
Published: (2025)

WhisperRT -- Turning Whisper into a Causal Streaming Model
by: Krichli, Tomer, et al.
Published: (2025)

BaldWhisper: Faster Whisper with Head Shearing and Layer Merging
by: Sy, Yaya, et al.
Published: (2025)

Improving Rare-Word Recognition of Whisper in Zero-Shot Settings
by: Jogi, Yash, et al.
Published: (2025)

Exploiting Music Source Separation for Automatic Lyrics Transcription with Whisper
by: Syed, Jaza, et al.
Published: (2025)

Audio-to-Score Conversion Model Based on Whisper methodology
by: Zhang, Hongyao, et al.
Published: (2024)

Perceiver-Prompt: Flexible Speaker Adaptation in Whisper for Chinese Disordered Speech Recognition
by: Jiang, Yicong, et al.
Published: (2024)

Muting Whisper: A Universal Acoustic Adversarial Attack on Speech Foundation Models
by: Raina, Vyas, et al.
Published: (2024)

Adapting Diarization-Conditioned Whisper for End-to-End Multi-Talker Speech Recognition
by: Kocour, Martin, et al.
Published: (2025)