:: Library Catalog

Image de couverture de livre

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Alderete, John, Hui, Macarious Kin Fung, Mohan, Aanchan
Format:	Preprint
Publié:	2025
Sujets:	Computation and Language
Accès en ligne:	https://arxiv.org/abs/2508.13060
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

Documents similaires

Enhancing AAC Software for Dysarthric Speakers in e-Health Settings: An Evaluation Using TORGO
par: Hui, Macarious, et autres
Publié: (2024)

WhisperAlign: Word-Boundary-Aware ASR and WhisperX-Anchored Pyannote Diarization for Long-Form Bengali Speech
par: Chowdhury, Aurchi, et autres
Publié: (2026)

A new approach for fine-tuning sentence transformers for intent classification and out-of-scope detection tasks
par: Zhang, Tianyi, et autres
Publié: (2024)

Human Latency Conversational Turns for Spoken Avatar Systems
par: Jacoby, Derek, et autres
Publié: (2024)

Configurable Multilingual ASR with Speech Summary Representations
par: Zhu, Harrison, et autres
Publié: (2024)

Fine-tuning Whisper for Pashto ASR: strategies and scale
par: Rahman, Hanif
Publié: (2026)

Crossmodal ASR Error Correction with Discrete Speech Units
par: Li, Yuanchao, et autres
Publié: (2024)

Extending Whisper with prompt tuning to target-speaker ASR
par: Ma, Hao, et autres
Publié: (2023)

Can Whisper perform speech-based in-context learning?
par: Wang, Siyin, et autres
Publié: (2023)

Calm-Whisper: Reduce Whisper Hallucination On Non-Speech By Calming Crazy Heads Down
par: Wang, Yingzhi, et autres
Publié: (2025)

Towards Rehearsal-Free Multilingual ASR: A LoRA-based Case Study on Whisper
par: Xu, Tianyi, et autres
Publié: (2024)

Careless Whisper: Speech-to-Text Hallucination Harms
par: Koenecke, Allison, et autres
Publié: (2024)

Whisper: Courtside Edition Enhancing ASR Performance Through LLM-Driven Context Generation
par: Ron, Yonathan, et autres
Publié: (2026)

LoRA-Whisper: Parameter-Efficient and Extensible Multilingual ASR
par: Song, Zheshu, et autres
Publié: (2024)

Fast Streaming Transducer ASR Prototyping via Knowledge Distillation with Whisper
par: Thorbecke, Iuliia, et autres
Publié: (2024)

Quantizing Whisper-small: How design choices affect ASR performance
par: Söhler, Arthur, et autres
Publié: (2025)

WhisperKit: On-device Real-time ASR with Billion-Scale Transformers
par: Orhon, Atila, et autres
Publié: (2025)

On the Role of Encoder Depth: Pruning Whisper and LoRA Fine-Tuning in SLAM-ASR
par: Kolluri, Ganesh Pavan Kartikeya Bharadwaj, et autres
Publié: (2026)

ASR-EC Benchmark: Evaluating Large Language Models on Chinese ASR Error Correction
par: Wei, Victor Junqiu, et autres
Publié: (2024)

Advocating Character Error Rate for Multilingual ASR Evaluation
par: K, Thennal D, et autres
Publié: (2024)

How much speech data is necessary for ASR in African languages? An evaluation of data scaling in Kinyarwanda and Kikuyu
par: Akera, Benjamin, et autres
Publié: (2025)

CantoASR: Prosody-Aware ASR-LALM Collaboration for Low-Resource Cantonese
par: Chen, Dazhong, et autres
Publié: (2025)

Classification is a RAG problem: A case study on hate speech detection
par: Willats, Richard, et autres
Publié: (2025)

ASR Error Correction using Large Language Models
par: Ma, Rao, et autres
Publié: (2024)

Better Pseudo-labeling with Multi-ASR Fusion and Error Correction by SpeechLLM
par: Prakash, Jeena, et autres
Publié: (2025)

PhoWhisper: Automatic Speech Recognition for Vietnamese
par: Le, Thanh-Thien, et autres
Publié: (2024)

Impact of automatic speech recognition quality on Alzheimer's disease detection from spontaneous speech: a reproducible benchmark study with lexical modeling and statistical validation
par: Samanta, Himadri S
Publié: (2026)

Whisper-UT: A Unified Translation Framework for Speech and Text
par: Xiao, Cihan, et autres
Publié: (2025)

From Speech to Subtitles: Evaluating ASR Models in Subtitling Italian Television Programs
par: Lucca, Alessandro, et autres
Publié: (2025)

Overcoming Data Scarcity in Multi-Dialectal Arabic ASR via Whisper Fine-Tuning
par: Özyilmaz, Ömer Tarik, et autres
Publié: (2025)

Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition
par: Radhakrishnan, Srijith, et autres
Publié: (2023)

Swedish Whispers; Leveraging a Massive Speech Corpus for Swedish Speech Recognition
par: Vesterbacka, Leonora, et autres
Publié: (2025)

Prosody in Cascade and Direct Speech-to-Text Translation: a case study on Korean Wh-Phrases
par: Zhou, Giulio, et autres
Publié: (2024)

Noise-Robust AV-ASR Using Visual Features Both in the Whisper Encoder and Decoder
par: Li, Zhengyang, et autres
Publié: (2026)

Languages in Whisper-Style Speech Encoders Align Both Phonetically and Semantically
par: Shim, Ryan Soh-Eun, et autres
Publié: (2025)

Hypernetworks for Personalizing ASR to Atypical Speech
par: Müller-Eberstein, Max, et autres
Publié: (2024)

Whispering Context: Distilling Syntax and Semantics for Long Speech Transcripts
par: Altinok, Duygu
Publié: (2025)

WhisperNER: Unified Open Named Entity and Speech Recognition
par: Ayache, Gil, et autres
Publié: (2024)

Internalizing ASR with Implicit Chain of Thought for Efficient Speech-to-Speech Conversational LLM
par: Yuen, Robin Shing-Hei, et autres
Publié: (2024)

Non-verbal information in spontaneous speech -- towards a new framework of analysis
par: Biron, Tirza, et autres
Publié: (2024)