Saved in:
| Main Authors: | Lee, Jung-Sun, Jo, Ha-Na, Ko, Eunyeong |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.07918 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Lightweight Diffusion-based Framework for Online Imagined Speech Decoding in Aphasia
by: Ko, Eunyeong, et al.
Published: (2025)
by: Ko, Eunyeong, et al.
Published: (2025)
Towards Unified Neural Decoding of Perceived, Spoken and Imagined Speech from EEG Signals
by: Lee, Jung-Sun, et al.
Published: (2024)
by: Lee, Jung-Sun, et al.
Published: (2024)
Toward Robust EEG-based Intention Decoding during Misarticulated Speech in Dysarthria
by: Jo, Ha-Na, et al.
Published: (2025)
by: Jo, Ha-Na, et al.
Published: (2025)
EEG-Based Speech Decoding: A Novel Approach Using Multi-Kernel Ensemble Diffusion Models
by: Kim, Soowon, et al.
Published: (2024)
by: Kim, Soowon, et al.
Published: (2024)
SpokenUS: A Spoken User Simulator for Task-Oriented Dialogue
by: Lee, Jonggeun, et al.
Published: (2026)
by: Lee, Jonggeun, et al.
Published: (2026)
Interventional Speech Noise Injection for ASR Generalizable Spoken Language Understanding
by: Jung, Yeonjoon, et al.
Published: (2024)
by: Jung, Yeonjoon, et al.
Published: (2024)
Calm-Whisper: Reduce Whisper Hallucination On Non-Speech By Calming Crazy Heads Down
by: Wang, Yingzhi, et al.
Published: (2025)
by: Wang, Yingzhi, et al.
Published: (2025)
On the Evaluation of Speech Foundation Models for Spoken Language Understanding
by: Arora, Siddhant, et al.
Published: (2024)
by: Arora, Siddhant, et al.
Published: (2024)
Careless Whisper: Speech-to-Text Hallucination Harms
by: Koenecke, Allison, et al.
Published: (2024)
by: Koenecke, Allison, et al.
Published: (2024)
Scaling Spoken Language Models with Syllabic Speech Tokenization
by: Lee, Nicholas, et al.
Published: (2025)
by: Lee, Nicholas, et al.
Published: (2025)
Towards Spoken Mathematical Reasoning: Benchmarking Speech-based Models over Multi-faceted Math Problems
by: Wei, Chengwei, et al.
Published: (2025)
by: Wei, Chengwei, et al.
Published: (2025)
SpeechDPR: End-to-End Spoken Passage Retrieval for Open-Domain Spoken Question Answering
by: Lin, Chyi-Jiunn, et al.
Published: (2024)
by: Lin, Chyi-Jiunn, et al.
Published: (2024)
PhoWhisper: Automatic Speech Recognition for Vietnamese
by: Le, Thanh-Thien, et al.
Published: (2024)
by: Le, Thanh-Thien, et al.
Published: (2024)
Whisper-UT: A Unified Translation Framework for Speech and Text
by: Xiao, Cihan, et al.
Published: (2025)
by: Xiao, Cihan, et al.
Published: (2025)
TASTE: Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling
by: Tseng, Liang-Hsuan, et al.
Published: (2025)
by: Tseng, Liang-Hsuan, et al.
Published: (2025)
ASKD-Whisper: Adaptive Self-knowledge Distillation for Efficient and Low-Latency Automatic Speech Recognition
by: Lee, Junseok, et al.
Published: (2026)
by: Lee, Junseok, et al.
Published: (2026)
TASTE-Streaming: Towards Streamable Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling
by: Tseng, Liang-Hsuan, et al.
Published: (2026)
by: Tseng, Liang-Hsuan, et al.
Published: (2026)
Languages in Whisper-Style Speech Encoders Align Both Phonetically and Semantically
by: Shim, Ryan Soh-Eun, et al.
Published: (2025)
by: Shim, Ryan Soh-Eun, et al.
Published: (2025)
SpokenWOZ: A Large-Scale Speech-Text Benchmark for Spoken Task-Oriented Dialogue Agents
by: Si, Shuzheng, et al.
Published: (2023)
by: Si, Shuzheng, et al.
Published: (2023)
Whispering Context: Distilling Syntax and Semantics for Long Speech Transcripts
by: Altinok, Duygu
Published: (2025)
by: Altinok, Duygu
Published: (2025)
WhisperNER: Unified Open Named Entity and Speech Recognition
by: Ayache, Gil, et al.
Published: (2024)
by: Ayache, Gil, et al.
Published: (2024)
Swedish Whispers; Leveraging a Massive Speech Corpus for Swedish Speech Recognition
by: Vesterbacka, Leonora, et al.
Published: (2025)
by: Vesterbacka, Leonora, et al.
Published: (2025)
Indigenous Languages Spoken in Argentina: A Survey of NLP and Speech Resources
by: Ticona, Belu, et al.
Published: (2025)
by: Ticona, Belu, et al.
Published: (2025)
ZAEBUC-Spoken: A Multilingual Multidialectal Arabic-English Speech Corpus
by: Hamed, Injy, et al.
Published: (2024)
by: Hamed, Injy, et al.
Published: (2024)
Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models
by: Lin, Yi-Cheng, et al.
Published: (2024)
by: Lin, Yi-Cheng, et al.
Published: (2024)
Optimal Multi-Task Learning at Regularization Horizon for Speech Translation Task
by: Jung, JungHo, et al.
Published: (2025)
by: Jung, JungHo, et al.
Published: (2025)
Zero Resource Code-switched Speech Benchmark Using Speech Utterance Pairs For Multiple Spoken Languages
by: Huang, Kuan-Po, et al.
Published: (2023)
by: Huang, Kuan-Po, et al.
Published: (2023)
Improving Whisper's Recognition Performance for Under-Represented Language Kazakh Leveraging Unpaired Speech and Text
by: Li, Jinpeng, et al.
Published: (2024)
by: Li, Jinpeng, et al.
Published: (2024)
Long-Form Speech Generation with Spoken Language Models
by: Park, Se Jin, et al.
Published: (2024)
by: Park, Se Jin, et al.
Published: (2024)
Language-Universal Speech Attributes Modeling for Zero-Shot Multilingual Spoken Keyword Recognition
by: Yen, Hao, et al.
Published: (2024)
by: Yen, Hao, et al.
Published: (2024)
POWSM: A Phonetic Open Whisper-Style Speech Foundation Model
by: Li, Chin-Jou, et al.
Published: (2025)
by: Li, Chin-Jou, et al.
Published: (2025)
Speech Discrete Tokens or Continuous Features? A Comparative Analysis for Spoken Language Understanding in SpeechLLMs
by: Wang, Dingdong, et al.
Published: (2025)
by: Wang, Dingdong, et al.
Published: (2025)
A Character-Centric Creative Story Generation via Imagination
by: Park, Kyeongman, et al.
Published: (2024)
by: Park, Kyeongman, et al.
Published: (2024)
Theta Theory: operads and coloring
by: Marcolli, Matilde, et al.
Published: (2025)
by: Marcolli, Matilde, et al.
Published: (2025)
OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer
by: Peng, Yifan, et al.
Published: (2024)
by: Peng, Yifan, et al.
Published: (2024)
Optimal Transport Regularization for Speech Text Alignment in Spoken Language Models
by: Xu, Wenze, et al.
Published: (2025)
by: Xu, Wenze, et al.
Published: (2025)
Spoken Word2Vec: Learning Skipgram Embeddings from Speech
by: Sayeed, Mohammad Amaan, et al.
Published: (2023)
by: Sayeed, Mohammad Amaan, et al.
Published: (2023)
Transcript-Prompted Whisper with Dictionary-Enhanced Decoding for Japanese Speech Annotation
by: Hu, Rui, et al.
Published: (2025)
by: Hu, Rui, et al.
Published: (2025)
DQ-Whisper: Joint Distillation and Quantization for Efficient Multilingual Speech Recognition
by: Shao, Hang, et al.
Published: (2023)
by: Shao, Hang, et al.
Published: (2023)
Neural Synchrony Between Socially Interacting Language Models
by: Zhang, Zhining, et al.
Published: (2026)
by: Zhang, Zhining, et al.
Published: (2026)
Similar Items
-
Lightweight Diffusion-based Framework for Online Imagined Speech Decoding in Aphasia
by: Ko, Eunyeong, et al.
Published: (2025) -
Towards Unified Neural Decoding of Perceived, Spoken and Imagined Speech from EEG Signals
by: Lee, Jung-Sun, et al.
Published: (2024) -
Toward Robust EEG-based Intention Decoding during Misarticulated Speech in Dysarthria
by: Jo, Ha-Na, et al.
Published: (2025) -
EEG-Based Speech Decoding: A Novel Approach Using Multi-Kernel Ensemble Diffusion Models
by: Kim, Soowon, et al.
Published: (2024) -
SpokenUS: A Spoken User Simulator for Task-Oriented Dialogue
by: Lee, Jonggeun, et al.
Published: (2026)