Saved in:
| Main Authors: | Yoo, Jaekwon, Chandiramani, Kunal, Tadimeti, Divya, Girma, Abenezer, Dhir, Chandra |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.04473 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Streaming Speech-to-Text Translation with a SpeechLLM
by: Parcollet, Titouan, et al.
Published: (2026)
by: Parcollet, Titouan, et al.
Published: (2026)
Conversational Speech Reveals Structural Robustness Failures in SpeechLLM Backbones
by: Teleki, Maria, et al.
Published: (2025)
by: Teleki, Maria, et al.
Published: (2025)
Measuring the Redundancy of Decoder Layers in SpeechLLMs
by: Moumen, Adel, et al.
Published: (2026)
by: Moumen, Adel, et al.
Published: (2026)
Better Pseudo-labeling with Multi-ASR Fusion and Error Correction by SpeechLLM
by: Prakash, Jeena, et al.
Published: (2025)
by: Prakash, Jeena, et al.
Published: (2025)
Task-Lens: Cross-Task Utility Based Speech Dataset Profiling for Low-Resource Indian Languages
by: Sharma, Swati, et al.
Published: (2026)
by: Sharma, Swati, et al.
Published: (2026)
Contrastive Learning for Task-Independent SpeechLLM-Pretraining
by: Züfle, Maike, et al.
Published: (2024)
by: Züfle, Maike, et al.
Published: (2024)
Rubric-Guided Fine-tuning of SpeechLLMs for Multi-Aspect, Multi-Rater L2 Reading-Speech Assessment
by: Parikh, Aditya Kamlesh, et al.
Published: (2026)
by: Parikh, Aditya Kamlesh, et al.
Published: (2026)
Slot Filling as a Reasoning Task for SpeechLLMs
by: Hacioglu, Kadri, et al.
Published: (2025)
by: Hacioglu, Kadri, et al.
Published: (2025)
Detecting Hallucinations in SpeechLLMs at Inference Time Using Attention Maps
by: Waldendorf, Jonas, et al.
Published: (2026)
by: Waldendorf, Jonas, et al.
Published: (2026)
Speech Discrete Tokens or Continuous Features? A Comparative Analysis for Spoken Language Understanding in SpeechLLMs
by: Wang, Dingdong, et al.
Published: (2025)
by: Wang, Dingdong, et al.
Published: (2025)
Short-form Text Rewriting with Phi Silica
by: Tadimeti, Divya, et al.
Published: (2026)
by: Tadimeti, Divya, et al.
Published: (2026)
WildSpeech-Bench: Benchmarking End-to-End SpeechLLMs in the Wild
by: Zhang, Linhao, et al.
Published: (2025)
by: Zhang, Linhao, et al.
Published: (2025)
DOA: Training-Free Decoder-Only Attention Policy for Long-Form Simultaneous Translation with SpeechLLMs
by: Papi, Sara, et al.
Published: (2026)
by: Papi, Sara, et al.
Published: (2026)
LoASR-Bench: Evaluating Large Speech Language Models on Low-Resource Automatic Speech Recognition Across Language Families
by: Chen, Jianan, et al.
Published: (2026)
by: Chen, Jianan, et al.
Published: (2026)
SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks
by: Chang, Kai-Wei, et al.
Published: (2024)
by: Chang, Kai-Wei, et al.
Published: (2024)
SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition
by: Wu, Yihan, et al.
Published: (2024)
by: Wu, Yihan, et al.
Published: (2024)
Speech-Worthy Alignment for Japanese SpeechLLMs via Direct Preference Optimization
by: Zhao, Mengjie, et al.
Published: (2026)
by: Zhao, Mengjie, et al.
Published: (2026)
Do Bias Benchmarks Generalise? Evidence from Voice-based Evaluation of Gender Bias in SpeechLLMs
by: Satish, Shree Harsha Bokkahalli, et al.
Published: (2025)
by: Satish, Shree Harsha Bokkahalli, et al.
Published: (2025)
Leveraging the Potential of Prompt Engineering for Hate Speech Detection in Low-Resource Languages
by: Prome, Ruhina Tabasshum, et al.
Published: (2025)
by: Prome, Ruhina Tabasshum, et al.
Published: (2025)
Exploring In-Context Learning of Textless Speech Language Model for Speech Classification Tasks
by: Hsu, Ming-Hao, et al.
Published: (2023)
by: Hsu, Ming-Hao, et al.
Published: (2023)
StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs
by: Song, Yuhan, et al.
Published: (2025)
by: Song, Yuhan, et al.
Published: (2025)
Advancing Speech Understanding in Speech-Aware Language Models with GRPO
by: Elmakies, Avishai, et al.
Published: (2025)
by: Elmakies, Avishai, et al.
Published: (2025)
A Unified Speech LLM for Diarization and Speech Recognition in Multilingual Conversations
by: Saengthong, Phurich, et al.
Published: (2025)
by: Saengthong, Phurich, et al.
Published: (2025)
Aligning Paralinguistic Understanding and Generation in Speech LLMs via Multi-Task Reinforcement Learning
by: Chen, Jingxiang, et al.
Published: (2026)
by: Chen, Jingxiang, et al.
Published: (2026)
Northeastern Uni at Multilingual Counterspeech Generation: Enhancing Counter Speech Generation with LLM Alignment through Direct Preference Optimization
by: Wadhwa, Sahil, et al.
Published: (2024)
by: Wadhwa, Sahil, et al.
Published: (2024)
Understanding the Modality Gap: An Empirical Study on the Speech-Text Alignment Mechanism of Large Speech Language Models
by: Xiang, Bajian, et al.
Published: (2025)
by: Xiang, Bajian, et al.
Published: (2025)
Task Arithmetic for Language Expansion in Speech Translation
by: Cheng, Yao-Fei, et al.
Published: (2024)
by: Cheng, Yao-Fei, et al.
Published: (2024)
Preservation of Language Understanding Capabilities in Speech-aware Large Language Models
by: Kubis, Marek, et al.
Published: (2025)
by: Kubis, Marek, et al.
Published: (2025)
Enhancing Speech Instruction Understanding and Disambiguation in Robotics via Speech Prosody
by: Sasu, David, et al.
Published: (2025)
by: Sasu, David, et al.
Published: (2025)
Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation
by: Yan, Canxiang, et al.
Published: (2025)
by: Yan, Canxiang, et al.
Published: (2025)
The Voice Behind the Words: Quantifying Intersectional Bias in SpeechLLMs
by: Satish, Shree Harsha Bokkahalli, et al.
Published: (2026)
by: Satish, Shree Harsha Bokkahalli, et al.
Published: (2026)
SEAHateCheck: Functional Tests for Detecting Hate Speech in Low-Resource Languages of Southeast Asia
by: Ng, Ri Chi, et al.
Published: (2026)
by: Ng, Ri Chi, et al.
Published: (2026)
Can Linguistically Related Languages Guide LLM Translation in Low-Resource Settings?
by: Ramasethu, Aishwarya, et al.
Published: (2026)
by: Ramasethu, Aishwarya, et al.
Published: (2026)
Improving Semantic Understanding in Speech Language Models via Brain-tuning
by: Moussa, Omer, et al.
Published: (2024)
by: Moussa, Omer, et al.
Published: (2024)
STTATTS: Unified Speech-To-Text And Text-To-Speech Model
by: Toyin, Hawau Olamide, et al.
Published: (2024)
by: Toyin, Hawau Olamide, et al.
Published: (2024)
GHaLIB: A Multilingual Framework for Hope Speech Detection in Low-Resource Languages
by: Abdullah, Ahmed, et al.
Published: (2025)
by: Abdullah, Ahmed, et al.
Published: (2025)
SpeechLLMs for Large-scale Contextualized Zero-shot Slot Filling
by: Hacioglu, Kadri, et al.
Published: (2025)
by: Hacioglu, Kadri, et al.
Published: (2025)
Speech LLMs in Low-Resource Scenarios: Data Volume Requirements and the Impact of Pretraining on High-Resource Languages
by: Fong, Seraphina, et al.
Published: (2025)
by: Fong, Seraphina, et al.
Published: (2025)
Locate-and-Focus: Enhancing Terminology Translation in Speech Language Models
by: Wu, Suhang, et al.
Published: (2025)
by: Wu, Suhang, et al.
Published: (2025)
Qwen vs. Gemma Integration with Whisper: A Comparative Study in Multilingual SpeechLLM Systems
by: Nguyen, Tuan, et al.
Published: (2025)
by: Nguyen, Tuan, et al.
Published: (2025)
Similar Items
-
Streaming Speech-to-Text Translation with a SpeechLLM
by: Parcollet, Titouan, et al.
Published: (2026) -
Conversational Speech Reveals Structural Robustness Failures in SpeechLLM Backbones
by: Teleki, Maria, et al.
Published: (2025) -
Measuring the Redundancy of Decoder Layers in SpeechLLMs
by: Moumen, Adel, et al.
Published: (2026) -
Better Pseudo-labeling with Multi-ASR Fusion and Error Correction by SpeechLLM
by: Prakash, Jeena, et al.
Published: (2025) -
Task-Lens: Cross-Task Utility Based Speech Dataset Profiling for Low-Resource Indian Languages
by: Sharma, Swati, et al.
Published: (2026)