:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yoo, Jaekwon, Chandiramani, Kunal, Tadimeti, Divya, Girma, Abenezer, Dhir, Chandra
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2509.04473
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Streaming Speech-to-Text Translation with a SpeechLLM
by: Parcollet, Titouan, et al.
Published: (2026)

Conversational Speech Reveals Structural Robustness Failures in SpeechLLM Backbones
by: Teleki, Maria, et al.
Published: (2025)

Measuring the Redundancy of Decoder Layers in SpeechLLMs
by: Moumen, Adel, et al.
Published: (2026)

Better Pseudo-labeling with Multi-ASR Fusion and Error Correction by SpeechLLM
by: Prakash, Jeena, et al.
Published: (2025)

Task-Lens: Cross-Task Utility Based Speech Dataset Profiling for Low-Resource Indian Languages
by: Sharma, Swati, et al.
Published: (2026)

Contrastive Learning for Task-Independent SpeechLLM-Pretraining
by: Züfle, Maike, et al.
Published: (2024)

Rubric-Guided Fine-tuning of SpeechLLMs for Multi-Aspect, Multi-Rater L2 Reading-Speech Assessment
by: Parikh, Aditya Kamlesh, et al.
Published: (2026)

Slot Filling as a Reasoning Task for SpeechLLMs
by: Hacioglu, Kadri, et al.
Published: (2025)

Detecting Hallucinations in SpeechLLMs at Inference Time Using Attention Maps
by: Waldendorf, Jonas, et al.
Published: (2026)

Speech Discrete Tokens or Continuous Features? A Comparative Analysis for Spoken Language Understanding in SpeechLLMs
by: Wang, Dingdong, et al.
Published: (2025)

Short-form Text Rewriting with Phi Silica
by: Tadimeti, Divya, et al.
Published: (2026)

WildSpeech-Bench: Benchmarking End-to-End SpeechLLMs in the Wild
by: Zhang, Linhao, et al.
Published: (2025)

DOA: Training-Free Decoder-Only Attention Policy for Long-Form Simultaneous Translation with SpeechLLMs
by: Papi, Sara, et al.
Published: (2026)

LoASR-Bench: Evaluating Large Speech Language Models on Low-Resource Automatic Speech Recognition Across Language Families
by: Chen, Jianan, et al.
Published: (2026)

SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks
by: Chang, Kai-Wei, et al.
Published: (2024)

SpeechComposer: Unifying Multiple Speech Tasks with Prompt Composition
by: Wu, Yihan, et al.
Published: (2024)

Speech-Worthy Alignment for Japanese SpeechLLMs via Direct Preference Optimization
by: Zhao, Mengjie, et al.
Published: (2026)

Do Bias Benchmarks Generalise? Evidence from Voice-based Evaluation of Gender Bias in SpeechLLMs
by: Satish, Shree Harsha Bokkahalli, et al.
Published: (2025)

Leveraging the Potential of Prompt Engineering for Hate Speech Detection in Low-Resource Languages
by: Prome, Ruhina Tabasshum, et al.
Published: (2025)

Exploring In-Context Learning of Textless Speech Language Model for Speech Classification Tasks
by: Hsu, Ming-Hao, et al.
Published: (2023)

StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs
by: Song, Yuhan, et al.
Published: (2025)

Advancing Speech Understanding in Speech-Aware Language Models with GRPO
by: Elmakies, Avishai, et al.
Published: (2025)

A Unified Speech LLM for Diarization and Speech Recognition in Multilingual Conversations
by: Saengthong, Phurich, et al.
Published: (2025)

Aligning Paralinguistic Understanding and Generation in Speech LLMs via Multi-Task Reinforcement Learning
by: Chen, Jingxiang, et al.
Published: (2026)

Northeastern Uni at Multilingual Counterspeech Generation: Enhancing Counter Speech Generation with LLM Alignment through Direct Preference Optimization
by: Wadhwa, Sahil, et al.
Published: (2024)

Understanding the Modality Gap: An Empirical Study on the Speech-Text Alignment Mechanism of Large Speech Language Models
by: Xiang, Bajian, et al.
Published: (2025)

Task Arithmetic for Language Expansion in Speech Translation
by: Cheng, Yao-Fei, et al.
Published: (2024)

Preservation of Language Understanding Capabilities in Speech-aware Large Language Models
by: Kubis, Marek, et al.
Published: (2025)

Enhancing Speech Instruction Understanding and Disambiguation in Robotics via Speech Prosody
by: Sasu, David, et al.
Published: (2025)

Ming-UniAudio: Speech LLM for Joint Understanding, Generation and Editing with Unified Representation
by: Yan, Canxiang, et al.
Published: (2025)

The Voice Behind the Words: Quantifying Intersectional Bias in SpeechLLMs
by: Satish, Shree Harsha Bokkahalli, et al.
Published: (2026)

SEAHateCheck: Functional Tests for Detecting Hate Speech in Low-Resource Languages of Southeast Asia
by: Ng, Ri Chi, et al.
Published: (2026)

Can Linguistically Related Languages Guide LLM Translation in Low-Resource Settings?
by: Ramasethu, Aishwarya, et al.
Published: (2026)

Improving Semantic Understanding in Speech Language Models via Brain-tuning
by: Moussa, Omer, et al.
Published: (2024)

STTATTS: Unified Speech-To-Text And Text-To-Speech Model
by: Toyin, Hawau Olamide, et al.
Published: (2024)

GHaLIB: A Multilingual Framework for Hope Speech Detection in Low-Resource Languages
by: Abdullah, Ahmed, et al.
Published: (2025)

SpeechLLMs for Large-scale Contextualized Zero-shot Slot Filling
by: Hacioglu, Kadri, et al.
Published: (2025)

Speech LLMs in Low-Resource Scenarios: Data Volume Requirements and the Impact of Pretraining on High-Resource Languages
by: Fong, Seraphina, et al.
Published: (2025)

Locate-and-Focus: Enhancing Terminology Translation in Speech Language Models
by: Wu, Suhang, et al.
Published: (2025)

Qwen vs. Gemma Integration with Whisper: A Comparative Study in Multilingual SpeechLLM Systems
by: Nguyen, Tuan, et al.
Published: (2025)