:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Cunningham, Jay L., Adjagbodjou, Adinawa, Basoah, Jeffrey, Jawara, Jainaba, Kadoma, Kowe, Lewis, Aaleyah
Format:	Preprint
Published:	2025
Subjects:	Audio and Speech Processing Artificial Intelligence Computation and Language Sound
Online Access:	https://arxiv.org/abs/2508.18288
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Resource-Efficient Adaptation of Speech Foundation Models for Multi-Speaker ASR
by: Wang, Weiqing, et al.
Published: (2024)

Elevating Robust Multi-Talker ASR by Decoupling Speaker Separation and Speech Recognition
by: Yang, Yufeng, et al.
Published: (2025)

Target Speaker ASR with Whisper
by: Polok, Alexander, et al.
Published: (2024)

Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment
by: Shao, Yiwen, et al.
Published: (2024)

ASR-FAIRBENCH: Measuring and Benchmarking Equity Across Speech Recognition Systems
by: Rai, Anand, et al.
Published: (2025)

SOT Triggered Neural Clustering for Speaker Attributed ASR
by: Zheng, Xianrui, et al.
Published: (2024)

DNCASR: End-to-End Training for Speaker-Attributed ASR
by: Zheng, Xianrui, et al.
Published: (2025)

Enhancing Intelligibility for Generative Target Speech Extraction via Joint Optimization with Target Speaker ASR
by: Ma, Hao, et al.
Published: (2025)

Speaker-Smoothed kNN Speaker Adaptation for End-to-End ASR
by: Li, Shaojun, et al.
Published: (2024)

Speaker Targeting via Self-Speaker Adaptation for Multi-talker ASR
by: Wang, Weiqing, et al.
Published: (2025)

Self-supervised Speech Representations Still Struggle with African American Vernacular English
by: Chang, Kalvin, et al.
Published: (2024)

Contextual Biasing for ASR in Speech LLM with Common Word Cues and Bias Word Position Prediction
by: Novitasari, Sashi, et al.
Published: (2026)

META-CAT: Speaker-Informed Speech Embeddings via Meta Information Concatenation for Multi-talker ASR
by: Wang, Jinhan, et al.
Published: (2024)

Can We Really Repurpose Multi-Speaker ASR Corpus for Speaker Diarization?
by: Horiguchi, Shota, et al.
Published: (2025)

SQ-Whisper: Speaker-Querying based Whisper Model for Target-Speaker ASR
by: Guo, Pengcheng, et al.
Published: (2024)

A Toolkit for Joint Speaker Diarization and Identification with Application to Speaker-Attributed ASR
by: Morrone, Giovanni, et al.
Published: (2024)

TagSpeech: End-to-End Multi-Speaker ASR and Diarization with Fine-Grained Temporal Grounding
by: Huo, Mingyue, et al.
Published: (2026)

Speaker Adaptation for Quantised End-to-End ASR Models
by: Zhao, Qiuming, et al.
Published: (2024)

BR-ASR: Efficient and Scalable Bias Retrieval Framework for Contextual Biasing ASR in Speech LLM
by: Gong, Xun, et al.
Published: (2025)

Joint ASR and Speaker Role Tagging with Serialized Output Training
by: Xu, Anfeng, et al.
Published: (2025)

Scaling Multi-Talker ASR with Speaker-Agnostic Activity Streams
by: He, Xiluo, et al.
Published: (2025)

MSA-ASR: Efficient Multilingual Speaker Attribution with frozen ASR Models
by: Nguyen, Thai-Binh, et al.
Published: (2024)

Mitigating Non-Target Speaker Bias in Guided Speaker Embedding
by: Horiguchi, Shota, et al.
Published: (2025)

ASR for Affective Speech: Investigating Impact of Emotion and Speech Generative Strategy
by: Wu, Ya-Tse, et al.
Published: (2026)

Leveraging ASR Pretrained Conformers for Speaker Verification through Transfer Learning and Knowledge Distillation
by: Cai, Danwei, et al.
Published: (2023)

Mind the Gap: Impact of Synthetic Conversational Data on Multi-Talker ASR and Speaker Diarization
by: Polok, Alexander, et al.
Published: (2026)

NTT Multi-Speaker ASR System for the DASR Task of CHiME-8 Challenge
by: Kamo, Naoyuki, et al.
Published: (2024)

Lightweight Target-Speaker-Based Overlap Transcription for Practical Streaming ASR
by: Pražák, Aleš, et al.
Published: (2025)

Speech Emotion Recognition with ASR Integration
by: Li, Yuanchao
Published: (2026)

Just ASR + LLM? A Study on Speech Large Language Models' Ability to Identify and Understand Speaker in Spoken Dialogue
by: Wu, Junkai, et al.
Published: (2024)

Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models
by: Lin, Yi-Cheng, et al.
Published: (2024)

Speaker Attributed Automatic Speech Recognition Using Speech Aware LLMS
by: Aronowitz, Hagai, et al.
Published: (2026)

ASR-Synchronized Speaker-Role Diarization
by: Ghosh, Arindam, et al.
Published: (2025)

Toward Fair Speech Technologies: A Comprehensive Survey of Bias and Fairness in Speech AI
by: Lin, Yi-Cheng, et al.
Published: (2026)

SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR
by: Fan, Zhiyun, et al.
Published: (2024)

End-to-End Joint ASR and Speaker Role Diarization with Child-Adult Interactions
by: Xu, Anfeng, et al.
Published: (2026)

Bengali-Loop: Community Benchmarks for Long-Form Bangla ASR and Speaker Diarization
by: Tabib, H. M. Shadman, et al.
Published: (2026)

SAML: Speaker Adaptive Mixture of LoRA Experts for End-to-End ASR
by: Zhao, Qiuming, et al.
Published: (2024)

GLOBE: A High-quality English Corpus with Global Accents for Zero-shot Speaker Adaptive Text-to-Speech
by: Wang, Wenbin, et al.
Published: (2024)

Towards a Single ASR Model That Generalizes to Disordered Speech
by: Tobin, Jimmy, et al.
Published: (2024)