Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Xu, Ziqi, Liu, Yi, Li, Yuekang, Shi, Ling, Wang, Kailong, Zhao, Yongxin
Format:	Preprint
Published:	2026
Subjects:	Computers and Society
Online Access:	https://arxiv.org/abs/2601.10223
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914256309452800
author	Xu, Ziqi Liu, Yi Li, Yuekang Shi, Ling Wang, Kailong Zhao, Yongxin
author_facet	Xu, Ziqi Liu, Yi Li, Yuekang Shi, Ling Wang, Kailong Zhao, Yongxin
contents	People who stutter (PWS) face systemic exclusion in today's voice-driven society, where access to voice assistants, authentication systems, and remote work tools increasingly depends on fluent speech. Current automatic speech recognition (ASR) systems, trained predominantly on fluent speech, fail to serve millions of PWS worldwide. We present STEAMROLLER, a real time system that transforms stuttered speech into fluent output through a novel multi-stage, multi-agent AI pipeline. Our approach addresses three critical technical challenges: (1) the difficulty of direct speech to speech conversion for disfluent input, (2) semantic distortions introduced during ASR transcription of stuttered speech, and (3) latency constraints for real time communication. STEAMROLLER employs a three stage architecture comprising ASR transcription, multi-agent text repair, and speech synthesis, where our core innovation lies in a collaborative multi-agent framework that iteratively refines transcripts while preserving semantic intent. Experiments on the FluencyBank dataset and a user study demonstrates clear word error rate (WER) reduction and strong user satisfaction. Beyond immediate accessibility benefits, fine tuning ASR on STEAMROLLER repaired speech further yields additional WER improvements, creating a pathway toward inclusive AI ecosystems.
format	Preprint
id	arxiv_https___arxiv_org_abs_2601_10223
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	STEAMROLLER: A Multi-Agent System for Inclusive Automatic Speech Recognition for People who Stutter Xu, Ziqi Liu, Yi Li, Yuekang Shi, Ling Wang, Kailong Zhao, Yongxin Computers and Society People who stutter (PWS) face systemic exclusion in today's voice-driven society, where access to voice assistants, authentication systems, and remote work tools increasingly depends on fluent speech. Current automatic speech recognition (ASR) systems, trained predominantly on fluent speech, fail to serve millions of PWS worldwide. We present STEAMROLLER, a real time system that transforms stuttered speech into fluent output through a novel multi-stage, multi-agent AI pipeline. Our approach addresses three critical technical challenges: (1) the difficulty of direct speech to speech conversion for disfluent input, (2) semantic distortions introduced during ASR transcription of stuttered speech, and (3) latency constraints for real time communication. STEAMROLLER employs a three stage architecture comprising ASR transcription, multi-agent text repair, and speech synthesis, where our core innovation lies in a collaborative multi-agent framework that iteratively refines transcripts while preserving semantic intent. Experiments on the FluencyBank dataset and a user study demonstrates clear word error rate (WER) reduction and strong user satisfaction. Beyond immediate accessibility benefits, fine tuning ASR on STEAMROLLER repaired speech further yields additional WER improvements, creating a pathway toward inclusive AI ecosystems.
title	STEAMROLLER: A Multi-Agent System for Inclusive Automatic Speech Recognition for People who Stutter
topic	Computers and Society
url	https://arxiv.org/abs/2601.10223

Similar Items