:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yang, Chao-Han Huck, Gu, Yile, Liu, Yi-Chieh, Ghosh, Shalini, Bulyko, Ivan, Stolcke, Andreas
Format:	Preprint
Published:	2023
Subjects:	Computation and Language Artificial Intelligence Machine Learning Sound Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2309.15649
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition
by: Yu, Yu, et al.
Published: (2023)

Speech Recognition Rescoring with Large Speech-Text Foundation Models
by: Shivakumar, Prashanth Gurunath, et al.
Published: (2024)

Paralinguistics-Enhanced Large Language Modeling of Spoken Dialogue
by: Lin, Guan-Ting, et al.
Published: (2023)

Multi-Modal Retrieval For Large Language Model Based Speech Recognition
by: Kolehmainen, Jari, et al.
Published: (2024)

Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction
by: Ko, Yuka, et al.
Published: (2024)

Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
by: Sundar, Anirudh S., et al.
Published: (2023)

Spoken Conversational Agents with Large Language Models
by: Yang, Chao-Han Huck, et al.
Published: (2025)

Evolutionary Prompt Design for LLM-Based Post-ASR Error Correction
by: Sachdev, Rithik, et al.
Published: (2024)

Group Relative Policy Optimization for Speech Recognition
by: Shivakumar, Prashanth Gurunath, et al.
Published: (2025)

Investigating Training Strategies and Model Robustness of Low-Rank Adaptation for Language Modeling in Speech Recognition
by: Yu, Yu, et al.
Published: (2024)

Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
by: Yang, Chao-Han Huck, et al.
Published: (2024)

PROCTER: PROnunciation-aware ConTextual adaptER for personalized speech recognition in neural transducers
by: Pandey, Rahul, et al.
Published: (2023)

Reducing Geographic Disparities in Automatic Speech Recognition via Elastic Weight Consolidation
by: Trinh, Viet Anh, et al.
Published: (2022)

Revise, Reason, and Recognize: LLM-Based Emotion Recognition via Emotion-Specific Prompts and ASR Error Correction
by: Li, Yuanchao, et al.
Published: (2024)

Retrieval Augmented Correction of Named Entity Speech Recognition Errors
by: Pusateri, Ernest, et al.
Published: (2024)

Pinyin Regularization in Error Correction for Chinese Speech Recognition with Large Language Models
by: Tang, Zhiyuan, et al.
Published: (2024)

Adversarial Reweighting for Speaker Verification Fairness
by: Jin, Minho, et al.
Published: (2022)

Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition
by: Radhakrishnan, Srijith, et al.
Published: (2023)

Task Oriented Dialogue as a Catalyst for Self-Supervised Automatic Speech Recognition
by: Chan, David M., et al.
Published: (2024)

Audio Large Language Models Can Be Descriptive Speech Quality Evaluators
by: Chen, Chen, et al.
Published: (2025)

A Transcription Prompt-based Efficient Audio Large Language Model for Robust Speech Recognition
by: Li, Yangze, et al.
Published: (2024)

Large Language Models are Efficient Learners of Noise-Robust Speech Recognition
by: Hu, Yuchen, et al.
Published: (2024)

Toward Fairness in Speech Recognition: Discovery and mitigation of performance disparities
by: Dheram, Pranav, et al.
Published: (2022)

Speech Recognition for Analysis of Police Radio Communication
by: Srivastava, Tejes, et al.
Published: (2024)

MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition
by: Mu, Bingshen, et al.
Published: (2024)

UCorrect: An Unsupervised Framework for Automatic Speech Recognition Error Correction
by: Guo, Jiaxin, et al.
Published: (2024)

Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks
by: Everson, Kevin, et al.
Published: (2024)

An approach to measuring the performance of Automatic Speech Recognition (ASR) models in the context of Large Language Model (LLM) powered applications
by: Pulikodan, Sujith, et al.
Published: (2025)

Zero-Shot Recognition of Dysarthric Speech Using Commercial Automatic Speech Recognition and Multimodal Large Language Models
by: Alsayegh, Ali, et al.
Published: (2025)

It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition
by: Chen, Chen, et al.
Published: (2024)

Mixture of LoRA Experts with Multi-Modal and Multi-Granularity LLM Generative Error Correction for Accented Speech Recognition
by: Mu, Bingshen, et al.
Published: (2025)

Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance
by: Ochiai, Tsubasa, et al.
Published: (2024)

Post-Training Embedding Alignment for Decoupling Enrollment and Runtime Speaker Recognition Models
by: Gao, Chenyang, et al.
Published: (2024)

Improving fairness in speaker verification via Group-adapted Fusion Network
by: Shen, Hua, et al.
Published: (2022)

Enhancing Speech Large Language Models with Prompt-Aware Mixture of Audio Encoders
by: Shan, Weiqiao, et al.
Published: (2025)

Testing Correctness, Fairness, and Robustness of Speech Emotion Recognition Models
by: Derington, Anna, et al.
Published: (2023)

EmoQ: Speech Emotion Recognition via Speech-Aware Q-Former and Large Language Model
by: Yang, Yiqing, et al.
Published: (2025)

DeSTA2: Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
by: Lu, Ke-Han, et al.
Published: (2024)

Error Correction by Paying Attention to Both Acoustic and Confidence References for Automatic Speech Recognition
by: Shu, Yuchun, et al.
Published: (2024)

Streaming Speech Recognition with Decoder-Only Large Language Models and Latency Optimization
by: Wan, Genshun, et al.
Published: (2026)