Saved in:
| Main Authors: | Yang, Chao-Han Huck, Gu, Yile, Liu, Yi-Chieh, Ghosh, Shalini, Bulyko, Ivan, Stolcke, Andreas |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2309.15649 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition
by: Yu, Yu, et al.
Published: (2023)
by: Yu, Yu, et al.
Published: (2023)
Speech Recognition Rescoring with Large Speech-Text Foundation Models
by: Shivakumar, Prashanth Gurunath, et al.
Published: (2024)
by: Shivakumar, Prashanth Gurunath, et al.
Published: (2024)
Paralinguistics-Enhanced Large Language Modeling of Spoken Dialogue
by: Lin, Guan-Ting, et al.
Published: (2023)
by: Lin, Guan-Ting, et al.
Published: (2023)
Multi-Modal Retrieval For Large Language Model Based Speech Recognition
by: Kolehmainen, Jari, et al.
Published: (2024)
by: Kolehmainen, Jari, et al.
Published: (2024)
Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction
by: Ko, Yuka, et al.
Published: (2024)
by: Ko, Yuka, et al.
Published: (2024)
Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
by: Sundar, Anirudh S., et al.
Published: (2023)
by: Sundar, Anirudh S., et al.
Published: (2023)
Spoken Conversational Agents with Large Language Models
by: Yang, Chao-Han Huck, et al.
Published: (2025)
by: Yang, Chao-Han Huck, et al.
Published: (2025)
Evolutionary Prompt Design for LLM-Based Post-ASR Error Correction
by: Sachdev, Rithik, et al.
Published: (2024)
by: Sachdev, Rithik, et al.
Published: (2024)
Group Relative Policy Optimization for Speech Recognition
by: Shivakumar, Prashanth Gurunath, et al.
Published: (2025)
by: Shivakumar, Prashanth Gurunath, et al.
Published: (2025)
Investigating Training Strategies and Model Robustness of Low-Rank Adaptation for Language Modeling in Speech Recognition
by: Yu, Yu, et al.
Published: (2024)
by: Yu, Yu, et al.
Published: (2024)
Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
by: Yang, Chao-Han Huck, et al.
Published: (2024)
by: Yang, Chao-Han Huck, et al.
Published: (2024)
PROCTER: PROnunciation-aware ConTextual adaptER for personalized speech recognition in neural transducers
by: Pandey, Rahul, et al.
Published: (2023)
by: Pandey, Rahul, et al.
Published: (2023)
Reducing Geographic Disparities in Automatic Speech Recognition via Elastic Weight Consolidation
by: Trinh, Viet Anh, et al.
Published: (2022)
by: Trinh, Viet Anh, et al.
Published: (2022)
Revise, Reason, and Recognize: LLM-Based Emotion Recognition via Emotion-Specific Prompts and ASR Error Correction
by: Li, Yuanchao, et al.
Published: (2024)
by: Li, Yuanchao, et al.
Published: (2024)
Retrieval Augmented Correction of Named Entity Speech Recognition Errors
by: Pusateri, Ernest, et al.
Published: (2024)
by: Pusateri, Ernest, et al.
Published: (2024)
Pinyin Regularization in Error Correction for Chinese Speech Recognition with Large Language Models
by: Tang, Zhiyuan, et al.
Published: (2024)
by: Tang, Zhiyuan, et al.
Published: (2024)
Adversarial Reweighting for Speaker Verification Fairness
by: Jin, Minho, et al.
Published: (2022)
by: Jin, Minho, et al.
Published: (2022)
Whispering LLaMA: A Cross-Modal Generative Error Correction Framework for Speech Recognition
by: Radhakrishnan, Srijith, et al.
Published: (2023)
by: Radhakrishnan, Srijith, et al.
Published: (2023)
Task Oriented Dialogue as a Catalyst for Self-Supervised Automatic Speech Recognition
by: Chan, David M., et al.
Published: (2024)
by: Chan, David M., et al.
Published: (2024)
Audio Large Language Models Can Be Descriptive Speech Quality Evaluators
by: Chen, Chen, et al.
Published: (2025)
by: Chen, Chen, et al.
Published: (2025)
A Transcription Prompt-based Efficient Audio Large Language Model for Robust Speech Recognition
by: Li, Yangze, et al.
Published: (2024)
by: Li, Yangze, et al.
Published: (2024)
Large Language Models are Efficient Learners of Noise-Robust Speech Recognition
by: Hu, Yuchen, et al.
Published: (2024)
by: Hu, Yuchen, et al.
Published: (2024)
Toward Fairness in Speech Recognition: Discovery and mitigation of performance disparities
by: Dheram, Pranav, et al.
Published: (2022)
by: Dheram, Pranav, et al.
Published: (2022)
Speech Recognition for Analysis of Police Radio Communication
by: Srivastava, Tejes, et al.
Published: (2024)
by: Srivastava, Tejes, et al.
Published: (2024)
MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition
by: Mu, Bingshen, et al.
Published: (2024)
by: Mu, Bingshen, et al.
Published: (2024)
UCorrect: An Unsupervised Framework for Automatic Speech Recognition Error Correction
by: Guo, Jiaxin, et al.
Published: (2024)
by: Guo, Jiaxin, et al.
Published: (2024)
Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks
by: Everson, Kevin, et al.
Published: (2024)
by: Everson, Kevin, et al.
Published: (2024)
An approach to measuring the performance of Automatic Speech Recognition (ASR) models in the context of Large Language Model (LLM) powered applications
by: Pulikodan, Sujith, et al.
Published: (2025)
by: Pulikodan, Sujith, et al.
Published: (2025)
Zero-Shot Recognition of Dysarthric Speech Using Commercial Automatic Speech Recognition and Multimodal Large Language Models
by: Alsayegh, Ali, et al.
Published: (2025)
by: Alsayegh, Ali, et al.
Published: (2025)
It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition
by: Chen, Chen, et al.
Published: (2024)
by: Chen, Chen, et al.
Published: (2024)
Mixture of LoRA Experts with Multi-Modal and Multi-Granularity LLM Generative Error Correction for Accented Speech Recognition
by: Mu, Bingshen, et al.
Published: (2025)
by: Mu, Bingshen, et al.
Published: (2025)
Rethinking Processing Distortions: Disentangling the Impact of Speech Enhancement Errors on Speech Recognition Performance
by: Ochiai, Tsubasa, et al.
Published: (2024)
by: Ochiai, Tsubasa, et al.
Published: (2024)
Post-Training Embedding Alignment for Decoupling Enrollment and Runtime Speaker Recognition Models
by: Gao, Chenyang, et al.
Published: (2024)
by: Gao, Chenyang, et al.
Published: (2024)
Improving fairness in speaker verification via Group-adapted Fusion Network
by: Shen, Hua, et al.
Published: (2022)
by: Shen, Hua, et al.
Published: (2022)
Enhancing Speech Large Language Models with Prompt-Aware Mixture of Audio Encoders
by: Shan, Weiqiao, et al.
Published: (2025)
by: Shan, Weiqiao, et al.
Published: (2025)
Testing Correctness, Fairness, and Robustness of Speech Emotion Recognition Models
by: Derington, Anna, et al.
Published: (2023)
by: Derington, Anna, et al.
Published: (2023)
EmoQ: Speech Emotion Recognition via Speech-Aware Q-Former and Large Language Model
by: Yang, Yiqing, et al.
Published: (2025)
by: Yang, Yiqing, et al.
Published: (2025)
DeSTA2: Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
by: Lu, Ke-Han, et al.
Published: (2024)
by: Lu, Ke-Han, et al.
Published: (2024)
Error Correction by Paying Attention to Both Acoustic and Confidence References for Automatic Speech Recognition
by: Shu, Yuchun, et al.
Published: (2024)
by: Shu, Yuchun, et al.
Published: (2024)
Streaming Speech Recognition with Decoder-Only Large Language Models and Latency Optimization
by: Wan, Genshun, et al.
Published: (2026)
by: Wan, Genshun, et al.
Published: (2026)
Similar Items
-
Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition
by: Yu, Yu, et al.
Published: (2023) -
Speech Recognition Rescoring with Large Speech-Text Foundation Models
by: Shivakumar, Prashanth Gurunath, et al.
Published: (2024) -
Paralinguistics-Enhanced Large Language Modeling of Spoken Dialogue
by: Lin, Guan-Ting, et al.
Published: (2023) -
Multi-Modal Retrieval For Large Language Model Based Speech Recognition
by: Kolehmainen, Jari, et al.
Published: (2024) -
Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction
by: Ko, Yuka, et al.
Published: (2024)