:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Xi, Yu, Ding, Wen, Yu, Kai, Lai, Junjie
Format:	Preprint
Published:	2024
Subjects:	Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2407.04219
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Romanization Encoding For Multilingual ASR
by: Ding, Wen, et al.
Published: (2024)

Better Semi-supervised Learning for Multi-domain ASR Through Incremental Retraining and Data Filtering
by: Carofilis, Andres, et al.
Published: (2025)

Align-Consistency: Improving Non-autoregressive and Semi-supervised ASR with Consistency Regularization
by: Huang, Wanting, et al.
Published: (2026)

MOSA: Mixtures of Simple Adapters Outperform Monolithic Approaches in LLM-based Multilingual ASR
by: Li, Junjie, et al.
Published: (2025)

TC-BiMamba: Trans-Chunk bidirectionally within BiMamba for unified streaming and non-streaming ASR
by: She, Qingshun, et al.
Published: (2026)

DM-ASR: Diarization-aware Multi-speaker ASR with Large Language Models
by: Li, Li, et al.
Published: (2026)

Doctor or Patient? Synergizing Diarization and ASR for Code-Switched Hinglish Medical Conditions Extraction
by: Baroudi, Séverin, et al.
Published: (2026)

AsyncSwitch: Asynchronous Text-Speech Adaptation for Code-Switched ASR
by: Nguyen, Tuan, et al.
Published: (2025)

LESS: Large Language Model Enhanced Semi-Supervised Learning for Speech Foundational Models Using in-the-wild Data
by: Ding, Wen, et al.
Published: (2025)

Advancing Multi-talker ASR Performance with Large Language Models
by: Shi, Mohan, et al.
Published: (2024)

A Survey on Speech Large Language Models for Understanding
by: Peng, Jing, et al.
Published: (2024)

Optimizing ASR for Catalan-Spanish Code-Switching: A Comparative Analysis of Methodologies
by: Mena, Carlos, et al.
Published: (2025)

Masked Self-distilled Transducer-based Keyword Spotting with Semi-autoregressive Decoding
by: Xi, Yu, et al.
Published: (2025)

FairASR: Fair Audio Contrastive Learning for Automatic Speech Recognition
by: Kim, Jongsuk, et al.
Published: (2025)

Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision
by: Yang, Chih-Kai, et al.
Published: (2023)

Boosting Code-Switching ASR with Mixture of Experts Enhanced Speech-Conditioned LLM
by: Zhang, Fengrun, et al.
Published: (2024)

TASU: Text-Only Alignment for Speech Understanding
by: Peng, Jing, et al.
Published: (2025)

NTC-KWS: Noise-aware CTC for Robust Keyword Spotting
by: Xi, Yu, et al.
Published: (2024)

Contextual Biasing for LLM-Based ASR with Hotword Retrieval and Reinforcement Learning
by: Kong, YuXiang, et al.
Published: (2025)

SC-MoE: Switch Conformer Mixture of Experts for Unified Streaming and Non-streaming Code-Switching ASR
by: Ye, Shuaishuai, et al.
Published: (2024)

Neural Directed Speech Enhancement with Dual Microphone Array in High Noise Scenario
by: Wen, Wen, et al.
Published: (2024)

ASR-EC Benchmark: Evaluating Large Language Models on Chinese ASR Error Correction
by: Wei, Victor Junqiu, et al.
Published: (2024)

CAMEL: Cross-Attention Enhanced Mixture-of-Experts and Language Bias for Code-Switching Speech Recognition
by: Wang, He, et al.
Published: (2024)

Large Language Models based ASR Error Correction for Child Conversations
by: Xu, Anfeng, et al.
Published: (2025)

AdaCS: Adaptive Normalization for Enhanced Code-Switching ASR
by: Chu, The Chuong, et al.
Published: (2025)

Robust ASR Error Correction with Conservative Data Filtering
by: Udagawa, Takuma, et al.
Published: (2024)

Multi-Channel Multi-Speaker ASR Using Target Speaker's Solo Segment
by: Shao, Yiwen, et al.
Published: (2024)

UniVoice: Unifying Autoregressive ASR and Flow-Matching based TTS with Large Language Models
by: Guan, Wenhao, et al.
Published: (2025)

ASR Error Correction using Large Language Models
by: Ma, Rao, et al.
Published: (2024)

Boosting Multi-Speaker Expressive Speech Synthesis with Semi-supervised Contrastive Learning
by: Zhu, Xinfa, et al.
Published: (2023)

Self-supervised ASR Models and Features For Dysarthric and Elderly Speech Recognition
by: Hu, Shujie, et al.
Published: (2024)

Benchmarking Children's ASR with Supervised and Self-supervised Speech Foundation Models
by: Fan, Ruchao, et al.
Published: (2024)

Federated Learning of Large ASR Models in the Real World
by: Xiao, Yonghui, et al.
Published: (2024)

Contrastive Learning With Audio Discrimination For Customizable Keyword Spotting In Continuous Speech
by: Xi, Yu, et al.
Published: (2024)

Inverse-Hessian Regularization for Continual Learning in ASR
by: Eeckt, Steven Vander, et al.
Published: (2026)

Efficient Scaling for LLM-based ASR
by: Mu, Bingshen, et al.
Published: (2025)

Context and System Fusion in Post-ASR Emotion Recognition with Large Language Models
by: Stepachev, Pavel, et al.
Published: (2024)

All-in-One ASR: Unifying Encoder-Decoder Models of CTC, Attention, and Transducer in Dual-Mode ASR
by: Moriya, Takafumi, et al.
Published: (2025)

End-to-end Acoustic-linguistic Emotion and Intent Recognition Enhanced by Semi-supervised Learning
by: Ren, Zhao, et al.
Published: (2025)

MaLa-ASR: Multimedia-Assisted LLM-Based ASR
by: Yang, Guanrou, et al.
Published: (2024)