Saved in:
| Main Authors: | Fang, Ying, Li, Xiaofei |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.14653 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
An efficient text augmentation approach for contextualized Mandarin speech recognition
by: Zheng, Naijun, et al.
Published: (2024)
by: Zheng, Naijun, et al.
Published: (2024)
A layer-wise analysis of Mandarin and English suprasegmentals in SSL speech models
by: de la Fuente, Antón, et al.
Published: (2024)
by: de la Fuente, Antón, et al.
Published: (2024)
Introducing MELI: the Mandarin-English Language Interview Corpus
by: Liu, Suyuan, et al.
Published: (2026)
by: Liu, Suyuan, et al.
Published: (2026)
Prominence-aware automatic speech recognition for conversational speech
by: Linke, Julian, et al.
Published: (2025)
by: Linke, Julian, et al.
Published: (2025)
The evaluation of a code-switched Sepedi-English automatic speech recognition system
by: Phaladi, Amanda, et al.
Published: (2024)
by: Phaladi, Amanda, et al.
Published: (2024)
Introduction to speech recognition
by: Dauphin, Gabriel
Published: (2024)
by: Dauphin, Gabriel
Published: (2024)
Form and meaning co-determine the realization of tone in Taiwan Mandarin spontaneous speech: the case of T2-T3 and T3-T3 tone sandhi
by: Lu, Yuxin, et al.
Published: (2024)
by: Lu, Yuxin, et al.
Published: (2024)
DiffNorm: Self-Supervised Normalization for Non-autoregressive Speech-to-speech Translation
by: Tan, Weiting, et al.
Published: (2024)
by: Tan, Weiting, et al.
Published: (2024)
Punctuation Restoration for Singaporean Spoken Languages: English, Malay, and Mandarin
by: Rao, Abhinav, et al.
Published: (2022)
by: Rao, Abhinav, et al.
Published: (2022)
Improving child speech recognition with augmented child-like speech
by: Zhang, Yuanyuan, et al.
Published: (2024)
by: Zhang, Yuanyuan, et al.
Published: (2024)
Unimodal Aggregation for CTC-based Speech Recognition
by: Fang, Ying, et al.
Published: (2023)
by: Fang, Ying, et al.
Published: (2023)
Mamba for Streaming ASR Combined with Unimodal Aggregation
by: Fang, Ying, et al.
Published: (2024)
by: Fang, Ying, et al.
Published: (2024)
Word-specific tonal realizations in Mandarin
by: Chuang, Yu-Ying, et al.
Published: (2024)
by: Chuang, Yu-Ying, et al.
Published: (2024)
Automated evaluation of LLMs for effective machine translation of Mandarin Chinese to English
by: Zhang, Yue, et al.
Published: (2026)
by: Zhang, Yue, et al.
Published: (2026)
Multilingual Stutter Event Detection for English, German, and Mandarin Speech
by: Haas, Felix, et al.
Published: (2026)
by: Haas, Felix, et al.
Published: (2026)
Advancing LLM-based phoneme-to-grapheme for multilingual speech recognition
by: Dong, Lukuang, et al.
Published: (2026)
by: Dong, Lukuang, et al.
Published: (2026)
Direct Preference Optimization for English-Mandarin Code-Switching Speech Recognition in Audio LLMs
by: Quang, Trung Nguyen, et al.
Published: (2026)
by: Quang, Trung Nguyen, et al.
Published: (2026)
CS3-Bench: Evaluating and Enhancing Speech-to-Speech LLMs for Mandarin-English Code-Switching
by: Liu, Heyang, et al.
Published: (2025)
by: Liu, Heyang, et al.
Published: (2025)
Enrolment-based personalisation for improving individual-level fairness in speech emotion recognition
by: Triantafyllopoulos, Andreas, et al.
Published: (2024)
by: Triantafyllopoulos, Andreas, et al.
Published: (2024)
XLM: A Python package for non-autoregressive language models
by: Patel, Dhruvesh, et al.
Published: (2025)
by: Patel, Dhruvesh, et al.
Published: (2025)
LLM-based phoneme-to-grapheme for phoneme-based speech recognition
by: Ma, Te, et al.
Published: (2025)
by: Ma, Te, et al.
Published: (2025)
Self-consistent context aware conformer transducer for speech recognition
by: Kolokolov, Konstantin, et al.
Published: (2024)
by: Kolokolov, Konstantin, et al.
Published: (2024)
Advancing Speech Translation: A Corpus of Mandarin-English Conversational Telephone Speech
by: Wotherspoon, Shannon, et al.
Published: (2024)
by: Wotherspoon, Shannon, et al.
Published: (2024)
A unified front-end framework for English text-to-speech synthesis
by: Ying, Zelin, et al.
Published: (2023)
by: Ying, Zelin, et al.
Published: (2023)
Identifying and typifying demographic unfairness in phoneme-level embeddings of self-supervised speech recognition models
by: Herron, Felix, et al.
Published: (2026)
by: Herron, Felix, et al.
Published: (2026)
Convoifilter: A case study of doing cocktail party speech recognition
by: Nguyen, Thai-Binh, et al.
Published: (2023)
by: Nguyen, Thai-Binh, et al.
Published: (2023)
Automated speech audiometry: Can it work using open-source pre-trained Kaldi-NL automatic speech recognition?
by: Araiza-Illan, Gloria, et al.
Published: (2023)
by: Araiza-Illan, Gloria, et al.
Published: (2023)
Tag and correct: high precision post-editing approach to correction of speech recognition errors
by: Ziętkiewicz, Tomasz
Published: (2024)
by: Ziętkiewicz, Tomasz
Published: (2024)
Measuring Taiwanese Mandarin Language Understanding
by: Chen, Po-Heng, et al.
Published: (2024)
by: Chen, Po-Heng, et al.
Published: (2024)
Exploring the limits of decoder-only models trained on public speech recognition corpora
by: Gupta, Ankit, et al.
Published: (2024)
by: Gupta, Ankit, et al.
Published: (2024)
asr_eval: Algorithms and tools for multi-reference and streaming speech recognition evaluation
by: Sedukhin, Oleg, et al.
Published: (2026)
by: Sedukhin, Oleg, et al.
Published: (2026)
More than words: Advancements and challenges in speech recognition for singing
by: Kruspe, Anna
Published: (2024)
by: Kruspe, Anna
Published: (2024)
Integrating automatic speech recognition into remote healthcare interpreting: A pilot study of its impact on interpreting quality
by: Tan, Shiyi, et al.
Published: (2025)
by: Tan, Shiyi, et al.
Published: (2025)
Paraformer-v2: An improved non-autoregressive transformer for noise-robust speech recognition
by: An, Keyu, et al.
Published: (2024)
by: An, Keyu, et al.
Published: (2024)
LLMs for automatic annotation of Mandarin narrative transcripts
by: Zhao, Qingwen, et al.
Published: (2026)
by: Zhao, Qingwen, et al.
Published: (2026)
Acquisition of Recursive Possessives and Recursive Locatives in Mandarin
by: Fu, Chenxi, et al.
Published: (2024)
by: Fu, Chenxi, et al.
Published: (2024)
Automatic speech recognition for the Nepali language using CNN, bidirectional LSTM and ResNet
by: Dhakal, Manish, et al.
Published: (2024)
by: Dhakal, Manish, et al.
Published: (2024)
The realization of tones in spontaneous spoken Taiwan Mandarin: a corpus-based survey and theory-driven computational modeling
by: Lu, Yuxin, et al.
Published: (2025)
by: Lu, Yuxin, et al.
Published: (2025)
Predicting positive transfer for improved low-resource speech recognition using acoustic pseudo-tokens
by: San, Nay, et al.
Published: (2024)
by: San, Nay, et al.
Published: (2024)
What Have We Achieved on Non-autoregressive Translation?
by: Li, Yafu, et al.
Published: (2024)
by: Li, Yafu, et al.
Published: (2024)
Similar Items
-
An efficient text augmentation approach for contextualized Mandarin speech recognition
by: Zheng, Naijun, et al.
Published: (2024) -
A layer-wise analysis of Mandarin and English suprasegmentals in SSL speech models
by: de la Fuente, Antón, et al.
Published: (2024) -
Introducing MELI: the Mandarin-English Language Interview Corpus
by: Liu, Suyuan, et al.
Published: (2026) -
Prominence-aware automatic speech recognition for conversational speech
by: Linke, Julian, et al.
Published: (2025) -
The evaluation of a code-switched Sepedi-English automatic speech recognition system
by: Phaladi, Amanda, et al.
Published: (2024)