Saved in:
| Main Authors: | Parikh, Aditya Kamlesh, Tejedor-Garcia, Cristian, Cucchiarini, Catia, Strik, Helmer |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.02080 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Evaluating Logit-Based GOP Scores for Mispronunciation Detection
by: Parikh, Aditya Kamlesh, et al.
Published: (2025)
by: Parikh, Aditya Kamlesh, et al.
Published: (2025)
Rubric-Guided Fine-tuning of SpeechLLMs for Multi-Aspect, Multi-Rater L2 Reading-Speech Assessment
by: Parikh, Aditya Kamlesh, et al.
Published: (2026)
by: Parikh, Aditya Kamlesh, et al.
Published: (2026)
Zero-Shot Speech LLMs for Multi-Aspect Evaluation of L2 Speech: Challenges and Opportunities
by: Parikh, Aditya Kamlesh, et al.
Published: (2026)
by: Parikh, Aditya Kamlesh, et al.
Published: (2026)
Improving Child Speech Recognition and Reading Mistake Detection by Using Prompts
by: Gao, Lingyun, et al.
Published: (2025)
by: Gao, Lingyun, et al.
Published: (2025)
Reading Miscue Detection in Primary School through Automatic Speech Recognition
by: Gao, Lingyun, et al.
Published: (2024)
by: Gao, Lingyun, et al.
Published: (2024)
Utterance-Level Methods for Identifying Reliable ASR-Output for Child Speech
by: Lathouwers, Gus, et al.
Published: (2026)
by: Lathouwers, Gus, et al.
Published: (2026)
Automatic Assessment of Oral Reading Accuracy for Reading Diagnostics
by: Molenaar, Bo, et al.
Published: (2023)
by: Molenaar, Bo, et al.
Published: (2023)
Automatic Speech Recognition of Non-Native Child Speech for Language Learning Applications
by: Wills, Simone, et al.
Published: (2023)
by: Wills, Simone, et al.
Published: (2023)
An ASR-Based Tutor for Learning to Read: How to Optimize Feedback to First Graders
by: Bai, Yu, et al.
Published: (2023)
by: Bai, Yu, et al.
Published: (2023)
Evaluating the Effectiveness of Pre-Trained Audio Embeddings for Classification of Parkinson's Disease Speech Data
by: Postma, Emmy, et al.
Published: (2025)
by: Postma, Emmy, et al.
Published: (2025)
Counterfactual Activation Editing for Post-hoc Prosody and Mispronunciation Correction in TTS Models
by: Lee, Kyowoon, et al.
Published: (2025)
by: Lee, Kyowoon, et al.
Published: (2025)
Alzheimer Disease Classification through ASR-based Transcriptions: Exploring the Impact of Punctuation and Pauses
by: Gómez-Zaragozá, Lucía, et al.
Published: (2023)
by: Gómez-Zaragozá, Lucía, et al.
Published: (2023)
CTC-TTS: LLM-based dual-streaming text-to-speech with CTC alignment
by: Liu, Hanwen, et al.
Published: (2026)
by: Liu, Hanwen, et al.
Published: (2026)
Enhancing CTC-based speech recognition with diverse modeling units
by: Han, Shiyi, et al.
Published: (2024)
by: Han, Shiyi, et al.
Published: (2024)
CTC-Assisted LLM-Based Contextual ASR
by: Yang, Guanrou, et al.
Published: (2024)
by: Yang, Guanrou, et al.
Published: (2024)
CTC-GMM: CTC guided modality matching for fast and accurate streaming speech translation
by: Zhao, Rui, et al.
Published: (2024)
by: Zhao, Rui, et al.
Published: (2024)
Innovative Speech-Based Deep Learning Approaches for Parkinson's Disease Classification: A Systematic Review
by: van Gelderen, Lisanne, et al.
Published: (2024)
by: van Gelderen, Lisanne, et al.
Published: (2024)
TASU2: Controllable CTC Simulation for Alignment and Low-Resource Adaptation of Speech LLMs
by: Peng, Jing, et al.
Published: (2026)
by: Peng, Jing, et al.
Published: (2026)
Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC
by: Kang, Jiawen, et al.
Published: (2024)
by: Kang, Jiawen, et al.
Published: (2024)
CTC-aligned Audio-Text Embedding for Streaming Open-vocabulary Keyword Spotting
by: Jin, Sichen, et al.
Published: (2024)
by: Jin, Sichen, et al.
Published: (2024)
Phonology-Guided Speech-to-Speech Translation for African Languages
by: Ochieng, Peter, et al.
Published: (2024)
by: Ochieng, Peter, et al.
Published: (2024)
RECA-PD: A Robust Explainable Cross-Attention Method for Speech-based Parkinson's Disease Classification
by: Zhong, Terry Yi, et al.
Published: (2025)
by: Zhong, Terry Yi, et al.
Published: (2025)
Scenario of Use Scheme: Threat Model Specification for Speaker Privacy Protection in the Medical Domain
by: Rahman, Mehtab Ur, et al.
Published: (2024)
by: Rahman, Mehtab Ur, et al.
Published: (2024)
A Benchmark for Early-stage Parkinson's Disease Detection from Speech
by: Zhong, Terry Yi, et al.
Published: (2026)
by: Zhong, Terry Yi, et al.
Published: (2026)
PhonologyBench: Evaluating Phonological Skills of Large Language Models
by: Suvarna, Ashima, et al.
Published: (2024)
by: Suvarna, Ashima, et al.
Published: (2024)
Evaluating the Usefulness of Non-Diagnostic Speech Data for Developing Parkinson's Disease Classifiers
by: Zhong, Terry Yi, et al.
Published: (2025)
by: Zhong, Terry Yi, et al.
Published: (2025)
FlexCTC: GPU-powered CTC Beam Decoding With Advanced Contextual Abilities
by: Grigoryan, Lilit, et al.
Published: (2025)
by: Grigoryan, Lilit, et al.
Published: (2025)
Fast Context-Biasing for CTC and Transducer ASR models with CTC-based Word Spotter
by: Andrusenko, Andrei, et al.
Published: (2024)
by: Andrusenko, Andrei, et al.
Published: (2024)
Cross-modal Knowledge Transfer Learning as Graph Matching Based on Optimal Transport for ASR
by: Lu, Xugang, et al.
Published: (2025)
by: Lu, Xugang, et al.
Published: (2025)
Less Peaky and More Accurate CTC Forced Alignment by Label Priors
by: Huang, Ruizhe, et al.
Published: (2024)
by: Huang, Ruizhe, et al.
Published: (2024)
Audio Deepfake Detection in the Age of Advanced Text-to-Speech models
by: Singh, Robin, et al.
Published: (2026)
by: Singh, Robin, et al.
Published: (2026)
Adaptive Knowledge Distillation for Device-Directed Speech Detection
by: Chi, Hyung Gun, et al.
Published: (2025)
by: Chi, Hyung Gun, et al.
Published: (2025)
kNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels
by: Zhou, Jiaming, et al.
Published: (2023)
by: Zhou, Jiaming, et al.
Published: (2023)
Mispronunciation Detection and Diagnosis Without Model Training: A Retrieval-Based Approach
by: Tu, Huu Tuong, et al.
Published: (2025)
by: Tu, Huu Tuong, et al.
Published: (2025)
Replay Attacks Against Audio Deepfake Detection
by: Müller, Nicolas, et al.
Published: (2025)
by: Müller, Nicolas, et al.
Published: (2025)
Studying the Effect of Audio Filters in Pre-Trained Models for Environmental Sound Classification
by: Dawn, Aditya, et al.
Published: (2024)
by: Dawn, Aditya, et al.
Published: (2024)
Pitch-Aware RNN-T for Mandarin Chinese Mispronunciation Detection and Diagnosis
by: Wang, Xintong, et al.
Published: (2024)
by: Wang, Xintong, et al.
Published: (2024)
Beyond Acoustic Sparsity and Linguistic Bias: A Prompt-Free Paradigm for Mispronunciation Detection and Diagnosis
by: Geng, Haopeng, et al.
Published: (2026)
by: Geng, Haopeng, et al.
Published: (2026)
LV-CTC: Non-autoregressive ASR with CTC and latent variable models
by: Fujita, Yuya, et al.
Published: (2024)
by: Fujita, Yuya, et al.
Published: (2024)
Improving Pretrained YAMNet for Enhanced Speech Command Detection via Transfer Learning
by: Lachenani, Sidahmed, et al.
Published: (2025)
by: Lachenani, Sidahmed, et al.
Published: (2025)
Similar Items
-
Evaluating Logit-Based GOP Scores for Mispronunciation Detection
by: Parikh, Aditya Kamlesh, et al.
Published: (2025) -
Rubric-Guided Fine-tuning of SpeechLLMs for Multi-Aspect, Multi-Rater L2 Reading-Speech Assessment
by: Parikh, Aditya Kamlesh, et al.
Published: (2026) -
Zero-Shot Speech LLMs for Multi-Aspect Evaluation of L2 Speech: Challenges and Opportunities
by: Parikh, Aditya Kamlesh, et al.
Published: (2026) -
Improving Child Speech Recognition and Reading Mistake Detection by Using Prompts
by: Gao, Lingyun, et al.
Published: (2025) -
Reading Miscue Detection in Primary School through Automatic Speech Recognition
by: Gao, Lingyun, et al.
Published: (2024)