:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Gómez-Zaragozá, Lucía, Wills, Simone, Tejedor-Garcia, Cristian, Marín-Morales, Javier, Alcañiz, Mariano, Strik, Helmer
Format:	Preprint
Published:	2023
Subjects:	Computation and Language Sound Audio and Speech Processing Signal Processing
Online Access:	https://arxiv.org/abs/2306.03443
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Automatic Speech Recognition of Non-Native Child Speech for Language Learning Applications
by: Wills, Simone, et al.
Published: (2023)

An ASR-Based Tutor for Learning to Read: How to Optimize Feedback to First Graders
by: Bai, Yu, et al.
Published: (2023)

Automatic Assessment of Oral Reading Accuracy for Reading Diagnostics
by: Molenaar, Bo, et al.
Published: (2023)

Evaluating Logit-Based GOP Scores for Mispronunciation Detection
by: Parikh, Aditya Kamlesh, et al.
Published: (2025)

Improving Child Speech Recognition and Reading Mistake Detection by Using Prompts
by: Gao, Lingyun, et al.
Published: (2025)

Rubric-Guided Fine-tuning of SpeechLLMs for Multi-Aspect, Multi-Rater L2 Reading-Speech Assessment
by: Parikh, Aditya Kamlesh, et al.
Published: (2026)

Zero-Shot Speech LLMs for Multi-Aspect Evaluation of L2 Speech: Challenges and Opportunities
by: Parikh, Aditya Kamlesh, et al.
Published: (2026)

Leveraging Prompt Learning and Pause Encoding for Alzheimer's Disease Detection
by: Liu, Yin-Long, et al.
Published: (2024)

End-to-end Joint Punctuated and Normalized ASR with a Limited Amount of Punctuated Training Data
by: Cui, Can, et al.
Published: (2023)

Integrating Pause Information with Word Embeddings in Language Models for Alzheimer's Disease Detection from Spontaneous Speech
by: Pu, Yu, et al.
Published: (2025)

Lightweight Target-Speaker-Based Overlap Transcription for Practical Streaming ASR
by: Pražák, Aleš, et al.
Published: (2025)

Enhancing GOP in CTC-Based Mispronunciation Detection with Phonological Knowledge
by: Parikh, Aditya Kamlesh, et al.
Published: (2025)

Reading Miscue Detection in Primary School through Automatic Speech Recognition
by: Gao, Lingyun, et al.
Published: (2024)

Effects of automotive microphone frequency response characteristics and noise conditions on speech and ASR quality -- an experimental evaluation
by: Buccoli, Michele, et al.
Published: (2025)

Mind the Gap: Entity-Preserved Context-Aware ASR Structured Transcriptions
by: Altinok, Duygu
Published: (2025)

The Sound of Healthcare: Improving Medical Transcription ASR Accuracy with Large Language Models
by: Adedeji, Ayo, et al.
Published: (2024)

RECA-PD: A Robust Explainable Cross-Attention Method for Speech-based Parkinson's Disease Classification
by: Zhong, Terry Yi, et al.
Published: (2025)

Innovative Speech-Based Deep Learning Approaches for Parkinson's Disease Classification: A Systematic Review
by: van Gelderen, Lisanne, et al.
Published: (2024)

Target Speaker ASR with Whisper
by: Polok, Alexander, et al.
Published: (2024)

Index-ASR Technical Report
by: Song, Zheshu, et al.
Published: (2025)

Beyond Transcription: Mechanistic Interpretability in ASR
by: Glazer, Neta, et al.
Published: (2025)

BR-ASR: Efficient and Scalable Bias Retrieval Framework for Contextual Biasing ASR in Speech LLM
by: Gong, Xun, et al.
Published: (2025)

Speech Emotion Recognition with ASR Integration
by: Li, Yuanchao
Published: (2026)

Efficient Scaling for LLM-based ASR
by: Mu, Bingshen, et al.
Published: (2025)

EMOVOME: A Dataset for Emotion Recognition in Spontaneous Real-Life Speech
by: Gómez-Zaragozá, Lucía, et al.
Published: (2024)

Breaking the Transcription Bottleneck: Fine-tuning ASR Models for Extremely Low-Resource Fieldwork Languages
by: Liang, Siyu, et al.
Published: (2025)

The USTC-NERCSLIP Systems for The ICMC-ASR Challenge
by: Wu, Minghui, et al.
Published: (2024)

Exploring SSL Discrete Tokens for Multilingual ASR
by: Cui, Mingyu, et al.
Published: (2024)

Infusing Acoustic Pause Context into Text-Based Dementia Assessment
by: Braun, Franziska, et al.
Published: (2024)

Is Transfer Learning Necessary for Violin Transcription?
by: Peng, Yueh-Po, et al.
Published: (2025)

LUPET: Incorporating Hierarchical Information Path into Multilingual ASR
by: Liu, Wei, et al.
Published: (2024)

persoDA: Personalized Data Augmentation for Personalized ASR
by: Parada, Pablo Peso, et al.
Published: (2025)

Speaker Adaptation for Quantised End-to-End ASR Models
by: Zhao, Qiuming, et al.
Published: (2024)

Comparative Analysis of ASR Methods for Speech Deepfake Detection
by: Salvi, Davide, et al.
Published: (2024)

Consistency Based Unsupervised Self-training For ASR Personalisation
by: Zhang, Jisi, et al.
Published: (2024)

Leveraging Multimodal Methods and Spontaneous Speech for Alzheimer's Disease Identification
by: Gao, Yifan, et al.
Published: (2024)

Exploring Dynamic Parameters for Vietnamese Gender-Independent ASR
by: Leang, Sotheara, et al.
Published: (2025)

Prompting Whisper for Joint Speech Transcription and Diarization
by: Zamyrova, Mariia, et al.
Published: (2026)

Robust Singing Voice Transcription Serves Synthesis
by: Li, Ruiqi, et al.
Published: (2024)

Joint ASR and Speaker Role Tagging with Serialized Output Training
by: Xu, Anfeng, et al.
Published: (2025)