Saved in:
| Main Authors: | Postma, Emmy, Tejedor-Garcia, Cristian |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.02078 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Evaluating the Usefulness of Non-Diagnostic Speech Data for Developing Parkinson's Disease Classifiers
by: Zhong, Terry Yi, et al.
Published: (2025)
by: Zhong, Terry Yi, et al.
Published: (2025)
Innovative Speech-Based Deep Learning Approaches for Parkinson's Disease Classification: A Systematic Review
by: van Gelderen, Lisanne, et al.
Published: (2024)
by: van Gelderen, Lisanne, et al.
Published: (2024)
RECA-PD: A Robust Explainable Cross-Attention Method for Speech-based Parkinson's Disease Classification
by: Zhong, Terry Yi, et al.
Published: (2025)
by: Zhong, Terry Yi, et al.
Published: (2025)
A Benchmark for Early-stage Parkinson's Disease Detection from Speech
by: Zhong, Terry Yi, et al.
Published: (2026)
by: Zhong, Terry Yi, et al.
Published: (2026)
Zero-Shot Speech LLMs for Multi-Aspect Evaluation of L2 Speech: Challenges and Opportunities
by: Parikh, Aditya Kamlesh, et al.
Published: (2026)
by: Parikh, Aditya Kamlesh, et al.
Published: (2026)
Zero-Shot Parkinson's Disease Detection from Speech: Comparing Large Audio and Language Models
by: Kabir, Muhammad Ashad, et al.
Published: (2026)
by: Kabir, Muhammad Ashad, et al.
Published: (2026)
Evaluating Logit-Based GOP Scores for Mispronunciation Detection
by: Parikh, Aditya Kamlesh, et al.
Published: (2025)
by: Parikh, Aditya Kamlesh, et al.
Published: (2025)
Studying the Effect of Audio Filters in Pre-Trained Models for Environmental Sound Classification
by: Dawn, Aditya, et al.
Published: (2024)
by: Dawn, Aditya, et al.
Published: (2024)
Rubric-Guided Fine-tuning of SpeechLLMs for Multi-Aspect, Multi-Rater L2 Reading-Speech Assessment
by: Parikh, Aditya Kamlesh, et al.
Published: (2026)
by: Parikh, Aditya Kamlesh, et al.
Published: (2026)
Improving Child Speech Recognition and Reading Mistake Detection by Using Prompts
by: Gao, Lingyun, et al.
Published: (2025)
by: Gao, Lingyun, et al.
Published: (2025)
Enhancing GOP in CTC-Based Mispronunciation Detection with Phonological Knowledge
by: Parikh, Aditya Kamlesh, et al.
Published: (2025)
by: Parikh, Aditya Kamlesh, et al.
Published: (2025)
Bilingual Dual-Head Deep Model for Parkinson's Disease Detection from Speech
by: La Quatra, Moreno, et al.
Published: (2025)
by: La Quatra, Moreno, et al.
Published: (2025)
Reading Miscue Detection in Primary School through Automatic Speech Recognition
by: Gao, Lingyun, et al.
Published: (2024)
by: Gao, Lingyun, et al.
Published: (2024)
4,500 Seconds: Small Data Training Approaches for Deep UAV Audio Classification
by: Berg, Andrew P., et al.
Published: (2025)
by: Berg, Andrew P., et al.
Published: (2025)
Unlocking Strong Supervision: A Data-Centric Study of General-Purpose Audio Pre-Training Methods
by: Zhou, Xuanru, et al.
Published: (2026)
by: Zhou, Xuanru, et al.
Published: (2026)
Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models
by: Jung, Kyudan, et al.
Published: (2026)
by: Jung, Kyudan, et al.
Published: (2026)
Unify Variables in Neural Scaling Laws for General Audio Representations via Embedding Effective Rank
by: Deng, Xuyao, et al.
Published: (2025)
by: Deng, Xuyao, et al.
Published: (2025)
Investigating the Effectiveness of Explainability Methods in Parkinson's Detection from Speech
by: Mancini, Eleonora, et al.
Published: (2024)
by: Mancini, Eleonora, et al.
Published: (2024)
Leveraging Pre-Trained Autoencoders for Interpretable Prototype Learning of Music Audio
by: Alonso-Jiménez, Pablo, et al.
Published: (2024)
by: Alonso-Jiménez, Pablo, et al.
Published: (2024)
DASB - Discrete Audio and Speech Benchmark
by: Mousavi, Pooneh, et al.
Published: (2024)
by: Mousavi, Pooneh, et al.
Published: (2024)
Speech Enhancement Using Continuous Embeddings of Neural Audio Codec
by: Li, Haoyang, et al.
Published: (2025)
by: Li, Haoyang, et al.
Published: (2025)
MATE: Matryoshka Audio-Text Embeddings for Open-Vocabulary Keyword Spotting
by: Jung, Youngmoon, et al.
Published: (2026)
by: Jung, Youngmoon, et al.
Published: (2026)
Improving Anomalous Sound Detection via Low-Rank Adaptation Fine-Tuning of Pre-Trained Audio Models
by: Zheng, Xinhu, et al.
Published: (2024)
by: Zheng, Xinhu, et al.
Published: (2024)
EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning
by: Kim, Jaeyeon, et al.
Published: (2024)
by: Kim, Jaeyeon, et al.
Published: (2024)
Automatic Speech Recognition in the Modern Era: Architectures, Training, and Evaluation
by: Nayeem, Md., et al.
Published: (2025)
by: Nayeem, Md., et al.
Published: (2025)
JASTIN: Aligning LLMs for Zero-Shot Audio and Speech Evaluation via Natural Language Instructions
by: Zhang, Leying, et al.
Published: (2026)
by: Zhang, Leying, et al.
Published: (2026)
SAM Audio Judge: A Unified Multimodal Framework for Perceptual Evaluation of Audio Separation
by: Wang, Helin, et al.
Published: (2026)
by: Wang, Helin, et al.
Published: (2026)
NEST-RQ: Next Token Prediction for Speech Self-Supervised Pre-Training
by: Han, Minglun, et al.
Published: (2024)
by: Han, Minglun, et al.
Published: (2024)
Neural Speech Embeddings for Speech Synthesis Based on Deep Generative Networks
by: Lee, Seo-Hyun, et al.
Published: (2023)
by: Lee, Seo-Hyun, et al.
Published: (2023)
Exploring Musical Roots: Applying Audio Embeddings to Empower Influence Attribution for a Generative Music Model
by: Barnett, Julia, et al.
Published: (2024)
by: Barnett, Julia, et al.
Published: (2024)
Scenario of Use Scheme: Threat Model Specification for Speaker Privacy Protection in the Medical Domain
by: Rahman, Mehtab Ur, et al.
Published: (2024)
by: Rahman, Mehtab Ur, et al.
Published: (2024)
Fundamental Survey on Neuromorphic Based Audio Classification
by: Basu, Amlan, et al.
Published: (2025)
by: Basu, Amlan, et al.
Published: (2025)
Investigation of Whisper ASR Hallucinations Induced by Non-Speech Audio
by: Barański, Mateusz, et al.
Published: (2025)
by: Barański, Mateusz, et al.
Published: (2025)
LLM-Guided Reinforcement Learning for Audio-Visual Speech Enhancement
by: Chen, Chih-Ning, et al.
Published: (2026)
by: Chen, Chih-Ning, et al.
Published: (2026)
Audio Deepfake Detection in the Age of Advanced Text-to-Speech models
by: Singh, Robin, et al.
Published: (2026)
by: Singh, Robin, et al.
Published: (2026)
Audio Codec Augmentation for Robust Collaborative Watermarking of Speech Synthesis
by: Juvela, Lauri, et al.
Published: (2024)
by: Juvela, Lauri, et al.
Published: (2024)
Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio?
by: Xie, Yuankun, et al.
Published: (2024)
by: Xie, Yuankun, et al.
Published: (2024)
Analysis and Evaluation of Synthetic Data Generation in Speech Dysfluency Detection
by: Zhang, Jinming, et al.
Published: (2025)
by: Zhang, Jinming, et al.
Published: (2025)
SLAP: Scalable Language-Audio Pretraining with Variable-Duration Audio and Multi-Objective Training
by: Mei, Xinhao, et al.
Published: (2026)
by: Mei, Xinhao, et al.
Published: (2026)
Embedding Alignment in Code Generation for Audio
by: Kouteili, Sam, et al.
Published: (2025)
by: Kouteili, Sam, et al.
Published: (2025)
Similar Items
-
Evaluating the Usefulness of Non-Diagnostic Speech Data for Developing Parkinson's Disease Classifiers
by: Zhong, Terry Yi, et al.
Published: (2025) -
Innovative Speech-Based Deep Learning Approaches for Parkinson's Disease Classification: A Systematic Review
by: van Gelderen, Lisanne, et al.
Published: (2024) -
RECA-PD: A Robust Explainable Cross-Attention Method for Speech-based Parkinson's Disease Classification
by: Zhong, Terry Yi, et al.
Published: (2025) -
A Benchmark for Early-stage Parkinson's Disease Detection from Speech
by: Zhong, Terry Yi, et al.
Published: (2026) -
Zero-Shot Speech LLMs for Multi-Aspect Evaluation of L2 Speech: Challenges and Opportunities
by: Parikh, Aditya Kamlesh, et al.
Published: (2026)