:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wahida, Farah, Chamikara, M. A. P., Shanmugarasa, Yashothara, Chhetri, Mohan Baruwal, Ranbaduge, Thilina, Khalil, Ibrahim
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Sound Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2508.05409
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Hidden in the Noise: Unveiling Backdoors in Audio LLMs Alignment through Latent Acoustic Pattern Triggers
by: Lin, Liang, et al.
Published: (2025)

Multichannel Voice Trigger Detection Based on Transform-average-concatenate
by: Higuchi, Takuya, et al.
Published: (2023)

Two-pass Endpoint Detection for Speech Recognition
by: Raju, Anirudh, et al.
Published: (2024)

Dual-Branch Knowledge Distillation for Noise-Robust Synthetic Speech Detection
by: Fan, Cunhang, et al.
Published: (2023)

Suppressing Noise Disparity in Training Data for Automatic Pathological Speech Detection
by: Amiri, Mahdi, et al.
Published: (2024)

CEC: A Noisy Label Detection Method for Speaker Recognition
by: Shen, Yao, et al.
Published: (2024)

Leveraging LLM and Text-Queried Separation for Noise-Robust Sound Event Detection
by: Yin, Han, et al.
Published: (2024)

Noise-Robust Contrastive Learning with an MFCC-Conformer For Coronary Artery Disease Detection
by: Marocchi, Milan, et al.
Published: (2026)

CTC Blank Triggered Dynamic Layer-Skipping for Efficient CTC-based Speech Recognition
by: Hou, Junfeng, et al.
Published: (2024)

Noise-Robust Sound Event Detection and Counting via Language-Queried Sound Separation
by: Chen, Yuanjian, et al.
Published: (2025)

Noisy Disentanglement with Tri-stage Training for Noise-Robust Speech Recognition
by: Chen, Shuangyuan, et al.
Published: (2025)

Leveraging Mamba with Full-Face Vision for Audio-Visual Speech Enhancement
by: Chao, Rong, et al.
Published: (2025)

Face-Voice Association for Audiovisual Active Speaker Detection in Egocentric Recordings
by: Clarke, Jason, et al.
Published: (2025)

Findings of the 2024 Mandarin Stuttering Event Detection and Automatic Speech Recognition Challenge
by: Xue, Hongfei, et al.
Published: (2024)

Leveraging LLM for Stuttering Speech: A Unified Architecture Bridging Recognition and Event Detection
by: Huang, Shangkun, et al.
Published: (2025)

Testing Correctness, Fairness, and Robustness of Speech Emotion Recognition Models
by: Derington, Anna, et al.
Published: (2023)

Retrieval Augmented Correction of Named Entity Speech Recognition Errors
by: Pusateri, Ernest, et al.
Published: (2024)

Descriptor:: Extended-Length Audio Dataset for Synthetic Voice Detection and Speaker Recognition (ELAD-SVDSR)
by: Vijaykumar, Rahul, et al.
Published: (2025)

Enhancing Automatic Speech Recognition Through Integrated Noise Detection Architecture
by: Singh, Karamvir
Published: (2025)

Misophonia Trigger Sound Detection on Synthetic Soundscapes Using a Hybrid Model with a Frozen Pre-Trained CNN and a Time-Series Module
by: Sashida, Kurumi, et al.
Published: (2026)

Description and Discussion on DCASE 2026 Challenge Task 2: Noise-aware Unsupervised Anomalous Sound Detection for Machine Condition Monitoring
by: Nishida, Tomoya, et al.
Published: (2026)

End-to-End Integration of Speech Emotion Recognition with Voice Activity Detection using Self-Supervised Learning Features
by: Yamashita, Natsuo, et al.
Published: (2024)

Gradient Norm-based Fine-Tuning for Backdoor Defense in Automatic Speech Recognition
by: Zhou, Nanjun, et al.
Published: (2025)

Towards Robust Dysarthric Speech Recognition: LLM-Agent Post-ASR Correction Beyond WER
by: Zheng, Xiuwen, et al.
Published: (2026)

MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition
by: Mu, Bingshen, et al.
Published: (2024)

Adaptive Noise Resilient Keyword Spotting Using One-Shot Learning
by: Martinez-Rau, Luciano Sebastian, et al.
Published: (2025)

Imperceptible Rhythm Backdoor Attacks: Exploring Rhythm Transformation for Embedding Undetectable Vulnerabilities on Speech Recognition
by: Yao, Wenhan, et al.
Published: (2024)

Exploration of Adapter for Noise Robust Automatic Speech Recognition
by: Shi, Hao, et al.
Published: (2024)

Mixture of LoRA Experts with Multi-Modal and Multi-Granularity LLM Generative Error Correction for Accented Speech Recognition
by: Mu, Bingshen, et al.
Published: (2025)

From Audio Deepfake Detection to AI-Generated Music Detection -- A Pathway and Overview
by: Li, Yupei, et al.
Published: (2024)

Pitch Accent Detection improves Pretrained Automatic Speech Recognition
by: Sasu, David, et al.
Published: (2025)

Unified Audio Event Detection
by: Jiang, Yidi, et al.
Published: (2024)

Generalizable Detection of Audio Deepfakes
by: Lopez, Jose A., et al.
Published: (2025)

Detection of Deepfake Environmental Audio
by: Ouajdi, Hafsa, et al.
Published: (2024)

Automatic Pronunciation Error Detection and Correction of the Holy Quran's Learners Using Deep Learning
by: Abdelfattah, Abdullah, et al.
Published: (2025)

Personalized Speech Emotion Recognition in Human-Robot Interaction using Vision Transformers
by: Mishra, Ruchik, et al.
Published: (2024)

Water Flow Detection Device Based on Sound Data Analysis and Machine Learning to Detect Water Leakage
by: Pourmehrani, Hossein, et al.
Published: (2025)

Speech as a Biomarker for Disease Detection
by: Botelho, Catarina, et al.
Published: (2024)

ASPED: An Audio Dataset for Detecting Pedestrians
by: Seshadri, Pavan, et al.
Published: (2023)

Mind the Gap: Detecting Cluster Exits for Robust Local Density-Based Score Normalization in Anomalous Sound Detection
by: Wilkinghoff, Kevin, et al.
Published: (2026)