:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Piao, Ran, Lu, Yuan, Kemps, Hareld, Xia, Tong, Saeed, Aaqib
Format:	Preprint
Published:	2025
Subjects:	Sound Machine Learning
Online Access:	https://arxiv.org/abs/2508.20717
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

RespLLM: Unifying Audio and Text with Multimodal LLMs for Generalized Respiratory Health Prediction
by: Zhang, Yuwei, et al.
Published: (2024)

Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking
by: Zhang, Yuwei, et al.
Published: (2024)

StethoLM: Audio Language Model for Cardiopulmonary Analysis Across Clinical Tasks
by: Wang, Yishan, et al.
Published: (2026)

Towards Robust Assessment of Pathological Voices via Combined Low-Level Descriptors and Foundation Model Representations
by: Ariyanti, Whenty, et al.
Published: (2025)

Lightweight and Generalizable Acoustic Scene Representations via Contrastive Fine-Tuning and Distillation
by: Yuan, Kuang, et al.
Published: (2025)

AI-Driven Acoustic Voice Biomarker-Based Hierarchical Classification of Benign Laryngeal Voice Disorders from Sustained Vowels
by: Annabestani, Mohsen, et al.
Published: (2025)

RA-QA: A Benchmarking System for Respiratory Audio Question Answering Under Real-World Heterogeneity
by: Bertolino, Gaia A., et al.
Published: (2026)

A Multimodal Framework for Dementia Detection via Linguistic and Acoustic Representation Learning
by: Ilias, Loukas, et al.
Published: (2026)

UniPACT: A Multimodal Framework for Prognostic Question Answering on Raw ECG and Structured EHR
by: Tang, Jialu, et al.
Published: (2026)

Benchmarking Representations for Speech, Music, and Acoustic Events
by: La Quatra, Moreno, et al.
Published: (2024)

Underwater Acoustic Target Recognition based on Smoothness-inducing Regularization and Spectrogram-based Data Augmentation
by: Xu, Ji, et al.
Published: (2023)

Automated Dysphagia Screening Using Noninvasive Neck Acoustic Sensing
by: Chng, Jade, et al.
Published: (2026)

Reproducible Machine Learning-based Voice Pathology Detection: Introducing the Pitch Difference Feature
by: Vrba, Jan, et al.
Published: (2024)

An AI-enabled Bias-Free Respiratory Disease Diagnosis Model using Cough Audio: A Case Study for COVID-19
by: Saeed, Tabish, et al.
Published: (2024)

SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models
by: Yin, Chun, et al.
Published: (2024)

Improving Deep Learning-based Respiratory Sound Analysis with Frequency Selection and Attention Mechanism
by: Fraihi, Nouhaila, et al.
Published: (2025)

Generative Multi-modal Feedback for Singing Voice Synthesis Evaluation
by: Li, Xueyan, et al.
Published: (2025)

Singing Voice Conversion with Accompaniment Using Self-Supervised Representation-Based Melody Features
by: Chen, Wei, et al.
Published: (2025)

Electrocardiogram Report Generation and Question Answering via Retrieval-Augmented Self-Supervised Modeling
by: Tang, Jialu, et al.
Published: (2024)

Electrocardiogram-Language Model for Few-Shot Question Answering with Meta Learning
by: Tang, Jialu, et al.
Published: (2024)

OpenVoice: Versatile Instant Voice Cloning
by: Qin, Zengyi, et al.
Published: (2023)

Disentangling Textual and Acoustic Features of Neural Speech Representations
by: Mohebbi, Hosein, et al.
Published: (2024)

Adaptive Test-Time Scaling for Zero-Shot Respiratory Audio Classification
by: Wang, Tsai-Ning, et al.
Published: (2026)

Acoustic evaluation of a neural network dedicated to the detection of animal vocalisations
by: Rouch, Jérémy, et al.
Published: (2025)

Voice Biomarkers for Depression and Anxiety
by: Abramenko, Oleksii, et al.
Published: (2026)

Low-Resource Cross-Domain Singing Voice Synthesis via Reduced Self-Supervised Speech Representations
by: Kakoulidis, Panos, et al.
Published: (2024)

Single Microphone Own Voice Detection based on Simulated Transfer Functions for Hearing Aids
by: Mayuravaani, Mathuranathan, et al.
Published: (2026)

Boosting the Transferability of Audio Adversarial Examples with Acoustic Representation Optimization
by: Jin, Weifei, et al.
Published: (2025)

Respiratory Disease Classification and Biometric Analysis Using Biosignals from Digital Stethoscopes
by: Casado, Constantino Álvarez, et al.
Published: (2023)

LibriVAD: A Scalable Open Dataset with Deep Learning Benchmarks for Voice Activity Detection
by: Stylianou, Ioannis, et al.
Published: (2025)

Temporal Convolution-based Hybrid Model Approach with Representation Learning for Real-Time Acoustic Anomaly Detection
by: Dissanayaka, Sahan, et al.
Published: (2024)

Investigation for Relative Voice Impression Estimation
by: Fujita, Kenichi, et al.
Published: (2026)

Underwater-Art: Expanding Information Perspectives With Text Templates For Underwater Acoustic Target Recognition
by: Xie, Yuan, et al.
Published: (2023)

MambaVoiceCloning: Efficient and Expressive Text-to-Speech via State-Space Modeling and Diffusion Control
by: Kumar, Sahil, et al.
Published: (2026)

Hankel-FNO: Fast Underwater Acoustic Charting Via Physics-Encoded Fourier Neural Operator
by: Sun, Yifan, et al.
Published: (2025)

Parameter-efficient Dual-encoder Architecture with Differentiable Choquet Integral Fusion for Underwater Acoustic Classification
by: Mohammadi, Amirmohammad, et al.
Published: (2026)

Distributed Acoustic Sensing for Urban Traffic Monitoring: Spatio-Temporal Attention in Recurrent Neural Networks
by: Fakhruzi, Izhan, et al.
Published: (2026)

NaturalVoices: A Large-Scale, Spontaneous and Emotional Podcast Dataset for Voice Conversion
by: Du, Zongyang, et al.
Published: (2025)

Unsupervised Acoustic Scene Mapping Based on Acoustic Features and Dimensionality Reduction
by: Cohen, Idan, et al.
Published: (2023)

VANPY: Voice Analysis Framework
by: Koushnir, Gregory, et al.
Published: (2025)