Saved in:
| Main Authors: | Piao, Ran, Lu, Yuan, Kemps, Hareld, Xia, Tong, Saeed, Aaqib |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.20717 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
RespLLM: Unifying Audio and Text with Multimodal LLMs for Generalized Respiratory Health Prediction
by: Zhang, Yuwei, et al.
Published: (2024)
by: Zhang, Yuwei, et al.
Published: (2024)
Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking
by: Zhang, Yuwei, et al.
Published: (2024)
by: Zhang, Yuwei, et al.
Published: (2024)
StethoLM: Audio Language Model for Cardiopulmonary Analysis Across Clinical Tasks
by: Wang, Yishan, et al.
Published: (2026)
by: Wang, Yishan, et al.
Published: (2026)
Towards Robust Assessment of Pathological Voices via Combined Low-Level Descriptors and Foundation Model Representations
by: Ariyanti, Whenty, et al.
Published: (2025)
by: Ariyanti, Whenty, et al.
Published: (2025)
Lightweight and Generalizable Acoustic Scene Representations via Contrastive Fine-Tuning and Distillation
by: Yuan, Kuang, et al.
Published: (2025)
by: Yuan, Kuang, et al.
Published: (2025)
AI-Driven Acoustic Voice Biomarker-Based Hierarchical Classification of Benign Laryngeal Voice Disorders from Sustained Vowels
by: Annabestani, Mohsen, et al.
Published: (2025)
by: Annabestani, Mohsen, et al.
Published: (2025)
RA-QA: A Benchmarking System for Respiratory Audio Question Answering Under Real-World Heterogeneity
by: Bertolino, Gaia A., et al.
Published: (2026)
by: Bertolino, Gaia A., et al.
Published: (2026)
A Multimodal Framework for Dementia Detection via Linguistic and Acoustic Representation Learning
by: Ilias, Loukas, et al.
Published: (2026)
by: Ilias, Loukas, et al.
Published: (2026)
UniPACT: A Multimodal Framework for Prognostic Question Answering on Raw ECG and Structured EHR
by: Tang, Jialu, et al.
Published: (2026)
by: Tang, Jialu, et al.
Published: (2026)
Benchmarking Representations for Speech, Music, and Acoustic Events
by: La Quatra, Moreno, et al.
Published: (2024)
by: La Quatra, Moreno, et al.
Published: (2024)
Underwater Acoustic Target Recognition based on Smoothness-inducing Regularization and Spectrogram-based Data Augmentation
by: Xu, Ji, et al.
Published: (2023)
by: Xu, Ji, et al.
Published: (2023)
Automated Dysphagia Screening Using Noninvasive Neck Acoustic Sensing
by: Chng, Jade, et al.
Published: (2026)
by: Chng, Jade, et al.
Published: (2026)
Reproducible Machine Learning-based Voice Pathology Detection: Introducing the Pitch Difference Feature
by: Vrba, Jan, et al.
Published: (2024)
by: Vrba, Jan, et al.
Published: (2024)
An AI-enabled Bias-Free Respiratory Disease Diagnosis Model using Cough Audio: A Case Study for COVID-19
by: Saeed, Tabish, et al.
Published: (2024)
by: Saeed, Tabish, et al.
Published: (2024)
SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models
by: Yin, Chun, et al.
Published: (2024)
by: Yin, Chun, et al.
Published: (2024)
Improving Deep Learning-based Respiratory Sound Analysis with Frequency Selection and Attention Mechanism
by: Fraihi, Nouhaila, et al.
Published: (2025)
by: Fraihi, Nouhaila, et al.
Published: (2025)
Generative Multi-modal Feedback for Singing Voice Synthesis Evaluation
by: Li, Xueyan, et al.
Published: (2025)
by: Li, Xueyan, et al.
Published: (2025)
Singing Voice Conversion with Accompaniment Using Self-Supervised Representation-Based Melody Features
by: Chen, Wei, et al.
Published: (2025)
by: Chen, Wei, et al.
Published: (2025)
Electrocardiogram Report Generation and Question Answering via Retrieval-Augmented Self-Supervised Modeling
by: Tang, Jialu, et al.
Published: (2024)
by: Tang, Jialu, et al.
Published: (2024)
Electrocardiogram-Language Model for Few-Shot Question Answering with Meta Learning
by: Tang, Jialu, et al.
Published: (2024)
by: Tang, Jialu, et al.
Published: (2024)
OpenVoice: Versatile Instant Voice Cloning
by: Qin, Zengyi, et al.
Published: (2023)
by: Qin, Zengyi, et al.
Published: (2023)
Disentangling Textual and Acoustic Features of Neural Speech Representations
by: Mohebbi, Hosein, et al.
Published: (2024)
by: Mohebbi, Hosein, et al.
Published: (2024)
Adaptive Test-Time Scaling for Zero-Shot Respiratory Audio Classification
by: Wang, Tsai-Ning, et al.
Published: (2026)
by: Wang, Tsai-Ning, et al.
Published: (2026)
Acoustic evaluation of a neural network dedicated to the detection of animal vocalisations
by: Rouch, Jérémy, et al.
Published: (2025)
by: Rouch, Jérémy, et al.
Published: (2025)
Voice Biomarkers for Depression and Anxiety
by: Abramenko, Oleksii, et al.
Published: (2026)
by: Abramenko, Oleksii, et al.
Published: (2026)
Low-Resource Cross-Domain Singing Voice Synthesis via Reduced Self-Supervised Speech Representations
by: Kakoulidis, Panos, et al.
Published: (2024)
by: Kakoulidis, Panos, et al.
Published: (2024)
Single Microphone Own Voice Detection based on Simulated Transfer Functions for Hearing Aids
by: Mayuravaani, Mathuranathan, et al.
Published: (2026)
by: Mayuravaani, Mathuranathan, et al.
Published: (2026)
Boosting the Transferability of Audio Adversarial Examples with Acoustic Representation Optimization
by: Jin, Weifei, et al.
Published: (2025)
by: Jin, Weifei, et al.
Published: (2025)
Respiratory Disease Classification and Biometric Analysis Using Biosignals from Digital Stethoscopes
by: Casado, Constantino Álvarez, et al.
Published: (2023)
by: Casado, Constantino Álvarez, et al.
Published: (2023)
LibriVAD: A Scalable Open Dataset with Deep Learning Benchmarks for Voice Activity Detection
by: Stylianou, Ioannis, et al.
Published: (2025)
by: Stylianou, Ioannis, et al.
Published: (2025)
Temporal Convolution-based Hybrid Model Approach with Representation Learning for Real-Time Acoustic Anomaly Detection
by: Dissanayaka, Sahan, et al.
Published: (2024)
by: Dissanayaka, Sahan, et al.
Published: (2024)
Investigation for Relative Voice Impression Estimation
by: Fujita, Kenichi, et al.
Published: (2026)
by: Fujita, Kenichi, et al.
Published: (2026)
Underwater-Art: Expanding Information Perspectives With Text Templates For Underwater Acoustic Target Recognition
by: Xie, Yuan, et al.
Published: (2023)
by: Xie, Yuan, et al.
Published: (2023)
MambaVoiceCloning: Efficient and Expressive Text-to-Speech via State-Space Modeling and Diffusion Control
by: Kumar, Sahil, et al.
Published: (2026)
by: Kumar, Sahil, et al.
Published: (2026)
Hankel-FNO: Fast Underwater Acoustic Charting Via Physics-Encoded Fourier Neural Operator
by: Sun, Yifan, et al.
Published: (2025)
by: Sun, Yifan, et al.
Published: (2025)
Parameter-efficient Dual-encoder Architecture with Differentiable Choquet Integral Fusion for Underwater Acoustic Classification
by: Mohammadi, Amirmohammad, et al.
Published: (2026)
by: Mohammadi, Amirmohammad, et al.
Published: (2026)
Distributed Acoustic Sensing for Urban Traffic Monitoring: Spatio-Temporal Attention in Recurrent Neural Networks
by: Fakhruzi, Izhan, et al.
Published: (2026)
by: Fakhruzi, Izhan, et al.
Published: (2026)
NaturalVoices: A Large-Scale, Spontaneous and Emotional Podcast Dataset for Voice Conversion
by: Du, Zongyang, et al.
Published: (2025)
by: Du, Zongyang, et al.
Published: (2025)
Unsupervised Acoustic Scene Mapping Based on Acoustic Features and Dimensionality Reduction
by: Cohen, Idan, et al.
Published: (2023)
by: Cohen, Idan, et al.
Published: (2023)
VANPY: Voice Analysis Framework
by: Koushnir, Gregory, et al.
Published: (2025)
by: Koushnir, Gregory, et al.
Published: (2025)
Similar Items
-
RespLLM: Unifying Audio and Text with Multimodal LLMs for Generalized Respiratory Health Prediction
by: Zhang, Yuwei, et al.
Published: (2024) -
Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking
by: Zhang, Yuwei, et al.
Published: (2024) -
StethoLM: Audio Language Model for Cardiopulmonary Analysis Across Clinical Tasks
by: Wang, Yishan, et al.
Published: (2026) -
Towards Robust Assessment of Pathological Voices via Combined Low-Level Descriptors and Foundation Model Representations
by: Ariyanti, Whenty, et al.
Published: (2025) -
Lightweight and Generalizable Acoustic Scene Representations via Contrastive Fine-Tuning and Distillation
by: Yuan, Kuang, et al.
Published: (2025)