Saved in:
| Main Authors: | Zhao, Aite, Liu, Yongcan, Yu, Xinglin, Xing, Xinyue |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.10703 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
SleepGMUformer: A gated multimodal temporal neural network for sleep staging
by: Zhao, Chenjun, et al.
Published: (2025)
by: Zhao, Chenjun, et al.
Published: (2025)
Synthetic data enables context-aware bioacoustic sound event detection
by: Hoffman, Benjamin, et al.
Published: (2025)
by: Hoffman, Benjamin, et al.
Published: (2025)
Determining the severity of Parkinson's disease in patients using a multi task neural network
by: García-Ordás, María Teresa, et al.
Published: (2024)
by: García-Ordás, María Teresa, et al.
Published: (2024)
Optimising MFCC parameters for the automatic detection of respiratory diseases
by: Yan, Yuyang, et al.
Published: (2024)
by: Yan, Yuyang, et al.
Published: (2024)
Evaluating Echo State Network for Parkinson's Disease Prediction using Voice Features
by: Hosseininian, Seyedeh Zahra Seyedi, et al.
Published: (2024)
by: Hosseininian, Seyedeh Zahra Seyedi, et al.
Published: (2024)
A multimodal Bayesian Network for symptom-level depression and anxiety prediction from voice and speech data
by: Norbury, Agnes, et al.
Published: (2025)
by: Norbury, Agnes, et al.
Published: (2025)
Acoustic evaluation of a neural network dedicated to the detection of animal vocalisations
by: Rouch, Jérémy, et al.
Published: (2025)
by: Rouch, Jérémy, et al.
Published: (2025)
An AI-enabled Bias-Free Respiratory Disease Diagnosis Model using Cough Audio: A Case Study for COVID-19
by: Saeed, Tabish, et al.
Published: (2024)
by: Saeed, Tabish, et al.
Published: (2024)
A multimodal dynamical variational autoencoder for audiovisual speech representation learning
by: Sadok, Samir, et al.
Published: (2023)
by: Sadok, Samir, et al.
Published: (2023)
Robust detection of overlapping bioacoustic sound events
by: Mahon, Louis, et al.
Published: (2025)
by: Mahon, Louis, et al.
Published: (2025)
Speech foundation models on intelligibility prediction for hearing-impaired listeners
by: Cuervo, Santiago, et al.
Published: (2024)
by: Cuervo, Santiago, et al.
Published: (2024)
Adaptive vector steering: A training-free, layer-wise intervention for hallucination mitigation in large audio and multimodal models
by: Lin, Tsung-En, et al.
Published: (2025)
by: Lin, Tsung-En, et al.
Published: (2025)
Echo: Towards Advanced Audio Comprehension via Audio-Interleaved Reasoning
by: Wu, Daiqing, et al.
Published: (2026)
by: Wu, Daiqing, et al.
Published: (2026)
Sparse deepfake detection promotes better disentanglement
by: Teissier, Antoine, et al.
Published: (2025)
by: Teissier, Antoine, et al.
Published: (2025)
SpikCommander: A High-performance Spiking Transformer with Multi-view Learning for Efficient Speech Command Recognition
by: Wang, Jiaqi, et al.
Published: (2025)
by: Wang, Jiaqi, et al.
Published: (2025)
A Semi-Supervised Framework for Speech Confidence Detection using Whisper
by: Wynn, Adam, et al.
Published: (2026)
by: Wynn, Adam, et al.
Published: (2026)
An Attention Long Short-Term Memory based system for automatic classification of speech intelligibility
by: Fernández-Díaz, Miguel, et al.
Published: (2024)
by: Fernández-Díaz, Miguel, et al.
Published: (2024)
A contrastive-learning approach for auditory attention detection
by: Bajestan, Seyed Ali Alavi, et al.
Published: (2024)
by: Bajestan, Seyed Ali Alavi, et al.
Published: (2024)
Decodable but not structured: linear probing enables Underwater Acoustic Target Recognition with pretrained audio embeddings
by: Hummel, Hilde I., et al.
Published: (2026)
by: Hummel, Hilde I., et al.
Published: (2026)
BenSParX: A Robust Explainable Machine Learning Framework for Parkinson's Disease Detection from Bengali Conversational Speech
by: Hossain, Riad, et al.
Published: (2025)
by: Hossain, Riad, et al.
Published: (2025)
ADNAC: Audio Denoiser using Neural Audio Codec
by: Jimon, Daniel, et al.
Published: (2025)
by: Jimon, Daniel, et al.
Published: (2025)
Selfsupervised learning for pathological speech detection
by: Sheikh, Shakeel Ahmad
Published: (2024)
by: Sheikh, Shakeel Ahmad
Published: (2024)
High-Fidelity Music Vocoder using Neural Audio Codecs
by: Lanzendörfer, Luca A., et al.
Published: (2025)
by: Lanzendörfer, Luca A., et al.
Published: (2025)
Efficient Continual Learning in Keyword Spotting using Binary Neural Networks
by: Vu, Quynh Nguyen-Phuong, et al.
Published: (2025)
by: Vu, Quynh Nguyen-Phuong, et al.
Published: (2025)
Denoising by neural network for muzzle blast detection
by: Pujol, Hadrien, et al.
Published: (2025)
by: Pujol, Hadrien, et al.
Published: (2025)
Cough activity detection for automatic tuberculosis screening
by: van Vüren, Joshua Jansen, et al.
Published: (2026)
by: van Vüren, Joshua Jansen, et al.
Published: (2026)
SAO-Instruct: Free-form Audio Editing using Natural Language Instructions
by: Ungersböck, Michael, et al.
Published: (2025)
by: Ungersböck, Michael, et al.
Published: (2025)
Multi-Task Learning for Lung sound & Lung disease classification
by: K V, Suma, et al.
Published: (2024)
by: K V, Suma, et al.
Published: (2024)
Investigating the Effectiveness of Explainability Methods in Parkinson's Detection from Speech
by: Mancini, Eleonora, et al.
Published: (2024)
by: Mancini, Eleonora, et al.
Published: (2024)
voice2mode: Phonation Mode Classification in Singing using Self-Supervised Speech Models
by: Justus, Aju Ani, et al.
Published: (2026)
by: Justus, Aju Ani, et al.
Published: (2026)
Towards generalizing deep-audio fake detection networks
by: Gasenzer, Konstantin, et al.
Published: (2023)
by: Gasenzer, Konstantin, et al.
Published: (2023)
A benchmark of state-of-the-art sound event detection systems evaluated on synthetic soundscapes
by: Ronchini, Francesca, et al.
Published: (2022)
by: Ronchini, Francesca, et al.
Published: (2022)
Surface impedance inference via neural fields and sparse acoustic data obtained by a compact array
by: Xia, Yuanxin, et al.
Published: (2026)
by: Xia, Yuanxin, et al.
Published: (2026)
Unsupervised outlier detection to improve bird audio dataset labels
by: Collins, Bruce
Published: (2025)
by: Collins, Bruce
Published: (2025)
Learning to rumble: Automated elephant call classification, detection and endpointing using deep architectures
by: Geldenhuys, Christiaan M., et al.
Published: (2024)
by: Geldenhuys, Christiaan M., et al.
Published: (2024)
A Novel Fusion Architecture for PD Detection Using Semi-Supervised Speech Embeddings
by: Adnan, Tariq, et al.
Published: (2024)
by: Adnan, Tariq, et al.
Published: (2024)
Unleashing the Power of Natural Audio Featuring Multiple Sound Sources
by: Cheng, Xize, et al.
Published: (2025)
by: Cheng, Xize, et al.
Published: (2025)
Generalizable speech deepfake detection via meta-learned LoRA
by: Laakkonen, Janne, et al.
Published: (2025)
by: Laakkonen, Janne, et al.
Published: (2025)
The impact of non-target events in synthetic soundscapes for sound event detection
by: Ronchini, Francesca, et al.
Published: (2021)
by: Ronchini, Francesca, et al.
Published: (2021)
Voxceleb-ESP: preliminary experiments detecting Spanish celebrities from their voices
by: Labrador, Beltrán, et al.
Published: (2023)
by: Labrador, Beltrán, et al.
Published: (2023)
Similar Items
-
SleepGMUformer: A gated multimodal temporal neural network for sleep staging
by: Zhao, Chenjun, et al.
Published: (2025) -
Synthetic data enables context-aware bioacoustic sound event detection
by: Hoffman, Benjamin, et al.
Published: (2025) -
Determining the severity of Parkinson's disease in patients using a multi task neural network
by: García-Ordás, María Teresa, et al.
Published: (2024) -
Optimising MFCC parameters for the automatic detection of respiratory diseases
by: Yan, Yuyang, et al.
Published: (2024) -
Evaluating Echo State Network for Parkinson's Disease Prediction using Voice Features
by: Hosseininian, Seyedeh Zahra Seyedi, et al.
Published: (2024)