Saved in:
| Main Authors: | Gołębiowska, Magdalena, Syga, Piotr |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.20229 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Distilled HuBERT for Mobile Speech Emotion Recognition: A Cross-Corpus Validation Study
by: Ismail, Saifelden M.
Published: (2025)
by: Ismail, Saifelden M.
Published: (2025)
An End-to-End Approach for Korean Wakeword Systems with Speaker Authentication
by: Seo, Geonwoo
Published: (2025)
by: Seo, Geonwoo
Published: (2025)
ParaNoise-SV: Integrated Approach for Noise-Robust Speaker Verification with Parallel Joint Learning of Speech Enhancement and Noise Extraction
by: Kim, Minu, et al.
Published: (2025)
by: Kim, Minu, et al.
Published: (2025)
EMOVOME: A Dataset for Emotion Recognition in Spontaneous Real-Life Speech
by: Gómez-Zaragozá, Lucía, et al.
Published: (2024)
by: Gómez-Zaragozá, Lucía, et al.
Published: (2024)
Audio-based Kinship Verification Using Age Domain Conversion
by: Sun, Qiyang, et al.
Published: (2024)
by: Sun, Qiyang, et al.
Published: (2024)
Impact of Phonetics on Speaker Identity in Adversarial Voice Attack
by: Dar, Daniyal Kabir, et al.
Published: (2025)
by: Dar, Daniyal Kabir, et al.
Published: (2025)
VocSim: A Training-free Benchmark for Zero-shot Content Identity in Single-source Audio
by: Basha, Maris, et al.
Published: (2025)
by: Basha, Maris, et al.
Published: (2025)
Crossing the Species Divide: Transfer Learning from Speech to Animal Sounds
by: Cauzinille, Jules, et al.
Published: (2025)
by: Cauzinille, Jules, et al.
Published: (2025)
Hidden Echoes Survive Training in Audio To Audio Generative Instrument Models
by: Tralie, Christopher J., et al.
Published: (2024)
by: Tralie, Christopher J., et al.
Published: (2024)
Revisiting SSL for sound event detection: complementary fusion and adaptive post-processing
by: Cui, Hanfang, et al.
Published: (2025)
by: Cui, Hanfang, et al.
Published: (2025)
Emotional Voice Messages (EMOVOME) database: emotion recognition in spontaneous voice messages
by: Zaragozá, Lucía Gómez, et al.
Published: (2024)
by: Zaragozá, Lucía Gómez, et al.
Published: (2024)
RARR : Robust Real-World Activity Recognition with Vibration by Scavenging Near-Surface Audio Online
by: Lee, Dong Yoon, et al.
Published: (2025)
by: Lee, Dong Yoon, et al.
Published: (2025)
How much to Dereverberate? Low-Latency Single-Channel Speech Enhancement in Distant Microphone Scenarios
by: Venkatesh, Satvik, et al.
Published: (2025)
by: Venkatesh, Satvik, et al.
Published: (2025)
Measuring Robustness of Speech Recognition from MEG Signals Under Distribution Shift
by: Chien, Sheng-You, et al.
Published: (2026)
by: Chien, Sheng-You, et al.
Published: (2026)
Cepstral Smoothing of Binary Masks for Convolutive Blind Separation of Speech Mixtures
by: Missaoui, Ibrahim, et al.
Published: (2026)
by: Missaoui, Ibrahim, et al.
Published: (2026)
Passive Underwater Acoustic Signal Separation based on Feature Decoupling Dual-path Network
by: Liu, Yucheng, et al.
Published: (2025)
by: Liu, Yucheng, et al.
Published: (2025)
Quantum-Enhanced Analysis and Grading of Vocal Performance
by: Agarwal, Rohan
Published: (2025)
by: Agarwal, Rohan
Published: (2025)
SoundPlot: An Open-Source Framework for Birdsong Acoustic Analysis and Neural Synthesis with Interactive 3D Visualization
by: Mehdi, Naqcho Ali, et al.
Published: (2026)
by: Mehdi, Naqcho Ali, et al.
Published: (2026)
Thaka at KSAA-2026 Task 2: Regularized Fine-Tuning for Arabic Speech Diacritization
by: Alamr, Meshal, et al.
Published: (2026)
by: Alamr, Meshal, et al.
Published: (2026)
A Bird Song Detector for improving bird identification through Deep Learning: a case study from Doñana
by: Márquez-Rodríguez, Alba, et al.
Published: (2025)
by: Márquez-Rodríguez, Alba, et al.
Published: (2025)
Hallucination Level of Artificial Intelligence Whisperer: Case Speech Recognizing Pantterinousut Rap Song
by: Horppu, Ismo, et al.
Published: (2025)
by: Horppu, Ismo, et al.
Published: (2025)
Proficiency-Aware Adaptation and Data Augmentation for Robust L2 ASR
by: Sun, Ling, et al.
Published: (2025)
by: Sun, Ling, et al.
Published: (2025)
Real-time Low-latency Music Source Separation using Hybrid Spectrogram-TasNet
by: Venkatesh, Satvik, et al.
Published: (2024)
by: Venkatesh, Satvik, et al.
Published: (2024)
Intracoronary Optical Coherence Tomography Image Processing and Vessel Classification Using Machine Learning
by: Lahchim, Amal, et al.
Published: (2026)
by: Lahchim, Amal, et al.
Published: (2026)
Prevailing Research Areas for Music AI in the Era of Foundation Models
by: Wei, Megan, et al.
Published: (2024)
by: Wei, Megan, et al.
Published: (2024)
Learning Alternative Ways of Performing a Task
by: Nieves, David, et al.
Published: (2024)
by: Nieves, David, et al.
Published: (2024)
Quantization for OpenAI's Whisper Models: A Comparative Analysis
by: Andreyev, Allison
Published: (2025)
by: Andreyev, Allison
Published: (2025)
AG-REPA: Causal Layer Selection for Representation Alignment in Audio Flow Matching
by: Zhang, Pengfei, et al.
Published: (2026)
by: Zhang, Pengfei, et al.
Published: (2026)
Physics Augmented Tuple Transformer for Autism Severity Level Detection
by: Ranasingha, Chinthaka, et al.
Published: (2024)
by: Ranasingha, Chinthaka, et al.
Published: (2024)
Event Detection via Probability Density Function Regression
by: Peng, Clark, et al.
Published: (2024)
by: Peng, Clark, et al.
Published: (2024)
Delayed Fusion: Integrating Large Language Models into First-Pass Decoding in End-to-end Speech Recognition
by: Hori, Takaaki, et al.
Published: (2025)
by: Hori, Takaaki, et al.
Published: (2025)
A Novel Global Context-aware Deep Neural Network for Enhanced Brain Tumor Segmentation using Magnetic Resonance Images
by: Mukherjee, Sourjya, et al.
Published: (2026)
by: Mukherjee, Sourjya, et al.
Published: (2026)
Predicting Upcoming Stuttering Events from Three-Second Audio: Stratified Evaluation Reveals Severity-Selective Precursors, and the Model Deploys Fully On-Device
by: Kozak, Nazar
Published: (2026)
by: Kozak, Nazar
Published: (2026)
Machine Learning Framework for Audio-Based Content Evaluation using MFCC, Chroma, Spectral Contrast, and Temporal Feature Engineering
by: Aristorenas, Aris J.
Published: (2024)
by: Aristorenas, Aris J.
Published: (2024)
Joint Estimation of Piano Dynamics and Metrical Structure with a Multi-task Multi-Scale Network
by: He, Zhanhong, et al.
Published: (2025)
by: He, Zhanhong, et al.
Published: (2025)
Connected Speech-Based Cognitive Assessment in Chinese and English
by: Luz, Saturnino, et al.
Published: (2024)
by: Luz, Saturnino, et al.
Published: (2024)
HAELT: A Hybrid Attentive Ensemble Learning Transformer Framework for High-Frequency Stock Price Forecasting
by: Bui, Thanh Dan
Published: (2025)
by: Bui, Thanh Dan
Published: (2025)
PI-TTA: Physics-Informed Source-Free Test-Time Adaptation for Robust Human Activity Recognition on Mobile Devices
by: Li, Changyu, et al.
Published: (2026)
by: Li, Changyu, et al.
Published: (2026)
Uncertainty-Aware Transfer Learning for Cross-Building Energy Forecasting: Toward Robust and Scalable District-Level Energy Management
by: Zaregarizi, Shadmehr, et al.
Published: (2026)
by: Zaregarizi, Shadmehr, et al.
Published: (2026)
EEG Sleep Stage Classification with Continuous Wavelet Transform and Deep Learning
by: Gashti, Mehdi Zekriyapanah, et al.
Published: (2025)
by: Gashti, Mehdi Zekriyapanah, et al.
Published: (2025)
Similar Items
-
Distilled HuBERT for Mobile Speech Emotion Recognition: A Cross-Corpus Validation Study
by: Ismail, Saifelden M.
Published: (2025) -
An End-to-End Approach for Korean Wakeword Systems with Speaker Authentication
by: Seo, Geonwoo
Published: (2025) -
ParaNoise-SV: Integrated Approach for Noise-Robust Speaker Verification with Parallel Joint Learning of Speech Enhancement and Noise Extraction
by: Kim, Minu, et al.
Published: (2025) -
EMOVOME: A Dataset for Emotion Recognition in Spontaneous Real-Life Speech
by: Gómez-Zaragozá, Lucía, et al.
Published: (2024) -
Audio-based Kinship Verification Using Age Domain Conversion
by: Sun, Qiyang, et al.
Published: (2024)