:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Gołębiowska, Magdalena, Syga, Piotr
Format:	Preprint
Published:	2026
Subjects:	Sound Artificial Intelligence I.5.4
Online Access:	https://arxiv.org/abs/2604.20229
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Distilled HuBERT for Mobile Speech Emotion Recognition: A Cross-Corpus Validation Study
by: Ismail, Saifelden M.
Published: (2025)

An End-to-End Approach for Korean Wakeword Systems with Speaker Authentication
by: Seo, Geonwoo
Published: (2025)

ParaNoise-SV: Integrated Approach for Noise-Robust Speaker Verification with Parallel Joint Learning of Speech Enhancement and Noise Extraction
by: Kim, Minu, et al.
Published: (2025)

EMOVOME: A Dataset for Emotion Recognition in Spontaneous Real-Life Speech
by: Gómez-Zaragozá, Lucía, et al.
Published: (2024)

Audio-based Kinship Verification Using Age Domain Conversion
by: Sun, Qiyang, et al.
Published: (2024)

Impact of Phonetics on Speaker Identity in Adversarial Voice Attack
by: Dar, Daniyal Kabir, et al.
Published: (2025)

VocSim: A Training-free Benchmark for Zero-shot Content Identity in Single-source Audio
by: Basha, Maris, et al.
Published: (2025)

Crossing the Species Divide: Transfer Learning from Speech to Animal Sounds
by: Cauzinille, Jules, et al.
Published: (2025)

Hidden Echoes Survive Training in Audio To Audio Generative Instrument Models
by: Tralie, Christopher J., et al.
Published: (2024)

Revisiting SSL for sound event detection: complementary fusion and adaptive post-processing
by: Cui, Hanfang, et al.
Published: (2025)

Emotional Voice Messages (EMOVOME) database: emotion recognition in spontaneous voice messages
by: Zaragozá, Lucía Gómez, et al.
Published: (2024)

RARR : Robust Real-World Activity Recognition with Vibration by Scavenging Near-Surface Audio Online
by: Lee, Dong Yoon, et al.
Published: (2025)

How much to Dereverberate? Low-Latency Single-Channel Speech Enhancement in Distant Microphone Scenarios
by: Venkatesh, Satvik, et al.
Published: (2025)

Measuring Robustness of Speech Recognition from MEG Signals Under Distribution Shift
by: Chien, Sheng-You, et al.
Published: (2026)

Cepstral Smoothing of Binary Masks for Convolutive Blind Separation of Speech Mixtures
by: Missaoui, Ibrahim, et al.
Published: (2026)

Passive Underwater Acoustic Signal Separation based on Feature Decoupling Dual-path Network
by: Liu, Yucheng, et al.
Published: (2025)

Quantum-Enhanced Analysis and Grading of Vocal Performance
by: Agarwal, Rohan
Published: (2025)

SoundPlot: An Open-Source Framework for Birdsong Acoustic Analysis and Neural Synthesis with Interactive 3D Visualization
by: Mehdi, Naqcho Ali, et al.
Published: (2026)

Thaka at KSAA-2026 Task 2: Regularized Fine-Tuning for Arabic Speech Diacritization
by: Alamr, Meshal, et al.
Published: (2026)

A Bird Song Detector for improving bird identification through Deep Learning: a case study from Doñana
by: Márquez-Rodríguez, Alba, et al.
Published: (2025)

Hallucination Level of Artificial Intelligence Whisperer: Case Speech Recognizing Pantterinousut Rap Song
by: Horppu, Ismo, et al.
Published: (2025)

Proficiency-Aware Adaptation and Data Augmentation for Robust L2 ASR
by: Sun, Ling, et al.
Published: (2025)

Real-time Low-latency Music Source Separation using Hybrid Spectrogram-TasNet
by: Venkatesh, Satvik, et al.
Published: (2024)

Intracoronary Optical Coherence Tomography Image Processing and Vessel Classification Using Machine Learning
by: Lahchim, Amal, et al.
Published: (2026)

Prevailing Research Areas for Music AI in the Era of Foundation Models
by: Wei, Megan, et al.
Published: (2024)

Learning Alternative Ways of Performing a Task
by: Nieves, David, et al.
Published: (2024)

Quantization for OpenAI's Whisper Models: A Comparative Analysis
by: Andreyev, Allison
Published: (2025)

AG-REPA: Causal Layer Selection for Representation Alignment in Audio Flow Matching
by: Zhang, Pengfei, et al.
Published: (2026)

Physics Augmented Tuple Transformer for Autism Severity Level Detection
by: Ranasingha, Chinthaka, et al.
Published: (2024)

Event Detection via Probability Density Function Regression
by: Peng, Clark, et al.
Published: (2024)

Delayed Fusion: Integrating Large Language Models into First-Pass Decoding in End-to-end Speech Recognition
by: Hori, Takaaki, et al.
Published: (2025)

A Novel Global Context-aware Deep Neural Network for Enhanced Brain Tumor Segmentation using Magnetic Resonance Images
by: Mukherjee, Sourjya, et al.
Published: (2026)

Predicting Upcoming Stuttering Events from Three-Second Audio: Stratified Evaluation Reveals Severity-Selective Precursors, and the Model Deploys Fully On-Device
by: Kozak, Nazar
Published: (2026)

Machine Learning Framework for Audio-Based Content Evaluation using MFCC, Chroma, Spectral Contrast, and Temporal Feature Engineering
by: Aristorenas, Aris J.
Published: (2024)

Joint Estimation of Piano Dynamics and Metrical Structure with a Multi-task Multi-Scale Network
by: He, Zhanhong, et al.
Published: (2025)

Connected Speech-Based Cognitive Assessment in Chinese and English
by: Luz, Saturnino, et al.
Published: (2024)

HAELT: A Hybrid Attentive Ensemble Learning Transformer Framework for High-Frequency Stock Price Forecasting
by: Bui, Thanh Dan
Published: (2025)

PI-TTA: Physics-Informed Source-Free Test-Time Adaptation for Robust Human Activity Recognition on Mobile Devices
by: Li, Changyu, et al.
Published: (2026)

Uncertainty-Aware Transfer Learning for Cross-Building Energy Forecasting: Toward Robust and Scalable District-Level Energy Management
by: Zaregarizi, Shadmehr, et al.
Published: (2026)

EEG Sleep Stage Classification with Continuous Wavelet Transform and Deep Learning
by: Gashti, Mehdi Zekriyapanah, et al.
Published: (2025)