:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Deshpande, Gauri, Battula, Harish, Panda, Ashish, Kopparapu, Sunil Kumar
Format:	Preprint
Published:	2025
Subjects:	Sound Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2512.02669
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Emotion-Disentangled Embedding Alignment for Noise-Robust and Cross-Corpus Speech Emotion Recognition
by: Tiwari, Upasana, et al.
Published: (2025)

A cost minimization approach to fix the vocabulary size in a tokenizer for an End-to-End ASR system
by: Kopparapu, Sunil Kumar, et al.
Published: (2024)

Unifying EEG and Speech for Emotion Recognition: A Two-Step Joint Learning Framework for Handling Missing EEG Data During Inference
by: Tiwari, Upasana, et al.
Published: (2025)

Spoken Grammar Assessment Using LLM
by: Kopparapu, Sunil Kumar, et al.
Published: (2024)

TSPE: Task-Specific Prompt Ensemble for Improved Zero-Shot Audio Classification
by: Anand, Nishit, et al.
Published: (2024)

Privacy-Enhancing Infant Cry Classification with Federated Transformers and Denoising Regularization
by: Owino, Geofrey, et al.
Published: (2025)

Real-Time Voicemail Detection in Telephony Audio Using Temporal Speech Activity Features
by: Saurav, Kumar
Published: (2026)

Improving Underwater Acoustic Classification Through Learnable Gabor Filter Convolution and Attention Mechanisms
by: Domingos, Lucas Cesar Ferreira, et al.
Published: (2025)

DFKI-Speech System for WildSpoof Challenge: A robust framework for SASV In-the-Wild
by: Das, Arnab, et al.
Published: (2026)

AI-Driven Acoustic Voice Biomarker-Based Hierarchical Classification of Benign Laryngeal Voice Disorders from Sustained Vowels
by: Annabestani, Mohsen, et al.
Published: (2025)

Linguistic and Audio Embedding-Based Machine Learning for Alzheimer's Dementia and Mild Cognitive Impairment Detection: Insights from the PROCESS Challenge
by: Devahi, Adharsha Sam Edwin Sam, et al.
Published: (2025)

Tri-MTL: A Triple Multitask Learning Approach for Respiratory Disease Diagnosis
by: Kim, June-Woo, et al.
Published: (2025)

Logit Distillation on Manifolds: Mapping by Learning
by: Yang, Yiru, et al.
Published: (2026)

SALF-MOS: Speaker Agnostic Latent Features Downsampled for MOS Prediction
by: Agrawal, Saurabh, et al.
Published: (2025)

Multi-Task Learning for Lung sound & Lung disease classification
by: K V, Suma, et al.
Published: (2024)

A Calculus-Based Framework for Determining Vocabulary Size in End-to-End ASR
by: Kopparapu, Sunil Kumar
Published: (2026)

EH-MAM: Easy-to-Hard Masked Acoustic Modeling for Self-Supervised Speech Representation Learning
by: Seth, Ashish, et al.
Published: (2024)

Progressive Rock Music Classification
by: Nagar, Arpan, et al.
Published: (2025)

H-Infinity Filter Enhanced CNN-LSTM for Arrhythmia Detection from Heart Sound Recordings
by: Kumar, Rohith Shinoj, et al.
Published: (2025)

Explainability of CNN Based Classification Models for Acoustic Signal
by: Faruqui, Zubair, et al.
Published: (2025)

AFEN: Respiratory Disease Classification using Ensemble Learning
by: Nadkarni, Rahul, et al.
Published: (2024)

Imagined Speech State Classification for Robust Brain-Computer Interface
by: Ko, Byung-Kwan, et al.
Published: (2024)

Innovative Speech-Based Deep Learning Approaches for Parkinson's Disease Classification: A Systematic Review
by: van Gelderen, Lisanne, et al.
Published: (2024)

Thinking While Listening: Simple Test Time Scaling For Audio Classification
by: Verma, Prateek, et al.
Published: (2025)

Synthetic Data Augmentation for Medical Audio Classification: A Preliminary Evaluation
by: McShannon, David, et al.
Published: (2026)

Device-Robust Acoustic Scene Classification via Impulse Response Augmentation
by: Morocutti, Tobias, et al.
Published: (2023)

Beyond saliency: enhancing explanation of speech emotion recognition with expert-referenced acoustic cues
by: Nasr, Seham, et al.
Published: (2025)

Preference-Based Learning in Audio Applications: A Systematic Analysis
by: Broukhim, Aaron, et al.
Published: (2025)

Model Merging Improves Zero-Shot Generalization in Bioacoustic Foundation Models
by: Marincione, Davide, et al.
Published: (2025)

Aria-MIDI: A Dataset of Piano MIDI Files for Symbolic Music Modeling
by: Bradshaw, Louis, et al.
Published: (2025)

Flowing Straighter with Conditional Flow Matching for Accurate Speech Enhancement
by: Cross, Mattias, et al.
Published: (2025)

AudioCodecBench: A Comprehensive Benchmark for Audio Codec Evaluation
by: Wang, Lu, et al.
Published: (2025)

AUDETER: A Large-scale Dataset for Deepfake Audio Detection in Open Worlds
by: Wang, Qizhou, et al.
Published: (2025)

Who Will Top the Charts? Multimodal Music Popularity Prediction via Adaptive Fusion of Modality Experts and Temporal Engagement Modeling
by: Choudhary, Yash, et al.
Published: (2025)

Hookpad Aria: A Copilot for Songwriters
by: Donahue, Chris, et al.
Published: (2025)

DAFMSVC: One-Shot Singing Voice Conversion with Dual Attention Mechanism and Flow Matching
by: Chen, Wei, et al.
Published: (2025)

QAMRO: Quality-aware Adaptive Margin Ranking Optimization for Human-aligned Assessment of Audio Generation Systems
by: Wang, Chien-Chun, et al.
Published: (2025)

Survey on the Evaluation of Generative Models in Music
by: Lerch, Alexander, et al.
Published: (2025)

Evaluation of Deep Audio Representations for Hearables
by: Gröger, Fabian, et al.
Published: (2025)

Explicit Context-Driven Neural Acoustic Modeling for High-Fidelity RIR Generation
by: Si, Chen, et al.
Published: (2025)