Saved in:
| Main Authors: | Deshpande, Gauri, Battula, Harish, Panda, Ashish, Kopparapu, Sunil Kumar |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.02669 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Emotion-Disentangled Embedding Alignment for Noise-Robust and Cross-Corpus Speech Emotion Recognition
by: Tiwari, Upasana, et al.
Published: (2025)
by: Tiwari, Upasana, et al.
Published: (2025)
A cost minimization approach to fix the vocabulary size in a tokenizer for an End-to-End ASR system
by: Kopparapu, Sunil Kumar, et al.
Published: (2024)
by: Kopparapu, Sunil Kumar, et al.
Published: (2024)
Unifying EEG and Speech for Emotion Recognition: A Two-Step Joint Learning Framework for Handling Missing EEG Data During Inference
by: Tiwari, Upasana, et al.
Published: (2025)
by: Tiwari, Upasana, et al.
Published: (2025)
Spoken Grammar Assessment Using LLM
by: Kopparapu, Sunil Kumar, et al.
Published: (2024)
by: Kopparapu, Sunil Kumar, et al.
Published: (2024)
TSPE: Task-Specific Prompt Ensemble for Improved Zero-Shot Audio Classification
by: Anand, Nishit, et al.
Published: (2024)
by: Anand, Nishit, et al.
Published: (2024)
Privacy-Enhancing Infant Cry Classification with Federated Transformers and Denoising Regularization
by: Owino, Geofrey, et al.
Published: (2025)
by: Owino, Geofrey, et al.
Published: (2025)
Real-Time Voicemail Detection in Telephony Audio Using Temporal Speech Activity Features
by: Saurav, Kumar
Published: (2026)
by: Saurav, Kumar
Published: (2026)
Improving Underwater Acoustic Classification Through Learnable Gabor Filter Convolution and Attention Mechanisms
by: Domingos, Lucas Cesar Ferreira, et al.
Published: (2025)
by: Domingos, Lucas Cesar Ferreira, et al.
Published: (2025)
DFKI-Speech System for WildSpoof Challenge: A robust framework for SASV In-the-Wild
by: Das, Arnab, et al.
Published: (2026)
by: Das, Arnab, et al.
Published: (2026)
AI-Driven Acoustic Voice Biomarker-Based Hierarchical Classification of Benign Laryngeal Voice Disorders from Sustained Vowels
by: Annabestani, Mohsen, et al.
Published: (2025)
by: Annabestani, Mohsen, et al.
Published: (2025)
Linguistic and Audio Embedding-Based Machine Learning for Alzheimer's Dementia and Mild Cognitive Impairment Detection: Insights from the PROCESS Challenge
by: Devahi, Adharsha Sam Edwin Sam, et al.
Published: (2025)
by: Devahi, Adharsha Sam Edwin Sam, et al.
Published: (2025)
Tri-MTL: A Triple Multitask Learning Approach for Respiratory Disease Diagnosis
by: Kim, June-Woo, et al.
Published: (2025)
by: Kim, June-Woo, et al.
Published: (2025)
Logit Distillation on Manifolds: Mapping by Learning
by: Yang, Yiru, et al.
Published: (2026)
by: Yang, Yiru, et al.
Published: (2026)
SALF-MOS: Speaker Agnostic Latent Features Downsampled for MOS Prediction
by: Agrawal, Saurabh, et al.
Published: (2025)
by: Agrawal, Saurabh, et al.
Published: (2025)
Multi-Task Learning for Lung sound & Lung disease classification
by: K V, Suma, et al.
Published: (2024)
by: K V, Suma, et al.
Published: (2024)
A Calculus-Based Framework for Determining Vocabulary Size in End-to-End ASR
by: Kopparapu, Sunil Kumar
Published: (2026)
by: Kopparapu, Sunil Kumar
Published: (2026)
EH-MAM: Easy-to-Hard Masked Acoustic Modeling for Self-Supervised Speech Representation Learning
by: Seth, Ashish, et al.
Published: (2024)
by: Seth, Ashish, et al.
Published: (2024)
Progressive Rock Music Classification
by: Nagar, Arpan, et al.
Published: (2025)
by: Nagar, Arpan, et al.
Published: (2025)
H-Infinity Filter Enhanced CNN-LSTM for Arrhythmia Detection from Heart Sound Recordings
by: Kumar, Rohith Shinoj, et al.
Published: (2025)
by: Kumar, Rohith Shinoj, et al.
Published: (2025)
Explainability of CNN Based Classification Models for Acoustic Signal
by: Faruqui, Zubair, et al.
Published: (2025)
by: Faruqui, Zubair, et al.
Published: (2025)
AFEN: Respiratory Disease Classification using Ensemble Learning
by: Nadkarni, Rahul, et al.
Published: (2024)
by: Nadkarni, Rahul, et al.
Published: (2024)
Imagined Speech State Classification for Robust Brain-Computer Interface
by: Ko, Byung-Kwan, et al.
Published: (2024)
by: Ko, Byung-Kwan, et al.
Published: (2024)
Innovative Speech-Based Deep Learning Approaches for Parkinson's Disease Classification: A Systematic Review
by: van Gelderen, Lisanne, et al.
Published: (2024)
by: van Gelderen, Lisanne, et al.
Published: (2024)
Thinking While Listening: Simple Test Time Scaling For Audio Classification
by: Verma, Prateek, et al.
Published: (2025)
by: Verma, Prateek, et al.
Published: (2025)
Synthetic Data Augmentation for Medical Audio Classification: A Preliminary Evaluation
by: McShannon, David, et al.
Published: (2026)
by: McShannon, David, et al.
Published: (2026)
Device-Robust Acoustic Scene Classification via Impulse Response Augmentation
by: Morocutti, Tobias, et al.
Published: (2023)
by: Morocutti, Tobias, et al.
Published: (2023)
Beyond saliency: enhancing explanation of speech emotion recognition with expert-referenced acoustic cues
by: Nasr, Seham, et al.
Published: (2025)
by: Nasr, Seham, et al.
Published: (2025)
Preference-Based Learning in Audio Applications: A Systematic Analysis
by: Broukhim, Aaron, et al.
Published: (2025)
by: Broukhim, Aaron, et al.
Published: (2025)
Model Merging Improves Zero-Shot Generalization in Bioacoustic Foundation Models
by: Marincione, Davide, et al.
Published: (2025)
by: Marincione, Davide, et al.
Published: (2025)
Aria-MIDI: A Dataset of Piano MIDI Files for Symbolic Music Modeling
by: Bradshaw, Louis, et al.
Published: (2025)
by: Bradshaw, Louis, et al.
Published: (2025)
Flowing Straighter with Conditional Flow Matching for Accurate Speech Enhancement
by: Cross, Mattias, et al.
Published: (2025)
by: Cross, Mattias, et al.
Published: (2025)
AudioCodecBench: A Comprehensive Benchmark for Audio Codec Evaluation
by: Wang, Lu, et al.
Published: (2025)
by: Wang, Lu, et al.
Published: (2025)
AUDETER: A Large-scale Dataset for Deepfake Audio Detection in Open Worlds
by: Wang, Qizhou, et al.
Published: (2025)
by: Wang, Qizhou, et al.
Published: (2025)
Who Will Top the Charts? Multimodal Music Popularity Prediction via Adaptive Fusion of Modality Experts and Temporal Engagement Modeling
by: Choudhary, Yash, et al.
Published: (2025)
by: Choudhary, Yash, et al.
Published: (2025)
Hookpad Aria: A Copilot for Songwriters
by: Donahue, Chris, et al.
Published: (2025)
by: Donahue, Chris, et al.
Published: (2025)
DAFMSVC: One-Shot Singing Voice Conversion with Dual Attention Mechanism and Flow Matching
by: Chen, Wei, et al.
Published: (2025)
by: Chen, Wei, et al.
Published: (2025)
QAMRO: Quality-aware Adaptive Margin Ranking Optimization for Human-aligned Assessment of Audio Generation Systems
by: Wang, Chien-Chun, et al.
Published: (2025)
by: Wang, Chien-Chun, et al.
Published: (2025)
Survey on the Evaluation of Generative Models in Music
by: Lerch, Alexander, et al.
Published: (2025)
by: Lerch, Alexander, et al.
Published: (2025)
Evaluation of Deep Audio Representations for Hearables
by: Gröger, Fabian, et al.
Published: (2025)
by: Gröger, Fabian, et al.
Published: (2025)
Explicit Context-Driven Neural Acoustic Modeling for High-Fidelity RIR Generation
by: Si, Chen, et al.
Published: (2025)
by: Si, Chen, et al.
Published: (2025)
Similar Items
-
Emotion-Disentangled Embedding Alignment for Noise-Robust and Cross-Corpus Speech Emotion Recognition
by: Tiwari, Upasana, et al.
Published: (2025) -
A cost minimization approach to fix the vocabulary size in a tokenizer for an End-to-End ASR system
by: Kopparapu, Sunil Kumar, et al.
Published: (2024) -
Unifying EEG and Speech for Emotion Recognition: A Two-Step Joint Learning Framework for Handling Missing EEG Data During Inference
by: Tiwari, Upasana, et al.
Published: (2025) -
Spoken Grammar Assessment Using LLM
by: Kopparapu, Sunil Kumar, et al.
Published: (2024) -
TSPE: Task-Specific Prompt Ensemble for Improved Zero-Shot Audio Classification
by: Anand, Nishit, et al.
Published: (2024)