Saved in:
| Main Authors: | Dixit, Satvik, Low, Daniel M., Elbanna, Gasser, Catania, Fabio, Ghosh, Satrajit S. |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.09511 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Evaluating Speaker Identity Coding in Self-supervised Models and Humans
by: Elbanna, Gasser
Published: (2024)
by: Elbanna, Gasser
Published: (2024)
Incorporating Talker Identity Aids With Improving Speech Recognition in Adversarial Environments
by: Alavilli, Sagarika, et al.
Published: (2024)
by: Alavilli, Sagarika, et al.
Published: (2024)
Predicting Heart Activity from Speech using Data-driven and Knowledge-based features
by: Elbanna, Gasser, et al.
Published: (2024)
by: Elbanna, Gasser, et al.
Published: (2024)
Improving Speaker Representations Using Contrastive Losses on Multi-scale Features
by: Dixit, Satvik, et al.
Published: (2024)
by: Dixit, Satvik, et al.
Published: (2024)
MACE: Leveraging Audio for Evaluating Audio Captioning Systems
by: Dixit, Satvik, et al.
Published: (2024)
by: Dixit, Satvik, et al.
Published: (2024)
Vision Language Models Are Few-Shot Audio Spectrogram Classifiers
by: Dixit, Satvik, et al.
Published: (2024)
by: Dixit, Satvik, et al.
Published: (2024)
Learning Perceptually Relevant Temporal Envelope Morphing
by: Dixit, Satvik, et al.
Published: (2025)
by: Dixit, Satvik, et al.
Published: (2025)
Semantic-Emotional Resonance Embedding: A Semi-Supervised Paradigm for Cross-Lingual Speech Emotion Recognition
by: Zhao, Ya, et al.
Published: (2026)
by: Zhao, Ya, et al.
Published: (2026)
Cross-Corpus Validation of Speech Emotion Recognition in Urdu using Domain-Knowledge Acoustic Features
by: Talpur, Unzela, et al.
Published: (2025)
by: Talpur, Unzela, et al.
Published: (2025)
Speech Emotion Recognition Using MFCC Features and LSTM-Based Deep Learning Model
by: Oluwademilade, Adelekun, et al.
Published: (2026)
by: Oluwademilade, Adelekun, et al.
Published: (2026)
Layer-Wise Analysis of Self-Supervised Acoustic Word Embeddings: A Study on Speech Emotion Recognition
by: Saliba, Alexandra, et al.
Published: (2024)
by: Saliba, Alexandra, et al.
Published: (2024)
Exploring Local Interpretable Model-Agnostic Explanations for Speech Emotion Recognition with Distribution-Shift
by: Hjuler, Maja J., et al.
Published: (2025)
by: Hjuler, Maja J., et al.
Published: (2025)
MATER: Multi-level Acoustic and Textual Emotion Representation for Interpretable Speech Emotion Recognition
by: Jon, Hyo Jin, et al.
Published: (2025)
by: Jon, Hyo Jin, et al.
Published: (2025)
Speech Emotion Recognition with ASR Integration
by: Li, Yuanchao
Published: (2026)
by: Li, Yuanchao
Published: (2026)
On the Contribution of Lexical Features to Speech Emotion Recognition
by: Combei, David
Published: (2025)
by: Combei, David
Published: (2025)
End-to-End Integration of Speech Emotion Recognition with Voice Activity Detection using Self-Supervised Learning Features
by: Yamashita, Natsuo, et al.
Published: (2024)
by: Yamashita, Natsuo, et al.
Published: (2024)
Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition
by: Shen, Siyuan, et al.
Published: (2024)
by: Shen, Siyuan, et al.
Published: (2024)
Interpreting End-to-End Deep Learning Models for Speech Source Localization Using Layer-wise Relevance Propagation
by: Comanducci, Luca, et al.
Published: (2024)
by: Comanducci, Luca, et al.
Published: (2024)
Revealing Emotional Clusters in Speaker Embeddings: A Contrastive Learning Strategy for Speech Emotion Recognition
by: Ulgen, Ismail Rasim, et al.
Published: (2024)
by: Ulgen, Ismail Rasim, et al.
Published: (2024)
End-to-end Acoustic-linguistic Emotion and Intent Recognition Enhanced by Semi-supervised Learning
by: Ren, Zhao, et al.
Published: (2025)
by: Ren, Zhao, et al.
Published: (2025)
Interpretable Embeddings of Speech Enhance and Explain Brain Encoding Performance of Audio Models
by: Shimizu, Riki, et al.
Published: (2025)
by: Shimizu, Riki, et al.
Published: (2025)
Investigation of Deep Neural Network Acoustic Modelling Approaches for Low Resource Accented Mandarin Speech Recognition
by: Xie, Xurong, et al.
Published: (2022)
by: Xie, Xurong, et al.
Published: (2022)
Mellow: a small audio language model for reasoning
by: Deshmukh, Soham, et al.
Published: (2025)
by: Deshmukh, Soham, et al.
Published: (2025)
EMO-SUPERB: An In-depth Look at Speech Emotion Recognition
by: Wu, Haibin, et al.
Published: (2024)
by: Wu, Haibin, et al.
Published: (2024)
Dataset-Distillation Generative Model for Speech Emotion Recognition
by: Ritter-Gutierrez, Fabian, et al.
Published: (2024)
by: Ritter-Gutierrez, Fabian, et al.
Published: (2024)
THAI Speech Emotion Recognition (THAI-SER) corpus
by: Wongpithayadisai, Jilamika, et al.
Published: (2025)
by: Wongpithayadisai, Jilamika, et al.
Published: (2025)
Iterative Prototype Refinement for Ambiguous Speech Emotion Recognition
by: Sun, Haoqin, et al.
Published: (2024)
by: Sun, Haoqin, et al.
Published: (2024)
Abusive Speech Detection in Indic Languages Using Acoustic Features
by: Spiesberger, Anika A., et al.
Published: (2024)
by: Spiesberger, Anika A., et al.
Published: (2024)
Unifying Listener Scoring Scales: Comparison Learning Framework for Speech Quality Assessment and Continuous Speech Emotion Recognition
by: Hu, Cheng-Hung, et al.
Published: (2025)
by: Hu, Cheng-Hung, et al.
Published: (2025)
Investigating Effective Speaker Property Privacy Protection in Federated Learning for Speech Emotion Recognition
by: Tan, Chao, et al.
Published: (2024)
by: Tan, Chao, et al.
Published: (2024)
PCQ: Emotion Recognition in Speech via Progressive Channel Querying
by: Wang, Xincheng, et al.
Published: (2024)
by: Wang, Xincheng, et al.
Published: (2024)
Testing Correctness, Fairness, and Robustness of Speech Emotion Recognition Models
by: Derington, Anna, et al.
Published: (2023)
by: Derington, Anna, et al.
Published: (2023)
SELM: Enhancing Speech Emotion Recognition for Out-of-Domain Scenarios
by: Bukhari, Hazim, et al.
Published: (2024)
by: Bukhari, Hazim, et al.
Published: (2024)
EMO-RL: Emotion-Rule-Based Reinforcement Learning Enhanced Audio-Language Model for Generalized Speech Emotion Recognition
by: Li, Pengcheng, et al.
Published: (2025)
by: Li, Pengcheng, et al.
Published: (2025)
Comparative Evaluation of Acoustic Feature Extraction Tools for Clinical Speech Analysis
by: Choi, Anna Seo Gyeong, et al.
Published: (2025)
by: Choi, Anna Seo Gyeong, et al.
Published: (2025)
Domain Adapting Deep Reinforcement Learning for Real-world Speech Emotion Recognition
by: Rajapakshe, Thejan, et al.
Published: (2022)
by: Rajapakshe, Thejan, et al.
Published: (2022)
From Human Speech to Ocean Signals: Transferring Speech Large Models for Underwater Acoustic Target Recognition
by: Huang, Mengcheng, et al.
Published: (2026)
by: Huang, Mengcheng, et al.
Published: (2026)
Emotion-Aware Contrastive Adaptation Network for Source-Free Cross-Corpus Speech Emotion Recognition
by: Zhao, Yan, et al.
Published: (2024)
by: Zhao, Yan, et al.
Published: (2024)
Emotional Styles Hide in Deep Speaker Embeddings: Disentangle Deep Speaker Embeddings for Speaker Clustering
by: Lin, Chaohao, et al.
Published: (2025)
by: Lin, Chaohao, et al.
Published: (2025)
Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition
by: Li, Guinan, et al.
Published: (2024)
by: Li, Guinan, et al.
Published: (2024)
Similar Items
-
Evaluating Speaker Identity Coding in Self-supervised Models and Humans
by: Elbanna, Gasser
Published: (2024) -
Incorporating Talker Identity Aids With Improving Speech Recognition in Adversarial Environments
by: Alavilli, Sagarika, et al.
Published: (2024) -
Predicting Heart Activity from Speech using Data-driven and Knowledge-based features
by: Elbanna, Gasser, et al.
Published: (2024) -
Improving Speaker Representations Using Contrastive Losses on Multi-scale Features
by: Dixit, Satvik, et al.
Published: (2024) -
MACE: Leveraging Audio for Evaluating Audio Captioning Systems
by: Dixit, Satvik, et al.
Published: (2024)