:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Li, Ming, Liu, Yong-Jin, Liu, Fang, Sheng, Huankun, Fan, Yeying, Wei, Yixiang, Luo, Minnan, Zhang, Weizhan, Wang, Wenping
Format:	Preprint
Published:	2026
Subjects:	Machine Learning Sound Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2602.20530
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Iterative Prototype Refinement for Ambiguous Speech Emotion Recognition
by: Sun, Haoqin, et al.
Published: (2024)

Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition
by: Shen, Siyuan, et al.
Published: (2024)

Metadata-Enhanced Speech Emotion Recognition: Augmented Residual Integration and Co-Attention in Two-Stage Fine-Tuning
by: Wan, Zixiang, et al.
Published: (2024)

Investigating Effective Speaker Property Privacy Protection in Federated Learning for Speech Emotion Recognition
by: Tan, Chao, et al.
Published: (2024)

End-to-end Acoustic-linguistic Emotion and Intent Recognition Enhanced by Semi-supervised Learning
by: Ren, Zhao, et al.
Published: (2025)

Diff-MSTC: A Mixing Style Transfer Prototype for Cubase
by: Vanka, Soumya Sai, et al.
Published: (2024)

Multi-Channel Differential ASR for Robust Wearer Speech Recognition on Smart Glasses
by: Yang, Yufeng, et al.
Published: (2025)

PERSONA: An Application for Emotion Recognition, Gender Recognition and Age Estimation
by: Koshal, Devyani, et al.
Published: (2024)

Speech Emotion Recognition with ASR Integration
by: Li, Yuanchao
Published: (2026)

Robust Audio-Visual Target Speaker Extraction with Emotion-Aware Multiple Enrollment Fusion
by: Jin, Zhan, et al.
Published: (2025)

EMO-SUPERB: An In-depth Look at Speech Emotion Recognition
by: Wu, Haibin, et al.
Published: (2024)

Dataset-Distillation Generative Model for Speech Emotion Recognition
by: Ritter-Gutierrez, Fabian, et al.
Published: (2024)

THAI Speech Emotion Recognition (THAI-SER) corpus
by: Wongpithayadisai, Jilamika, et al.
Published: (2025)

Emotion-Aware Contrastive Adaptation Network for Source-Free Cross-Corpus Speech Emotion Recognition
by: Zhao, Yan, et al.
Published: (2024)

Enhancing Dysarthric Speech Recognition for Unseen Speakers via Prototype-Based Adaptation
by: Wang, Shiyao, et al.
Published: (2024)

Semantic-Emotional Resonance Embedding: A Semi-Supervised Paradigm for Cross-Lingual Speech Emotion Recognition
by: Zhao, Ya, et al.
Published: (2026)

PCQ: Emotion Recognition in Speech via Progressive Channel Querying
by: Wang, Xincheng, et al.
Published: (2024)

Testing Correctness, Fairness, and Robustness of Speech Emotion Recognition Models
by: Derington, Anna, et al.
Published: (2023)

LLM supervised Pre-training for Multimodal Emotion Recognition in Conversations
by: Dutta, Soumya, et al.
Published: (2025)

SELM: Enhancing Speech Emotion Recognition for Out-of-Domain Scenarios
by: Bukhari, Hazim, et al.
Published: (2024)

Feature Selection via Graph Topology Inference for Soundscape Emotion Recognition
by: Rey, Samuel, et al.
Published: (2025)

Machine Unlearning in Speech Emotion Recognition via Forget Set Alone
by: Ren, Zhao, et al.
Published: (2025)

MSP-Conversation: A Corpus for Naturalistic, Time-Continuous Emotion Recognition
by: Martinez-Lucas, Luz, et al.
Published: (2026)

EMO-RL: Emotion-Rule-Based Reinforcement Learning Enhanced Audio-Language Model for Generalized Speech Emotion Recognition
by: Li, Pengcheng, et al.
Published: (2025)

Attention-weighted Centered Kernel Alignment for Knowledge Distillation in Large Audio-Language Models Applied to Speech Emotion Recognition
by: Yang, Qingran, et al.
Published: (2026)

1st Place Solution to Odyssey Emotion Recognition Challenge Task1: Tackling Class Imbalance Problem
by: Chen, Mingjie, et al.
Published: (2024)

Enhancing Speaker Verification with w2v-BERT 2.0 and Knowledge Distillation guided Structured Pruning
by: Li, Ze, et al.
Published: (2025)

Bridging Speech Emotion Recognition and Personality: Dataset and Temporal Interaction Condition Network
by: Gao, Yuan, et al.
Published: (2025)

Temporal-Frequency State Space Duality: An Efficient Paradigm for Speech Emotion Recognition
by: Zhao, Jiaqi, et al.
Published: (2024)

TBDM-Net: Bidirectional Dense Networks with Gender Information for Speech Emotion Recognition
by: Striletchi, Vlad, et al.
Published: (2024)

A Survey on Multimodal Music Emotion Recognition
by: Liyanarachchi, Rashini, et al.
Published: (2025)

AmbER$^2$: Dual Ambiguity-Aware Emotion Recognition Applied to Speech and Text
by: Wu, Jingyao, et al.
Published: (2026)

Exploring Local Interpretable Model-Agnostic Explanations for Speech Emotion Recognition with Distribution-Shift
by: Hjuler, Maja J., et al.
Published: (2025)

LPGNet: A Lightweight Network with Parallel Attention and Gated Fusion for Multimodal Emotion Recognition
by: He, Zhining, et al.
Published: (2025)

Are Mamba-based Audio Foundation Models the Best Fit for Non-Verbal Emotion Recognition?
by: Akhtar, Mohd Mujtaba, et al.
Published: (2025)

Color-based Emotion Representation for Speech Emotion Recognition
by: Nagase, Ryotaro, et al.
Published: (2026)

EchoVoices: Preserving Generational Voices and Memories for Seniors and Children
by: Xu, Haiying, et al.
Published: (2025)

SwitchCodec: A High-Fidelity Nerual Audio Codec With Sparse Quantization
by: Wang, Jin, et al.
Published: (2025)

The DKU System for Multi-Speaker Automatic Speech Recognition in MLC-SLM Challenge
by: Lin, Yuke, et al.
Published: (2025)

Double Multi-Head Attention Multimodal System for Odyssey 2024 Speech Emotion Recognition Challenge
by: Costa, Federico, et al.
Published: (2024)