:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Fan, Qi, Li, Yutong, Xin, Yi, Cheng, Xinyu, Gao, Guanglai, Ma, Miao
Format:	Preprint
Published:	2024
Subjects:	Sound Artificial Intelligence Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2409.04447
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Leveraging Label Potential for Enhanced Multimodal Emotion Recognition
by: Shao, Xuechun, et al.
Published: (2025)

Emotion-Anchored Contrastive Learning Framework for Emotion Recognition in Conversation
by: Yu, Fangxu, et al.
Published: (2024)

Mitigating Subgroup Disparities in Multi-Label Speech Emotion Recognition: A Pseudo-Labeling and Unsupervised Learning Approach
by: Lin, Yi-Cheng, et al.
Published: (2025)

Emotion-Aware Contrastive Adaptation Network for Source-Free Cross-Corpus Speech Emotion Recognition
by: Zhao, Yan, et al.
Published: (2024)

Exploring Self-Supervised Multi-view Contrastive Learning for Speech Emotion Recognition with Limited Annotations
by: Khaertdinov, Bulat, et al.
Published: (2024)

Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition
by: Shen, Siyuan, et al.
Published: (2024)

LLM supervised Pre-training for Multimodal Emotion Recognition in Conversations
by: Dutta, Soumya, et al.
Published: (2025)

Emotion-Coherent Speech Data Augmentation and Self-Supervised Contrastive Style Training for Enhancing Kids's Story Speech Synthesis
by: Chung, Raymond
Published: (2026)

CLEP-DG: Contrastive Learning for Speech Emotion Domain Generalization via Soft Prompt Tuning
by: Shi, Jiacheng, et al.
Published: (2025)

Improving Multimodal Emotion Recognition by Leveraging Acoustic Adaptation and Visual Alignment
by: Zhao, Zhixian, et al.
Published: (2024)

ConPCO: Preserving Phoneme Characteristics for Automatic Pronunciation Assessment Leveraging Contrastive Ordinal Regularization
by: Yan, Bi-Cheng, et al.
Published: (2024)

A Survey on Multimodal Music Emotion Recognition
by: Liyanarachchi, Rashini, et al.
Published: (2025)

EMO-SUPERB: An In-depth Look at Speech Emotion Recognition
by: Wu, Haibin, et al.
Published: (2024)

Emotion-Aware Speech Self-Supervised Representation Learning with Intensity Knowledge
by: Liu, Rui, et al.
Published: (2024)

Leveraging Self-Supervised Models for Automatic Whispered Speech Recognition
by: Farhadipour, Aref, et al.
Published: (2024)

Revealing Emotional Clusters in Speaker Embeddings: A Contrastive Learning Strategy for Speech Emotion Recognition
by: Ulgen, Ismail Rasim, et al.
Published: (2024)

Enhancing Emotional Text-to-Speech Controllability with Natural Language Guidance through Contrastive Learning and Diffusion Models
by: Jing, Xin, et al.
Published: (2024)

Adapting General Disentanglement-Based Speaker Anonymization for Enhanced Emotion Preservation
by: Miao, Xiaoxiao, et al.
Published: (2024)

LPGNet: A Lightweight Network with Parallel Attention and Gated Fusion for Multimodal Emotion Recognition
by: He, Zhining, et al.
Published: (2025)

End-to-End Integration of Speech Emotion Recognition with Voice Activity Detection using Self-Supervised Learning Features
by: Yamashita, Natsuo, et al.
Published: (2024)

Multimodal Emotion Recognition from Raw Audio with Sinc-convolution
by: Zhang, Xiaohui, et al.
Published: (2024)

Leveraging Multimodal Methods and Spontaneous Speech for Alzheimer's Disease Identification
by: Gao, Yifan, et al.
Published: (2024)

MELT: Towards Automated Multimodal Emotion Data Annotation by Leveraging LLM Embedded Knowledge
by: Jing, Xin, et al.
Published: (2025)

Double Multi-Head Attention Multimodal System for Odyssey 2024 Speech Emotion Recognition Challenge
by: Costa, Federico, et al.
Published: (2024)

Semi-Supervised Self-Learning Enhanced Music Emotion Recognition
by: Sun, Yifu, et al.
Published: (2024)

Unifying Listener Scoring Scales: Comparison Learning Framework for Speech Quality Assessment and Continuous Speech Emotion Recognition
by: Hu, Cheng-Hung, et al.
Published: (2025)

CEC: A Noisy Label Detection Method for Speaker Recognition
by: Shen, Yao, et al.
Published: (2024)

A Cross-Corpus Speech Emotion Recognition Method Based on Supervised Contrastive Learning
by: minjie, Xiang
Published: (2024)

Crab: Multi Layer Contrastive Supervision to Improve Speech Emotion Recognition Under Both Acted and Natural Speech Condition
by: Ueda, Lucas H., et al.
Published: (2026)

SIGNL: A Label-Efficient Audio Deepfake Detection System via Spectral-Temporal Graph Non-Contrastive Learning
by: Febrinanto, Falih Gozi, et al.
Published: (2025)

Robust Training for Speaker Verification against Noisy Labels
by: Fang, Zhihua, et al.
Published: (2022)

Leveraging Self-Supervised Learning for Speaker Diarization
by: Han, Jiangyu, et al.
Published: (2024)

Bridging Speech Emotion Recognition and Personality: Dataset and Temporal Interaction Condition Network
by: Gao, Yuan, et al.
Published: (2025)

PERSONA: An Application for Emotion Recognition, Gender Recognition and Age Estimation
by: Koshal, Devyani, et al.
Published: (2024)

Speech Emotion Recognition with ASR Integration
by: Li, Yuanchao
Published: (2026)

Emotion-Aligned Contrastive Learning Between Images and Music
by: Stewart, Shanti, et al.
Published: (2023)

Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition
by: Ma, Ziyang, et al.
Published: (2023)

$\text{M}^3\text{PDB}$: A Multimodal, Multi-Label, Multilingual Prompt Database for Speech Generation
by: Zhu, Boyu, et al.
Published: (2025)

Label-Efficient Self-Supervised Speaker Verification With Information Maximization and Contrastive Learning
by: Lepage, Théo, et al.
Published: (2022)

MSP-Podcast SER Challenge 2024: L'antenne du Ventoux Multimodal Self-Supervised Learning for Speech Emotion Recognition
by: Duret, Jarod, et al.
Published: (2024)