Saved in:
| Main Authors: | He, Zhining, Xiao, Yang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.08925 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Double Multi-Head Attention Multimodal System for Odyssey 2024 Speech Emotion Recognition Challenge
by: Costa, Federico, et al.
Published: (2024)
by: Costa, Federico, et al.
Published: (2024)
Emotion Recognition in Multi-Speaker Conversations through Speaker Identification, Knowledge Distillation, and Hierarchical Fusion
by: Li, Xiao, et al.
Published: (2025)
by: Li, Xiao, et al.
Published: (2025)
Sync-TVA: A Graph-Attention Framework for Multimodal Emotion Recognition with Cross-Modal Fusion
by: Deng, Zeyu, et al.
Published: (2025)
by: Deng, Zeyu, et al.
Published: (2025)
PARROT: Synergizing Mamba and Attention-based SSL Pre-Trained Models via Parallel Branch Hadamard Optimal Transport for Speech Emotion Recognition
by: Phukan, Orchid Chetia, et al.
Published: (2025)
by: Phukan, Orchid Chetia, et al.
Published: (2025)
Recursive Joint Cross-Modal Attention for Multimodal Fusion in Dimensional Emotion Recognition
by: Praveen, R. Gnana, et al.
Published: (2024)
by: Praveen, R. Gnana, et al.
Published: (2024)
A Survey on Multimodal Music Emotion Recognition
by: Liyanarachchi, Rashini, et al.
Published: (2025)
by: Liyanarachchi, Rashini, et al.
Published: (2025)
LLM supervised Pre-training for Multimodal Emotion Recognition in Conversations
by: Dutta, Soumya, et al.
Published: (2025)
by: Dutta, Soumya, et al.
Published: (2025)
Low-Complexity Acoustic Scene Classification Using Parallel Attention-Convolution Network
by: Li, Yanxiong, et al.
Published: (2024)
by: Li, Yanxiong, et al.
Published: (2024)
Attention-weighted Centered Kernel Alignment for Knowledge Distillation in Large Audio-Language Models Applied to Speech Emotion Recognition
by: Yang, Qingran, et al.
Published: (2026)
by: Yang, Qingran, et al.
Published: (2026)
Leveraging Cross-Attention Transformer and Multi-Feature Fusion for Cross-Linguistic Speech Emotion Recognition
by: Zhao, Ruoyu, et al.
Published: (2025)
by: Zhao, Ruoyu, et al.
Published: (2025)
MFHCA: Enhancing Speech Emotion Recognition Via Multi-Spatial Fusion and Hierarchical Cooperative Attention
by: Jiao, Xinxin, et al.
Published: (2024)
by: Jiao, Xinxin, et al.
Published: (2024)
Emotion-Aware Contrastive Adaptation Network for Source-Free Cross-Corpus Speech Emotion Recognition
by: Zhao, Yan, et al.
Published: (2024)
by: Zhao, Yan, et al.
Published: (2024)
Improving Speech Emotion Recognition Through Cross Modal Attention Alignment and Balanced Stacking Model
by: Ueda, Lucas, et al.
Published: (2025)
by: Ueda, Lucas, et al.
Published: (2025)
TelME: Teacher-leading Multimodal Fusion Network for Emotion Recognition in Conversation
by: Yun, Taeyang, et al.
Published: (2024)
by: Yun, Taeyang, et al.
Published: (2024)
Re-Parameterization of Lightweight Transformer for On-Device Speech Emotion Recognition
by: Zhang, Zixing, et al.
Published: (2024)
by: Zhang, Zixing, et al.
Published: (2024)
Multimodal Emotion Recognition from Raw Audio with Sinc-convolution
by: Zhang, Xiaohui, et al.
Published: (2024)
by: Zhang, Xiaohui, et al.
Published: (2024)
EfficientASR: Speech Recognition Network Compression via Attention Redundancy and Chunk-Level FFN Optimization
by: Wang, Jianzong, et al.
Published: (2024)
by: Wang, Jianzong, et al.
Published: (2024)
Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition
by: Shen, Siyuan, et al.
Published: (2024)
by: Shen, Siyuan, et al.
Published: (2024)
Bridging Speech Emotion Recognition and Personality: Dataset and Temporal Interaction Condition Network
by: Gao, Yuan, et al.
Published: (2025)
by: Gao, Yuan, et al.
Published: (2025)
TBDM-Net: Bidirectional Dense Networks with Gender Information for Speech Emotion Recognition
by: Striletchi, Vlad, et al.
Published: (2024)
by: Striletchi, Vlad, et al.
Published: (2024)
Metadata-Enhanced Speech Emotion Recognition: Augmented Residual Integration and Co-Attention in Two-Stage Fine-Tuning
by: Wan, Zixiang, et al.
Published: (2024)
by: Wan, Zixiang, et al.
Published: (2024)
PERSONA: An Application for Emotion Recognition, Gender Recognition and Age Estimation
by: Koshal, Devyani, et al.
Published: (2024)
by: Koshal, Devyani, et al.
Published: (2024)
EMO-RL: Emotion-Rule-Based Reinforcement Learning Enhanced Audio-Language Model for Generalized Speech Emotion Recognition
by: Li, Pengcheng, et al.
Published: (2025)
by: Li, Pengcheng, et al.
Published: (2025)
Speech Emotion Recognition with ASR Integration
by: Li, Yuanchao
Published: (2026)
by: Li, Yuanchao
Published: (2026)
Speaker Recognition Using Isomorphic Graph Attention Network Based Pooling on Self-Supervised Representation
by: Ge, Zirui, et al.
Published: (2023)
by: Ge, Zirui, et al.
Published: (2023)
MF-AED-AEC: Speech Emotion Recognition by Leveraging Multimodal Fusion, Asr Error Detection, and Asr Error Correction
by: He, Jiajun, et al.
Published: (2024)
by: He, Jiajun, et al.
Published: (2024)
Textless and Non-Parallel Speech-to-Speech Emotion Style Transfer
by: Dutta, Soumya, et al.
Published: (2025)
by: Dutta, Soumya, et al.
Published: (2025)
ArabEmoNet: A Lightweight Hybrid 2D CNN-BiLSTM Model with Attention for Robust Arabic Speech Emotion Recognition
by: Abouzeid, Ali, et al.
Published: (2025)
by: Abouzeid, Ali, et al.
Published: (2025)
Temporal-Frequency State Space Duality: An Efficient Paradigm for Speech Emotion Recognition
by: Zhao, Jiaqi, et al.
Published: (2024)
by: Zhao, Jiaqi, et al.
Published: (2024)
Reverse Attention for Lightweight Speech Enhancement on Edge Devices
by: Ojha, Shuubham, et al.
Published: (2025)
by: Ojha, Shuubham, et al.
Published: (2025)
Bimodal Connection Attention Fusion for Speech Emotion Recognition
by: Luo, Jiachen, et al.
Published: (2025)
by: Luo, Jiachen, et al.
Published: (2025)
RawTFNet: A Lightweight CNN Architecture for Speech Anti-spoofing
by: Xiao, Yang, et al.
Published: (2025)
by: Xiao, Yang, et al.
Published: (2025)
MFSN: Multi-perspective Fusion Search Network For Pre-training Knowledge in Speech Emotion Recognition
by: Sun, Haiyang, et al.
Published: (2023)
by: Sun, Haiyang, et al.
Published: (2023)
Audio-Guided Fusion Techniques for Multimodal Emotion Analysis
by: Shi, Pujin, et al.
Published: (2024)
by: Shi, Pujin, et al.
Published: (2024)
Semantic-Emotional Resonance Embedding: A Semi-Supervised Paradigm for Cross-Lingual Speech Emotion Recognition
by: Zhao, Ya, et al.
Published: (2026)
by: Zhao, Ya, et al.
Published: (2026)
Leveraging Label Potential for Enhanced Multimodal Emotion Recognition
by: Shao, Xuechun, et al.
Published: (2025)
by: Shao, Xuechun, et al.
Published: (2025)
A Lightweight Slot-Attention Framework for Multi-Instrument Multi-Pitch Estimation
by: Taenzer, Michael
Published: (2026)
by: Taenzer, Michael
Published: (2026)
MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech Recognition
by: Wang, He, et al.
Published: (2024)
by: Wang, He, et al.
Published: (2024)
THAI Speech Emotion Recognition (THAI-SER) corpus
by: Wongpithayadisai, Jilamika, et al.
Published: (2025)
by: Wongpithayadisai, Jilamika, et al.
Published: (2025)
TG-ASR: Translation-Guided Learning with Parallel Gated Cross Attention for Low-Resource Automatic Speech Recognition
by: Yang, Cheng-Yeh, et al.
Published: (2026)
by: Yang, Cheng-Yeh, et al.
Published: (2026)
Similar Items
-
Double Multi-Head Attention Multimodal System for Odyssey 2024 Speech Emotion Recognition Challenge
by: Costa, Federico, et al.
Published: (2024) -
Emotion Recognition in Multi-Speaker Conversations through Speaker Identification, Knowledge Distillation, and Hierarchical Fusion
by: Li, Xiao, et al.
Published: (2025) -
Sync-TVA: A Graph-Attention Framework for Multimodal Emotion Recognition with Cross-Modal Fusion
by: Deng, Zeyu, et al.
Published: (2025) -
PARROT: Synergizing Mamba and Attention-based SSL Pre-Trained Models via Parallel Branch Hadamard Optimal Transport for Speech Emotion Recognition
by: Phukan, Orchid Chetia, et al.
Published: (2025) -
Recursive Joint Cross-Modal Attention for Multimodal Fusion in Dimensional Emotion Recognition
by: Praveen, R. Gnana, et al.
Published: (2024)