:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Kim, Byunggun, Kwon, Younghun
Format:	Preprint
Published:	2024
Subjects:	Sound Artificial Intelligence Audio and Speech Processing
Online Access:	https://arxiv.org/abs/2409.04007
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Enhanced Speech Emotion Recognition with Efficient Channel Attention Guided Deep CNN-BiLSTM Framework
by: Kundu, Niloy Kumar, et al.
Published: (2024)

Toward Efficient Speech Emotion Recognition via Spectral Learning and Attention
by: Lee, HyeYoung, et al.
Published: (2025)

Color-based Emotion Representation for Speech Emotion Recognition
by: Nagase, Ryotaro, et al.
Published: (2026)

Efficient Finetuning for Dimensional Speech Emotion Recognition in the Age of Transformers
by: Sampath, Aneesha, et al.
Published: (2025)

MFHCA: Enhancing Speech Emotion Recognition Via Multi-Spatial Fusion and Hierarchical Cooperative Attention
by: Jiao, Xinxin, et al.
Published: (2024)

Multi-Loss Learning for Speech Emotion Recognition with Energy-Adaptive Mixup and Frame-Level Attention
by: Wang, Cong, et al.
Published: (2025)

MSAC: Multiple Speech Attribute Control Method for Reliable Speech Emotion Recognition
by: Pan, Yu, et al.
Published: (2023)

Speech Emotion Recognition Using CNN and Its Use Case in Digital Healthcare
by: Nigar, Nishargo
Published: (2024)

Persian Speech Emotion Recognition by Fine-Tuning Transformers
by: Shayaninasab, Minoo, et al.
Published: (2024)

EmoSphere-SER: Enhancing Speech Emotion Recognition Through Spherical Representation with Auxiliary Classification
by: Cho, Deok-Hyeon, et al.
Published: (2025)

Breaking Resource Barriers in Speech Emotion Recognition via Data Distillation
by: Chang, Yi, et al.
Published: (2024)

Active Learning with Task Adaptation Pre-training for Speech Emotion Recognition
by: Li, Dongyuan, et al.
Published: (2024)

Learning Physiology-Informed Vocal Spectrotemporal Representations for Speech Emotion Recognition
by: Zhang, Xu, et al.
Published: (2026)

ABHINAYA -- A System for Speech Emotion Recognition In Naturalistic Conditions Challenge
by: Dutta, Soumya, et al.
Published: (2025)

MambAttention: Mamba with Multi-Head Attention for Generalizable Single-Channel Speech Enhancement
by: Kühne, Nikolai Lund, et al.
Published: (2025)

Automatic Speech Recognition in the Modern Era: Architectures, Training, and Evaluation
by: Nayeem, Md., et al.
Published: (2025)

Bimodal Connection Attention Fusion for Speech Emotion Recognition
by: Luo, Jiachen, et al.
Published: (2025)

MLCA-AVSR: Multi-Layer Cross Attention Fusion based Audio-Visual Speech Recognition
by: Wang, He, et al.
Published: (2024)

MATER: Multi-level Acoustic and Textual Emotion Representation for Interpretable Speech Emotion Recognition
by: Jon, Hyo Jin, et al.
Published: (2025)

Amplifying Emotional Signals: Data-Efficient Deep Learning for Robust Speech Emotion Recognition
by: Vu, Tai
Published: (2025)

Improvement and Implementation of a Speech Emotion Recognition Model Based on Dual-Layer LSTM
by: Yang, Xiaoran, et al.
Published: (2024)

Are you sure? Analysing Uncertainty Quantification Approaches for Real-world Speech Emotion Recognition
by: Schrüfer, Oliver, et al.
Published: (2024)

Explaining Deep Learning Embeddings for Speech Emotion Recognition by Predicting Interpretable Acoustic Features
by: Dixit, Satvik, et al.
Published: (2024)

Speech Emotion Recognition Using MFCC Features and LSTM-Based Deep Learning Model
by: Oluwademilade, Adelekun, et al.
Published: (2026)

Temporal-Channel Modeling in Multi-head Self-Attention for Synthetic Speech Detection
by: Truong, Duc-Tuan, et al.
Published: (2024)

Clustering and Mining Accented Speech for Inclusive and Fair Speech Recognition
by: Kim, Jaeyoung, et al.
Published: (2024)

Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition
by: Ma, Ziyang, et al.
Published: (2023)

Do we really need Self-Attention for Streaming Automatic Speech Recognition?
by: Dkhissi, Youness, et al.
Published: (2026)

Enhancing Speech Emotion Recognition through Segmental Average Pooling of Self-Supervised Learning Features
by: Hyeon, Jonghwan, et al.
Published: (2024)

Cross-Corpus Validation of Speech Emotion Recognition in Urdu using Domain-Knowledge Acoustic Features
by: Talpur, Unzela, et al.
Published: (2025)

ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge
by: Wang, He, et al.
Published: (2024)

EmoSphere-TTS: Emotional Style and Intensity Modeling via Spherical Emotion Vector for Controllable Emotional Text-to-Speech
by: Cho, Deok-Hyeon, et al.
Published: (2024)

Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask
by: Wang, Tianzi, et al.
Published: (2024)

EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vector
by: Cho, Deok-Hyeon, et al.
Published: (2024)

Effective and Efficient Mixed Precision Quantization of Speech Foundation Models
by: Xu, Haoning, et al.
Published: (2025)

Speech Recognition-based Feature Extraction for Enhanced Automatic Severity Classification in Dysarthric Speech
by: Choi, Yerin, et al.
Published: (2024)

GMP-TL: Gender-augmented Multi-scale Pseudo-label Enhanced Transfer Learning for Speech Emotion Recognition
by: Pan, Yu, et al.
Published: (2024)

DiEmo-TTS: Disentangled Emotion Representations via Self-Supervised Distillation for Cross-Speaker Emotion Transfer in Text-to-Speech
by: Cho, Deok-Hyeon, et al.
Published: (2025)

Focal Loss based Residual Convolutional Neural Network for Speech Emotion Recognition
by: Tripathi, Suraj, et al.
Published: (2019)

TinyML for Speech Recognition
by: Barovic, Andrew, et al.
Published: (2025)