:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Chen, Mingjie, Zhang, Hezhao, Li, Yuanchao, Luo, Jiachen, Wu, Wen, Ma, Ziyang, Bell, Peter, Lai, Catherine, Reiss, Joshua, Wang, Lin, Woodland, Philip C., Chen, Xie, Phan, Huy, Hain, Thomas
Format:	Preprint
Published:	2024
Subjects:	Audio and Speech Processing Sound
Online Access:	https://arxiv.org/abs/2405.20064
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Bimodal Connection Attention Fusion for Speech Emotion Recognition
by: Luo, Jiachen, et al.
Published: (2025)

EmoBox: Multilingual Multi-corpus Speech Emotion Recognition Toolkit and Benchmark
by: Ma, Ziyang, et al.
Published: (2024)

Speech Emotion Recognition with ASR Transcripts: A Comprehensive Study on Word Error Rate and Fusion Techniques
by: Li, Yuanchao, et al.
Published: (2024)

VoxEmo: Benchmarking Speech Emotion Recognition with Speech LLMs
by: Zhang, Hezhao, et al.
Published: (2026)

Heterogeneous bimodal attention fusion for speech emotion recognition
by: Luo, Jiachen, et al.
Published: (2025)

Revise, Reason, and Recognize: LLM-Based Emotion Recognition via Emotion-Specific Prompts and ASR Error Correction
by: Li, Yuanchao, et al.
Published: (2024)

Automatic Speech Recognition System-Independent Word Error Rate Estimation
by: Park, Chanho, et al.
Published: (2024)

Crossmodal ASR Error Correction with Discrete Speech Units
by: Li, Yuanchao, et al.
Published: (2024)

Layer-Wise Analysis of Self-Supervised Acoustic Word Embeddings: A Study on Speech Emotion Recognition
by: Saliba, Alexandra, et al.
Published: (2024)

Speech Emotion Recognition with ASR Integration
by: Li, Yuanchao
Published: (2026)

1st Place Solution to the 1st SkatingVerse Challenge
by: Sun, Tao, et al.
Published: (2024)

Distribution-based Emotion Recognition in Conversation
by: Wu, Wen, et al.
Published: (2022)

Semi-Supervised Cognitive State Classification from Speech with Multi-View Pseudo-Labeling
by: Li, Yuanchao, et al.
Published: (2024)

Estimating the Uncertainty in Emotion Class Labels with Utterance-Specific Dirichlet Priors
by: Wu, Wen, et al.
Published: (2022)

First Nations Australian Theatre for Health Equity
by: Woodland, Sarah, et al.
Published: (2024)

Parameter Efficient Finetuning for Speech Emotion Recognition and Domain Adaptation
by: Lashkarashvili, Nineli, et al.
Published: (2024)

Fast Word Error Rate Estimation Using Self-Supervised Representations for Speech and Text
by: Park, Chanho, et al.
Published: (2023)

Adaptive Ensemble Framework With Synthetic Sampling for Tackling Class Imbalance Problem
by: R. Sasirekha, et al.
Published: (2025)

Mapping Class Groups of Simply Connected Kähler Manifolds
by: Hain, Richard
Published: (2023)

Addressing Emotion Bias in Music Emotion Recognition and Generation with Frechet Audio Distance
by: Li, Yuanchao, et al.
Published: (2024)

Discriminative Spatial-Semantic VOS Solution: 1st Place Solution for 6th LSVOS
by: Miao, Deshui, et al.
Published: (2024)

Balancing the Scales: A Comprehensive Study on Tackling Class Imbalance in Binary Classification
by: Abdelhamid, Mohamed, et al.
Published: (2024)

Correction to “Adaptive Ensemble Framework With Synthetic Sampling for Tackling Class Imbalance Problem”
Published: (2025)

Federated Learning and Class Imbalances
by: Zhu, Siqi, et al.
Published: (2026)

Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
by: Yang, Chao-Han Huck, et al.
Published: (2024)

1st Place Solution of Multiview Egocentric Hand Tracking Challenge ECCV2024
by: Zou, Minqiang, et al.
Published: (2024)

Double Multi-Head Attention Multimodal System for Odyssey 2024 Speech Emotion Recognition Challenge
by: Costa, Federico, et al.
Published: (2024)

An Introduction to Preservation Challenges and Potential Solutions for Scrapbooks in Archival Collections
by: Teper, Jennifer Hain
Published: (2007)

Lectures on the Hodge-de Rham Theory of the Fundamental Group of P^1-{0,1,∞}
by: Hain, Richard
Published: (2005)

1st Place Solution for 5th LSVOS Challenge: Referring Video Object Segmentation
by: Luo, Zhuoyan, et al.
Published: (2024)

Leveraging Speech PTM, Text LLM, and Emotional TTS for Speech Emotion Recognition
by: Ma, Ziyang, et al.
Published: (2023)

Towards Fine-grained Large Object Segmentation 1st Place Solution to 3D AI Challenge 2020 -- Instance Segmentation Track
by: Chen, Zehui, et al.
Published: (2020)

Label-Synchronous Neural Transducer for Adaptable Online E2E Speech Recognition
by: Deng, Keqi, et al.
Published: (2023)

Task adaptation of Vision-Language-Action model: 1st Place Solution for the 2025 BEHAVIOR Challenge
by: Larchenko, Ilia, et al.
Published: (2025)

Estimating the Uncertainty in Emotion Attributes using Deep Evidential Regression
by: Wu, Wen, et al.
Published: (2023)

Augment to Segment: Tackling Pixel-Level Imbalance in Wheat Disease and Pest Segmentation
by: Wei, Tianqi, et al.
Published: (2025)

Chapter 1 A “C Odyssey”
by: Levine, Mark, et al.
Published: (2023)

GraphFedMIG: Tackling Class Imbalance in Federated Graph Learning via Mutual Information-Guided Generation
by: Li, Xinrui, et al.
Published: (2025)

1$^{st}$ Place Solution of WWW 2025 EReL@MIR Workshop Multimodal CTR Prediction Challenge
by: Xu, Junwei, et al.
Published: (2025)

Multimodal Reasoning for Science: Technical Report and 1st Place Solution to the ICML 2025 SeePhys Challenge
by: Liang, Hao, et al.
Published: (2025)