Saved in:
Bibliographic Details
Main Authors: Li, Ao, Xu, Longwei, Ling, Chen, Zhang, Jinghui, Wang, Pengwei
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2412.08049
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912467166167040
author Li, Ao
Xu, Longwei
Ling, Chen
Zhang, Jinghui
Wang, Pengwei
author_facet Li, Ao
Xu, Longwei
Ling, Chen
Zhang, Jinghui
Wang, Pengwei
contents Sentiment and emotion understanding are essential to applications such as human-computer interaction and depression detection. While Multimodal Large Language Models (MLLMs) demonstrate robust general capabilities, they face considerable challenges in the field of affective computing, particularly in detecting subtle facial expressions and handling complex emotion-related tasks, such as emotion reason inference and understanding emotions in long-context scenarios. Furthermore, there is a lack of a unified MLLM that can effectively handle both sentiment and emotion-related tasks. To address these challenges, we explore multi-task training strategies for MLLMs in affective computing and introduce Emotion Universe (EmoVerse), an MLLM designed to handle a broad spectrum of sentiment and emotion-related tasks. In addition, EmoVerse is capable of deeply analyzing the underlying causes of emotional states. We also introduce the Affective Multitask (AMT) Dataset, which supports multimodal sentiment analysis, multimodal emotion recognition, facial expression recognition, emotion reason inference, and emotion cause-pair extraction tasks. Extensive experiments demonstrate that EmoVerse outperforms existing methods, achieving state-of-the-art results in sentiment and emotion-related tasks. The code is available at https://github.com/liaolea/EmoVerse.
format Preprint
id arxiv_https___arxiv_org_abs_2412_08049
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle EmoVerse: Exploring Multimodal Large Language Models for Sentiment and Emotion Understanding
Li, Ao
Xu, Longwei
Ling, Chen
Zhang, Jinghui
Wang, Pengwei
Computation and Language
Sentiment and emotion understanding are essential to applications such as human-computer interaction and depression detection. While Multimodal Large Language Models (MLLMs) demonstrate robust general capabilities, they face considerable challenges in the field of affective computing, particularly in detecting subtle facial expressions and handling complex emotion-related tasks, such as emotion reason inference and understanding emotions in long-context scenarios. Furthermore, there is a lack of a unified MLLM that can effectively handle both sentiment and emotion-related tasks. To address these challenges, we explore multi-task training strategies for MLLMs in affective computing and introduce Emotion Universe (EmoVerse), an MLLM designed to handle a broad spectrum of sentiment and emotion-related tasks. In addition, EmoVerse is capable of deeply analyzing the underlying causes of emotional states. We also introduce the Affective Multitask (AMT) Dataset, which supports multimodal sentiment analysis, multimodal emotion recognition, facial expression recognition, emotion reason inference, and emotion cause-pair extraction tasks. Extensive experiments demonstrate that EmoVerse outperforms existing methods, achieving state-of-the-art results in sentiment and emotion-related tasks. The code is available at https://github.com/liaolea/EmoVerse.
title EmoVerse: Exploring Multimodal Large Language Models for Sentiment and Emotion Understanding
topic Computation and Language
url https://arxiv.org/abs/2412.08049