MARC21: :: Library Catalog

Salvato in:

Dettagli Bibliografici
Autori principali:	Wang, Yusong, Mao, Keyu, Obi, Takao, Shao, Minghao, Funakoshi, Kotaro
Natura:	Preprint
Pubblicazione:	2026
Soggetti:	Computation and Language
Accesso online:	https://arxiv.org/abs/2605.23328
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

_version_	1866913155133734912
author	Wang, Yusong Mao, Keyu Obi, Takao Shao, Minghao Funakoshi, Kotaro
author_facet	Wang, Yusong Mao, Keyu Obi, Takao Shao, Minghao Funakoshi, Kotaro
contents	Emotion Recognition in Conversation is a core component of affective computing, while current resources of sign language emotion datasets primarily focus on isolated sentences and lack conversational context. Models trained exclusively on these isolated utterances demonstrate degraded performance in real world scenarios because they cannot utilize historical dialogue flow. To address this structural limitation, we introduce the ERC task to sign language video analysis and propose the eJSL Dialog dataset. Constructed using the scripts from the STUDIES corpus, the dataset contains 1,920 video samples organized into 480 unique dialogues. We conduct systematic benchmarking on this dataset using models ranging from isolated visual networks to multimodal conversational architectures. The results reveal a domain gap when applying generic multimodal conversational emotion recognition models to sign language. These findings demonstrate the explicit need for context aware visual extractors specific to sign language and indicate that expanding the scale of conversational datasets to support large scale pre-training is a necessary next step for future research.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_23328
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Emotion Recognition in Sign Language Conversation Wang, Yusong Mao, Keyu Obi, Takao Shao, Minghao Funakoshi, Kotaro Computation and Language Emotion Recognition in Conversation is a core component of affective computing, while current resources of sign language emotion datasets primarily focus on isolated sentences and lack conversational context. Models trained exclusively on these isolated utterances demonstrate degraded performance in real world scenarios because they cannot utilize historical dialogue flow. To address this structural limitation, we introduce the ERC task to sign language video analysis and propose the eJSL Dialog dataset. Constructed using the scripts from the STUDIES corpus, the dataset contains 1,920 video samples organized into 480 unique dialogues. We conduct systematic benchmarking on this dataset using models ranging from isolated visual networks to multimodal conversational architectures. The results reveal a domain gap when applying generic multimodal conversational emotion recognition models to sign language. These findings demonstrate the explicit need for context aware visual extractors specific to sign language and indicate that expanding the scale of conversational datasets to support large scale pre-training is a necessary next step for future research.
title	Emotion Recognition in Sign Language Conversation
topic	Computation and Language
url	https://arxiv.org/abs/2605.23328

Documenti analoghi