Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Wu, Wen, Zhang, Chao, Wu, Xixin, Woodland, Philip C.
Format:	Preprint
Published:	2022
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2203.04443
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866929296629563392
author	Wu, Wen Zhang, Chao Wu, Xixin Woodland, Philip C.
author_facet	Wu, Wen Zhang, Chao Wu, Xixin Woodland, Philip C.
contents	Emotion recognition is a key attribute for artificial intelligence systems that need to naturally interact with humans. However, the task definition is still an open problem due to the inherent ambiguity of emotions. In this paper, a novel Bayesian training loss based on per-utterance Dirichlet prior distributions is proposed for verbal emotion recognition, which models the uncertainty in one-hot labels created when human annotators assign the same utterance to different emotion classes. An additional metric is used to evaluate the performance by detection test utterances with high labelling uncertainty. This removes a major limitation that emotion classification systems only consider utterances with labels where the majority of annotators agree on the emotion class. Furthermore, a frequentist approach is studied to leverage the continuous-valued "soft" labels obtained by averaging the one-hot labels. We propose a two-branch model structure for emotion classification on a per-utterance basis, which achieves state-of-the-art classification results on the widely used IEMOCAP dataset. Based on this, uncertainty estimation experiments were performed. The best performance in terms of the area under the precision-recall curve when detecting utterances with high uncertainty was achieved by interpolating the Bayesian training loss with the Kullback-Leibler divergence training loss for the soft labels. The generality of the proposed approach was verified using the MSP-Podcast dataset which yielded the same pattern of results.
format	Preprint
id	arxiv_https___arxiv_org_abs_2203_04443
institution	arXiv
publishDate	2022
record_format	arxiv
spellingShingle	Estimating the Uncertainty in Emotion Class Labels with Utterance-Specific Dirichlet Priors Wu, Wen Zhang, Chao Wu, Xixin Woodland, Philip C. Computation and Language Emotion recognition is a key attribute for artificial intelligence systems that need to naturally interact with humans. However, the task definition is still an open problem due to the inherent ambiguity of emotions. In this paper, a novel Bayesian training loss based on per-utterance Dirichlet prior distributions is proposed for verbal emotion recognition, which models the uncertainty in one-hot labels created when human annotators assign the same utterance to different emotion classes. An additional metric is used to evaluate the performance by detection test utterances with high labelling uncertainty. This removes a major limitation that emotion classification systems only consider utterances with labels where the majority of annotators agree on the emotion class. Furthermore, a frequentist approach is studied to leverage the continuous-valued "soft" labels obtained by averaging the one-hot labels. We propose a two-branch model structure for emotion classification on a per-utterance basis, which achieves state-of-the-art classification results on the widely used IEMOCAP dataset. Based on this, uncertainty estimation experiments were performed. The best performance in terms of the area under the precision-recall curve when detecting utterances with high uncertainty was achieved by interpolating the Bayesian training loss with the Kullback-Leibler divergence training loss for the soft labels. The generality of the proposed approach was verified using the MSP-Podcast dataset which yielded the same pattern of results.
title	Estimating the Uncertainty in Emotion Class Labels with Utterance-Specific Dirichlet Priors
topic	Computation and Language
url	https://arxiv.org/abs/2203.04443

Similar Items