Saved in:
Bibliographic Details
Main Authors: Wang, Yu, Türk, Olcay, Grimminger, Angela, Buschmeier, Hendrik
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.20079
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911654063636480
author Wang, Yu
Türk, Olcay
Grimminger, Angela
Buschmeier, Hendrik
author_facet Wang, Yu
Türk, Olcay
Grimminger, Angela
Buschmeier, Hendrik
contents We investigate how verbal and nonverbal linguistic features, exhibited by speakers and listeners in dialogue, can contribute to predicting the listener's state of understanding in explanatory interactions on a moment-by-moment basis. Specifically, we examine three linguistic cues related to cognitive load and hypothesised to correlate with listener understanding: the information value (operationalised with surprisal) and syntactic complexity of the speaker's utterances, and the variation in the listener's interactive gaze behaviour. Based on statistical analyses of the MUNDEX corpus of face-to-face dialogic board game explanations, we find that individual cues vary with the listener's level of understanding. Listener states ('Understanding', 'Partial Understanding', 'Non-Understanding' and 'Misunderstanding') were self-annotated by the listeners using a retrospective video-recall method. The results of a subsequent classification experiment, involving two off-the-shelf classifiers and a fine-tuned German BERT-based multimodal classifier, demonstrate that prediction of these four states of understanding is generally possible and improves when the three linguistic cues are considered alongside textual features.
format Preprint
id arxiv_https___arxiv_org_abs_2603_20079
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Predicting States of Understanding in Explanatory Interactions Using Cognitive Load-Related Linguistic Cues
Wang, Yu
Türk, Olcay
Grimminger, Angela
Buschmeier, Hendrik
Computation and Language
We investigate how verbal and nonverbal linguistic features, exhibited by speakers and listeners in dialogue, can contribute to predicting the listener's state of understanding in explanatory interactions on a moment-by-moment basis. Specifically, we examine three linguistic cues related to cognitive load and hypothesised to correlate with listener understanding: the information value (operationalised with surprisal) and syntactic complexity of the speaker's utterances, and the variation in the listener's interactive gaze behaviour. Based on statistical analyses of the MUNDEX corpus of face-to-face dialogic board game explanations, we find that individual cues vary with the listener's level of understanding. Listener states ('Understanding', 'Partial Understanding', 'Non-Understanding' and 'Misunderstanding') were self-annotated by the listeners using a retrospective video-recall method. The results of a subsequent classification experiment, involving two off-the-shelf classifiers and a fine-tuned German BERT-based multimodal classifier, demonstrate that prediction of these four states of understanding is generally possible and improves when the three linguistic cues are considered alongside textual features.
title Predicting States of Understanding in Explanatory Interactions Using Cognitive Load-Related Linguistic Cues
topic Computation and Language
url https://arxiv.org/abs/2603.20079