Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Hu, Zhe, Liang, Tuo, Li, Jing, Lu, Yiren, Zhou, Yunlai, Qiao, Yiran, Ma, Jing, Yin, Yu
Format:	Preprint
Published:	2024
Subjects:	Computation and Language Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2405.19088
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915937337212928
author	Hu, Zhe Liang, Tuo Li, Jing Lu, Yiren Zhou, Yunlai Qiao, Yiran Ma, Jing Yin, Yu
author_facet	Hu, Zhe Liang, Tuo Li, Jing Lu, Yiren Zhou, Yunlai Qiao, Yiran Ma, Jing Yin, Yu
contents	Recent advancements in large multimodal language models have demonstrated remarkable proficiency across a wide range of tasks. Yet, these models still struggle with understanding the nuances of human humor through juxtaposition, particularly when it involves nonlinear narratives that underpin many jokes and humor cues. This paper investigates this challenge by focusing on comics with contradictory narratives, where each comic consists of two panels that create a humorous contradiction. We introduce the YesBut benchmark, which comprises tasks of varying difficulty aimed at assessing AI's capabilities in recognizing and interpreting these comics, ranging from literal content comprehension to deep narrative reasoning. Through extensive experimentation and analysis of recent commercial or open-sourced large (vision) language models, we assess their capability to comprehend the complex interplay of the narrative humor inherent in these comics. Our results show that even state-of-the-art models still lag behind human performance on this task. Our findings offer insights into the current limitations and potential improvements for AI in understanding human creative expressions.
format	Preprint
id	arxiv_https___arxiv_org_abs_2405_19088
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions Hu, Zhe Liang, Tuo Li, Jing Lu, Yiren Zhou, Yunlai Qiao, Yiran Ma, Jing Yin, Yu Computation and Language Computer Vision and Pattern Recognition Recent advancements in large multimodal language models have demonstrated remarkable proficiency across a wide range of tasks. Yet, these models still struggle with understanding the nuances of human humor through juxtaposition, particularly when it involves nonlinear narratives that underpin many jokes and humor cues. This paper investigates this challenge by focusing on comics with contradictory narratives, where each comic consists of two panels that create a humorous contradiction. We introduce the YesBut benchmark, which comprises tasks of varying difficulty aimed at assessing AI's capabilities in recognizing and interpreting these comics, ranging from literal content comprehension to deep narrative reasoning. Through extensive experimentation and analysis of recent commercial or open-sourced large (vision) language models, we assess their capability to comprehend the complex interplay of the narrative humor inherent in these comics. Our results show that even state-of-the-art models still lag behind human performance on this task. Our findings offer insights into the current limitations and potential improvements for AI in understanding human creative expressions.
title	Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions
topic	Computation and Language Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2405.19088

Similar Items