Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Ma, Boxiang, Li, Ru, Wang, Yuanlong, Tan, Hongye, Li, Xiaoli
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2509.04866
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866918136131878912
author	Ma, Boxiang Li, Ru Wang, Yuanlong Tan, Hongye Li, Xiaoli
author_facet	Ma, Boxiang Li, Ru Wang, Yuanlong Tan, Hongye Li, Xiaoli
contents	Driven by vast and diverse textual data, large language models (LLMs) have demonstrated impressive performance across numerous natural language processing (NLP) tasks. Yet, a critical question persists: does their generalization arise from mere memorization of training data or from deep semantic understanding? To investigate this, we propose a bi-perspective evaluation framework to assess LLMs' scenario cognition - the ability to link semantic scenario elements with their arguments in context. Specifically, we introduce a novel scenario-based dataset comprising diverse textual descriptions of fictional facts, annotated with scenario elements. LLMs are evaluated through their capacity to answer scenario-related questions (model output perspective) and via probing their internal representations for encoded scenario elements-argument associations (internal representation perspective). Our experiments reveal that current LLMs predominantly rely on superficial memorization, failing to achieve robust semantic scenario cognition, even in simple cases. These findings expose critical limitations in LLMs' semantic understanding and offer cognitive insights for advancing their capabilities.
format	Preprint
id	arxiv_https___arxiv_org_abs_2509_04866
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Memorization $\neq$ Understanding: Do Large Language Models Have the Ability of Scenario Cognition? Ma, Boxiang Li, Ru Wang, Yuanlong Tan, Hongye Li, Xiaoli Computation and Language Driven by vast and diverse textual data, large language models (LLMs) have demonstrated impressive performance across numerous natural language processing (NLP) tasks. Yet, a critical question persists: does their generalization arise from mere memorization of training data or from deep semantic understanding? To investigate this, we propose a bi-perspective evaluation framework to assess LLMs' scenario cognition - the ability to link semantic scenario elements with their arguments in context. Specifically, we introduce a novel scenario-based dataset comprising diverse textual descriptions of fictional facts, annotated with scenario elements. LLMs are evaluated through their capacity to answer scenario-related questions (model output perspective) and via probing their internal representations for encoded scenario elements-argument associations (internal representation perspective). Our experiments reveal that current LLMs predominantly rely on superficial memorization, failing to achieve robust semantic scenario cognition, even in simple cases. These findings expose critical limitations in LLMs' semantic understanding and offer cognitive insights for advancing their capabilities.
title	Memorization $\neq$ Understanding: Do Large Language Models Have the Ability of Scenario Cognition?
topic	Computation and Language
url	https://arxiv.org/abs/2509.04866

Similar Items