Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Xiao, Bushi, Bennie, Michael, Bardhan, Jayetri, Wang, Daisy Zhe
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2502.17669
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912653210812416
author	Xiao, Bushi Bennie, Michael Bardhan, Jayetri Wang, Daisy Zhe
author_facet	Xiao, Bushi Bennie, Michael Bardhan, Jayetri Wang, Daisy Zhe
contents	Structural priming is a cognitive phenomenon where exposure to a particular syntactic structure increases the likelihood of producing the same structure in subsequent utterances. While humans consistently demonstrate structural priming effects across various linguistic contexts, it remains unclear whether multimodal large language models (MLLMs) exhibit similar syntactic preservation behaviors. We introduce PRISMATIC, the first multimodal structural priming dataset, which advances computational linguistics by providing a standardized benchmark for investigating syntax-vision interactions. We propose the Syntactic Preservation Index (SPI), a novel reference-free evaluation metric designed specifically to assess structural priming effects in sentence level. Using this metric, we constructed and tested models with two different multimodal encoding architectures to investigate their structural preservation capabilities. Our experimental results demonstrate that models with both encoding methods show comparable syntactic priming effects. However, only fusion-encoded models exhibit robust positive correlations between priming effects and visual similarity, suggesting a cognitive process more aligned with human psycholinguistic patterns. This work provides new insights into evaluating and understanding how syntactic information is processed in multimodal language models.
format	Preprint
id	arxiv_https___arxiv_org_abs_2502_17669
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Towards Human Cognition: Visual Context Guides Syntactic Priming in Fusion-Encoded Models Xiao, Bushi Bennie, Michael Bardhan, Jayetri Wang, Daisy Zhe Computation and Language Structural priming is a cognitive phenomenon where exposure to a particular syntactic structure increases the likelihood of producing the same structure in subsequent utterances. While humans consistently demonstrate structural priming effects across various linguistic contexts, it remains unclear whether multimodal large language models (MLLMs) exhibit similar syntactic preservation behaviors. We introduce PRISMATIC, the first multimodal structural priming dataset, which advances computational linguistics by providing a standardized benchmark for investigating syntax-vision interactions. We propose the Syntactic Preservation Index (SPI), a novel reference-free evaluation metric designed specifically to assess structural priming effects in sentence level. Using this metric, we constructed and tested models with two different multimodal encoding architectures to investigate their structural preservation capabilities. Our experimental results demonstrate that models with both encoding methods show comparable syntactic priming effects. However, only fusion-encoded models exhibit robust positive correlations between priming effects and visual similarity, suggesting a cognitive process more aligned with human psycholinguistic patterns. This work provides new insights into evaluating and understanding how syntactic information is processed in multimodal language models.
title	Towards Human Cognition: Visual Context Guides Syntactic Priming in Fusion-Encoded Models
topic	Computation and Language
url	https://arxiv.org/abs/2502.17669

Similar Items