Saved in:
Bibliographic Details
Main Authors: Li, Zehan, Pan, Ruhua, Pi, Xinyu
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2504.07459
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915235817848832
author Li, Zehan
Pan, Ruhua
Pi, Xinyu
author_facet Li, Zehan
Pan, Ruhua
Pi, Xinyu
contents We propose a novel framework for generating causal graphs from narrative texts, bridging high-level causality and detailed event-specific relationships. Our method first extracts concise, agent-centered vertices using large language model (LLM)-based summarization. We introduce an "Expert Index," comprising seven linguistically informed features, integrated into a Situation-Task-Action-Consequence (STAC) classification model. This hybrid system, combining RoBERTa embeddings with the Expert Index, achieves superior precision in causal link identification compared to pure LLM-based approaches. Finally, a structured five-iteration prompting process refines and constructs connected causal graphs. Experiments on 100 narrative chapters and short stories demonstrate that our approach consistently outperforms GPT-4o and Claude 3.5 in causal graph quality, while maintaining readability. The open-source tool provides an interpretable, efficient solution for capturing nuanced causal chains in narratives.
format Preprint
id arxiv_https___arxiv_org_abs_2504_07459
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Beyond LLMs: A Linguistic Approach to Causal Graph Generation from Narrative Texts
Li, Zehan
Pan, Ruhua
Pi, Xinyu
Computation and Language
We propose a novel framework for generating causal graphs from narrative texts, bridging high-level causality and detailed event-specific relationships. Our method first extracts concise, agent-centered vertices using large language model (LLM)-based summarization. We introduce an "Expert Index," comprising seven linguistically informed features, integrated into a Situation-Task-Action-Consequence (STAC) classification model. This hybrid system, combining RoBERTa embeddings with the Expert Index, achieves superior precision in causal link identification compared to pure LLM-based approaches. Finally, a structured five-iteration prompting process refines and constructs connected causal graphs. Experiments on 100 narrative chapters and short stories demonstrate that our approach consistently outperforms GPT-4o and Claude 3.5 in causal graph quality, while maintaining readability. The open-source tool provides an interpretable, efficient solution for capturing nuanced causal chains in narratives.
title Beyond LLMs: A Linguistic Approach to Causal Graph Generation from Narrative Texts
topic Computation and Language
url https://arxiv.org/abs/2504.07459