Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Barale, Claire, Barrett, Leslie, Bajaj, Vikram Sunil, Rovatsos, Michael
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2506.04041
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909887240339456
author	Barale, Claire Barrett, Leslie Bajaj, Vikram Sunil Rovatsos, Michael
author_facet	Barale, Claire Barrett, Leslie Bajaj, Vikram Sunil Rovatsos, Michael
contents	Understanding temporal relationships and accurately reconstructing the event timeline is important for case law analysis, compliance monitoring, and legal summarization. However, existing benchmarks lack specialized language evaluation, leaving a gap in understanding how LLMs handle event ordering in legal contexts. We introduce LexTime, a dataset designed to evaluate LLMs' event ordering capabilities in legal language, consisting of 512 instances from U.S. Federal Complaints with annotated event pairs and their temporal relations. Our findings show that (1) LLMs are more accurate on legal event ordering than on narrative texts (up to +10.5%); (2) longer input contexts and implicit events boost accuracy, reaching 80.8% for implicit-explicit event pairs; (3) legal linguistic complexities and nested clauses remain a challenge. While performance is promising, specific features of legal texts remain a bottleneck for legal temporal event reasoning, and we propose concrete modeling directions to better address them.
format	Preprint
id	arxiv_https___arxiv_org_abs_2506_04041
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	LexTime: A Benchmark for Temporal Ordering of Legal Events Barale, Claire Barrett, Leslie Bajaj, Vikram Sunil Rovatsos, Michael Computation and Language Understanding temporal relationships and accurately reconstructing the event timeline is important for case law analysis, compliance monitoring, and legal summarization. However, existing benchmarks lack specialized language evaluation, leaving a gap in understanding how LLMs handle event ordering in legal contexts. We introduce LexTime, a dataset designed to evaluate LLMs' event ordering capabilities in legal language, consisting of 512 instances from U.S. Federal Complaints with annotated event pairs and their temporal relations. Our findings show that (1) LLMs are more accurate on legal event ordering than on narrative texts (up to +10.5%); (2) longer input contexts and implicit events boost accuracy, reaching 80.8% for implicit-explicit event pairs; (3) legal linguistic complexities and nested clauses remain a challenge. While performance is promising, specific features of legal texts remain a bottleneck for legal temporal event reasoning, and we propose concrete modeling directions to better address them.
title	LexTime: A Benchmark for Temporal Ordering of Legal Events
topic	Computation and Language
url	https://arxiv.org/abs/2506.04041

Similar Items