Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Ruiz, Alfredo Garrachón, de la Rosa, Tomás, Borrajo, Daniel
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Artificial Intelligence
Online Access:	https://arxiv.org/abs/2504.07646
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912741913001984
author	Ruiz, Alfredo Garrachón de la Rosa, Tomás Borrajo, Daniel
author_facet	Ruiz, Alfredo Garrachón de la Rosa, Tomás Borrajo, Daniel
contents	The applicability of Large Language Models (LLMs) in temporal reasoning tasks over data that is not present during training is still a field that remains to be explored. In this paper we work on this topic, focusing on structured and semi-structured anonymized data. We not only develop a direct LLM pipeline, but also compare various methodologies and conduct an in-depth analysis. We identified and examined seventeen common temporal reasoning tasks in natural language, focusing on their algorithmic components. To assess LLM performance, we created the \textit{Reasoning and Answering Temporal Ability} dataset (RATA), featuring semi-structured anonymized data to ensure reliance on reasoning rather than on prior knowledge. We compared several methodologies, involving SoTA techniques such as Tree-of-Thought, self-reflexion and code execution, tuned specifically for this scenario. Our results suggest that achieving scalable and reliable solutions requires more than just standalone LLMs, highlighting the need for integrated approaches.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_07646
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	On the Temporal Question-Answering Capabilities of Large Language Models Over Anonymized Data Ruiz, Alfredo Garrachón de la Rosa, Tomás Borrajo, Daniel Computation and Language Artificial Intelligence The applicability of Large Language Models (LLMs) in temporal reasoning tasks over data that is not present during training is still a field that remains to be explored. In this paper we work on this topic, focusing on structured and semi-structured anonymized data. We not only develop a direct LLM pipeline, but also compare various methodologies and conduct an in-depth analysis. We identified and examined seventeen common temporal reasoning tasks in natural language, focusing on their algorithmic components. To assess LLM performance, we created the \textit{Reasoning and Answering Temporal Ability} dataset (RATA), featuring semi-structured anonymized data to ensure reliance on reasoning rather than on prior knowledge. We compared several methodologies, involving SoTA techniques such as Tree-of-Thought, self-reflexion and code execution, tuned specifically for this scenario. Our results suggest that achieving scalable and reliable solutions requires more than just standalone LLMs, highlighting the need for integrated approaches.
title	On the Temporal Question-Answering Capabilities of Large Language Models Over Anonymized Data
topic	Computation and Language Artificial Intelligence
url	https://arxiv.org/abs/2504.07646

Similar Items