Saved in:
| Main Authors: | Schmoll, Jonathan, Jatowt, Adam |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.24289 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Navigating Tomorrow: Reliably Assessing Large Language Models Performance on Future Event Prediction
by: Nako, Petraq, et al.
Published: (2025)
by: Nako, Petraq, et al.
Published: (2025)
Analyzing the Role of Context in Forecasting with Large Language Models
by: Mutschlechner, Gerrit, et al.
Published: (2025)
by: Mutschlechner, Gerrit, et al.
Published: (2025)
Temporal Blind Spots in Large Language Models
by: Wallat, Jonas, et al.
Published: (2024)
by: Wallat, Jonas, et al.
Published: (2024)
Pretraining Exposure Explains Popularity Judgments in Large Language Models
by: Mozafari, Jamshid, et al.
Published: (2026)
by: Mozafari, Jamshid, et al.
Published: (2026)
Statements: Universal Information Extraction from Tables with Large Language Models for ESG KPIs
by: Mishra, Lokesh, et al.
Published: (2024)
by: Mishra, Lokesh, et al.
Published: (2024)
Question Difficulty Estimation for Large Language Models via Answer Plausibility Scoring
by: Mozafari, Jamshid, et al.
Published: (2026)
by: Mozafari, Jamshid, et al.
Published: (2026)
SustainableQA: A Comprehensive Question Answering Dataset for Corporate Sustainability and EU Taxonomy Reporting
by: Ali, Mohammed, et al.
Published: (2025)
by: Ali, Mohammed, et al.
Published: (2025)
Towards Effective Time-Aware Language Representation: Exploring Enhanced Temporal Understanding in Language Models
by: Wang, Jiexin, et al.
Published: (2024)
by: Wang, Jiexin, et al.
Published: (2024)
Evaluating List Construction and Temporal Understanding capabilities of Large Language Models
by: Dumitru, Alexandru, et al.
Published: (2025)
by: Dumitru, Alexandru, et al.
Published: (2025)
Transformers and Language Models in Form Understanding: A Comprehensive Review of Scanned Document Analysis
by: Abdallah, Abdelrahman, et al.
Published: (2024)
by: Abdallah, Abdelrahman, et al.
Published: (2024)
Wisdom of the Crowds in Forecasting: Forecast Summarization for Supporting Future Event Prediction
by: Saha, Anisha, et al.
Published: (2025)
by: Saha, Anisha, et al.
Published: (2025)
Temporal Validity Change Prediction
by: Wenzel, Georg, et al.
Published: (2024)
by: Wenzel, Georg, et al.
Published: (2024)
ChroniclingAmericaQA: A Large-scale Question Answering Dataset based on Historical American Newspaper Pages
by: Piryani, Bhawna, et al.
Published: (2024)
by: Piryani, Bhawna, et al.
Published: (2024)
Exploring NLP Benchmarks in an Extremely Low-Resource Setting
by: Nuha, Ulin, et al.
Published: (2025)
by: Nuha, Ulin, et al.
Published: (2025)
Generator-Retriever-Generator Approach for Open-Domain Question Answering
by: Abdallah, Abdelrahman, et al.
Published: (2023)
by: Abdallah, Abdelrahman, et al.
Published: (2023)
AMuRD: Annotated Arabic-English Receipt Dataset for Key Information Extraction and Classification
by: Abdallah, Abdelrahman, et al.
Published: (2023)
by: Abdallah, Abdelrahman, et al.
Published: (2023)
Enriching Taxonomies Using Large Language Models
by: Ghamlouch, Zeinab, et al.
Published: (2025)
by: Ghamlouch, Zeinab, et al.
Published: (2025)
TriviaHG: A Dataset for Automatic Hint Generation from Factoid Questions
by: Mozafari, Jamshid, et al.
Published: (2024)
by: Mozafari, Jamshid, et al.
Published: (2024)
How often do Answers Change? Estimating Recency Requirements in Question Answering
by: Piryani, Bhawna, et al.
Published: (2026)
by: Piryani, Bhawna, et al.
Published: (2026)
How Good are LLM-based Rerankers? An Empirical Analysis of State-of-the-Art Reranking Models
by: Abdallah, Abdelrahman, et al.
Published: (2025)
by: Abdallah, Abdelrahman, et al.
Published: (2025)
Wrong Answers Can Also Be Useful: PlausibleQA -- A Large-Scale QA Dataset with Answer Plausibility Scores
by: Mozafari, Jamshid, et al.
Published: (2025)
by: Mozafari, Jamshid, et al.
Published: (2025)
Evaluating Answer Reranking Strategies in Time-sensitive Question Answering
by: Kardan, Mehmet, et al.
Published: (2025)
by: Kardan, Mehmet, et al.
Published: (2025)
Context Convergence Improves Answering Inferential Questions
by: Mozafari, Jamshid, et al.
Published: (2026)
by: Mozafari, Jamshid, et al.
Published: (2026)
WikiHint: A Human-Annotated Dataset for Hint Ranking and Generation
by: Mozafari, Jamshid, et al.
Published: (2024)
by: Mozafari, Jamshid, et al.
Published: (2024)
ASRank: Zero-Shot Re-Ranking with Answer Scent for Document Retrieval
by: Abdallah, Abdelrahman, et al.
Published: (2025)
by: Abdallah, Abdelrahman, et al.
Published: (2025)
Detecting Temporal Ambiguity in Questions
by: Piryani, Bhawna, et al.
Published: (2024)
by: Piryani, Bhawna, et al.
Published: (2024)
ComplexTempQA:A 100m Dataset for Complex Temporal Question Answering
by: Gruber, Raphael, et al.
Published: (2024)
by: Gruber, Raphael, et al.
Published: (2024)
Multi-hop Question Answering
by: Mavi, Vaibhav, et al.
Published: (2022)
by: Mavi, Vaibhav, et al.
Published: (2022)
Is Your AI-Generated Code Really Safe? Evaluating Large Language Models on Secure Code Generation with CodeSecEval
by: Wang, Jiexin, et al.
Published: (2024)
by: Wang, Jiexin, et al.
Published: (2024)
Detecting Future-related Contexts of Entity Mentions
by: Prashar, Puneet, et al.
Published: (2025)
by: Prashar, Puneet, et al.
Published: (2025)
PARSE: An Open-Domain Reasoning Question Answering Benchmark for Persian
by: Mozafari, Jamshid, et al.
Published: (2026)
by: Mozafari, Jamshid, et al.
Published: (2026)
Taxonomy Inference for Tabular Data Using Large Language Models
by: Wu, Zhenyu, et al.
Published: (2025)
by: Wu, Zhenyu, et al.
Published: (2025)
A Taxonomy for Data Contamination in Large Language Models
by: Palavalli, Medha, et al.
Published: (2024)
by: Palavalli, Medha, et al.
Published: (2024)
Navigating the Landscape of Hint Generation Research: From the Past to the Future
by: Jangra, Anubhav, et al.
Published: (2024)
by: Jangra, Anubhav, et al.
Published: (2024)
Taxonomy-based CheckList for Large Language Model Evaluation
by: Zhang, Damin
Published: (2023)
by: Zhang, Damin
Published: (2023)
TaxoAlign: Scholarly Taxonomy Generation Using Language Models
by: Lahiri, Avishek, et al.
Published: (2025)
by: Lahiri, Avishek, et al.
Published: (2025)
DeAR: Dual-Stage Document Reranking with Reasoning Agents via LLM Distillation
by: Abdallah, Abdelrahman, et al.
Published: (2025)
by: Abdallah, Abdelrahman, et al.
Published: (2025)
HintEval: A Comprehensive Framework for Hint Generation and Evaluation for Questions
by: Mozafari, Jamshid, et al.
Published: (2025)
by: Mozafari, Jamshid, et al.
Published: (2025)
Inferential Question Answering
by: Mozafari, Jamshid, et al.
Published: (2026)
by: Mozafari, Jamshid, et al.
Published: (2026)
Exploring Hint Generation Approaches in Open-Domain Question Answering
by: Mozafari, Jamshid, et al.
Published: (2024)
by: Mozafari, Jamshid, et al.
Published: (2024)
Similar Items
-
Navigating Tomorrow: Reliably Assessing Large Language Models Performance on Future Event Prediction
by: Nako, Petraq, et al.
Published: (2025) -
Analyzing the Role of Context in Forecasting with Large Language Models
by: Mutschlechner, Gerrit, et al.
Published: (2025) -
Temporal Blind Spots in Large Language Models
by: Wallat, Jonas, et al.
Published: (2024) -
Pretraining Exposure Explains Popularity Judgments in Large Language Models
by: Mozafari, Jamshid, et al.
Published: (2026) -
Statements: Universal Information Extraction from Tables with Large Language Models for ESG KPIs
by: Mishra, Lokesh, et al.
Published: (2024)