:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Piryani, Bhawna, Mert, Zehra, Jatowt, Adam
Format:	Preprint
Published:	2026
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2603.16544
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Evaluating Answer Reranking Strategies in Time-sensitive Question Answering
by: Kardan, Mehmet, et al.
Published: (2025)

Question Difficulty Estimation for Large Language Models via Answer Plausibility Scoring
by: Mozafari, Jamshid, et al.
Published: (2026)

Context Convergence Improves Answering Inferential Questions
by: Mozafari, Jamshid, et al.
Published: (2026)

ChroniclingAmericaQA: A Large-scale Question Answering Dataset based on Historical American Newspaper Pages
by: Piryani, Bhawna, et al.
Published: (2024)

Exploring Hint Generation Approaches in Open-Domain Question Answering
by: Mozafari, Jamshid, et al.
Published: (2024)

Wrong Answers Can Also Be Useful: PlausibleQA -- A Large-Scale QA Dataset with Answer Plausibility Scores
by: Mozafari, Jamshid, et al.
Published: (2025)

It's High Time: A Survey of Temporal Question Answering
by: Piryani, Bhawna, et al.
Published: (2025)

Evaluating Robustness of LLMs in Question Answering on Multilingual Noisy OCR Data
by: Piryani, Bhawna, et al.
Published: (2025)

ASRank: Zero-Shot Re-Ranking with Answer Scent for Document Retrieval
by: Abdallah, Abdelrahman, et al.
Published: (2025)

Detecting Temporal Ambiguity in Questions
by: Piryani, Bhawna, et al.
Published: (2024)

Pretraining Exposure Explains Popularity Judgments in Large Language Models
by: Mozafari, Jamshid, et al.
Published: (2026)

HintEval: A Comprehensive Framework for Hint Generation and Evaluation for Questions
by: Mozafari, Jamshid, et al.
Published: (2025)

TempRetriever: Fusion-based Temporal Dense Passage Retrieval for Time-Sensitive Questions
by: Abdallah, Abdelrahman, et al.
Published: (2025)

DynRank: Improving Passage Retrieval with Dynamic Zero-Shot Prompting Based on Question Classification
by: Abdallah, Abdelrahman, et al.
Published: (2024)

DeAR: Dual-Stage Document Reranking with Reasoning Agents via LLM Distillation
by: Abdallah, Abdelrahman, et al.
Published: (2025)

How Good are LLM-based Rerankers? An Empirical Analysis of State-of-the-Art Reranking Models
by: Abdallah, Abdelrahman, et al.
Published: (2025)

From Retrieval to Generation: Comparing Different Approaches
by: Abdallah, Abdelrahman, et al.
Published: (2025)

Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation
by: Abdallah, Abdelrahman, et al.
Published: (2025)

Generator-Retriever-Generator Approach for Open-Domain Question Answering
by: Abdallah, Abdelrahman, et al.
Published: (2023)

Multi-hop Question Answering
by: Mavi, Vaibhav, et al.
Published: (2022)

Inferential Question Answering
by: Mozafari, Jamshid, et al.
Published: (2026)

PARSE: An Open-Domain Reasoning Question Answering Benchmark for Persian
by: Mozafari, Jamshid, et al.
Published: (2026)

ComplexTempQA:A 100m Dataset for Complex Temporal Question Answering
by: Gruber, Raphael, et al.
Published: (2024)

BracketRank: Large Language Model Document Ranking via Reasoning-based Competitive Elimination
by: Abdallah, Abdelrahman, et al.
Published: (2026)

Question: How do Large Language Models perform on the Question Answering tasks? Answer:
by: Fischer, Kevin, et al.
Published: (2024)

ArabicaQA: A Comprehensive Dataset for Arabic Question Answering
by: Abdallah, Abdelrahman, et al.
Published: (2024)

TriviaHG: A Dataset for Automatic Hint Generation from Factoid Questions
by: Mozafari, Jamshid, et al.
Published: (2024)

Temporal Validity Change Prediction
by: Wenzel, Georg, et al.
Published: (2024)

Question Answering with LLMs and Learning from Answer Sets
by: Borroto, Manuel, et al.
Published: (2025)

Graph Guided Question Answer Generation for Procedural Question-Answering
by: Pham, Hai X., et al.
Published: (2024)

Automated Analysis of Sustainability Reports: Using Large Language Models for the Extraction and Prediction of EU Taxonomy-Compliant KPIs
by: Schmoll, Jonathan, et al.
Published: (2025)

Exploring NLP Benchmarks in an Extremely Low-Resource Setting
by: Nuha, Ulin, et al.
Published: (2025)

Consensus or Conflict? Fine-Grained Evaluation of Conflicting Answers in Question-Answering
by: Nachshoni, Eviatar, et al.
Published: (2025)

RankArena: A Unified Platform for Evaluating Retrieval, Reranking and RAG with Human and LLM Feedback
by: Abdallah, Abdelrahman, et al.
Published: (2025)

Is That Your Final Answer? Test-Time Scaling Improves Selective Question Answering
by: Jurayj, William, et al.
Published: (2025)

Reverse Question Answering: Can an LLM Write a Question so Hard (or Bad) that it Can't Answer?
by: Balepur, Nishant, et al.
Published: (2024)

Analyzing the Role of Context in Forecasting with Large Language Models
by: Mutschlechner, Gerrit, et al.
Published: (2025)

Navigating Tomorrow: Reliably Assessing Large Language Models Performance on Future Event Prediction
by: Nako, Petraq, et al.
Published: (2025)

A Dataset of Open-Domain Question Answering with Multiple-Span Answers
by: Luo, Zhiyi, et al.
Published: (2024)

General Table Question Answering via Answer-Formula Joint Generation
by: Wang, Zhongyuan, et al.
Published: (2025)