:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Abdallah, Abdelrahman, Holdcroft, Jamie, Ali, Mohammed, Jatowt, Adam
Format:	Preprint
Published:	2026
Subjects:	Information Retrieval
Online Access:	https://arxiv.org/abs/2604.03676
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

TEMPO: A Realistic Multi-Domain Benchmark for Temporal Reasoning-Intensive Retrieval
by: Abdallah, Abdelrahman, et al.
Published: (2026)

SustainableQA: A Comprehensive Question Answering Dataset for Corporate Sustainability and EU Taxonomy Reporting
by: Ali, Mohammed, et al.
Published: (2025)

RECOR: Reasoning-focused Multi-turn Conversational Retrieval Benchmark
by: Ali, Mohammed, et al.
Published: (2026)

BracketRank: Large Language Model Document Ranking via Reasoning-based Competitive Elimination
by: Abdallah, Abdelrahman, et al.
Published: (2026)

How Good are LLM-based Rerankers? An Empirical Analysis of State-of-the-Art Reranking Models
by: Abdallah, Abdelrahman, et al.
Published: (2025)

Negative Sampling Techniques in Information Retrieval: A Survey
by: Wischounig, Laurin, et al.
Published: (2026)

RankArena: A Unified Platform for Evaluating Retrieval, Reranking and RAG with Human and LLM Feedback
by: Abdallah, Abdelrahman, et al.
Published: (2025)

Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation
by: Abdallah, Abdelrahman, et al.
Published: (2025)

DeAR: Dual-Stage Document Reranking with Reasoning Agents via LLM Distillation
by: Abdallah, Abdelrahman, et al.
Published: (2025)

A Study into Investigating Temporal Robustness of LLMs
by: Wallat, Jonas, et al.
Published: (2025)

TempRetriever: Fusion-based Temporal Dense Passage Retrieval for Time-Sensitive Questions
by: Abdallah, Abdelrahman, et al.
Published: (2025)

MM-BRIGHT: A Multi-Task Multimodal Benchmark for Reasoning-Intensive Retrieval
by: Abdallah, Abdelrahman, et al.
Published: (2026)

Wrong Answers Can Also Be Useful: PlausibleQA -- A Large-Scale QA Dataset with Answer Plausibility Scores
by: Mozafari, Jamshid, et al.
Published: (2025)

HintEval: A Comprehensive Framework for Hint Generation and Evaluation for Questions
by: Mozafari, Jamshid, et al.
Published: (2025)

Exploring Hint Generation Approaches in Open-Domain Question Answering
by: Mozafari, Jamshid, et al.
Published: (2024)

It's High Time: A Survey of Temporal Question Answering
by: Piryani, Bhawna, et al.
Published: (2025)

HIVE: Query, Hypothesize, Verify An LLM Framework for Multimodal Reasoning-Intensive Retrieval
by: Abdalla, Mahmoud, et al.
Published: (2026)

ArabicaQA: A Comprehensive Dataset for Arabic Question Answering
by: Abdallah, Abdelrahman, et al.
Published: (2024)

PARSE: An Open-Domain Reasoning Question Answering Benchmark for Persian
by: Mozafari, Jamshid, et al.
Published: (2026)

LLMTemporalComparator: A Tool for Analysing Differences in Temporal Adaptations of Large Language Models
by: Fritsch, Reinhard Friedrich, et al.
Published: (2024)

Analyzing the Role of Context in Forecasting with Large Language Models
by: Mutschlechner, Gerrit, et al.
Published: (2025)

Navigating Tomorrow: Reliably Assessing Large Language Models Performance on Future Event Prediction
by: Nako, Petraq, et al.
Published: (2025)

The Impact of International Collaborations with Highly Publishing Countries in Computer Science
by: Espes, Alberto Gomez, et al.
Published: (2025)

Enhancing Knowledge Retrieval with In-Context Learning and Semantic Search through Generative AI
by: Ghali, Mohammed-Khalil, et al.
Published: (2024)

Is Semantic Chunking Worth the Computational Cost?
by: Qu, Renyi, et al.
Published: (2024)

Wisdom of the Crowds in Forecasting: Forecast Summarization for Supporting Future Event Prediction
by: Saha, Anisha, et al.
Published: (2025)

Context Convergence Improves Answering Inferential Questions
by: Mozafari, Jamshid, et al.
Published: (2026)

Question Difficulty Estimation for Large Language Models via Answer Plausibility Scoring
by: Mozafari, Jamshid, et al.
Published: (2026)

WikiHint: A Human-Annotated Dataset for Hint Ranking and Generation
by: Mozafari, Jamshid, et al.
Published: (2024)

Evaluating Answer Reranking Strategies in Time-sensitive Question Answering
by: Kardan, Mehmet, et al.
Published: (2025)

MARVEL: Multimodal Adaptive Reasoning-intensiVe Expand-rerank and retrievaL
by: Kasem, Mahmoud SalahEldin, et al.
Published: (2026)

Detecting Future-related Contexts of Entity Mentions
by: Prashar, Puneet, et al.
Published: (2025)

Multi-hop Question Answering
by: Mavi, Vaibhav, et al.
Published: (2022)

Inferential Question Answering
by: Mozafari, Jamshid, et al.
Published: (2026)

A Picture is Worth a Thousand Words? An Empirical Study of Aggregation Strategies for Visual Financial Document Retrieval
by: Lim, Ho Hung, et al.
Published: (2026)

An Empirical Study of Position Bias in Modern Information Retrieval
by: Zeng, Ziyang, et al.
Published: (2025)

Captions Are Worth a Thousand Words: Enhancing Product Retrieval with Pretrained Image-to-Text Models
by: Tang, Jason, et al.
Published: (2024)

Cost-Aware Retrieval-Augmentation Reasoning Models with Adaptive Retrieval Depth
by: Hashemi, Helia, et al.
Published: (2025)

AdversarialCoT: Single-Document Retrieval Poisoning for LLM Reasoning
by: Song, Hongru, et al.
Published: (2026)

Evaluating LLM-Based Mobile App Recommendations: An Empirical Study
by: Motger, Quim, et al.
Published: (2025)