:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Staudinger, Moritz, Kusa, Wojciech, Hanbury, Allan
Format:	Preprint
Published:	2026
Subjects:	Information Retrieval
Online Access:	https://arxiv.org/abs/2604.05766
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

A Reproducibility and Generalizability Study of Large Language Models for Query Generation
by: Staudinger, Moritz, et al.
Published: (2024)

Compare: A Framework for Scientific Comparisons
by: Staudinger, Moritz, et al.
Published: (2025)

ASPIRE: Assistive System for Performance Evaluation in IR
by: Peikos, Georgios, et al.
Published: (2024)

Reproducible Hybrid Time-Travel Retrieval in Evolving Corpora
by: Staudinger, Moritz, et al.
Published: (2024)

LLM-Driven Usefulness Labeling for IR Evaluation
by: Dewan, Mouly, et al.
Published: (2025)

Lighting the Way for BRIGHT: Reproducible Baselines with Anserini, Pyserini, and RankLLM
by: Sharifymoghaddam, Sahel, et al.
Published: (2025)

SimEval-IR: A Unified Toolkit and Benchmark Suite for Evaluating User Simulators and Search Sessions
by: Zerhoudi, Saber
Published: (2026)

FactIR: A Real-World Zero-shot Open-Domain Retrieval Benchmark for Fact-Checking
by: V, Venktesh, et al.
Published: (2025)

From Questions to Trust Reports: A LLM-IR Framework for the TREC 2025 DRAGUN Track
by: Alwasiak, Ignacy, et al.
Published: (2026)

CoIR: A Comprehensive Benchmark for Code Information Retrieval Models
by: Li, Xiangyang, et al.
Published: (2024)

DisastIR: A Comprehensive Information Retrieval Benchmark for Disaster Management
by: Yin, Kai, et al.
Published: (2025)

PosIR: Position-Aware Heterogeneous Information Retrieval Benchmark
by: Zeng, Ziyang, et al.
Published: (2026)

Training on the Test Model: Contamination in Ranking Distillation
by: Kalal, Vishakha Suresh, et al.
Published: (2024)

A Comparison of Methods for Evaluating Generative IR
by: Arabzadeh, Negar, et al.
Published: (2024)

Meta Lattice: Model Space Redesign for Cost-Effective Industry-Scale Ads Recommendations
by: Luo, Liang, et al.
Published: (2025)

Probing Ranking LLMs: A Mechanistic Analysis for Information Retrieval
by: Chowdhury, Tanya, et al.
Published: (2024)

Indexing Depth and Retrieval Effectiveness
by: Seely, Barbara J.
Published: (1972)

MiLQ: Benchmarking IR Models for Bilingual Web Search with Mixed Language Queries
by: Kim, Jonghwi, et al.
Published: (2025)

Revisiting BPR: A Replicability Study of a Common Recommender System Baseline
by: Milogradskii, Aleksandr, et al.
Published: (2024)

Recommending Missed Citations Identified by Reviewers: A New Task, Dataset and Baselines
by: Long, Kehan, et al.
Published: (2024)

MechIR: A Mechanistic Interpretability Framework for Information Retrieval
by: Parry, Andrew, et al.
Published: (2025)

Scientometric Analysis of the German IR Community within TREC & CLEF
by: Kruff, A. K., et al.
Published: (2025)

Reproducing Adaptive Reranking for Reasoning-Intensive IR
by: Rathee, Mandeep, et al.
Published: (2026)

ExcluIR: Exclusionary Neural Information Retrieval
by: Zhang, Wenhao, et al.
Published: (2024)

Evaluation of Temporal Change in IR Test Collections
by: Keller, Jüri, et al.
Published: (2024)

The Effects of Demographic Instructions on LLM Personas
by: de Paula, Angel Felipe Magnossão, et al.
Published: (2025)

ViLLA-MMBench: A Unified Benchmark Suite for LLM-Augmented Multimodal Movie Recommendation
by: Nazary, Fatemeh, et al.
Published: (2025)

Reproducing NevIR: Negation in Neural Information Retrieval
by: Elsen, Coen van den, et al.
Published: (2025)

Lost in Transliteration: Bridging the Script Gap in Neural IR
by: Chari, Andreas, et al.
Published: (2025)

LifeIR at the NTCIR-18 Lifelog-6 Task
by: Chen, Jiahan, et al.
Published: (2025)

Improving GenIR Systems Based on User Feedback
by: Ai, Qingyao, et al.
Published: (2025)

Establishing Performance Baselines in Fine-Tuning, Retrieval-Augmented Generation and Soft-Prompting for Non-Specialist LLM Users
by: Dodgson, Jennifer, et al.
Published: (2023)

Accuracy Assessment of OpenAlex and Clarivate Scholar ID with an LLM-Assisted Benchmark
by: Zhao, Renyu, et al.
Published: (2025)

mFollowIR: a Multilingual Benchmark for Instruction Following in Retrieval
by: Weller, Orion, et al.
Published: (2025)

Cloud-Based Benchmarking of Medical Image Analysis
by: Allan Hanbury

MultiConIR: Towards multi-condition Information Retrieval
by: Lu, Xuan, et al.
Published: (2025)

ir_explain: a Python Library of Explainable IR Methods
by: Saha, Sourav, et al.
Published: (2024)

Foundations of GenIR
by: Ai, Qingyao, et al.
Published: (2025)

Same Ranking, Different Winner: How Scoring Targets Shape LLM Memory Benchmarks
by: Panthi, Sugam, et al.
Published: (2026)

Discovering Biases in Information Retrieval Models Using Relevance Thesaurus as Global Explanation
by: Kim, Youngwoo, et al.
Published: (2024)