Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Tchuindjo, Diane, Shah, Devavrat, Khattab, Omar
Format:	Preprint
Published:	2026
Subjects:	Information Retrieval Artificial Intelligence
Online Access:	https://arxiv.org/abs/2605.06235
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910270623842304
author	Tchuindjo, Diane Shah, Devavrat Khattab, Omar
author_facet	Tchuindjo, Diane Shah, Devavrat Khattab, Omar
contents	Retrieval benchmarks are increasingly saturating, but we argue that efficient search is far from a solved problem. We identify a class of queries we call oblique, which seek documents that instantiate a latent pattern, like finding all tweets that express an implicit stance, chat logs that demonstrate a particular failure mode, or transcripts that match an abstract scenario. We study three mechanisms through which obliqueness may arise and introduce OBLIQ-Bench, a suite of five oblique search problems over real long-tail corpora. OBLIQ-Bench exposes an overlooked asymmetry between retrieval and verification, where reasoning LLMs reliably recognize latent relevance whenever relevant documents are surfaced, but even sophisticated retrieval pipelines fail to surface most relevant documents in the first place. We hope that OBLIQ-Bench will drive research into retrieval architectures that efficiently capture latent patterns and implicit signals in large corpora.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_06235
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	OBLIQ-Bench: Exposing Overlooked Bottlenecks in Modern Retrievers with Latent and Implicit Queries Tchuindjo, Diane Shah, Devavrat Khattab, Omar Information Retrieval Artificial Intelligence Retrieval benchmarks are increasingly saturating, but we argue that efficient search is far from a solved problem. We identify a class of queries we call oblique, which seek documents that instantiate a latent pattern, like finding all tweets that express an implicit stance, chat logs that demonstrate a particular failure mode, or transcripts that match an abstract scenario. We study three mechanisms through which obliqueness may arise and introduce OBLIQ-Bench, a suite of five oblique search problems over real long-tail corpora. OBLIQ-Bench exposes an overlooked asymmetry between retrieval and verification, where reasoning LLMs reliably recognize latent relevance whenever relevant documents are surfaced, but even sophisticated retrieval pipelines fail to surface most relevant documents in the first place. We hope that OBLIQ-Bench will drive research into retrieval architectures that efficiently capture latent patterns and implicit signals in large corpora.
title	OBLIQ-Bench: Exposing Overlooked Bottlenecks in Modern Retrievers with Latent and Implicit Queries
topic	Information Retrieval Artificial Intelligence
url	https://arxiv.org/abs/2605.06235

Similar Items