:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Ayaou, Iliass, Cavallucci, Denis, Chibane, Hicham
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Information Retrieval
Online Access:	https://arxiv.org/abs/2506.22141
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

PatenTEB: A Comprehensive Benchmark and Model Family for Patent Text Embedding
by: Ayaou, Iliass, et al.
Published: (2025)

Multi-task retriever fine-tuning for domain-specific and efficient RAG
by: Béchard, Patrice, et al.
Published: (2025)

The Coverage Illusion: From Pre-retrieval Routing Failure to Post-retrieval Cascades in a Production RAG System
by: Hussain, Zafar, et al.
Published: (2026)

Enhancing patent retrieval using automated patent summarization
by: Kamateri, Eleni, et al.
Published: (2025)

A comparative analysis of embedding models for patent similarity
by: Ascione, Grazia Sveva, et al.
Published: (2024)

ViFactCheck: A New Benchmark Dataset and Methods for Multi-domain News Fact-Checking in Vietnamese
by: Hoa, Tran Thai, et al.
Published: (2024)

Provence: efficient and robust context pruning for retrieval-augmented generation
by: Chirkova, Nadezhda, et al.
Published: (2025)

CXMArena: Unified Dataset to benchmark performance in realistic CXM Scenarios
by: Garg, Raghav, et al.
Published: (2025)

REAR: A Relevance-Aware Retrieval-Augmented Framework for Open-Domain Question Answering
by: Wang, Yuhao, et al.
Published: (2024)

Breaking It Down: Domain-Aware Semantic Segmentation for Retrieval Augmented Generation
by: Allamraju, Aparajitha, et al.
Published: (2025)

Multi-Reranker: Maximizing performance of retrieval-augmented generation in the FinanceRAG challenge
by: Lee, Joohyun, et al.
Published: (2024)

Domain-Aware RAG: MoL-Enhanced RL for Efficient Training and Scalable Retrieval
by: Lin, Hao, et al.
Published: (2025)

DragonVerseQA: Open-Domain Long-Form Context-Aware Question-Answering
by: Lahiri, Aritra Kumar, et al.
Published: (2024)

Qilin: A Multimodal Information Retrieval Dataset with APP-level User Sessions
by: Chen, Jia, et al.
Published: (2025)

Advancing continual lifelong learning in neural information retrieval: definition, dataset, framework, and empirical evaluation
by: Hou, Jingrui, et al.
Published: (2023)

Team LA at SCIDOCA shared task 2025: Citation Discovery via relation-based zero-shot retrieval
by: An, Trieu, et al.
Published: (2025)

Evaluation of retrieval-based QA on QUEST-LOFT
by: Scales, Nathan, et al.
Published: (2025)

On the impact of retrieved content representations in RAG Pipelines
by: Ross, Jonathan J, et al.
Published: (2026)

Navigating Through Paper Flood: Advancing LLM-based Paper Evaluation through Domain-Aware Retrieval and Latent Reasoning
by: Zheng, Wuqiang, et al.
Published: (2025)

Dr Web: a modern, query-based web data retrieval engine
by: Prifti, Ylli, et al.
Published: (2025)

Had enough of experts? Quantitative knowledge retrieval from large language models
by: Selby, David, et al.
Published: (2024)

DomainRAG: A Chinese Benchmark for Evaluating Domain-specific Retrieval-Augmented Generation
by: Wang, Shuting, et al.
Published: (2024)

C3PA: An Open Dataset of Expert-Annotated and Regulation-Aware Privacy Policies to Enable Scalable Regulatory Compliance Audits
by: Musa, Maaz Bin, et al.
Published: (2024)

Level-Navi Agent: A Framework and benchmark for Chinese Web Search Agents
by: Hu, Chuanrui, et al.
Published: (2024)

MURAD: A Large-Scale Multi-Domain Unified Reverse Arabic Dictionary Dataset
by: Sibaee, Serry, et al.
Published: (2026)

DEXTER: A Benchmark for open-domain Complex Question Answering using LLMs
by: Prabhu, Venktesh V. Deepali, et al.
Published: (2024)

Presenting Terrorizer: an algorithm for consolidating company names in patent assignees
by: Ascione, Grazia Sveva, et al.
Published: (2024)

Navigating the Knowledge Sea: Planet-scale answer retrieval using LLMs
by: Sarkar, Dipankar
Published: (2024)

CRAWLDoc: A Dataset for Robust Ranking of Bibliographic Documents
by: Karl, Fabian, et al.
Published: (2025)

ArabicaQA: A Comprehensive Dataset for Arabic Question Answering
by: Abdallah, Abdelrahman, et al.
Published: (2024)

Fine-tune the Entire RAG Architecture (including DPR retriever) for Question-Answering
by: Siriwardhana, Shamane, et al.
Published: (2021)

MIRACL-VISION: A Large, multilingual, visual document retrieval benchmark
by: Osmulski, Radek, et al.
Published: (2025)

Large Language Model Empowered Recommendation Meets All-domain Continual Pre-Training
by: Ma, Haokai, et al.
Published: (2025)

WikiHint: A Human-Annotated Dataset for Hint Ranking and Generation
by: Mozafari, Jamshid, et al.
Published: (2024)

On The Persona-based Summarization of Domain-Specific Documents
by: Mullick, Ankan, et al.
Published: (2024)

Trust or Abstain? A Self-Aware RAG Approach
by: Zhu, Xi, et al.
Published: (2026)

DAPR: A Benchmark on Document-Aware Passage Retrieval
by: Wang, Kexin, et al.
Published: (2023)

A Benchmark for Open-Domain Numerical Fact-Checking Enhanced by Claim Decomposition
by: Venktesh, V, et al.
Published: (2025)

Horizon Scans can be accelerated using novel information retrieval and artificial intelligence tools
by: Schmidt, Lena, et al.
Published: (2025)

An Analysis of Datasets, Metrics and Models in Keyphrase Generation
by: Boudin, Florian, et al.
Published: (2025)