Saved in:
Bibliographic Details
Main Authors: Kim, Youngwoo, Rahimi, Razieh, Allan, James
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2410.03584
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866929528096423936
author Kim, Youngwoo
Rahimi, Razieh
Allan, James
author_facet Kim, Youngwoo
Rahimi, Razieh
Allan, James
contents Most efforts in interpreting neural relevance models have focused on local explanations, which explain the relevance of a document to a query but are not useful in predicting the model's behavior on unseen query-document pairs. We propose a novel method to globally explain neural relevance models by constructing a "relevance thesaurus" containing semantically relevant query and document term pairs. This thesaurus is used to augment lexical matching models such as BM25 to approximate the neural model's predictions. Our method involves training a neural relevance model to score the relevance of partial query and document segments, which is then used to identify relevant terms across the vocabulary space. We evaluate the obtained thesaurus explanation based on ranking effectiveness and fidelity to the target neural ranking model. Notably, our thesaurus reveals the existence of brand name bias in ranking models, demonstrating one advantage of our explanation method.
format Preprint
id arxiv_https___arxiv_org_abs_2410_03584
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Discovering Biases in Information Retrieval Models Using Relevance Thesaurus as Global Explanation
Kim, Youngwoo
Rahimi, Razieh
Allan, James
Information Retrieval
Most efforts in interpreting neural relevance models have focused on local explanations, which explain the relevance of a document to a query but are not useful in predicting the model's behavior on unseen query-document pairs. We propose a novel method to globally explain neural relevance models by constructing a "relevance thesaurus" containing semantically relevant query and document term pairs. This thesaurus is used to augment lexical matching models such as BM25 to approximate the neural model's predictions. Our method involves training a neural relevance model to score the relevance of partial query and document segments, which is then used to identify relevant terms across the vocabulary space. We evaluate the obtained thesaurus explanation based on ranking effectiveness and fidelity to the target neural ranking model. Notably, our thesaurus reveals the existence of brand name bias in ranking models, demonstrating one advantage of our explanation method.
title Discovering Biases in Information Retrieval Models Using Relevance Thesaurus as Global Explanation
topic Information Retrieval
url https://arxiv.org/abs/2410.03584