Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Chen, Catherine, Merullo, Jack, Eickhoff, Carsten
Format:	Preprint
Published:	2024
Subjects:	Information Retrieval
Online Access:	https://arxiv.org/abs/2405.02503
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910809286770688
author	Chen, Catherine Merullo, Jack Eickhoff, Carsten
author_facet	Chen, Catherine Merullo, Jack Eickhoff, Carsten
contents	Neural models have demonstrated remarkable performance across diverse ranking tasks. However, the processes and internal mechanisms along which they determine relevance are still largely unknown. Existing approaches for analyzing neural ranker behavior with respect to IR properties rely either on assessing overall model behavior or employing probing methods that may offer an incomplete understanding of causal mechanisms. To provide a more granular understanding of internal model decision-making processes, we propose the use of causal interventions to reverse engineer neural rankers, and demonstrate how mechanistic interpretability methods can be used to isolate components satisfying term-frequency axioms within a ranking model. We identify a group of attention heads that detect duplicate tokens in earlier layers of the model, then communicate with downstream heads to compute overall document relevance. More generally, we propose that this style of mechanistic analysis opens up avenues for reverse engineering the processes neural retrieval models use to compute relevance. This work aims to initiate granular interpretability efforts that will not only benefit retrieval model development and training, but ultimately ensure safer deployment of these models.
format	Preprint
id	arxiv_https___arxiv_org_abs_2405_02503
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Axiomatic Causal Interventions for Reverse Engineering Relevance Computation in Neural Retrieval Models Chen, Catherine Merullo, Jack Eickhoff, Carsten Information Retrieval Neural models have demonstrated remarkable performance across diverse ranking tasks. However, the processes and internal mechanisms along which they determine relevance are still largely unknown. Existing approaches for analyzing neural ranker behavior with respect to IR properties rely either on assessing overall model behavior or employing probing methods that may offer an incomplete understanding of causal mechanisms. To provide a more granular understanding of internal model decision-making processes, we propose the use of causal interventions to reverse engineer neural rankers, and demonstrate how mechanistic interpretability methods can be used to isolate components satisfying term-frequency axioms within a ranking model. We identify a group of attention heads that detect duplicate tokens in earlier layers of the model, then communicate with downstream heads to compute overall document relevance. More generally, we propose that this style of mechanistic analysis opens up avenues for reverse engineering the processes neural retrieval models use to compute relevance. This work aims to initiate granular interpretability efforts that will not only benefit retrieval model development and training, but ultimately ensure safer deployment of these models.
title	Axiomatic Causal Interventions for Reverse Engineering Relevance Computation in Neural Retrieval Models
topic	Information Retrieval
url	https://arxiv.org/abs/2405.02503

Similar Items