Saved in:
Bibliographic Details
Main Author: Diaz, Fernando
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2410.13680
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916442999357440
author Diaz, Fernando
author_facet Diaz, Fernando
contents Traditional evaluation of information access systems has focused primarily on average utility across a set of information needs (information retrieval) or users (recommender systems). In this work, we argue that evaluating only with average metric measurements assumes utilitarian values not aligned with traditions of information access based on equal access. We advocate for pessimistic evaluation of information access systems focusing on worst case utility. These methods are (a) grounded in ethical and pragmatic concepts, (b) theoretically complementary to existing robustness and fairness methods, and (c) empirically validated across a set of retrieval and recommendation tasks. These results suggest that pessimistic evaluation should be included in existing experimentation processes to better understand the behavior of systems, especially when concerned with principles of social good.
format Preprint
id arxiv_https___arxiv_org_abs_2410_13680
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Pessimistic Evaluation
Diaz, Fernando
Information Retrieval
Traditional evaluation of information access systems has focused primarily on average utility across a set of information needs (information retrieval) or users (recommender systems). In this work, we argue that evaluating only with average metric measurements assumes utilitarian values not aligned with traditions of information access based on equal access. We advocate for pessimistic evaluation of information access systems focusing on worst case utility. These methods are (a) grounded in ethical and pragmatic concepts, (b) theoretically complementary to existing robustness and fairness methods, and (c) empirically validated across a set of retrieval and recommendation tasks. These results suggest that pessimistic evaluation should be included in existing experimentation processes to better understand the behavior of systems, especially when concerned with principles of social good.
title Pessimistic Evaluation
topic Information Retrieval
url https://arxiv.org/abs/2410.13680