Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Kolling, Camila, Speicher, Till, Nanda, Vedant, Toneva, Mariya, Gummadi, Krishna P.
Format:	Preprint
Published:	2023
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2305.19294
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866918029382647808
author	Kolling, Camila Speicher, Till Nanda, Vedant Toneva, Mariya Gummadi, Krishna P.
author_facet	Kolling, Camila Speicher, Till Nanda, Vedant Toneva, Mariya Gummadi, Krishna P.
contents	Machine learning (ML) algorithms can often exhibit discriminatory behavior, negatively affecting certain populations across protected groups. To address this, numerous debiasing methods, and consequently evaluation measures, have been proposed. Current evaluation measures for debiasing methods suffer from two main limitations: (1) they primarily provide a global estimate of unfairness, failing to provide a more fine-grained analysis, and (2) they predominantly analyze the model output on a specific task, failing to generalize the findings to other tasks. In this work, we introduce Pointwise Normalized Kernel Alignment (PNKA), a pointwise representational similarity measure that addresses these limitations by measuring how debiasing measures affect the intermediate representations of individuals. On tabular data, the use of PNKA reveals previously unknown insights: while group fairness predominantly influences a small subset of the population, maintaining high representational similarity for the majority, individual fairness constraints uniformly impact representations across the entire population, altering nearly every data point. We show that by evaluating representations using PNKA, we can reliably predict the behavior of ML models trained on these representations. Moreover, applying PNKA to language embeddings shows that existing debiasing methods may not perform as intended, failing to remove biases from stereotypical words and sentences. Our findings suggest that current evaluation measures for debiasing methods are insufficient, highlighting the need for a deeper understanding of the effects of debiasing methods, and show how pointwise representational similarity metrics can help with fairness audits.
format	Preprint
id	arxiv_https___arxiv_org_abs_2305_19294
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	Investigating the Effects of Fairness Interventions Using Pointwise Representational Similarity Kolling, Camila Speicher, Till Nanda, Vedant Toneva, Mariya Gummadi, Krishna P. Machine Learning Machine learning (ML) algorithms can often exhibit discriminatory behavior, negatively affecting certain populations across protected groups. To address this, numerous debiasing methods, and consequently evaluation measures, have been proposed. Current evaluation measures for debiasing methods suffer from two main limitations: (1) they primarily provide a global estimate of unfairness, failing to provide a more fine-grained analysis, and (2) they predominantly analyze the model output on a specific task, failing to generalize the findings to other tasks. In this work, we introduce Pointwise Normalized Kernel Alignment (PNKA), a pointwise representational similarity measure that addresses these limitations by measuring how debiasing measures affect the intermediate representations of individuals. On tabular data, the use of PNKA reveals previously unknown insights: while group fairness predominantly influences a small subset of the population, maintaining high representational similarity for the majority, individual fairness constraints uniformly impact representations across the entire population, altering nearly every data point. We show that by evaluating representations using PNKA, we can reliably predict the behavior of ML models trained on these representations. Moreover, applying PNKA to language embeddings shows that existing debiasing methods may not perform as intended, failing to remove biases from stereotypical words and sentences. Our findings suggest that current evaluation measures for debiasing methods are insufficient, highlighting the need for a deeper understanding of the effects of debiasing methods, and show how pointwise representational similarity metrics can help with fairness audits.
title	Investigating the Effects of Fairness Interventions Using Pointwise Representational Similarity
topic	Machine Learning
url	https://arxiv.org/abs/2305.19294

Similar Items