Affichage MARC: :: Library Catalog

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Kuzmin, Gleb, Yadav, Neemesh, Smirnov, Ivan, Baldwin, Timothy, Shelmanov, Artem
Format:	Preprint
Publié:	2024
Sujets:	Computation and Language Artificial Intelligence
Accès en ligne:	https://arxiv.org/abs/2407.19345
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

_version_	1866917950548606976
author	Kuzmin, Gleb Yadav, Neemesh Smirnov, Ivan Baldwin, Timothy Shelmanov, Artem
author_facet	Kuzmin, Gleb Yadav, Neemesh Smirnov, Ivan Baldwin, Timothy Shelmanov, Artem
contents	We propose selective debiasing -- an inference-time safety mechanism designed to enhance the overall model quality in terms of prediction performance and fairness, especially in scenarios where retraining the model is impractical. The method draws inspiration from selective classification, where at inference time, predictions with low quality, as indicated by their uncertainty scores, are discarded. In our approach, we identify the potentially biased model predictions and, instead of discarding them, we remove bias from these predictions using LEACE -- a post-processing debiasing method. To select problematic predictions, we propose a bias quantification approach based on KL divergence, which achieves better results than standard uncertainty quantification methods. Experiments on text classification datasets with encoder-based classification models demonstrate that selective debiasing helps to reduce the performance gap between post-processing methods and debiasing techniques from the at-training and pre-processing categories.
format	Preprint
id	arxiv_https___arxiv_org_abs_2407_19345
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Inference-Time Selective Debiasing to Enhance Fairness in Text Classification Models Kuzmin, Gleb Yadav, Neemesh Smirnov, Ivan Baldwin, Timothy Shelmanov, Artem Computation and Language Artificial Intelligence We propose selective debiasing -- an inference-time safety mechanism designed to enhance the overall model quality in terms of prediction performance and fairness, especially in scenarios where retraining the model is impractical. The method draws inspiration from selective classification, where at inference time, predictions with low quality, as indicated by their uncertainty scores, are discarded. In our approach, we identify the potentially biased model predictions and, instead of discarding them, we remove bias from these predictions using LEACE -- a post-processing debiasing method. To select problematic predictions, we propose a bias quantification approach based on KL divergence, which achieves better results than standard uncertainty quantification methods. Experiments on text classification datasets with encoder-based classification models demonstrate that selective debiasing helps to reduce the performance gap between post-processing methods and debiasing techniques from the at-training and pre-processing categories.
title	Inference-Time Selective Debiasing to Enhance Fairness in Text Classification Models
topic	Computation and Language Artificial Intelligence
url	https://arxiv.org/abs/2407.19345

Documents similaires