Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Baksi, Arkadeep, Singh, Rahul, Joshi, Tarun
Format:	Preprint
Published:	2024
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2408.00612
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909298867568640
author	Baksi, Arkadeep Singh, Rahul Joshi, Tarun
author_facet	Baksi, Arkadeep Singh, Rahul Joshi, Tarun
contents	The advent of transformer-based architectures and large language models (LLMs) have significantly advanced the performance of natural language processing (NLP) models. Since these LLMs are trained on huge corpuses of data from the web and other sources, there has been a major concern about harmful prejudices that may potentially be transferred from the data. In many applications, these pre-trained LLMs are fine-tuned on task specific datasets, which can further contribute to biases. This paper studies the extent of biases absorbed by LLMs during pre-training as well as task-specific behaviour after fine-tuning. We found that controlled interventions on pre-trained LLMs, prior to fine-tuning, have minimal effect on lowering biases in classifiers. However, the biases present in domain-specific datasets play a much bigger role, and hence mitigating them at this stage has a bigger impact. While pre-training does matter, but after the model has been pre-trained, even slight changes to co-occurrence rates in the fine-tuning dataset has a significant effect on the bias of the model.
format	Preprint
id	arxiv_https___arxiv_org_abs_2408_00612
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Downstream bias mitigation is all you need Baksi, Arkadeep Singh, Rahul Joshi, Tarun Computation and Language The advent of transformer-based architectures and large language models (LLMs) have significantly advanced the performance of natural language processing (NLP) models. Since these LLMs are trained on huge corpuses of data from the web and other sources, there has been a major concern about harmful prejudices that may potentially be transferred from the data. In many applications, these pre-trained LLMs are fine-tuned on task specific datasets, which can further contribute to biases. This paper studies the extent of biases absorbed by LLMs during pre-training as well as task-specific behaviour after fine-tuning. We found that controlled interventions on pre-trained LLMs, prior to fine-tuning, have minimal effect on lowering biases in classifiers. However, the biases present in domain-specific datasets play a much bigger role, and hence mitigating them at this stage has a bigger impact. While pre-training does matter, but after the model has been pre-trained, even slight changes to co-occurrence rates in the fine-tuning dataset has a significant effect on the bias of the model.
title	Downstream bias mitigation is all you need
topic	Computation and Language
url	https://arxiv.org/abs/2408.00612

Similar Items