Saved in:
Bibliographic Details
Main Authors: Arnold, Stefan, Gröbner, Rene, Schreiner, Annika
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2407.00764
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909235099467776
author Arnold, Stefan
Gröbner, Rene
Schreiner, Annika
author_facet Arnold, Stefan
Gröbner, Rene
Schreiner, Annika
contents Differential Privacy (DP) can be applied to raw text by exploiting the spatial arrangement of words in an embedding space. We investigate the implications of such text privatization on Language Models (LMs) and their tendency towards stereotypical associations. Since previous studies documented that linguistic proficiency correlates with stereotypical bias, one could assume that techniques for text privatization, which are known to degrade language modeling capabilities, would cancel out undesirable biases. By testing BERT models trained on texts containing biased statements primed with varying degrees of privacy, our study reveals that while stereotypical bias generally diminishes when privacy is tightened, text privatization does not uniformly equate to diminishing bias across all social domains. This highlights the need for careful diagnosis of bias in LMs that undergo text privatization.
format Preprint
id arxiv_https___arxiv_org_abs_2407_00764
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Characterizing Stereotypical Bias from Privacy-preserving Pre-Training
Arnold, Stefan
Gröbner, Rene
Schreiner, Annika
Computation and Language
Artificial Intelligence
Differential Privacy (DP) can be applied to raw text by exploiting the spatial arrangement of words in an embedding space. We investigate the implications of such text privatization on Language Models (LMs) and their tendency towards stereotypical associations. Since previous studies documented that linguistic proficiency correlates with stereotypical bias, one could assume that techniques for text privatization, which are known to degrade language modeling capabilities, would cancel out undesirable biases. By testing BERT models trained on texts containing biased statements primed with varying degrees of privacy, our study reveals that while stereotypical bias generally diminishes when privacy is tightened, text privatization does not uniformly equate to diminishing bias across all social domains. This highlights the need for careful diagnosis of bias in LMs that undergo text privatization.
title Characterizing Stereotypical Bias from Privacy-preserving Pre-Training
topic Computation and Language
Artificial Intelligence
url https://arxiv.org/abs/2407.00764