Internformat: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Zhou, Ej, Lu, Weiming
Format:	Preprint
Veröffentlicht:	2025
Schlagworte:	Computation and Language
Online-Zugang:	https://arxiv.org/abs/2504.11183
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

_version_	1866909687692132352
author	Zhou, Ej Lu, Weiming
author_facet	Zhou, Ej Lu, Weiming
contents	Social bias in language models can potentially exacerbate social inequalities. Despite it having garnered wide attention, most research focuses on English data. In a low-resource scenario, the models often perform worse due to insufficient training data. This study aims to leverage high-resource language corpora to evaluate bias and experiment with debiasing methods in low-resource languages. We evaluated the performance of recent multilingual models in five languages: English, Chinese, Russian, Indonesian and Thai, and analyzed four bias dimensions: gender, religion, nationality, and race-color. By constructing multilingual bias evaluation datasets, this study allows fair comparisons between models across languages. We have further investigated three debiasing methods-CDA, Dropout, SenDeb-and demonstrated that debiasing methods from high-resource languages can be effectively transferred to low-resource ones, providing actionable insights for fairness research in multilingual NLP.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_11183
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Bias Beyond English: Evaluating Social Bias and Debiasing Methods in a Low-Resource Setting Zhou, Ej Lu, Weiming Computation and Language Social bias in language models can potentially exacerbate social inequalities. Despite it having garnered wide attention, most research focuses on English data. In a low-resource scenario, the models often perform worse due to insufficient training data. This study aims to leverage high-resource language corpora to evaluate bias and experiment with debiasing methods in low-resource languages. We evaluated the performance of recent multilingual models in five languages: English, Chinese, Russian, Indonesian and Thai, and analyzed four bias dimensions: gender, religion, nationality, and race-color. By constructing multilingual bias evaluation datasets, this study allows fair comparisons between models across languages. We have further investigated three debiasing methods-CDA, Dropout, SenDeb-and demonstrated that debiasing methods from high-resource languages can be effectively transferred to low-resource ones, providing actionable insights for fairness research in multilingual NLP.
title	Bias Beyond English: Evaluating Social Bias and Debiasing Methods in a Low-Resource Setting
topic	Computation and Language
url	https://arxiv.org/abs/2504.11183

Ähnliche Einträge