Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Mehner, Luise, Fiedler, Lena Alicija Philine, Ammon, Sabine, Kolossa, Dorothea
Format:	Preprint
Published:	2025
Subjects:	Computation and Language
Online Access:	https://arxiv.org/abs/2502.20898
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913712667885568
author	Mehner, Luise Fiedler, Lena Alicija Philine Ammon, Sabine Kolossa, Dorothea
author_facet	Mehner, Luise Fiedler, Lena Alicija Philine Ammon, Sabine Kolossa, Dorothea
contents	The widespread application of Large Language Models (LLMs) involves ethical risks for users and societies. A prominent ethical risk of LLMs is the generation of unfair language output that reinforces or exacerbates harm for members of disadvantaged social groups through gender biases (Weidinger et al., 2022; Bender et al., 2021; Kotek et al., 2023). Hence, the evaluation of the fairness of LLM outputs with respect to such biases is a topic of rising interest. To advance research in this field, promote discourse on suitable normative bases and evaluation methodologies, and enhance the reproducibility of related studies, we propose a novel approach to database construction. This approach enables the assessment of gender-related biases in LLM-generated language beyond merely evaluating their degree of neutralization.
format	Preprint
id	arxiv_https___arxiv_org_abs_2502_20898
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	A database to support the evaluation of gender biases in GPT-4o output Mehner, Luise Fiedler, Lena Alicija Philine Ammon, Sabine Kolossa, Dorothea Computation and Language The widespread application of Large Language Models (LLMs) involves ethical risks for users and societies. A prominent ethical risk of LLMs is the generation of unfair language output that reinforces or exacerbates harm for members of disadvantaged social groups through gender biases (Weidinger et al., 2022; Bender et al., 2021; Kotek et al., 2023). Hence, the evaluation of the fairness of LLM outputs with respect to such biases is a topic of rising interest. To advance research in this field, promote discourse on suitable normative bases and evaluation methodologies, and enhance the reproducibility of related studies, we propose a novel approach to database construction. This approach enables the assessment of gender-related biases in LLM-generated language beyond merely evaluating their degree of neutralization.
title	A database to support the evaluation of gender biases in GPT-4o output
topic	Computation and Language
url	https://arxiv.org/abs/2502.20898

Similar Items