Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Wang, Haining, Clark, Jason, Peña, Angelica
Format:	Preprint
Published:	2026
Subjects:	Digital Libraries Software Engineering
Online Access:	https://arxiv.org/abs/2602.18935
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915811107536896
author	Wang, Haining Clark, Jason Peña, Angelica
author_facet	Wang, Haining Clark, Jason Peña, Angelica
contents	As libraries explore large language models (LLMs) as a scalable layer for reference services, a core fairness question follows: can LLM-based services support all patrons fairly, regardless of demographic identity? While LLMs offer great potential for broadening access to information assistance, they may also reproduce societal biases embedded in their training data, potentially undermining libraries' commitments to impartial service. In this chapter, we apply a systematic evaluation approach that combines diagnostic classification to detect systematic differences with linguistic analysis to interpret their sources. Across three widely used open models (Llama-3.1 8B, Gemma-2 9B, and Ministral 8B), we find no compelling evidence of systematic differentiation by race/ethnicity, and only minor evidence of sex-linked differentiation in one model. We discuss implications for responsible AI adoption in libraries and the importance of ongoing monitoring in aligning LLM-based services with core professional values.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_18935
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Responsible Intelligence in Practice: A Fairness Audit of Open Large Language Models for Library Reference Services Wang, Haining Clark, Jason Peña, Angelica Digital Libraries Software Engineering As libraries explore large language models (LLMs) as a scalable layer for reference services, a core fairness question follows: can LLM-based services support all patrons fairly, regardless of demographic identity? While LLMs offer great potential for broadening access to information assistance, they may also reproduce societal biases embedded in their training data, potentially undermining libraries' commitments to impartial service. In this chapter, we apply a systematic evaluation approach that combines diagnostic classification to detect systematic differences with linguistic analysis to interpret their sources. Across three widely used open models (Llama-3.1 8B, Gemma-2 9B, and Ministral 8B), we find no compelling evidence of systematic differentiation by race/ethnicity, and only minor evidence of sex-linked differentiation in one model. We discuss implications for responsible AI adoption in libraries and the importance of ongoing monitoring in aligning LLM-based services with core professional values.
title	Responsible Intelligence in Practice: A Fairness Audit of Open Large Language Models for Library Reference Services
topic	Digital Libraries Software Engineering
url	https://arxiv.org/abs/2602.18935

Similar Items