Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.18935 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866915811107536896 |
|---|---|
| author | Wang, Haining Clark, Jason Peña, Angelica |
| author_facet | Wang, Haining Clark, Jason Peña, Angelica |
| contents | As libraries explore large language models (LLMs) as a scalable layer for reference services, a core fairness question follows: can LLM-based services support all patrons fairly, regardless of demographic identity? While LLMs offer great potential for broadening access to information assistance, they may also reproduce societal biases embedded in their training data, potentially undermining libraries' commitments to impartial service. In this chapter, we apply a systematic evaluation approach that combines diagnostic classification to detect systematic differences with linguistic analysis to interpret their sources. Across three widely used open models (Llama-3.1 8B, Gemma-2 9B, and Ministral 8B), we find no compelling evidence of systematic differentiation by race/ethnicity, and only minor evidence of sex-linked differentiation in one model. We discuss implications for responsible AI adoption in libraries and the importance of ongoing monitoring in aligning LLM-based services with core professional values. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2602_18935 |
| institution | arXiv |
| publishDate | 2026 |
| record_format | arxiv |
| spellingShingle | Responsible Intelligence in Practice: A Fairness Audit of Open Large Language Models for Library Reference Services Wang, Haining Clark, Jason Peña, Angelica Digital Libraries Software Engineering As libraries explore large language models (LLMs) as a scalable layer for reference services, a core fairness question follows: can LLM-based services support all patrons fairly, regardless of demographic identity? While LLMs offer great potential for broadening access to information assistance, they may also reproduce societal biases embedded in their training data, potentially undermining libraries' commitments to impartial service. In this chapter, we apply a systematic evaluation approach that combines diagnostic classification to detect systematic differences with linguistic analysis to interpret their sources. Across three widely used open models (Llama-3.1 8B, Gemma-2 9B, and Ministral 8B), we find no compelling evidence of systematic differentiation by race/ethnicity, and only minor evidence of sex-linked differentiation in one model. We discuss implications for responsible AI adoption in libraries and the importance of ongoing monitoring in aligning LLM-based services with core professional values. |
| title | Responsible Intelligence in Practice: A Fairness Audit of Open Large Language Models for Library Reference Services |
| topic | Digital Libraries Software Engineering |
| url | https://arxiv.org/abs/2602.18935 |