Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Guo, Wenqi Marshall, Du, Yiyang, Tworek, Heidi J. S., Du, Shan
Formato:	Preprint
Publicado:	2025
Materias:	Computers and Society
Acceso en línea:	https://arxiv.org/abs/2509.08833
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866912632318984192
author	Guo, Wenqi Marshall Du, Yiyang Tworek, Heidi J. S. Du, Shan
author_facet	Guo, Wenqi Marshall Du, Yiyang Tworek, Heidi J. S. Du, Shan
contents	Large Language Models (LLMs) are usually aligned with "human values/preferences" to prevent harmful output. Discussions around the alignment of Large Language Models (LLMs) generally focus on preventing harmful outputs. However, in this paper, we argue that in health-related queries, over-alignment-leading to overly cautious responses-can itself be harmful, especially for people with anxiety and obsessive-compulsive disorder (OCD). This is not only unethical but also dangerous to the user, both mentally and physically. We also showed qualitative results that some LLMs exhibit varying degrees of alignment. Finally, we call for the development of LLMs with stronger reasoning capabilities that provide more tailored and nuanced responses to health queries. Warning: This paper contains materials that could trigger health anxiety or OCD. Dataset and full results can be found in https://github.com/weathon/over-alignment.
format	Preprint
id	arxiv_https___arxiv_org_abs_2509_08833
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Position: The Pitfalls of Over-Alignment: Overly Caution Health-Related Responses From LLMs are Unethical and Dangerous Guo, Wenqi Marshall Du, Yiyang Tworek, Heidi J. S. Du, Shan Computers and Society Large Language Models (LLMs) are usually aligned with "human values/preferences" to prevent harmful output. Discussions around the alignment of Large Language Models (LLMs) generally focus on preventing harmful outputs. However, in this paper, we argue that in health-related queries, over-alignment-leading to overly cautious responses-can itself be harmful, especially for people with anxiety and obsessive-compulsive disorder (OCD). This is not only unethical but also dangerous to the user, both mentally and physically. We also showed qualitative results that some LLMs exhibit varying degrees of alignment. Finally, we call for the development of LLMs with stronger reasoning capabilities that provide more tailored and nuanced responses to health queries. Warning: This paper contains materials that could trigger health anxiety or OCD. Dataset and full results can be found in https://github.com/weathon/over-alignment.
title	Position: The Pitfalls of Over-Alignment: Overly Caution Health-Related Responses From LLMs are Unethical and Dangerous
topic	Computers and Society
url	https://arxiv.org/abs/2509.08833

Ejemplares similares