Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Sharma, Sonali, Alaa, Ahmed M., Daneshjou, Roxana
Format:	Preprint
Published:	2025
Subjects:	Computation and Language Computational Engineering, Finance, and Science Human-Computer Interaction
Online Access:	https://arxiv.org/abs/2507.08030
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912475699478528
author	Sharma, Sonali Alaa, Ahmed M. Daneshjou, Roxana
author_facet	Sharma, Sonali Alaa, Ahmed M. Daneshjou, Roxana
contents	Generative AI models, including large language models (LLMs) and vision-language models (VLMs), are increasingly used to interpret medical images and answer clinical questions. Their responses often include inaccuracies; therefore, safety measures like medical disclaimers are critical to remind users that AI outputs are not professionally vetted or a substitute for medical advice. This study evaluated the presence of disclaimers in LLM and VLM outputs across model generations from 2022 to 2025. Using 500 mammograms, 500 chest X-rays, 500 dermatology images, and 500 medical questions, outputs were screened for disclaimer phrases. Medical disclaimer presence in LLM and VLM outputs dropped from 26.3% in 2022 to 0.97% in 2025, and from 19.6% in 2023 to 1.05% in 2025, respectively. By 2025, the majority of models displayed no disclaimers. As public models become more capable and authoritative, disclaimers must be implemented as a safeguard adapting to the clinical context of each output.
format	Preprint
id	arxiv_https___arxiv_org_abs_2507_08030
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	A Systematic Analysis of Declining Medical Safety Messaging in Generative AI Models Sharma, Sonali Alaa, Ahmed M. Daneshjou, Roxana Computation and Language Computational Engineering, Finance, and Science Human-Computer Interaction Generative AI models, including large language models (LLMs) and vision-language models (VLMs), are increasingly used to interpret medical images and answer clinical questions. Their responses often include inaccuracies; therefore, safety measures like medical disclaimers are critical to remind users that AI outputs are not professionally vetted or a substitute for medical advice. This study evaluated the presence of disclaimers in LLM and VLM outputs across model generations from 2022 to 2025. Using 500 mammograms, 500 chest X-rays, 500 dermatology images, and 500 medical questions, outputs were screened for disclaimer phrases. Medical disclaimer presence in LLM and VLM outputs dropped from 26.3% in 2022 to 0.97% in 2025, and from 19.6% in 2023 to 1.05% in 2025, respectively. By 2025, the majority of models displayed no disclaimers. As public models become more capable and authoritative, disclaimers must be implemented as a safeguard adapting to the clinical context of each output.
title	A Systematic Analysis of Declining Medical Safety Messaging in Generative AI Models
topic	Computation and Language Computational Engineering, Finance, and Science Human-Computer Interaction
url	https://arxiv.org/abs/2507.08030

Similar Items