Saved in:
Bibliographic Details
Main Authors: Sharma, Sonali, Alaa, Ahmed M., Daneshjou, Roxana
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2507.08030
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912475699478528
author Sharma, Sonali
Alaa, Ahmed M.
Daneshjou, Roxana
author_facet Sharma, Sonali
Alaa, Ahmed M.
Daneshjou, Roxana
contents Generative AI models, including large language models (LLMs) and vision-language models (VLMs), are increasingly used to interpret medical images and answer clinical questions. Their responses often include inaccuracies; therefore, safety measures like medical disclaimers are critical to remind users that AI outputs are not professionally vetted or a substitute for medical advice. This study evaluated the presence of disclaimers in LLM and VLM outputs across model generations from 2022 to 2025. Using 500 mammograms, 500 chest X-rays, 500 dermatology images, and 500 medical questions, outputs were screened for disclaimer phrases. Medical disclaimer presence in LLM and VLM outputs dropped from 26.3% in 2022 to 0.97% in 2025, and from 19.6% in 2023 to 1.05% in 2025, respectively. By 2025, the majority of models displayed no disclaimers. As public models become more capable and authoritative, disclaimers must be implemented as a safeguard adapting to the clinical context of each output.
format Preprint
id arxiv_https___arxiv_org_abs_2507_08030
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle A Systematic Analysis of Declining Medical Safety Messaging in Generative AI Models
Sharma, Sonali
Alaa, Ahmed M.
Daneshjou, Roxana
Computation and Language
Computational Engineering, Finance, and Science
Human-Computer Interaction
Generative AI models, including large language models (LLMs) and vision-language models (VLMs), are increasingly used to interpret medical images and answer clinical questions. Their responses often include inaccuracies; therefore, safety measures like medical disclaimers are critical to remind users that AI outputs are not professionally vetted or a substitute for medical advice. This study evaluated the presence of disclaimers in LLM and VLM outputs across model generations from 2022 to 2025. Using 500 mammograms, 500 chest X-rays, 500 dermatology images, and 500 medical questions, outputs were screened for disclaimer phrases. Medical disclaimer presence in LLM and VLM outputs dropped from 26.3% in 2022 to 0.97% in 2025, and from 19.6% in 2023 to 1.05% in 2025, respectively. By 2025, the majority of models displayed no disclaimers. As public models become more capable and authoritative, disclaimers must be implemented as a safeguard adapting to the clinical context of each output.
title A Systematic Analysis of Declining Medical Safety Messaging in Generative AI Models
topic Computation and Language
Computational Engineering, Finance, and Science
Human-Computer Interaction
url https://arxiv.org/abs/2507.08030