Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Jung, Daniel, Butler, Alex, Park, Joongheum, Saperstein, Yair
Format:	Preprint
Published:	2024
Subjects:	Human-Computer Interaction Artificial Intelligence
Online Access:	https://arxiv.org/abs/2409.15326
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866909323811094528
author	Jung, Daniel Butler, Alex Park, Joongheum Saperstein, Yair
author_facet	Jung, Daniel Butler, Alex Park, Joongheum Saperstein, Yair
contents	The use of Large language models (LLMs) to augment clinical decision support systems is a topic with rapidly growing interest, but current shortcomings such as hallucinations and lack of clear source citations make them unreliable for use in the clinical environment. This study evaluates Ask Avo, an LLM-derived software by AvoMD that incorporates a proprietary Language Model Augmented Retrieval (LMAR) system, in-built visual citation cues, and prompt engineering designed for interactions with physicians, against ChatGPT-4 in end-user experience for physicians in a simulated clinical scenario environment. Eight clinical questions derived from medical guideline documents in various specialties were prompted to both models by 62 study participants, with each response rated on trustworthiness, actionability, relevancy, comprehensiveness, and friendly format from 1 to 5. Ask Avo significantly outperformed ChatGPT-4 in all criteria: trustworthiness (4.52 vs. 3.34, p<0.001), actionability (4.41 vs. 3.19, p<0.001), relevancy (4.55 vs. 3.49, p<0.001), comprehensiveness (4.50 vs. 3.37, p<0.001), and friendly format (4.52 vs. 3.60, p<0.001). Our findings suggest that specialized LLMs designed with the needs of clinicians in mind can offer substantial improvements in user experience over general-purpose LLMs. Ask Avo's evidence-based approach tailored to clinician needs shows promise in the adoption of LLM-augmented clinical decision support software.
format	Preprint
id	arxiv_https___arxiv_org_abs_2409_15326
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Evaluating the Impact of a Specialized LLM on Physician Experience in Clinical Decision Support: A Comparison of Ask Avo and ChatGPT-4 Jung, Daniel Butler, Alex Park, Joongheum Saperstein, Yair Human-Computer Interaction Artificial Intelligence The use of Large language models (LLMs) to augment clinical decision support systems is a topic with rapidly growing interest, but current shortcomings such as hallucinations and lack of clear source citations make them unreliable for use in the clinical environment. This study evaluates Ask Avo, an LLM-derived software by AvoMD that incorporates a proprietary Language Model Augmented Retrieval (LMAR) system, in-built visual citation cues, and prompt engineering designed for interactions with physicians, against ChatGPT-4 in end-user experience for physicians in a simulated clinical scenario environment. Eight clinical questions derived from medical guideline documents in various specialties were prompted to both models by 62 study participants, with each response rated on trustworthiness, actionability, relevancy, comprehensiveness, and friendly format from 1 to 5. Ask Avo significantly outperformed ChatGPT-4 in all criteria: trustworthiness (4.52 vs. 3.34, p<0.001), actionability (4.41 vs. 3.19, p<0.001), relevancy (4.55 vs. 3.49, p<0.001), comprehensiveness (4.50 vs. 3.37, p<0.001), and friendly format (4.52 vs. 3.60, p<0.001). Our findings suggest that specialized LLMs designed with the needs of clinicians in mind can offer substantial improvements in user experience over general-purpose LLMs. Ask Avo's evidence-based approach tailored to clinician needs shows promise in the adoption of LLM-augmented clinical decision support software.
title	Evaluating the Impact of a Specialized LLM on Physician Experience in Clinical Decision Support: A Comparison of Ask Avo and ChatGPT-4
topic	Human-Computer Interaction Artificial Intelligence
url	https://arxiv.org/abs/2409.15326

Similar Items