Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Hou, Ruihui, Chen, Shencheng, Fan, Yongqi, Yu, Guangya, Zhu, Lifeng, Sun, Jing, Liu, Jingping, Ruan, Tong
Format:	Preprint
Published:	2024
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2408.10039
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912157632823296
author	Hou, Ruihui Chen, Shencheng Fan, Yongqi Yu, Guangya Zhu, Lifeng Sun, Jing Liu, Jingping Ruan, Tong
author_facet	Hou, Ruihui Chen, Shencheng Fan, Yongqi Yu, Guangya Zhu, Lifeng Sun, Jing Liu, Jingping Ruan, Tong
contents	Clinical diagnosis is critical in medical practice, typically requiring a continuous and evolving process that includes primary diagnosis, differential diagnosis, and final diagnosis. However, most existing clinical diagnostic tasks are single-step processes, which does not align with the complex multi-step diagnostic procedures found in real-world clinical settings. In this paper, we propose a Chinese clinical diagnostic benchmark, called MSDiagnosis. This benchmark consists of 2,225 cases from 12 departments, covering tasks such as primary diagnosis, differential diagnosis, and final diagnosis. Additionally, we propose a novel and effective framework. This framework combines forward inference, backward inference, reflection, and refinement, enabling the large language model to self-evaluate and adjust its diagnostic results. To this end, we test open-source models, closed-source models, and our proposed framework.The experimental results demonstrate the effectiveness of the proposed method. We also provide a comprehensive experimental analysis and suggest future research directions for this task.
format	Preprint
id	arxiv_https___arxiv_org_abs_2408_10039
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	MSDiagnosis: A Benchmark for Evaluating Large Language Models in Multi-Step Clinical Diagnosis Hou, Ruihui Chen, Shencheng Fan, Yongqi Yu, Guangya Zhu, Lifeng Sun, Jing Liu, Jingping Ruan, Tong Artificial Intelligence Clinical diagnosis is critical in medical practice, typically requiring a continuous and evolving process that includes primary diagnosis, differential diagnosis, and final diagnosis. However, most existing clinical diagnostic tasks are single-step processes, which does not align with the complex multi-step diagnostic procedures found in real-world clinical settings. In this paper, we propose a Chinese clinical diagnostic benchmark, called MSDiagnosis. This benchmark consists of 2,225 cases from 12 departments, covering tasks such as primary diagnosis, differential diagnosis, and final diagnosis. Additionally, we propose a novel and effective framework. This framework combines forward inference, backward inference, reflection, and refinement, enabling the large language model to self-evaluate and adjust its diagnostic results. To this end, we test open-source models, closed-source models, and our proposed framework.The experimental results demonstrate the effectiveness of the proposed method. We also provide a comprehensive experimental analysis and suggest future research directions for this task.
title	MSDiagnosis: A Benchmark for Evaluating Large Language Models in Multi-Step Clinical Diagnosis
topic	Artificial Intelligence
url	https://arxiv.org/abs/2408.10039

Similar Items