Saved in:
Bibliographic Details
Main Authors: Wang, Xinyuan, Li, Haozhou, Zheng, Dingfang, Peng, Qinke
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2410.03521
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909509018976256
author Wang, Xinyuan
Li, Haozhou
Zheng, Dingfang
Peng, Qinke
author_facet Wang, Xinyuan
Li, Haozhou
Zheng, Dingfang
Peng, Qinke
contents The global COVID-19 pandemic underscored major deficiencies in traditional healthcare systems, hastening the advancement of online medical services, especially in medical triage and consultation. However, existing studies face two main challenges. First, the scarcity of large-scale, publicly available, domain-specific medical datasets due to privacy concerns, with current datasets being small and limited to a few diseases, limiting the effectiveness of triage methods based on Pre-trained Language Models (PLMs). Second, existing methods lack medical knowledge and struggle to accurately understand professional terms and expressions in patient-doctor consultations. To overcome these obstacles, we construct the Large-scale Chinese Medical Dialogue Corpora (LCMDC), thereby addressing the data shortage in this field. Moreover, we further propose a novel triage system that combines BERT-based supervised learning with prompt learning, as well as a GPT-based medical consultation model. To enhance domain knowledge acquisition, we pre-trained PLMs using our self-constructed background corpus. Experimental results on the LCMDC demonstrate the efficacy of our proposed systems.
format Preprint
id arxiv_https___arxiv_org_abs_2410_03521
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Building a Chinese Medical Dialogue System: Integrating Large-scale Corpora and Novel Models
Wang, Xinyuan
Li, Haozhou
Zheng, Dingfang
Peng, Qinke
Computation and Language
Artificial Intelligence
Machine Learning
The global COVID-19 pandemic underscored major deficiencies in traditional healthcare systems, hastening the advancement of online medical services, especially in medical triage and consultation. However, existing studies face two main challenges. First, the scarcity of large-scale, publicly available, domain-specific medical datasets due to privacy concerns, with current datasets being small and limited to a few diseases, limiting the effectiveness of triage methods based on Pre-trained Language Models (PLMs). Second, existing methods lack medical knowledge and struggle to accurately understand professional terms and expressions in patient-doctor consultations. To overcome these obstacles, we construct the Large-scale Chinese Medical Dialogue Corpora (LCMDC), thereby addressing the data shortage in this field. Moreover, we further propose a novel triage system that combines BERT-based supervised learning with prompt learning, as well as a GPT-based medical consultation model. To enhance domain knowledge acquisition, we pre-trained PLMs using our self-constructed background corpus. Experimental results on the LCMDC demonstrate the efficacy of our proposed systems.
title Building a Chinese Medical Dialogue System: Integrating Large-scale Corpora and Novel Models
topic Computation and Language
Artificial Intelligence
Machine Learning
url https://arxiv.org/abs/2410.03521