Saved in:
Bibliographic Details
Main Authors: Yang, Shuqi, Jing, Mingrui, Wang, Shuai, Kou, Jiaxin, Shi, Manfei, Xing, Weijie, Hu, Yan, Zhu, Zheng
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.11861
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • This study reviewed the use of Large Language Models (LLMs) in healthcare, focusing on their training corpora, customization techniques, and evaluation metrics. A systematic search of studies from 2021 to 2024 identified 61 articles. Four types of corpora were used: clinical resources, literature, open-source datasets, and web-crawled data. Common construction techniques included pre-training, prompt engineering, and retrieval-augmented generation, with 44 studies combining multiple methods. Evaluation metrics were categorized into process, usability, and outcome metrics, with outcome metrics divided into model-based and expert-assessed outcomes. The study identified critical gaps in corpus fairness, which contributed to biases from geographic, cultural, and socio-economic factors. The reliance on unverified or unstructured data highlighted the need for better integration of evidence-based clinical guidelines. Future research should focus on developing a tiered corpus architecture with vetted sources and dynamic weighting, while ensuring model transparency. Additionally, the lack of standardized evaluation frameworks for domain-specific models called for comprehensive validation of LLMs in real-world healthcare settings.