Saved in:
Bibliographic Details
Main Authors: He, Jinglin, Guo, Yunqi, Lam, Lai Kwan, Leung, Waikei, He, Lixing, Jiang, Yuanan, Wang, Chi Chiu, Xing, Guoliang, Chen, Hongkai
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2504.20118
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866908424874229760
author He, Jinglin
Guo, Yunqi
Lam, Lai Kwan
Leung, Waikei
He, Lixing
Jiang, Yuanan
Wang, Chi Chiu
Xing, Guoliang
Chen, Hongkai
author_facet He, Jinglin
Guo, Yunqi
Lam, Lai Kwan
Leung, Waikei
He, Lixing
Jiang, Yuanan
Wang, Chi Chiu
Xing, Guoliang
Chen, Hongkai
contents Traditional Chinese Medicine (TCM) represents a rich repository of ancient medical knowledge that continues to play an important role in modern healthcare. Due to the complexity and breadth of the TCM literature, the integration of AI technologies is critical for its modernization and broader accessibility. However, this integration poses considerable challenges, including the interpretation of obscure classical Chinese texts and the modeling of intricate semantic relationships among TCM concepts. In this paper, we develop OpenTCM, an LLM-based system that combines a domain-specific TCM knowledge graph and Graph-based Retrieval-Augmented Generation (GraphRAG). First, we extract more than 3.73 million classical Chinese characters from 68 gynecological books in the Chinese Medical Classics Database, with the help of TCM and gynecology experts. Second, we construct a comprehensive multi-relational knowledge graph comprising more than 48,000 entities and 152,000 interrelationships, using customized prompts and Chinese-oriented LLMs such as DeepSeek and Kimi to ensure high-fidelity semantic understanding. Last, we empower OpenTCM with GraphRAG, enabling high-fidelity ingredient knowledge retrieval and diagnostic question-answering without model fine-tuning. Experimental evaluations demonstrate that OpenTCM achieves mean expert scores (MES) of 4.378 in ingredient information retrieval and 4.045 in diagnostic question-answering tasks, outperforming state-of-the-art solutions in real-world TCM use cases.
format Preprint
id arxiv_https___arxiv_org_abs_2504_20118
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle OpenTCM: A GraphRAG-Empowered LLM-based System for Traditional Chinese Medicine Knowledge Retrieval and Diagnosis
He, Jinglin
Guo, Yunqi
Lam, Lai Kwan
Leung, Waikei
He, Lixing
Jiang, Yuanan
Wang, Chi Chiu
Xing, Guoliang
Chen, Hongkai
Information Retrieval
Artificial Intelligence
Traditional Chinese Medicine (TCM) represents a rich repository of ancient medical knowledge that continues to play an important role in modern healthcare. Due to the complexity and breadth of the TCM literature, the integration of AI technologies is critical for its modernization and broader accessibility. However, this integration poses considerable challenges, including the interpretation of obscure classical Chinese texts and the modeling of intricate semantic relationships among TCM concepts. In this paper, we develop OpenTCM, an LLM-based system that combines a domain-specific TCM knowledge graph and Graph-based Retrieval-Augmented Generation (GraphRAG). First, we extract more than 3.73 million classical Chinese characters from 68 gynecological books in the Chinese Medical Classics Database, with the help of TCM and gynecology experts. Second, we construct a comprehensive multi-relational knowledge graph comprising more than 48,000 entities and 152,000 interrelationships, using customized prompts and Chinese-oriented LLMs such as DeepSeek and Kimi to ensure high-fidelity semantic understanding. Last, we empower OpenTCM with GraphRAG, enabling high-fidelity ingredient knowledge retrieval and diagnostic question-answering without model fine-tuning. Experimental evaluations demonstrate that OpenTCM achieves mean expert scores (MES) of 4.378 in ingredient information retrieval and 4.045 in diagnostic question-answering tasks, outperforming state-of-the-art solutions in real-world TCM use cases.
title OpenTCM: A GraphRAG-Empowered LLM-based System for Traditional Chinese Medicine Knowledge Retrieval and Diagnosis
topic Information Retrieval
Artificial Intelligence
url https://arxiv.org/abs/2504.20118