Saved in:
Bibliographic Details
Main Authors: Wu, Junde, Zhu, Jiayuan, Qi, Yunli, Chen, Jingkun, Xu, Min, Menolascina, Filippo, Grau, Vicente
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2408.04187
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914974095376384
author Wu, Junde
Zhu, Jiayuan
Qi, Yunli
Chen, Jingkun
Xu, Min
Menolascina, Filippo
Grau, Vicente
author_facet Wu, Junde
Zhu, Jiayuan
Qi, Yunli
Chen, Jingkun
Xu, Min
Menolascina, Filippo
Grau, Vicente
contents We introduce a novel graph-based Retrieval-Augmented Generation (RAG) framework specifically designed for the medical domain, called \textbf{MedGraphRAG}, aimed at enhancing Large Language Model (LLM) capabilities for generating evidence-based medical responses, thereby improving safety and reliability when handling private medical data. Graph-based RAG (GraphRAG) leverages LLMs to organize RAG data into graphs, showing strong potential for gaining holistic insights from long-form documents. However, its standard implementation is overly complex for general use and lacks the ability to generate evidence-based responses, limiting its effectiveness in the medical field. To extend the capabilities of GraphRAG to the medical domain, we propose unique Triple Graph Construction and U-Retrieval techniques over it. In our graph construction, we create a triple-linked structure that connects user documents to credible medical sources and controlled vocabularies. In the retrieval process, we propose U-Retrieval which combines Top-down Precise Retrieval with Bottom-up Response Refinement to balance global context awareness with precise indexing. These effort enable both source information retrieval and comprehensive response generation. Our approach is validated on 9 medical Q\&A benchmarks, 2 health fact-checking benchmarks, and one collected dataset testing long-form generation. The results show that MedGraphRAG consistently outperforms state-of-the-art models across all benchmarks, while also ensuring that responses include credible source documentation and definitions. Our code is released at: https://github.com/MedicineToken/Medical-Graph-RAG.
format Preprint
id arxiv_https___arxiv_org_abs_2408_04187
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented Generation
Wu, Junde
Zhu, Jiayuan
Qi, Yunli
Chen, Jingkun
Xu, Min
Menolascina, Filippo
Grau, Vicente
Computer Vision and Pattern Recognition
We introduce a novel graph-based Retrieval-Augmented Generation (RAG) framework specifically designed for the medical domain, called \textbf{MedGraphRAG}, aimed at enhancing Large Language Model (LLM) capabilities for generating evidence-based medical responses, thereby improving safety and reliability when handling private medical data. Graph-based RAG (GraphRAG) leverages LLMs to organize RAG data into graphs, showing strong potential for gaining holistic insights from long-form documents. However, its standard implementation is overly complex for general use and lacks the ability to generate evidence-based responses, limiting its effectiveness in the medical field. To extend the capabilities of GraphRAG to the medical domain, we propose unique Triple Graph Construction and U-Retrieval techniques over it. In our graph construction, we create a triple-linked structure that connects user documents to credible medical sources and controlled vocabularies. In the retrieval process, we propose U-Retrieval which combines Top-down Precise Retrieval with Bottom-up Response Refinement to balance global context awareness with precise indexing. These effort enable both source information retrieval and comprehensive response generation. Our approach is validated on 9 medical Q\&A benchmarks, 2 health fact-checking benchmarks, and one collected dataset testing long-form generation. The results show that MedGraphRAG consistently outperforms state-of-the-art models across all benchmarks, while also ensuring that responses include credible source documentation and definitions. Our code is released at: https://github.com/MedicineToken/Medical-Graph-RAG.
title Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented Generation
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2408.04187