Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2408.04187 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866914974095376384 |
|---|---|
| author | Wu, Junde Zhu, Jiayuan Qi, Yunli Chen, Jingkun Xu, Min Menolascina, Filippo Grau, Vicente |
| author_facet | Wu, Junde Zhu, Jiayuan Qi, Yunli Chen, Jingkun Xu, Min Menolascina, Filippo Grau, Vicente |
| contents | We introduce a novel graph-based Retrieval-Augmented Generation (RAG) framework specifically designed for the medical domain, called \textbf{MedGraphRAG}, aimed at enhancing Large Language Model (LLM) capabilities for generating evidence-based medical responses, thereby improving safety and reliability when handling private medical data. Graph-based RAG (GraphRAG) leverages LLMs to organize RAG data into graphs, showing strong potential for gaining holistic insights from long-form documents. However, its standard implementation is overly complex for general use and lacks the ability to generate evidence-based responses, limiting its effectiveness in the medical field. To extend the capabilities of GraphRAG to the medical domain, we propose unique Triple Graph Construction and U-Retrieval techniques over it. In our graph construction, we create a triple-linked structure that connects user documents to credible medical sources and controlled vocabularies. In the retrieval process, we propose U-Retrieval which combines Top-down Precise Retrieval with Bottom-up Response Refinement to balance global context awareness with precise indexing. These effort enable both source information retrieval and comprehensive response generation. Our approach is validated on 9 medical Q\&A benchmarks, 2 health fact-checking benchmarks, and one collected dataset testing long-form generation. The results show that MedGraphRAG consistently outperforms state-of-the-art models across all benchmarks, while also ensuring that responses include credible source documentation and definitions. Our code is released at: https://github.com/MedicineToken/Medical-Graph-RAG. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2408_04187 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented Generation Wu, Junde Zhu, Jiayuan Qi, Yunli Chen, Jingkun Xu, Min Menolascina, Filippo Grau, Vicente Computer Vision and Pattern Recognition We introduce a novel graph-based Retrieval-Augmented Generation (RAG) framework specifically designed for the medical domain, called \textbf{MedGraphRAG}, aimed at enhancing Large Language Model (LLM) capabilities for generating evidence-based medical responses, thereby improving safety and reliability when handling private medical data. Graph-based RAG (GraphRAG) leverages LLMs to organize RAG data into graphs, showing strong potential for gaining holistic insights from long-form documents. However, its standard implementation is overly complex for general use and lacks the ability to generate evidence-based responses, limiting its effectiveness in the medical field. To extend the capabilities of GraphRAG to the medical domain, we propose unique Triple Graph Construction and U-Retrieval techniques over it. In our graph construction, we create a triple-linked structure that connects user documents to credible medical sources and controlled vocabularies. In the retrieval process, we propose U-Retrieval which combines Top-down Precise Retrieval with Bottom-up Response Refinement to balance global context awareness with precise indexing. These effort enable both source information retrieval and comprehensive response generation. Our approach is validated on 9 medical Q\&A benchmarks, 2 health fact-checking benchmarks, and one collected dataset testing long-form generation. The results show that MedGraphRAG consistently outperforms state-of-the-art models across all benchmarks, while also ensuring that responses include credible source documentation and definitions. Our code is released at: https://github.com/MedicineToken/Medical-Graph-RAG. |
| title | Medical Graph RAG: Towards Safe Medical Large Language Model via Graph Retrieval-Augmented Generation |
| topic | Computer Vision and Pattern Recognition |
| url | https://arxiv.org/abs/2408.04187 |