Saved in:
Bibliographic Details
Main Authors: Zhang, Tiantian, Lin, Manxi, Guo, Hongda, Zhang, Xiaofan, Chiu, Ka Fung Peter, Feragen, Aasa, Dou, Qi
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2405.08786
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910521075171328
author Zhang, Tiantian
Lin, Manxi
Guo, Hongda
Zhang, Xiaofan
Chiu, Ka Fung Peter
Feragen, Aasa
Dou, Qi
author_facet Zhang, Tiantian
Lin, Manxi
Guo, Hongda
Zhang, Xiaofan
Chiu, Ka Fung Peter
Feragen, Aasa
Dou, Qi
contents The Prostate Imaging Reporting and Data System (PI-RADS) is pivotal in the diagnosis of clinically significant prostate cancer through MRI imaging. Current deep learning-based PI-RADS scoring methods often lack the incorporation of common PI-RADS clinical guideline~(PICG) utilized by radiologists, potentially compromising scoring accuracy. This paper introduces a novel approach that adapts a multi-modal large language model (MLLM) to incorporate PICG into PI-RADS scoring model without additional annotations and network parameters. We present a designed two-stage fine-tuning process aiming at adapting a MLLM originally trained on natural images to the MRI images while effectively integrating the PICG. Specifically, in the first stage, we develop a domain adapter layer tailored for processing 3D MRI inputs and instruct the MLLM to differentiate MRI sequences. In the second stage, we translate PICG for guiding instructions from the model to generate PICG-guided image features. Through such a feature distillation step, we align the scoring network's features with the PICG-guided image features, which enables the model to effectively incorporate the PICG information. We develop our model on a public dataset and evaluate it on an in-house dataset. Experimental results demonstrate that our approach effectively improves the performance of current scoring networks. Code is available at: https://github.com/med-air/PICG2scoring
format Preprint
id arxiv_https___arxiv_org_abs_2405_08786
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Incorporating Clinical Guidelines through Adapting Multi-modal Large Language Model for Prostate Cancer PI-RADS Scoring
Zhang, Tiantian
Lin, Manxi
Guo, Hongda
Zhang, Xiaofan
Chiu, Ka Fung Peter
Feragen, Aasa
Dou, Qi
Computer Vision and Pattern Recognition
The Prostate Imaging Reporting and Data System (PI-RADS) is pivotal in the diagnosis of clinically significant prostate cancer through MRI imaging. Current deep learning-based PI-RADS scoring methods often lack the incorporation of common PI-RADS clinical guideline~(PICG) utilized by radiologists, potentially compromising scoring accuracy. This paper introduces a novel approach that adapts a multi-modal large language model (MLLM) to incorporate PICG into PI-RADS scoring model without additional annotations and network parameters. We present a designed two-stage fine-tuning process aiming at adapting a MLLM originally trained on natural images to the MRI images while effectively integrating the PICG. Specifically, in the first stage, we develop a domain adapter layer tailored for processing 3D MRI inputs and instruct the MLLM to differentiate MRI sequences. In the second stage, we translate PICG for guiding instructions from the model to generate PICG-guided image features. Through such a feature distillation step, we align the scoring network's features with the PICG-guided image features, which enables the model to effectively incorporate the PICG information. We develop our model on a public dataset and evaluate it on an in-house dataset. Experimental results demonstrate that our approach effectively improves the performance of current scoring networks. Code is available at: https://github.com/med-air/PICG2scoring
title Incorporating Clinical Guidelines through Adapting Multi-modal Large Language Model for Prostate Cancer PI-RADS Scoring
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2405.08786