Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Chen, Yinda, Liu, Che, Liu, Xiaoyu, Arcucci, Rossella, Xiong, Zhiwei
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Computation and Language
Online Access:	https://arxiv.org/abs/2403.15992
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910532881088512
author	Chen, Yinda Liu, Che Liu, Xiaoyu Arcucci, Rossella Xiong, Zhiwei
author_facet	Chen, Yinda Liu, Che Liu, Xiaoyu Arcucci, Rossella Xiong, Zhiwei
contents	The burgeoning integration of 3D medical imaging into healthcare has led to a substantial increase in the workload of medical professionals. To assist clinicians in their diagnostic processes and alleviate their workload, the development of a robust system for retrieving similar case studies presents a viable solution. While the concept holds great promise, the field of 3D medical text-image retrieval is currently limited by the absence of robust evaluation benchmarks and curated datasets. To remedy this, our study presents a groundbreaking dataset, {BIMCV-R}, which includes an extensive collection of 8,069 3D CT volumes, encompassing over 2 million slices, paired with their respective radiological reports. Expanding upon the foundational work of our dataset, we craft a retrieval strategy, MedFinder. This approach employs a dual-stream network architecture, harnessing the potential of large language models to advance the field of medical image retrieval beyond existing text-image retrieval solutions. It marks our preliminary step towards developing a system capable of facilitating text-to-image, image-to-text, and keyword-based retrieval tasks. Our project is available at \url{https://huggingface.co/datasets/cyd0806/BIMCV-R}.
format	Preprint
id	arxiv_https___arxiv_org_abs_2403_15992
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	BIMCV-R: A Landmark Dataset for 3D CT Text-Image Retrieval Chen, Yinda Liu, Che Liu, Xiaoyu Arcucci, Rossella Xiong, Zhiwei Computer Vision and Pattern Recognition Computation and Language The burgeoning integration of 3D medical imaging into healthcare has led to a substantial increase in the workload of medical professionals. To assist clinicians in their diagnostic processes and alleviate their workload, the development of a robust system for retrieving similar case studies presents a viable solution. While the concept holds great promise, the field of 3D medical text-image retrieval is currently limited by the absence of robust evaluation benchmarks and curated datasets. To remedy this, our study presents a groundbreaking dataset, {BIMCV-R}, which includes an extensive collection of 8,069 3D CT volumes, encompassing over 2 million slices, paired with their respective radiological reports. Expanding upon the foundational work of our dataset, we craft a retrieval strategy, MedFinder. This approach employs a dual-stream network architecture, harnessing the potential of large language models to advance the field of medical image retrieval beyond existing text-image retrieval solutions. It marks our preliminary step towards developing a system capable of facilitating text-to-image, image-to-text, and keyword-based retrieval tasks. Our project is available at \url{https://huggingface.co/datasets/cyd0806/BIMCV-R}.
title	BIMCV-R: A Landmark Dataset for 3D CT Text-Image Retrieval
topic	Computer Vision and Pattern Recognition Computation and Language
url	https://arxiv.org/abs/2403.15992

Similar Items