Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Bakhsheshi, Nadia, Beigy, Hamid
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2509.06033
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915969028325376
author	Bakhsheshi, Nadia Beigy, Hamid
author_facet	Bakhsheshi, Nadia Beigy, Hamid
contents	The reliable analysis of blood reports is important for health knowledge, but individuals often struggle with interpretation, leading to anxiety and overlooked issues. We explore the potential of general-purpose Vision-Language Models (VLMs) to address this challenge by automatically analyzing blood report images. We conduct a comparative evaluation of three VLMs: Qwen-VL-Max, Gemini 2.5 Pro, and Llama 4 Maverick, determining their performance on a dataset of 100 diverse blood report images. Each model was prompted with clinically relevant questions adapted to each blood report. The answers were then processed using Sentence-BERT to compare and evaluate how closely the models responded. The findings suggest that general-purpose VLMs are a practical and promising technology for developing patient-facing tools for preliminary blood report analysis. Their ability to provide clear interpretations directly from images can improve health literacy and reduce the limitations to understanding complex medical information. This work establishes a foundation for the future development of reliable and accessible AI-assisted healthcare applications. While results are encouraging, they should be interpreted cautiously given the limited dataset size.
format	Preprint
id	arxiv_https___arxiv_org_abs_2509_06033
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Analysis of Blood Report Images Using General Purpose Vision-Language Models Bakhsheshi, Nadia Beigy, Hamid Computer Vision and Pattern Recognition The reliable analysis of blood reports is important for health knowledge, but individuals often struggle with interpretation, leading to anxiety and overlooked issues. We explore the potential of general-purpose Vision-Language Models (VLMs) to address this challenge by automatically analyzing blood report images. We conduct a comparative evaluation of three VLMs: Qwen-VL-Max, Gemini 2.5 Pro, and Llama 4 Maverick, determining their performance on a dataset of 100 diverse blood report images. Each model was prompted with clinically relevant questions adapted to each blood report. The answers were then processed using Sentence-BERT to compare and evaluate how closely the models responded. The findings suggest that general-purpose VLMs are a practical and promising technology for developing patient-facing tools for preliminary blood report analysis. Their ability to provide clear interpretations directly from images can improve health literacy and reduce the limitations to understanding complex medical information. This work establishes a foundation for the future development of reliable and accessible AI-assisted healthcare applications. While results are encouraging, they should be interpreted cautiously given the limited dataset size.
title	Analysis of Blood Report Images Using General Purpose Vision-Language Models
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2509.06033

Similar Items