Saved in:
Bibliographic Details
Main Authors: Hu, Jiacheng, Xiang, Yanlin, Lin, Yang, Du, Junliang, Zhang, Hanchao, Liu, Houze
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2502.06243
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912226270511104
author Hu, Jiacheng
Xiang, Yanlin
Lin, Yang
Du, Junliang
Zhang, Hanchao
Liu, Houze
author_facet Hu, Jiacheng
Xiang, Yanlin
Lin, Yang
Du, Junliang
Zhang, Hanchao
Liu, Houze
contents This study introduces an AI-driven skin lesion classification algorithm built on an enhanced Transformer architecture, addressing the challenges of accuracy and robustness in medical image analysis. By integrating a multi-scale feature fusion mechanism and refining the self-attention process, the model effectively extracts both global and local features, enhancing its ability to detect lesions with ambiguous boundaries and intricate structures. Performance evaluation on the ISIC 2017 dataset demonstrates that the improved Transformer surpasses established AI models, including ResNet50, VGG19, ResNext, and Vision Transformer, across key metrics such as accuracy, AUC, F1-Score, and Precision. Grad-CAM visualizations further highlight the interpretability of the model, showcasing strong alignment between the algorithm's focus areas and actual lesion sites. This research underscores the transformative potential of advanced AI models in medical imaging, paving the way for more accurate and reliable diagnostic tools. Future work will explore the scalability of this approach to broader medical imaging tasks and investigate the integration of multimodal data to enhance AI-driven diagnostic frameworks for intelligent healthcare.
format Preprint
id arxiv_https___arxiv_org_abs_2502_06243
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Multi-Scale Transformer Architecture for Accurate Medical Image Classification
Hu, Jiacheng
Xiang, Yanlin
Lin, Yang
Du, Junliang
Zhang, Hanchao
Liu, Houze
Computer Vision and Pattern Recognition
Machine Learning
This study introduces an AI-driven skin lesion classification algorithm built on an enhanced Transformer architecture, addressing the challenges of accuracy and robustness in medical image analysis. By integrating a multi-scale feature fusion mechanism and refining the self-attention process, the model effectively extracts both global and local features, enhancing its ability to detect lesions with ambiguous boundaries and intricate structures. Performance evaluation on the ISIC 2017 dataset demonstrates that the improved Transformer surpasses established AI models, including ResNet50, VGG19, ResNext, and Vision Transformer, across key metrics such as accuracy, AUC, F1-Score, and Precision. Grad-CAM visualizations further highlight the interpretability of the model, showcasing strong alignment between the algorithm's focus areas and actual lesion sites. This research underscores the transformative potential of advanced AI models in medical imaging, paving the way for more accurate and reliable diagnostic tools. Future work will explore the scalability of this approach to broader medical imaging tasks and investigate the integration of multimodal data to enhance AI-driven diagnostic frameworks for intelligent healthcare.
title Multi-Scale Transformer Architecture for Accurate Medical Image Classification
topic Computer Vision and Pattern Recognition
Machine Learning
url https://arxiv.org/abs/2502.06243