Saved in:
Bibliographic Details
Main Authors: Nasiri-Sarvi, Ali, Hosseini, Mahdi S., Rivaz, Hassan
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2407.03552
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914950389170176
author Nasiri-Sarvi, Ali
Hosseini, Mahdi S.
Rivaz, Hassan
author_facet Nasiri-Sarvi, Ali
Hosseini, Mahdi S.
Rivaz, Hassan
contents Mamba-based models, VMamba and Vim, are a recent family of vision encoders that offer promising performance improvements in many computer vision tasks. This paper compares Mamba-based models with traditional Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) using the breast ultrasound BUSI dataset and Breast Ultrasound B dataset. Our evaluation, which includes multiple runs of experiments and statistical significance analysis, demonstrates that some of the Mamba-based architectures often outperform CNN and ViT models with statistically significant results. For example, in the B dataset, the best Mamba-based models have a 1.98\% average AUC and a 5.0\% average Accuracy improvement compared to the best non-Mamba-based model in this study. These Mamba-based models effectively capture long-range dependencies while maintaining some inductive biases, making them suitable for applications with limited data. The code is available at \url{https://github.com/anasiri/BU-Mamba}
format Preprint
id arxiv_https___arxiv_org_abs_2407_03552
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Vision Mamba for Classification of Breast Ultrasound Images
Nasiri-Sarvi, Ali
Hosseini, Mahdi S.
Rivaz, Hassan
Computer Vision and Pattern Recognition
Mamba-based models, VMamba and Vim, are a recent family of vision encoders that offer promising performance improvements in many computer vision tasks. This paper compares Mamba-based models with traditional Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) using the breast ultrasound BUSI dataset and Breast Ultrasound B dataset. Our evaluation, which includes multiple runs of experiments and statistical significance analysis, demonstrates that some of the Mamba-based architectures often outperform CNN and ViT models with statistically significant results. For example, in the B dataset, the best Mamba-based models have a 1.98\% average AUC and a 5.0\% average Accuracy improvement compared to the best non-Mamba-based model in this study. These Mamba-based models effectively capture long-range dependencies while maintaining some inductive biases, making them suitable for applications with limited data. The code is available at \url{https://github.com/anasiri/BU-Mamba}
title Vision Mamba for Classification of Breast Ultrasound Images
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2407.03552