Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2407.03552 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866914950389170176 |
|---|---|
| author | Nasiri-Sarvi, Ali Hosseini, Mahdi S. Rivaz, Hassan |
| author_facet | Nasiri-Sarvi, Ali Hosseini, Mahdi S. Rivaz, Hassan |
| contents | Mamba-based models, VMamba and Vim, are a recent family of vision encoders that offer promising performance improvements in many computer vision tasks. This paper compares Mamba-based models with traditional Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) using the breast ultrasound BUSI dataset and Breast Ultrasound B dataset. Our evaluation, which includes multiple runs of experiments and statistical significance analysis, demonstrates that some of the Mamba-based architectures often outperform CNN and ViT models with statistically significant results. For example, in the B dataset, the best Mamba-based models have a 1.98\% average AUC and a 5.0\% average Accuracy improvement compared to the best non-Mamba-based model in this study. These Mamba-based models effectively capture long-range dependencies while maintaining some inductive biases, making them suitable for applications with limited data. The code is available at \url{https://github.com/anasiri/BU-Mamba} |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2407_03552 |
| institution | arXiv |
| publishDate | 2024 |
| record_format | arxiv |
| spellingShingle | Vision Mamba for Classification of Breast Ultrasound Images Nasiri-Sarvi, Ali Hosseini, Mahdi S. Rivaz, Hassan Computer Vision and Pattern Recognition Mamba-based models, VMamba and Vim, are a recent family of vision encoders that offer promising performance improvements in many computer vision tasks. This paper compares Mamba-based models with traditional Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) using the breast ultrasound BUSI dataset and Breast Ultrasound B dataset. Our evaluation, which includes multiple runs of experiments and statistical significance analysis, demonstrates that some of the Mamba-based architectures often outperform CNN and ViT models with statistically significant results. For example, in the B dataset, the best Mamba-based models have a 1.98\% average AUC and a 5.0\% average Accuracy improvement compared to the best non-Mamba-based model in this study. These Mamba-based models effectively capture long-range dependencies while maintaining some inductive biases, making them suitable for applications with limited data. The code is available at \url{https://github.com/anasiri/BU-Mamba} |
| title | Vision Mamba for Classification of Breast Ultrasound Images |
| topic | Computer Vision and Pattern Recognition |
| url | https://arxiv.org/abs/2407.03552 |