Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Jiang, Jue, Rangnekar, Aneesh, Choi, Chloe Min Seo, Veeraraghavan, Harini
Formato:	Preprint
Publicado:	2023
Materias:	Computer Vision and Pattern Recognition
Acceso en línea:	https://arxiv.org/abs/2310.01209
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866916310031532032
author	Jiang, Jue Rangnekar, Aneesh Choi, Chloe Min Seo Veeraraghavan, Harini
author_facet	Jiang, Jue Rangnekar, Aneesh Choi, Chloe Min Seo Veeraraghavan, Harini
contents	Pretraining vision transformers (ViT) with attention guided masked image modeling (MIM) has shown to increase downstream accuracy for natural image analysis. Hierarchical shifted window (Swin) transformer, often used in medical image analysis cannot use attention guided masking as it lacks an explicit [CLS] token, needed for computing attention maps for selective masking. We thus enhanced Swin with semantic class attention. We developed a co-distilled Swin transformer that combines a noisy momentum updated teacher to guide selective masking for MIM. Our approach called \textsc{s}e\textsc{m}antic \textsc{a}ttention guided co-distillation with noisy teacher \textsc{r}egularized Swin \textsc{T}rans\textsc{F}ormer (SMARTFormer) was applied for analyzing 3D computed tomography datasets with lung nodules and malignant lung cancers (LC). We also analyzed the impact of semantic attention and noisy teacher on pretraining and downstream accuracy. SMARTFormer classified lesions (malignant from benign) with a high accuracy of 0.895 of 1000 nodules, predicted LC treatment response with accuracy of 0.74, and achieved high accuracies even in limited data regimes. Pretraining with semantic attention and noisy teacher improved ability to distinguish semantically meaningful structures such as organs in a unsupervised clustering task and localize abnormal structures like tumors. Code, models will be made available through GitHub upon paper acceptance.
format	Preprint
id	arxiv_https___arxiv_org_abs_2310_01209
institution	arXiv
publishDate	2023
record_format	arxiv
spellingShingle	Self-distilled Masked Attention guided masked image modeling with noise Regularized Teacher (SMART) for medical image analysis Jiang, Jue Rangnekar, Aneesh Choi, Chloe Min Seo Veeraraghavan, Harini Computer Vision and Pattern Recognition Pretraining vision transformers (ViT) with attention guided masked image modeling (MIM) has shown to increase downstream accuracy for natural image analysis. Hierarchical shifted window (Swin) transformer, often used in medical image analysis cannot use attention guided masking as it lacks an explicit [CLS] token, needed for computing attention maps for selective masking. We thus enhanced Swin with semantic class attention. We developed a co-distilled Swin transformer that combines a noisy momentum updated teacher to guide selective masking for MIM. Our approach called \textsc{s}e\textsc{m}antic \textsc{a}ttention guided co-distillation with noisy teacher \textsc{r}egularized Swin \textsc{T}rans\textsc{F}ormer (SMARTFormer) was applied for analyzing 3D computed tomography datasets with lung nodules and malignant lung cancers (LC). We also analyzed the impact of semantic attention and noisy teacher on pretraining and downstream accuracy. SMARTFormer classified lesions (malignant from benign) with a high accuracy of 0.895 of 1000 nodules, predicted LC treatment response with accuracy of 0.74, and achieved high accuracies even in limited data regimes. Pretraining with semantic attention and noisy teacher improved ability to distinguish semantically meaningful structures such as organs in a unsupervised clustering task and localize abnormal structures like tumors. Code, models will be made available through GitHub upon paper acceptance.
title	Self-distilled Masked Attention guided masked image modeling with noise Regularized Teacher (SMART) for medical image analysis
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2310.01209

Ejemplares similares