Saved in:
Bibliographic Details
Main Authors: Tomar, Nikhil Kumar, Jha, Debesh, Biswas, Koushik, Berzin, Tyler M., Keswani, Rajesh, Wallace, Michael, Bagci, Ulas
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2409.05875
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909309737107456
author Tomar, Nikhil Kumar
Jha, Debesh
Biswas, Koushik
Berzin, Tyler M.
Keswani, Rajesh
Wallace, Michael
Bagci, Ulas
author_facet Tomar, Nikhil Kumar
Jha, Debesh
Biswas, Koushik
Berzin, Tyler M.
Keswani, Rajesh
Wallace, Michael
Bagci, Ulas
contents Colorectal cancer (CRC) is the third most common cause of cancer diagnosed in the United States and the second leading cause of cancer-related death among both genders. Notably, CRC is the leading cause of cancer in younger men less than 50 years old. Colonoscopy is considered the gold standard for the early diagnosis of CRC. Skills vary significantly among endoscopists, and a high miss rate is reported. Automated polyp segmentation can reduce the missed rates, and timely treatment is possible in the early stage. To address this challenge, we introduce \textit{\textbf{\ac{FANetv2}}}, an advanced encoder-decoder network designed to accurately segment polyps from colonoscopy images. Leveraging an initial input mask generated by Otsu thresholding, FANetv2 iteratively refines its binary segmentation masks through a novel feedback attention mechanism informed by the mask predictions of previous epochs. Additionally, it employs a text-guided approach that integrates essential information about the number (one or many) and size (small, medium, large) of polyps to further enhance its feature representation capabilities. This dual-task approach facilitates accurate polyp segmentation and aids in the auxiliary classification of polyp attributes, significantly boosting the model's performance. Our comprehensive evaluations on the publicly available BKAI-IGH and CVC-ClinicDB datasets demonstrate the superior performance of FANetv2, evidenced by high dice similarity coefficients (DSC) of 0.9186 and 0.9481, along with low Hausdorff distances of 2.83 and 3.19, respectively. The source code for FANetv2 is available at https://github.com/xxxxx/FANetv2.
format Preprint
id arxiv_https___arxiv_org_abs_2409_05875
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Transformer-Enhanced Iterative Feedback Mechanism for Polyp Segmentation
Tomar, Nikhil Kumar
Jha, Debesh
Biswas, Koushik
Berzin, Tyler M.
Keswani, Rajesh
Wallace, Michael
Bagci, Ulas
Computer Vision and Pattern Recognition
Colorectal cancer (CRC) is the third most common cause of cancer diagnosed in the United States and the second leading cause of cancer-related death among both genders. Notably, CRC is the leading cause of cancer in younger men less than 50 years old. Colonoscopy is considered the gold standard for the early diagnosis of CRC. Skills vary significantly among endoscopists, and a high miss rate is reported. Automated polyp segmentation can reduce the missed rates, and timely treatment is possible in the early stage. To address this challenge, we introduce \textit{\textbf{\ac{FANetv2}}}, an advanced encoder-decoder network designed to accurately segment polyps from colonoscopy images. Leveraging an initial input mask generated by Otsu thresholding, FANetv2 iteratively refines its binary segmentation masks through a novel feedback attention mechanism informed by the mask predictions of previous epochs. Additionally, it employs a text-guided approach that integrates essential information about the number (one or many) and size (small, medium, large) of polyps to further enhance its feature representation capabilities. This dual-task approach facilitates accurate polyp segmentation and aids in the auxiliary classification of polyp attributes, significantly boosting the model's performance. Our comprehensive evaluations on the publicly available BKAI-IGH and CVC-ClinicDB datasets demonstrate the superior performance of FANetv2, evidenced by high dice similarity coefficients (DSC) of 0.9186 and 0.9481, along with low Hausdorff distances of 2.83 and 3.19, respectively. The source code for FANetv2 is available at https://github.com/xxxxx/FANetv2.
title Transformer-Enhanced Iterative Feedback Mechanism for Polyp Segmentation
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2409.05875