Saved in:
Bibliographic Details
Main Authors: Wahd, Assefa Seyoum, Felfeliyan, Banafshe, Zhou, Yuyue, Ghosh, Shrimanti, McArthur, Adam, Zhang, Jiechen, Jaremko, Jacob L., Hareendranathan, Abhilash
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2409.06821
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Foundation models like the segment anything model require high-quality manual prompts for medical image segmentation, which is time-consuming and requires expertise. SAM and its variants often fail to segment structures in ultrasound (US) images due to domain shift. We propose Sam2Rad, a prompt learning approach to adapt SAM and its variants for US bone segmentation without human prompts. It introduces a prompt predictor network (PPN) with a cross-attention module to predict prompt embeddings from image encoder features. PPN outputs bounding box and mask prompts, and 256-dimensional embeddings for regions of interest. The framework allows optional manual prompting and can be trained end-to-end using parameter-efficient fine-tuning (PEFT). Sam2Rad was tested on 3 musculoskeletal US datasets: wrist (3822 images), rotator cuff (1605 images), and hip (4849 images). It improved performance across all datasets without manual prompts, increasing Dice scores by 2-7% for hip/wrist and up to 33% for shoulder data. Sam2Rad can be trained with as few as 10 labeled images and is compatible with any SAM architecture for automatic segmentation.