Saved in:
Bibliographic Details
Main Authors: Lee, Minjae, Hur, Sungwoo, Hwang, Soojin, Kim, Won Hwa
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2604.12113
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910127613804544
author Lee, Minjae
Hur, Sungwoo
Hwang, Soojin
Kim, Won Hwa
author_facet Lee, Minjae
Hur, Sungwoo
Hwang, Soojin
Kim, Won Hwa
contents Visual Foundation Models (VFMs) such as the Segment Anything Model (SAM) have significantly advanced broad use of image segmentation. However, SAM and its variants necessitate substantial manual effort for prompt generation and additional training for specific applications. Recent approaches address these limitations by integrating SAM into in-context (one/few shot) segmentation, enabling auto-prompting through semantic alignment between query and support images. Despite these efforts, they still generate sub-optimal prompts that degrade segmentation quality due to visual inconsistencies between support and query images. To tackle this limitation, we introduce PR-MaGIC (Prompt Refinement via Mask Decoder Gradient Flow for In-Context Segmentation), a training-free test-time framework that refines prompts via gradient flow derived from SAM's mask decoder. PR-MaGIC seamlessly integrates into in-context segmentation frameworks, being theoretically grounded yet practically stabilized through a simple top-1 selection strategy that ensures robust performance across samples. Extensive evaluations demonstrate that PR-MaGIC consistently improves segmentation quality across various benchmarks, effectively mitigating inadequate prompts without requiring additional training or architectural modifications.
format Preprint
id arxiv_https___arxiv_org_abs_2604_12113
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle PR-MaGIC: Prompt Refinement Via Mask Decoder Gradient Flow For In-Context Segmentation
Lee, Minjae
Hur, Sungwoo
Hwang, Soojin
Kim, Won Hwa
Computer Vision and Pattern Recognition
Artificial Intelligence
Visual Foundation Models (VFMs) such as the Segment Anything Model (SAM) have significantly advanced broad use of image segmentation. However, SAM and its variants necessitate substantial manual effort for prompt generation and additional training for specific applications. Recent approaches address these limitations by integrating SAM into in-context (one/few shot) segmentation, enabling auto-prompting through semantic alignment between query and support images. Despite these efforts, they still generate sub-optimal prompts that degrade segmentation quality due to visual inconsistencies between support and query images. To tackle this limitation, we introduce PR-MaGIC (Prompt Refinement via Mask Decoder Gradient Flow for In-Context Segmentation), a training-free test-time framework that refines prompts via gradient flow derived from SAM's mask decoder. PR-MaGIC seamlessly integrates into in-context segmentation frameworks, being theoretically grounded yet practically stabilized through a simple top-1 selection strategy that ensures robust performance across samples. Extensive evaluations demonstrate that PR-MaGIC consistently improves segmentation quality across various benchmarks, effectively mitigating inadequate prompts without requiring additional training or architectural modifications.
title PR-MaGIC: Prompt Refinement Via Mask Decoder Gradient Flow For In-Context Segmentation
topic Computer Vision and Pattern Recognition
Artificial Intelligence
url https://arxiv.org/abs/2604.12113