Saved in:
Bibliographic Details
Main Authors: Yang, Jianhao, Yu, Wenshuo, Lv, Yuanchao, Sun, Jiance, Sun, Bokang, Liu, Mingyang
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2503.12404
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915504245964800
author Yang, Jianhao
Yu, Wenshuo
Lv, Yuanchao
Sun, Jiance
Sun, Bokang
Liu, Mingyang
author_facet Yang, Jianhao
Yu, Wenshuo
Lv, Yuanchao
Sun, Jiance
Sun, Bokang
Liu, Mingyang
contents Remote sensing image segmentation is crucial for environmental monitoring, disaster assessment, and resource management, but its performance largely depends on the quality of the dataset. Although several high-quality datasets are broadly accessible, data scarcity remains for specialized tasks like marine oil spill segmentation. Such tasks still rely on manual annotation, which is both time-consuming and influenced by subjective human factors. The segment anything model 2 (SAM2) has strong potential as an automatic annotation framework but struggles to perform effectively on heterogeneous, low-contrast remote sensing imagery. To address these challenges, we introduce a novel label enhancement and automatic annotation framework, termed SAM2-ELNet (Enhancement and Labeling Network). Specifically, we employ the frozen Hiera backbone from the pretrained SAM2 as the encoder, while fine-tuning the adapter and decoder for different remote sensing tasks. In addition, the proposed framework includes a label quality evaluator for filtering, ensuring the reliability of the generated labels. We design a series of experiments targeting resource-limited remote sensing tasks and evaluate our method on two datasets: the Deep-SAR Oil Spill (SOS) dataset with Synthetic Aperture Radar (SAR) imagery, and the CHN6-CUG Road dataset with Very High Resolution (VHR) optical imagery. The proposed framework can enhance coarse annotations and generate reliable training data under resource-limited conditions. Fine-tuned on only 30% of the training data, it generates automatically labeled data. A model trained solely on these achieves slightly lower performance than using the full original annotations, while greatly reducing labeling costs and offering a practical solution for large-scale remote sensing interpretation.
format Preprint
id arxiv_https___arxiv_org_abs_2503_12404
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle SAM2-ELNet: Label Enhancement and Automatic Annotation for Remote Sensing Segmentation
Yang, Jianhao
Yu, Wenshuo
Lv, Yuanchao
Sun, Jiance
Sun, Bokang
Liu, Mingyang
Computer Vision and Pattern Recognition
Remote sensing image segmentation is crucial for environmental monitoring, disaster assessment, and resource management, but its performance largely depends on the quality of the dataset. Although several high-quality datasets are broadly accessible, data scarcity remains for specialized tasks like marine oil spill segmentation. Such tasks still rely on manual annotation, which is both time-consuming and influenced by subjective human factors. The segment anything model 2 (SAM2) has strong potential as an automatic annotation framework but struggles to perform effectively on heterogeneous, low-contrast remote sensing imagery. To address these challenges, we introduce a novel label enhancement and automatic annotation framework, termed SAM2-ELNet (Enhancement and Labeling Network). Specifically, we employ the frozen Hiera backbone from the pretrained SAM2 as the encoder, while fine-tuning the adapter and decoder for different remote sensing tasks. In addition, the proposed framework includes a label quality evaluator for filtering, ensuring the reliability of the generated labels. We design a series of experiments targeting resource-limited remote sensing tasks and evaluate our method on two datasets: the Deep-SAR Oil Spill (SOS) dataset with Synthetic Aperture Radar (SAR) imagery, and the CHN6-CUG Road dataset with Very High Resolution (VHR) optical imagery. The proposed framework can enhance coarse annotations and generate reliable training data under resource-limited conditions. Fine-tuned on only 30% of the training data, it generates automatically labeled data. A model trained solely on these achieves slightly lower performance than using the full original annotations, while greatly reducing labeling costs and offering a practical solution for large-scale remote sensing interpretation.
title SAM2-ELNet: Label Enhancement and Automatic Annotation for Remote Sensing Segmentation
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2503.12404