Saved in:
Bibliographic Details
Main Authors: Prexl, Jonathan, Recla, Michael, Schmitt, Michael
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2504.08441
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915236945068032
author Prexl, Jonathan
Recla, Michael
Schmitt, Michael
author_facet Prexl, Jonathan
Recla, Michael
Schmitt, Michael
contents This manuscript introduces SARFormer, a modified Vision Transformer (ViT) architecture designed for processing one or multiple synthetic aperture radar (SAR) images. Given the complex image geometry of SAR data, we propose an acquisition parameter encoding module that significantly guides the learning process, especially in the case of multiple images, leading to improved performance on downstream tasks. We further explore self-supervised pre-training, conduct experiments with limited labeled data, and benchmark our contribution and adaptations thoroughly in ablation experiments against a baseline, where the model is tested on tasks such as height reconstruction and segmentation. Our approach achieves up to 17% improvement in terms of RMSE over baseline models
format Preprint
id arxiv_https___arxiv_org_abs_2504_08441
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle SARFormer -- An Acquisition Parameter Aware Vision Transformer for Synthetic Aperture Radar Data
Prexl, Jonathan
Recla, Michael
Schmitt, Michael
Computer Vision and Pattern Recognition
This manuscript introduces SARFormer, a modified Vision Transformer (ViT) architecture designed for processing one or multiple synthetic aperture radar (SAR) images. Given the complex image geometry of SAR data, we propose an acquisition parameter encoding module that significantly guides the learning process, especially in the case of multiple images, leading to improved performance on downstream tasks. We further explore self-supervised pre-training, conduct experiments with limited labeled data, and benchmark our contribution and adaptations thoroughly in ablation experiments against a baseline, where the model is tested on tasks such as height reconstruction and segmentation. Our approach achieves up to 17% improvement in terms of RMSE over baseline models
title SARFormer -- An Acquisition Parameter Aware Vision Transformer for Synthetic Aperture Radar Data
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2504.08441