Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Prexl, Jonathan, Recla, Michael, Schmitt, Michael
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2504.08441
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915236945068032
author	Prexl, Jonathan Recla, Michael Schmitt, Michael
author_facet	Prexl, Jonathan Recla, Michael Schmitt, Michael
contents	This manuscript introduces SARFormer, a modified Vision Transformer (ViT) architecture designed for processing one or multiple synthetic aperture radar (SAR) images. Given the complex image geometry of SAR data, we propose an acquisition parameter encoding module that significantly guides the learning process, especially in the case of multiple images, leading to improved performance on downstream tasks. We further explore self-supervised pre-training, conduct experiments with limited labeled data, and benchmark our contribution and adaptations thoroughly in ablation experiments against a baseline, where the model is tested on tasks such as height reconstruction and segmentation. Our approach achieves up to 17% improvement in terms of RMSE over baseline models
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_08441
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	SARFormer -- An Acquisition Parameter Aware Vision Transformer for Synthetic Aperture Radar Data Prexl, Jonathan Recla, Michael Schmitt, Michael Computer Vision and Pattern Recognition This manuscript introduces SARFormer, a modified Vision Transformer (ViT) architecture designed for processing one or multiple synthetic aperture radar (SAR) images. Given the complex image geometry of SAR data, we propose an acquisition parameter encoding module that significantly guides the learning process, especially in the case of multiple images, leading to improved performance on downstream tasks. We further explore self-supervised pre-training, conduct experiments with limited labeled data, and benchmark our contribution and adaptations thoroughly in ablation experiments against a baseline, where the model is tested on tasks such as height reconstruction and segmentation. Our approach achieves up to 17% improvement in terms of RMSE over baseline models
title	SARFormer -- An Acquisition Parameter Aware Vision Transformer for Synthetic Aperture Radar Data
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2504.08441

Similar Items