Saved in:
Bibliographic Details
Main Authors: Rossi, Leonardo, Bernuzzi, Vittorio, Fontanini, Tomaso, Bertozzi, Massimo, Prati, Andrea
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2404.18924
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866908411840430080
author Rossi, Leonardo
Bernuzzi, Vittorio
Fontanini, Tomaso
Bertozzi, Massimo
Prati, Andrea
author_facet Rossi, Leonardo
Bernuzzi, Vittorio
Fontanini, Tomaso
Bertozzi, Massimo
Prati, Andrea
contents Due to the limitations of current optical and sensor technologies and the high cost of updating them, the spectral and spatial resolution of satellites may not always meet desired requirements. For these reasons, Remote-Sensing Single-Image Super-Resolution (RS-SISR) techniques have gained significant interest. In this paper, we propose Swin2-MoSE model, an enhanced version of Swin2SR. Our model introduces MoE-SM, an enhanced Mixture-of-Experts (MoE) to replace the Feed-Forward inside all Transformer block. MoE-SM is designed with Smart-Merger, and new layer for merging the output of individual experts, and with a new way to split the work between experts, defining a new per-example strategy instead of the commonly used per-token one. Furthermore, we analyze how positional encodings interact with each other, demonstrating that per-channel bias and per-head bias can positively cooperate. Finally, we propose to use a combination of Normalized-Cross-Correlation (NCC) and Structural Similarity Index Measure (SSIM) losses, to avoid typical MSE loss limitations. Experimental results demonstrate that Swin2-MoSE outperforms any Swin derived models by up to 0.377 - 0.958 dB (PSNR) on task of 2x, 3x and 4x resolution-upscaling (Sen2Venus and OLI2MSI datasets). It also outperforms SOTA models by a good margin, proving to be competitive and with excellent potential, especially for complex tasks. Additionally, an analysis of computational costs is also performed. Finally, we show the efficacy of Swin2-MoSE, applying it to a semantic segmentation task (SeasoNet dataset). Code and pretrained are available on https://github.com/IMPLabUniPr/swin2-mose/tree/official_code
format Preprint
id arxiv_https___arxiv_org_abs_2404_18924
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Swin2-MoSE: A New Single Image Super-Resolution Model for Remote Sensing
Rossi, Leonardo
Bernuzzi, Vittorio
Fontanini, Tomaso
Bertozzi, Massimo
Prati, Andrea
Computer Vision and Pattern Recognition
Image and Video Processing
Due to the limitations of current optical and sensor technologies and the high cost of updating them, the spectral and spatial resolution of satellites may not always meet desired requirements. For these reasons, Remote-Sensing Single-Image Super-Resolution (RS-SISR) techniques have gained significant interest. In this paper, we propose Swin2-MoSE model, an enhanced version of Swin2SR. Our model introduces MoE-SM, an enhanced Mixture-of-Experts (MoE) to replace the Feed-Forward inside all Transformer block. MoE-SM is designed with Smart-Merger, and new layer for merging the output of individual experts, and with a new way to split the work between experts, defining a new per-example strategy instead of the commonly used per-token one. Furthermore, we analyze how positional encodings interact with each other, demonstrating that per-channel bias and per-head bias can positively cooperate. Finally, we propose to use a combination of Normalized-Cross-Correlation (NCC) and Structural Similarity Index Measure (SSIM) losses, to avoid typical MSE loss limitations. Experimental results demonstrate that Swin2-MoSE outperforms any Swin derived models by up to 0.377 - 0.958 dB (PSNR) on task of 2x, 3x and 4x resolution-upscaling (Sen2Venus and OLI2MSI datasets). It also outperforms SOTA models by a good margin, proving to be competitive and with excellent potential, especially for complex tasks. Additionally, an analysis of computational costs is also performed. Finally, we show the efficacy of Swin2-MoSE, applying it to a semantic segmentation task (SeasoNet dataset). Code and pretrained are available on https://github.com/IMPLabUniPr/swin2-mose/tree/official_code
title Swin2-MoSE: A New Single Image Super-Resolution Model for Remote Sensing
topic Computer Vision and Pattern Recognition
Image and Video Processing
url https://arxiv.org/abs/2404.18924