Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Sun, Enzhe, Cui, Yongchuan, Liu, Peng, Yan, Jining
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2504.00901
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866913937532911616
author	Sun, Enzhe Cui, Yongchuan Liu, Peng Yan, Jining
author_facet	Sun, Enzhe Cui, Yongchuan Liu, Peng Yan, Jining
contents	Remote sensing spatiotemporal fusion (STF) addresses the fundamental trade-off between temporal and spatial resolution by combining high temporal-low spatial and high spatial-low temporal imagery. This paper presents the first comprehensive survey of deep learning advances in remote sensing STF over the past decade. We establish a systematic taxonomy of deep learning architectures including Convolutional Neural Networks (CNNs), Transformers, Generative Adversarial Networks (GANs), diffusion models, and sequence models, revealing significant growth in deep learning adoption for STF tasks. Our analysis reveals that CNN-based methods dominate spatial feature extraction, while Transformer architectures show superior performance in capturing long-range temporal dependencies. GAN and diffusion models demonstrate exceptional capability in detail reconstruction, substantially outperforming traditional methods in structural similarity and spectral fidelity. Through comprehensive experiments on seven benchmark datasets comparing ten representative methods, we validate these findings and quantify the performance trade-offs between different approaches. We identify five critical challenges: time-space conflicts, limited generalization across datasets, computational efficiency for large-scale processing, multi-source heterogeneous fusion, and insufficient benchmark diversity. The survey highlights promising opportunities in foundation models, hybrid architectures, and self-supervised learning approaches that could address current limitations and enable multimodal applications. The specific models, datasets, and other information mentioned in this article have been collected in: https://github.com/yc-cui/Deep-Learning-Spatiotemporal-Fusion-Survey.
format	Preprint
id	arxiv_https___arxiv_org_abs_2504_00901
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	A Decade of Deep Learning for Remote Sensing Spatiotemporal Fusion: Advances, Challenges, and Opportunities Sun, Enzhe Cui, Yongchuan Liu, Peng Yan, Jining Computer Vision and Pattern Recognition Remote sensing spatiotemporal fusion (STF) addresses the fundamental trade-off between temporal and spatial resolution by combining high temporal-low spatial and high spatial-low temporal imagery. This paper presents the first comprehensive survey of deep learning advances in remote sensing STF over the past decade. We establish a systematic taxonomy of deep learning architectures including Convolutional Neural Networks (CNNs), Transformers, Generative Adversarial Networks (GANs), diffusion models, and sequence models, revealing significant growth in deep learning adoption for STF tasks. Our analysis reveals that CNN-based methods dominate spatial feature extraction, while Transformer architectures show superior performance in capturing long-range temporal dependencies. GAN and diffusion models demonstrate exceptional capability in detail reconstruction, substantially outperforming traditional methods in structural similarity and spectral fidelity. Through comprehensive experiments on seven benchmark datasets comparing ten representative methods, we validate these findings and quantify the performance trade-offs between different approaches. We identify five critical challenges: time-space conflicts, limited generalization across datasets, computational efficiency for large-scale processing, multi-source heterogeneous fusion, and insufficient benchmark diversity. The survey highlights promising opportunities in foundation models, hybrid architectures, and self-supervised learning approaches that could address current limitations and enable multimodal applications. The specific models, datasets, and other information mentioned in this article have been collected in: https://github.com/yc-cui/Deep-Learning-Spatiotemporal-Fusion-Survey.
title	A Decade of Deep Learning for Remote Sensing Spatiotemporal Fusion: Advances, Challenges, and Opportunities
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2504.00901

Similar Items