:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Li, Chunshan, Wang, Rong, Yang, Xiaofei, Chu, Dianhui
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2503.20382
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

An Efficient and Effective Encoder Model for Vision and Language Tasks in the Remote Sensing Domain
by: Silva, João Daniel, et al.
Published: (2025)

Falcon: A Remote Sensing Vision-Language Foundation Model (Technical Report)
by: Yao, Kelu, et al.
Published: (2025)

UniRS: Unifying Multi-temporal Remote Sensing Tasks through Vision Language Models
by: Li, Yujie, et al.
Published: (2024)

SatelliteCalculator: A Multi-Task Vision Foundation Model for Quantitative Remote Sensing Inversion
by: Yu, Zhenyu, et al.
Published: (2025)

Diffusion-RSCC: Diffusion Probabilistic Model for Change Captioning in Remote Sensing Images
by: Yu, Xiaofei, et al.
Published: (2024)

LWGANet: Addressing Spatial and Channel Redundancy in Remote Sensing Visual Tasks with Light-Weight Grouped Attention
by: Lu, Wei, et al.
Published: (2025)

Contributions to Label-Efficient Learning in Computer Vision and Remote Sensing
by: Pham, Minh-Tan
Published: (2025)

Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models
by: Guo, Haonan, et al.
Published: (2024)

Multi-Task Domain Adaptation for Language Grounding with 3D Objects
by: Sun, Penglei, et al.
Published: (2024)

NeXt2Former-CD: Efficient Remote Sensing Change Detection with Modern Vision Architectures
by: Wang, Yufan, et al.
Published: (2026)

RemoteCLIP: A Vision Language Foundation Model for Remote Sensing
by: Liu, Fan, et al.
Published: (2023)

Spiking Meets Attention: Efficient Remote Sensing Image Super-Resolution with Attention Spiking Neural Networks
by: Xiao, Yi, et al.
Published: (2025)

AMMUNet: Multi-Scale Attention Map Merging for Remote Sensing Image Segmentation
by: Yang, Yang, et al.
Published: (2024)

Deformable Attention Mechanisms Applied to Object Detection, case of Remote Sensing
by: Boutayeb, Anasse, et al.
Published: (2025)

VisionGRU: A Linear-Complexity RNN Model for Efficient Image Analysis
by: Yin, Shicheng, et al.
Published: (2024)

SkyEyeGPT: Unifying Remote Sensing Vision-Language Tasks via Instruction Tuning with Large Language Model
by: Zhan, Yang, et al.
Published: (2024)

A Survey on Remote Sensing Foundation Models: From Vision to Multimodality
by: Huang, Ziyue, et al.
Published: (2025)

A Survey of Sample-Efficient Deep Learning for Change Detection in Remote Sensing: Tasks, Strategies, and Challenges
by: Ding, Lei, et al.
Published: (2025)

RS-DFM: A Remote Sensing Distributed Foundation Model for Diverse Downstream Tasks
by: Wang, Zhechao, et al.
Published: (2024)

Co-Training Vision Language Models for Remote Sensing Multi-task Learning
by: Li, Qingyun, et al.
Published: (2025)

PeftCD: Leveraging Vision Foundation Models with Parameter-Efficient Fine-Tuning for Remote Sensing Change Detection
by: Dong, Sijun, et al.
Published: (2025)

RS-Agent: Automating Remote Sensing Tasks through Intelligent Agent
by: Xu, Wenjia, et al.
Published: (2024)

FUSE-RSVLM: Feature Fusion Vision-Language Model for Remote Sensing
by: Dang, Yunkai, et al.
Published: (2025)

A Benchmark for Multi-Lingual Vision-Language Learning in Remote Sensing Image Captioning
by: Zhou, Qing, et al.
Published: (2025)

LeMeViT: Efficient Vision Transformer with Learnable Meta Tokens for Remote Sensing Image Interpretation
by: Jiang, Wentao, et al.
Published: (2024)

GeoGround: A Unified Large Vision-Language Model for Remote Sensing Visual Grounding
by: Zhou, Yue, et al.
Published: (2024)

Task Specific Pretraining with Noisy Labels for Remote Sensing Image Segmentation
by: Liu, Chenying, et al.
Published: (2024)

GeoRSMLLM: A Multimodal Large Language Model for Vision-Language Tasks in Geoscience and Remote Sensing
by: Zhang, Zilun, et al.
Published: (2025)

Continual Vision-Language Learning for Remote Sensing: Benchmarking and Analysis
by: Weng, Xingxing, et al.
Published: (2026)

Interactive Multi-Head Self-Attention with Linear Complexity
by: Kang, Hankyul, et al.
Published: (2024)

RS2-SAM2: Customized SAM2 for Referring Remote Sensing Image Segmentation
by: Rong, Fu, et al.
Published: (2025)

Threshold Attention Network for Semantic Segmentation of Remote Sensing Images
by: Long, Wei, et al.
Published: (2025)

VHM: Versatile and Honest Vision Language Model for Remote Sensing Image Analysis
by: Pang, Chao, et al.
Published: (2024)

Demystify Mamba in Vision: A Linear Attention Perspective
by: Han, Dongchen, et al.
Published: (2024)

Remote Sensing SpatioTemporal Vision-Language Models: A Comprehensive Survey
by: Liu, Chenyang, et al.
Published: (2024)

Adapting Vision Foundation Models for Robust Cloud Segmentation in Remote Sensing Images
by: Zou, Xuechao, et al.
Published: (2024)

Multimodal Remote Sensing Scene Classification Using VLMs and Dual-Cross Attention Networks
by: Cai, Jinjin, et al.
Published: (2024)

LinearSR: Unlocking Linear Attention for Stable and Efficient Image Super-Resolution
by: Li, Xiaohui, et al.
Published: (2025)

Vision Mamba in Remote Sensing: A Comprehensive Survey of Techniques, Applications and Outlook
by: Bao, Muyi, et al.
Published: (2025)

Vision Foundation Models in Remote Sensing: A Survey
by: Lu, Siqi, et al.
Published: (2024)