:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Jiang, Hongxiang, Yin, Jihao, Wang, Qixiong, Feng, Jiaqi, Chen, Guo
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2503.23330
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

EagleVision: A Multi-Task Benchmark for Cross-Domain Perception in High-Speed Autonomous Racing
by: Yagudin, Zakhar, et al.
Published: (2026)

EagleVision: A Dual-Stage Framework with BEV-grounding-based Chain-of-Thought for Spatial Intelligence
by: Wan, Jiaxu, et al.
Published: (2025)

Interactive Masked Image Modeling for Multimodal Object Detection in Remote Sensing
by: Vu, Minh-Duc, et al.
Published: (2024)

Beyond Open Vocabulary: Multimodal Prompting for Object Detection in Remote Sensing Images
by: Yang, Shuai, et al.
Published: (2026)

Multimodal Transformer Using Cross-Channel attention for Object Detection in Remote Sensing Images
by: Bahaduri, Bissmella, et al.
Published: (2023)

Frequency-Aware Vision-Language Multimodality Generalization Network for Remote Sensing Image Classification
by: Zhang, Junjie, et al.
Published: (2025)

LMFNet: An Efficient Multimodal Fusion Approach for Semantic Segmentation in High-Resolution Remote Sensing
by: Wang, Tong, et al.
Published: (2024)

A Survey on Remote Sensing Foundation Models: From Vision to Multimodality
by: Huang, Ziyue, et al.
Published: (2025)

LLM-Assisted Semantic Guidance for Sparsely Annotated Remote Sensing Object Detection
by: Liao, Wei, et al.
Published: (2025)

Real-Time Oriented Object Detection Transformer in Remote Sensing Images
by: Ding, Zeyu, et al.
Published: (2026)

EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
by: Shi, Yulong, et al.
Published: (2023)

Remote Sensing Object Counting with Online Knowledge Learning
by: Jiang, Shengqin, et al.
Published: (2023)

MO R-CNN: Multispectral Oriented R-CNN for Object Detection in Remote Sensing Image
by: Wang, Leiyu, et al.
Published: (2025)

MutDet: Mutually Optimizing Pre-training for Remote Sensing Object Detection
by: Huang, Ziyue, et al.
Published: (2024)

LHRS-Bot-Nova: Improved Multimodal Large Language Model for Remote Sensing Vision-Language Interpretation
by: Li, Zhenshi, et al.
Published: (2024)

GeoRSMLLM: A Multimodal Large Language Model for Vision-Language Tasks in Geoscience and Remote Sensing
by: Zhang, Zilun, et al.
Published: (2025)

A Resource-Efficient Training Framework for Remote Sensing Text--Image Retrieval
by: Zhang, Weihang, et al.
Published: (2025)

MGIMM: Multi-Granularity Instruction Multimodal Model for Attribute-Guided Remote Sensing Image Detailed Description
by: Yang, Cong, et al.
Published: (2024)

AeroGen: Enhancing Remote Sensing Object Detection with Diffusion-Driven Data Generation
by: Tang, Datao, et al.
Published: (2024)

OpenRSD: Towards Open-prompts for Object Detection in Remote Sensing Images
by: Huang, Ziyue, et al.
Published: (2025)

CSFMamba: Cross State Fusion Mamba Operator for Multimodal Remote Sensing Image Classification
by: Wang, Qingyu, et al.
Published: (2025)

VHM: Versatile and Honest Vision Language Model for Remote Sensing Image Analysis
by: Pang, Chao, et al.
Published: (2024)

RemoteCLIP: A Vision Language Foundation Model for Remote Sensing
by: Liu, Fan, et al.
Published: (2023)

Generalization-Enhanced Few-Shot Object Detection in Remote Sensing
by: Lin, Hui, et al.
Published: (2025)

InstructAttribute: Fine-grained Object Attributes editing with Instruction
by: Yin, Xingxi, et al.
Published: (2025)

OrientedFormer: An End-to-End Transformer-Based Oriented Object Detector in Remote Sensing Images
by: Zhao, Jiaqi, et al.
Published: (2024)

STARS: Shared-specific Translation and Alignment for missing-modality Remote Sensing Semantic Segmentation
by: Wang, Tong, et al.
Published: (2026)

Object Fidelity Diffusion for Remote Sensing Image Generation
by: Ye, Ziqi, et al.
Published: (2025)

Seeing Clearly without Training: Mitigating Hallucinations in Multimodal LLMs for Remote Sensing
by: Liu, Yi, et al.
Published: (2026)

Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models
by: Chen, Guo, et al.
Published: (2025)

DO-Bench: An Attributable Benchmark for Diagnosing Object Hallucination in Vision-Language Models
by: Wang, JiYang, et al.
Published: (2026)

Fourier Angle Alignment for Oriented Object Detection in Remote Sensing
by: Gu, Changyu, et al.
Published: (2026)

Towards Faithful Reasoning in Remote Sensing: A Perceptually-Grounded GeoSpatial Chain-of-Thought for Vision-Language Models
by: Liu, Jiaqi, et al.
Published: (2025)

Bridging the Scale Gap: Balanced Tiny and General Object Detection in Remote Sensing Imagery
by: Zhao, Zhicheng, et al.
Published: (2025)

ChangeBridge: Spatiotemporal Image Generation with Multimodal Controls for Remote Sensing
by: Zhao, Zhenghui, et al.
Published: (2025)

Bring Remote Sensing Object Detect Into Nature Language Model: Using SFT Method
by: Wang, Fei, et al.
Published: (2025)

GLRT-Based Metric Learning for Remote Sensing Object Retrieval
by: Zhang, Linping, et al.
Published: (2024)

Efficient Meta-Learning Enabled Lightweight Multiscale Few-Shot Object Detection in Remote Sensing Images
by: Guan, Wenbin, et al.
Published: (2024)

MF2Summ: Multimodal Fusion for Video Summarization with Temporal Alignment
by: wang, Shuo, et al.
Published: (2025)

GeoGround: A Unified Large Vision-Language Model for Remote Sensing Visual Grounding
by: Zhou, Yue, et al.
Published: (2024)