Saved in:
| Main Authors: | Jiang, Hongxiang, Yin, Jihao, Wang, Qixiong, Feng, Jiaqi, Chen, Guo |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.23330 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
EagleVision: A Multi-Task Benchmark for Cross-Domain Perception in High-Speed Autonomous Racing
by: Yagudin, Zakhar, et al.
Published: (2026)
by: Yagudin, Zakhar, et al.
Published: (2026)
EagleVision: A Dual-Stage Framework with BEV-grounding-based Chain-of-Thought for Spatial Intelligence
by: Wan, Jiaxu, et al.
Published: (2025)
by: Wan, Jiaxu, et al.
Published: (2025)
Interactive Masked Image Modeling for Multimodal Object Detection in Remote Sensing
by: Vu, Minh-Duc, et al.
Published: (2024)
by: Vu, Minh-Duc, et al.
Published: (2024)
Beyond Open Vocabulary: Multimodal Prompting for Object Detection in Remote Sensing Images
by: Yang, Shuai, et al.
Published: (2026)
by: Yang, Shuai, et al.
Published: (2026)
Multimodal Transformer Using Cross-Channel attention for Object Detection in Remote Sensing Images
by: Bahaduri, Bissmella, et al.
Published: (2023)
by: Bahaduri, Bissmella, et al.
Published: (2023)
Frequency-Aware Vision-Language Multimodality Generalization Network for Remote Sensing Image Classification
by: Zhang, Junjie, et al.
Published: (2025)
by: Zhang, Junjie, et al.
Published: (2025)
LMFNet: An Efficient Multimodal Fusion Approach for Semantic Segmentation in High-Resolution Remote Sensing
by: Wang, Tong, et al.
Published: (2024)
by: Wang, Tong, et al.
Published: (2024)
A Survey on Remote Sensing Foundation Models: From Vision to Multimodality
by: Huang, Ziyue, et al.
Published: (2025)
by: Huang, Ziyue, et al.
Published: (2025)
LLM-Assisted Semantic Guidance for Sparsely Annotated Remote Sensing Object Detection
by: Liao, Wei, et al.
Published: (2025)
by: Liao, Wei, et al.
Published: (2025)
Real-Time Oriented Object Detection Transformer in Remote Sensing Images
by: Ding, Zeyu, et al.
Published: (2026)
by: Ding, Zeyu, et al.
Published: (2026)
EViT: An Eagle Vision Transformer with Bi-Fovea Self-Attention
by: Shi, Yulong, et al.
Published: (2023)
by: Shi, Yulong, et al.
Published: (2023)
Remote Sensing Object Counting with Online Knowledge Learning
by: Jiang, Shengqin, et al.
Published: (2023)
by: Jiang, Shengqin, et al.
Published: (2023)
MO R-CNN: Multispectral Oriented R-CNN for Object Detection in Remote Sensing Image
by: Wang, Leiyu, et al.
Published: (2025)
by: Wang, Leiyu, et al.
Published: (2025)
MutDet: Mutually Optimizing Pre-training for Remote Sensing Object Detection
by: Huang, Ziyue, et al.
Published: (2024)
by: Huang, Ziyue, et al.
Published: (2024)
LHRS-Bot-Nova: Improved Multimodal Large Language Model for Remote Sensing Vision-Language Interpretation
by: Li, Zhenshi, et al.
Published: (2024)
by: Li, Zhenshi, et al.
Published: (2024)
GeoRSMLLM: A Multimodal Large Language Model for Vision-Language Tasks in Geoscience and Remote Sensing
by: Zhang, Zilun, et al.
Published: (2025)
by: Zhang, Zilun, et al.
Published: (2025)
A Resource-Efficient Training Framework for Remote Sensing Text--Image Retrieval
by: Zhang, Weihang, et al.
Published: (2025)
by: Zhang, Weihang, et al.
Published: (2025)
MGIMM: Multi-Granularity Instruction Multimodal Model for Attribute-Guided Remote Sensing Image Detailed Description
by: Yang, Cong, et al.
Published: (2024)
by: Yang, Cong, et al.
Published: (2024)
AeroGen: Enhancing Remote Sensing Object Detection with Diffusion-Driven Data Generation
by: Tang, Datao, et al.
Published: (2024)
by: Tang, Datao, et al.
Published: (2024)
OpenRSD: Towards Open-prompts for Object Detection in Remote Sensing Images
by: Huang, Ziyue, et al.
Published: (2025)
by: Huang, Ziyue, et al.
Published: (2025)
CSFMamba: Cross State Fusion Mamba Operator for Multimodal Remote Sensing Image Classification
by: Wang, Qingyu, et al.
Published: (2025)
by: Wang, Qingyu, et al.
Published: (2025)
VHM: Versatile and Honest Vision Language Model for Remote Sensing Image Analysis
by: Pang, Chao, et al.
Published: (2024)
by: Pang, Chao, et al.
Published: (2024)
RemoteCLIP: A Vision Language Foundation Model for Remote Sensing
by: Liu, Fan, et al.
Published: (2023)
by: Liu, Fan, et al.
Published: (2023)
Generalization-Enhanced Few-Shot Object Detection in Remote Sensing
by: Lin, Hui, et al.
Published: (2025)
by: Lin, Hui, et al.
Published: (2025)
InstructAttribute: Fine-grained Object Attributes editing with Instruction
by: Yin, Xingxi, et al.
Published: (2025)
by: Yin, Xingxi, et al.
Published: (2025)
OrientedFormer: An End-to-End Transformer-Based Oriented Object Detector in Remote Sensing Images
by: Zhao, Jiaqi, et al.
Published: (2024)
by: Zhao, Jiaqi, et al.
Published: (2024)
STARS: Shared-specific Translation and Alignment for missing-modality Remote Sensing Semantic Segmentation
by: Wang, Tong, et al.
Published: (2026)
by: Wang, Tong, et al.
Published: (2026)
Object Fidelity Diffusion for Remote Sensing Image Generation
by: Ye, Ziqi, et al.
Published: (2025)
by: Ye, Ziqi, et al.
Published: (2025)
Seeing Clearly without Training: Mitigating Hallucinations in Multimodal LLMs for Remote Sensing
by: Liu, Yi, et al.
Published: (2026)
by: Liu, Yi, et al.
Published: (2026)
Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models
by: Chen, Guo, et al.
Published: (2025)
by: Chen, Guo, et al.
Published: (2025)
DO-Bench: An Attributable Benchmark for Diagnosing Object Hallucination in Vision-Language Models
by: Wang, JiYang, et al.
Published: (2026)
by: Wang, JiYang, et al.
Published: (2026)
Fourier Angle Alignment for Oriented Object Detection in Remote Sensing
by: Gu, Changyu, et al.
Published: (2026)
by: Gu, Changyu, et al.
Published: (2026)
Towards Faithful Reasoning in Remote Sensing: A Perceptually-Grounded GeoSpatial Chain-of-Thought for Vision-Language Models
by: Liu, Jiaqi, et al.
Published: (2025)
by: Liu, Jiaqi, et al.
Published: (2025)
Bridging the Scale Gap: Balanced Tiny and General Object Detection in Remote Sensing Imagery
by: Zhao, Zhicheng, et al.
Published: (2025)
by: Zhao, Zhicheng, et al.
Published: (2025)
ChangeBridge: Spatiotemporal Image Generation with Multimodal Controls for Remote Sensing
by: Zhao, Zhenghui, et al.
Published: (2025)
by: Zhao, Zhenghui, et al.
Published: (2025)
Bring Remote Sensing Object Detect Into Nature Language Model: Using SFT Method
by: Wang, Fei, et al.
Published: (2025)
by: Wang, Fei, et al.
Published: (2025)
GLRT-Based Metric Learning for Remote Sensing Object Retrieval
by: Zhang, Linping, et al.
Published: (2024)
by: Zhang, Linping, et al.
Published: (2024)
Efficient Meta-Learning Enabled Lightweight Multiscale Few-Shot Object Detection in Remote Sensing Images
by: Guan, Wenbin, et al.
Published: (2024)
by: Guan, Wenbin, et al.
Published: (2024)
MF2Summ: Multimodal Fusion for Video Summarization with Temporal Alignment
by: wang, Shuo, et al.
Published: (2025)
by: wang, Shuo, et al.
Published: (2025)
GeoGround: A Unified Large Vision-Language Model for Remote Sensing Visual Grounding
by: Zhou, Yue, et al.
Published: (2024)
by: Zhou, Yue, et al.
Published: (2024)
Similar Items
-
EagleVision: A Multi-Task Benchmark for Cross-Domain Perception in High-Speed Autonomous Racing
by: Yagudin, Zakhar, et al.
Published: (2026) -
EagleVision: A Dual-Stage Framework with BEV-grounding-based Chain-of-Thought for Spatial Intelligence
by: Wan, Jiaxu, et al.
Published: (2025) -
Interactive Masked Image Modeling for Multimodal Object Detection in Remote Sensing
by: Vu, Minh-Duc, et al.
Published: (2024) -
Beyond Open Vocabulary: Multimodal Prompting for Object Detection in Remote Sensing Images
by: Yang, Shuai, et al.
Published: (2026) -
Multimodal Transformer Using Cross-Channel attention for Object Detection in Remote Sensing Images
by: Bahaduri, Bissmella, et al.
Published: (2023)