Saved in:
| Main Authors: | Wang, Luting, Xiang, Yinghao, Huang, Hongliang, Li, Dongjun, Gao, Chen, Liu, Si |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.26297 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark, and Methodology
by: Wang, Xiangyu, et al.
Published: (2024)
by: Wang, Xiangyu, et al.
Published: (2024)
ManiSoft: Towards Vision-Language Manipulation for Soft Continuum Robotics
by: Wei, Ziyu, et al.
Published: (2026)
by: Wei, Ziyu, et al.
Published: (2026)
REOBench: Benchmarking Robustness of Earth Observation Foundation Models
by: Li, Xiang, et al.
Published: (2025)
by: Li, Xiang, et al.
Published: (2025)
ChronoEarth-492K: A Large Scale and Long Horizon Spatiotemporal Hyperspectral Earth Observation Dataset and Benchmark
by: Si, Haozhe, et al.
Published: (2026)
by: Si, Haozhe, et al.
Published: (2026)
RemoteSAM: Towards Segment Anything for Earth Observation
by: Yao, Liang, et al.
Published: (2025)
by: Yao, Liang, et al.
Published: (2025)
Towards Realistic Open-Vocabulary Remote Sensing Segmentation: Benchmark and Baseline
by: Li, Bingyu, et al.
Published: (2026)
by: Li, Bingyu, et al.
Published: (2026)
EarthNets: Empowering AI in Earth Observation
by: Xiong, Zhitong, et al.
Published: (2022)
by: Xiong, Zhitong, et al.
Published: (2022)
Image Understanding Makes for A Good Tokenizer for Image Generation
by: Wang, Luting, et al.
Published: (2024)
by: Wang, Luting, et al.
Published: (2024)
Towards Unified Vision Language Models for Forest Ecological Analysis in Earth Observation
by: Xue, Xizhe, et al.
Published: (2025)
by: Xue, Xizhe, et al.
Published: (2025)
Toward Realistic Camouflaged Object Detection: Benchmarks and Method
by: Xin, Zhimeng, et al.
Published: (2025)
by: Xin, Zhimeng, et al.
Published: (2025)
Transfer Learning for Onboard Cloud Segmentation in Thermal Earth Observation: From Landsat to a CubeSat Constellation
by: Wölki, Niklas, et al.
Published: (2025)
by: Wölki, Niklas, et al.
Published: (2025)
Knowledge Distillation via Query Selection for Detection Transformer
by: Liu, Yi, et al.
Published: (2024)
by: Liu, Yi, et al.
Published: (2024)
UniTTA: Unified Benchmark and Versatile Framework Towards Realistic Test-Time Adaptation
by: Du, Chaoqun, et al.
Published: (2024)
by: Du, Chaoqun, et al.
Published: (2024)
InterAct: Capture and Modelling of Realistic, Expressive and Interactive Activities between Two Persons in Daily Scenarios
by: Huang, Yinghao, et al.
Published: (2024)
by: Huang, Yinghao, et al.
Published: (2024)
Benchmarking Composed Image Retrieval for Applied Earth Observation
by: Psomas, Bill, et al.
Published: (2026)
by: Psomas, Bill, et al.
Published: (2026)
OmniEarth-Bench: Towards Holistic Evaluation of Earth's Six Spheres and Cross-Spheres Interactions with Multimodal Observational Earth Data
by: Wang, Fengxiang, et al.
Published: (2025)
by: Wang, Fengxiang, et al.
Published: (2025)
Multi-Label Guided Soft Contrastive Learning for Efficient Earth Observation Pretraining
by: Wang, Yi, et al.
Published: (2024)
by: Wang, Yi, et al.
Published: (2024)
Towards Autonomous UAV Visual Object Search in City Space: Benchmark and Agentic Methodology
by: Ji, Yatai, et al.
Published: (2025)
by: Ji, Yatai, et al.
Published: (2025)
EO-VAE: Towards A Multi-sensor Tokenizer for Earth Observation Data
by: Lehmann, Nils, et al.
Published: (2026)
by: Lehmann, Nils, et al.
Published: (2026)
Earth-Agent: Unlocking the Full Landscape of Earth Observation with Agents
by: Feng, Peilin, et al.
Published: (2025)
by: Feng, Peilin, et al.
Published: (2025)
EarthSynth: Generating Informative Earth Observation with Diffusion Models
by: Pan, Jiancheng, et al.
Published: (2025)
by: Pan, Jiancheng, et al.
Published: (2025)
Toward a Realistic Benchmark for Out-of-Distribution Detection
by: Recalcati, Pietro, et al.
Published: (2024)
by: Recalcati, Pietro, et al.
Published: (2024)
REO-VLM: Transforming VLM to Meet Regression Challenges in Earth Observation
by: Xue, Xizhe, et al.
Published: (2024)
by: Xue, Xizhe, et al.
Published: (2024)
LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction
by: Du, Penghui, et al.
Published: (2024)
by: Du, Penghui, et al.
Published: (2024)
Bridging the Gap Between End-to-End and Two-Step Text Spotting
by: Huang, Mingxin, et al.
Published: (2024)
by: Huang, Mingxin, et al.
Published: (2024)
SkySense: A Multi-Modal Remote Sensing Foundation Model Towards Universal Interpretation for Earth Observation Imagery
by: Guo, Xin, et al.
Published: (2023)
by: Guo, Xin, et al.
Published: (2023)
One for All: Toward Unified Foundation Models for Earth Vision
by: Xiong, Zhitong, et al.
Published: (2024)
by: Xiong, Zhitong, et al.
Published: (2024)
SwinTextSpotter v2: Towards Better Synergy for Scene Text Spotting
by: Huang, Mingxin, et al.
Published: (2024)
by: Huang, Mingxin, et al.
Published: (2024)
BigEarthNet.txt: A Large-Scale Multi-Sensor Image-Text Dataset and Benchmark for Earth Observation
by: Herzog, Johann-Ludwig, et al.
Published: (2026)
by: Herzog, Johann-Ludwig, et al.
Published: (2026)
Neural Plasticity-Inspired Multimodal Foundation Model for Earth Observation
by: Xiong, Zhitong, et al.
Published: (2024)
by: Xiong, Zhitong, et al.
Published: (2024)
Constellation Dataset: Benchmarking High-Altitude Object Detection for an Urban Intersection
by: Turkcan, Mehmet Kerem, et al.
Published: (2024)
by: Turkcan, Mehmet Kerem, et al.
Published: (2024)
TerraScope: Pixel-Grounded Visual Reasoning for Earth Observation
by: Shu, Yan, et al.
Published: (2026)
by: Shu, Yan, et al.
Published: (2026)
Foundation Models for Remote Sensing and Earth Observation: A Survey
by: Xiao, Aoran, et al.
Published: (2024)
by: Xiao, Aoran, et al.
Published: (2024)
OpenEarth-Agent: From Tool Calling to Tool Creation for Open-Environment Earth Observation
by: Zhao, Sijie, et al.
Published: (2026)
by: Zhao, Sijie, et al.
Published: (2026)
Evaluating and Benchmarking Foundation Models for Earth Observation and Geospatial AI
by: Dionelis, Nikolaos, et al.
Published: (2024)
by: Dionelis, Nikolaos, et al.
Published: (2024)
DOFA-CLIP: Multimodal Vision-Language Foundation Models for Earth Observation
by: Xiong, Zhitong, et al.
Published: (2025)
by: Xiong, Zhitong, et al.
Published: (2025)
Learning Hazing to Dehazing: Towards Realistic Haze Generation for Real-World Image Dehazing
by: Wang, Ruiyi, et al.
Published: (2025)
by: Wang, Ruiyi, et al.
Published: (2025)
Where on Earth? A Vision-Language Benchmark for Probing Model Geolocation Skills Across Scales
by: Qian, Zhaofang, et al.
Published: (2025)
by: Qian, Zhaofang, et al.
Published: (2025)
READoc: A Unified Benchmark for Realistic Document Structured Extraction
by: Li, Zichao, et al.
Published: (2024)
by: Li, Zichao, et al.
Published: (2024)
Scientific Image Synthesis: Benchmarking, Methodologies, and Downstream Utility
by: Lin, Honglin, et al.
Published: (2026)
by: Lin, Honglin, et al.
Published: (2026)
Similar Items
-
Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark, and Methodology
by: Wang, Xiangyu, et al.
Published: (2024) -
ManiSoft: Towards Vision-Language Manipulation for Soft Continuum Robotics
by: Wei, Ziyu, et al.
Published: (2026) -
REOBench: Benchmarking Robustness of Earth Observation Foundation Models
by: Li, Xiang, et al.
Published: (2025) -
ChronoEarth-492K: A Large Scale and Long Horizon Spatiotemporal Hyperspectral Earth Observation Dataset and Benchmark
by: Si, Haozhe, et al.
Published: (2026) -
RemoteSAM: Towards Segment Anything for Earth Observation
by: Yao, Liang, et al.
Published: (2025)