Saved in:
| Main Authors: | Liu, Kejia, Zhou, Haoyang, Xu, Ruoyu, Wang, Peicheng, Song, Mingli, Zhang, Haofei |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.22153 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Angle Robustness Unmanned Aerial Vehicle Navigation in GNSS-Denied Scenarios
by: Wang, Yuxin, et al.
Published: (2024)
by: Wang, Yuxin, et al.
Published: (2024)
AeroDuo: Aerial Duo for UAV-based Vision and Language Navigation
by: Wu, Ruipu, et al.
Published: (2025)
by: Wu, Ruipu, et al.
Published: (2025)
ProtoPFormer: Concentrating on Prototypical Parts in Vision Transformers for Interpretable Image Recognition
by: Xue, Mengqi, et al.
Published: (2022)
by: Xue, Mengqi, et al.
Published: (2022)
Aerial Vision-and-Language Navigation with Grid-based View Selection and Map Construction
by: Zhao, Ganlong, et al.
Published: (2025)
by: Zhao, Ganlong, et al.
Published: (2025)
Token-Level Inference-Time Alignment for Vision-Language Models
by: Chen, Kejia, et al.
Published: (2025)
by: Chen, Kejia, et al.
Published: (2025)
On the Concept Trustworthiness in Concept Bottleneck Models
by: Huang, Qihan, et al.
Published: (2024)
by: Huang, Qihan, et al.
Published: (2024)
NavAgent: Multi-scale Urban Street View Fusion For UAV Embodied Vision-and-Language Navigation
by: Liu, Youzhi, et al.
Published: (2024)
by: Liu, Youzhi, et al.
Published: (2024)
WorldVLN: Autoregressive World Action Model for Aerial Vision-Language Navigation
by: Zhao, Baining, et al.
Published: (2026)
by: Zhao, Baining, et al.
Published: (2026)
LookasideVLN: Direction-Aware Aerial Vision-and-Language Navigation
by: Ning, Yuwei, et al.
Published: (2026)
by: Ning, Yuwei, et al.
Published: (2026)
Syn-GRPO: Self-Evolving Data Synthesis for MLLM Perception Reasoning
by: Huang, Qihan, et al.
Published: (2025)
by: Huang, Qihan, et al.
Published: (2025)
RS3DBench: A Comprehensive Benchmark for 3D Spatial Perception in Remote Sensing
by: Wang, Jiayu, et al.
Published: (2025)
by: Wang, Jiayu, et al.
Published: (2025)
AerialVLA: A Vision-Language-Action Model for UAV Navigation via Minimalist End-to-End Control
by: Xu, Peng, et al.
Published: (2026)
by: Xu, Peng, et al.
Published: (2026)
UAV-ON: A Benchmark for Open-World Object Goal Navigation with Aerial Agents
by: Xiao, Jianqiang, et al.
Published: (2025)
by: Xiao, Jianqiang, et al.
Published: (2025)
Triple-View Knowledge Distillation for Semi-Supervised Semantic Segmentation
by: Li, Ping, et al.
Published: (2023)
by: Li, Ping, et al.
Published: (2023)
ViewBridge:Revisiting Cross-View Localization from Image Matching
by: Xia, Panwang, et al.
Published: (2025)
by: Xia, Panwang, et al.
Published: (2025)
YoNoSplat: You Only Need One Model for Feedforward 3D Gaussian Splatting
by: Ye, Botao, et al.
Published: (2025)
by: Ye, Botao, et al.
Published: (2025)
Rethinking Token Reduction for Large Vision-Language Models
by: Wang, Yi, et al.
Published: (2026)
by: Wang, Yi, et al.
Published: (2026)
View-Centric Multi-Object Tracking with Homographic Matching in Moving UAV
by: Ji, Deyi, et al.
Published: (2024)
by: Ji, Deyi, et al.
Published: (2024)
Sparse Video Generation Propels Real-World Beyond-the-View Vision-Language Navigation
by: Zhang, Hai, et al.
Published: (2026)
by: Zhang, Hai, et al.
Published: (2026)
Learning to Retrieve Navigable Candidates for Efficient Vision-and-Language Navigation
by: Gu, Shutian, et al.
Published: (2026)
by: Gu, Shutian, et al.
Published: (2026)
Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning
by: Xu, Huilin, et al.
Published: (2025)
by: Xu, Huilin, et al.
Published: (2025)
Scale-Aware UAV-to-Satellite Cross-View Geo-Localization: A Semantic Geometric Approach
by: Ye, Yibin, et al.
Published: (2026)
by: Ye, Yibin, et al.
Published: (2026)
UAV-Track VLA: Embodied Aerial Tracking via Vision-Language-Action Models
by: Zhang, Qiyao, et al.
Published: (2026)
by: Zhang, Qiyao, et al.
Published: (2026)
On the Evaluation Consistency of Attribution-based Explanations
by: Duan, Jiarui, et al.
Published: (2024)
by: Duan, Jiarui, et al.
Published: (2024)
NOLO: Navigate Only Look Once
by: Zhou, Bohan, et al.
Published: (2024)
by: Zhou, Bohan, et al.
Published: (2024)
Knowledge Amalgamation for Object Detection with Transformers
by: Zhang, Haofei, et al.
Published: (2022)
by: Zhang, Haofei, et al.
Published: (2022)
Leveraging Geometric Priors for Unaligned Scene Change Detection
by: Liu, Ziling, et al.
Published: (2025)
by: Liu, Ziling, et al.
Published: (2025)
Graph-based Semantic Calibration Network for Unaligned UAV RGBT Image Semantic Segmentation and A Large-scale Benchmark
by: Fan, Fangqiang, et al.
Published: (2026)
by: Fan, Fangqiang, et al.
Published: (2026)
Beyond the Horizon: Decoupling Multi-View UAV Action Recognition via Partial Order Transfer
by: Liu, Wenxuan, et al.
Published: (2025)
by: Liu, Wenxuan, et al.
Published: (2025)
Object Detection as an Optional Basis: A Graph Matching Network for Cross-View UAV Localization
by: Liu, Tao, et al.
Published: (2025)
by: Liu, Tao, et al.
Published: (2025)
LG-CAV: Train Any Concept Activation Vector with Language Guidance
by: Huang, Qihan, et al.
Published: (2024)
by: Huang, Qihan, et al.
Published: (2024)
Satellite to GroundScape -- Large-scale Consistent Ground View Generation from Satellite Views
by: Xu, Ningli, et al.
Published: (2025)
by: Xu, Ningli, et al.
Published: (2025)
Fan-Beam CT Reconstruction for Unaligned Sparse-View X-ray Baggage Dataset
by: Kim, Shin
Published: (2024)
by: Kim, Shin
Published: (2024)
ViSA-Enhanced Aerial VLN: A Visual-Spatial Reasoning Enhanced Framework for Aerial Vision-Language Navigation
by: Tong, Haoyu, et al.
Published: (2026)
by: Tong, Haoyu, et al.
Published: (2026)
SHAPE : Self-Improved Visual Preference Alignment by Iteratively Generating Holistic Winner
by: Chen, Kejia, et al.
Published: (2025)
by: Chen, Kejia, et al.
Published: (2025)
ViSE: A Systematic Approach to Vision-Only Street-View Extrapolation
by: Tan, Kaiyuan, et al.
Published: (2025)
by: Tan, Kaiyuan, et al.
Published: (2025)
YOLC: You Only Look Clusters for Tiny Object Detection in Aerial Images
by: Liu, Chenguang, et al.
Published: (2024)
by: Liu, Chenguang, et al.
Published: (2024)
Unaligned RGB Guided Hyperspectral Image Super-Resolution with Spatial-Spectral Concordance
by: Zhang, Yingkai, et al.
Published: (2025)
by: Zhang, Yingkai, et al.
Published: (2025)
$D^3$-RSMDE: 40$\times$ Faster and High-Fidelity Remote Sensing Monocular Depth Estimation
by: Wang, Ruizhi, et al.
Published: (2026)
by: Wang, Ruizhi, et al.
Published: (2026)
Learning an Adaptive and View-Invariant Vision Transformer for Real-Time UAV Tracking
by: Wu, You, et al.
Published: (2024)
by: Wu, You, et al.
Published: (2024)
Similar Items
-
Angle Robustness Unmanned Aerial Vehicle Navigation in GNSS-Denied Scenarios
by: Wang, Yuxin, et al.
Published: (2024) -
AeroDuo: Aerial Duo for UAV-based Vision and Language Navigation
by: Wu, Ruipu, et al.
Published: (2025) -
ProtoPFormer: Concentrating on Prototypical Parts in Vision Transformers for Interpretable Image Recognition
by: Xue, Mengqi, et al.
Published: (2022) -
Aerial Vision-and-Language Navigation with Grid-based View Selection and Map Construction
by: Zhao, Ganlong, et al.
Published: (2025) -
Token-Level Inference-Time Alignment for Vision-Language Models
by: Chen, Kejia, et al.
Published: (2025)