Saved in:
| Main Authors: | Zhou, Yue, Ding, Ran, Yang, Xue, Jiang, Xue, Liu, Xingzhao |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2601.01416 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Multimodal Mathematical Reasoning Embedded in Aerial Vehicle Imagery: Benchmarking, Analysis, and Exploration
by: Zhou, Yue, et al.
Published: (2025)
by: Zhou, Yue, et al.
Published: (2025)
Spatial Retrieval Augmented Autonomous Driving
by: Jia, Xiaosong, et al.
Published: (2025)
by: Jia, Xiaosong, et al.
Published: (2025)
AeroRAG: Structured Multimodal Retrieval-Augmented LLM for Fine-Grained Aerial Visual Reasoning
by: Xue, Junxiao, et al.
Published: (2026)
by: Xue, Junxiao, et al.
Published: (2026)
SpatialReward: Verifiable Spatial Reward Modeling for Fine-Grained Spatial Consistency in Text-to-Image Generation
by: Zhou, Sashuai, et al.
Published: (2026)
by: Zhou, Sashuai, et al.
Published: (2026)
SpatialEdit: Benchmarking Fine-Grained Image Spatial Editing
by: Xiao, Yicheng, et al.
Published: (2026)
by: Xiao, Yicheng, et al.
Published: (2026)
SpatialBot: Precise Spatial Understanding with Vision Language Models
by: Cai, Wenxiao, et al.
Published: (2024)
by: Cai, Wenxiao, et al.
Published: (2024)
UEMM-Air: Make Unmanned Aerial Vehicles Perform More Multi-modal Tasks
by: Yao, Liang, et al.
Published: (2024)
by: Yao, Liang, et al.
Published: (2024)
Learning Attribute-Aware Hash Codes for Fine-Grained Image Retrieval via Query Optimization
by: Wang, Peng, et al.
Published: (2025)
by: Wang, Peng, et al.
Published: (2025)
Attributes Grouping and Mining Hashing for Fine-Grained Image Retrieval
by: Lu, Xin, et al.
Published: (2023)
by: Lu, Xin, et al.
Published: (2023)
Explainable Part-Based Vehicle Classifier with Spatial Awareness
by: Caduff, Andreas, et al.
Published: (2026)
by: Caduff, Andreas, et al.
Published: (2026)
SpatialDreamer: Incentivizing Spatial Reasoning via Active Mental Imagery
by: Cao, Meng, et al.
Published: (2025)
by: Cao, Meng, et al.
Published: (2025)
SASP: Strip-Aware Spatial Perception for Fine-Grained Bird Image Classification
by: Wang, Zheng
Published: (2025)
by: Wang, Zheng
Published: (2025)
SAPNet++: Evolving Point-Prompted Instance Segmentation with Semantic and Spatial Awareness
by: Wei, Zhaoyang, et al.
Published: (2026)
by: Wei, Zhaoyang, et al.
Published: (2026)
Dual Attribute-Spatial Relation Alignment for 3D Visual Grounding
by: Xu, Yue, et al.
Published: (2024)
by: Xu, Yue, et al.
Published: (2024)
Fine-Grained Spatially Varying Material Selection in Images
by: Guerrero-Viu, Julia, et al.
Published: (2025)
by: Guerrero-Viu, Julia, et al.
Published: (2025)
Aerial-NeRF: Adaptive Spatial Partitioning and Sampling for Large-Scale Aerial Rendering
by: Zhang, Xiaohan, et al.
Published: (2024)
by: Zhang, Xiaohan, et al.
Published: (2024)
GCP: Guarded Collaborative Perception with Spatial-Temporal Aware Malicious Agent Detection
by: Tao, Yihang, et al.
Published: (2025)
by: Tao, Yihang, et al.
Published: (2025)
Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs
by: Shen, Yifan, et al.
Published: (2025)
by: Shen, Yifan, et al.
Published: (2025)
Stepping VLMs onto the Court: Benchmarking Spatial Intelligence in Sports
by: Yang, Yuchen, et al.
Published: (2026)
by: Yang, Yuchen, et al.
Published: (2026)
Sketch and Text Synergy: Fusing Structural Contours and Descriptive Attributes for Fine-Grained Image Retrieval
by: Wang, Siyuan, et al.
Published: (2026)
by: Wang, Siyuan, et al.
Published: (2026)
Fine-Grained Spatial and Verbal Losses for 3D Visual Grounding
by: Dey, Sombit, et al.
Published: (2024)
by: Dey, Sombit, et al.
Published: (2024)
SFFR: Spatial-Frequency Feature Reconstruction for Multispectral Aerial Object Detection
by: Zuo, Xin, et al.
Published: (2025)
by: Zuo, Xin, et al.
Published: (2025)
Cross Modal Fine-Grained Alignment via Granularity-Aware and Region-Uncertain Modeling
by: Liu, Jiale, et al.
Published: (2025)
by: Liu, Jiale, et al.
Published: (2025)
Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control
by: Han, Yue, et al.
Published: (2024)
by: Han, Yue, et al.
Published: (2024)
Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence
by: Gao, Yuanyuan, et al.
Published: (2026)
by: Gao, Yuanyuan, et al.
Published: (2026)
Exploring Boundary-Aware Spatial-Frequency Fusion for Camouflaged Object Detection
by: Yu, Song, et al.
Published: (2026)
by: Yu, Song, et al.
Published: (2026)
Accurate and Fast Pixel Retrieval with Spatial and Uncertainty Aware Hypergraph Diffusion
by: An, Guoyuan, et al.
Published: (2024)
by: An, Guoyuan, et al.
Published: (2024)
Spatial Policy: Guiding Visuomotor Robotic Manipulation with Spatial-Aware Modeling and Reasoning
by: Liu, Yijun, et al.
Published: (2025)
by: Liu, Yijun, et al.
Published: (2025)
UniFGVC: Universal Training-Free Few-Shot Fine-Grained Vision Classification via Attribute-Aware Multimodal Retrieval
by: Guo, Hongyu, et al.
Published: (2025)
by: Guo, Hongyu, et al.
Published: (2025)
Dual-Pathway Geometry-Aware MLLM for Spatial Intelligence
by: Zheng, Yufei, et al.
Published: (2026)
by: Zheng, Yufei, et al.
Published: (2026)
LWGANet: Addressing Spatial and Channel Redundancy in Remote Sensing Visual Tasks with Light-Weight Grouped Attention
by: Lu, Wei, et al.
Published: (2025)
by: Lu, Wei, et al.
Published: (2025)
Enhancing Fine-Grained Spatial Grounding in 3D CT Report Generation via Discriminative Guidance
by: Wang, Chenyu, et al.
Published: (2026)
by: Wang, Chenyu, et al.
Published: (2026)
Freeview Sketching: View-Aware Fine-Grained Sketch-Based Image Retrieval
by: Sain, Aneeshan, et al.
Published: (2024)
by: Sain, Aneeshan, et al.
Published: (2024)
SweetTok: Semantic-Aware Spatial-Temporal Tokenizer for Compact Video Discretization
by: Tan, Zhentao, et al.
Published: (2024)
by: Tan, Zhentao, et al.
Published: (2024)
RoadBench: Benchmarking MLLMs on Fine-Grained Spatial Understanding and Reasoning under Urban Road Scenarios
by: Zhang, Jun, et al.
Published: (2025)
by: Zhang, Jun, et al.
Published: (2025)
Beyond Frequency: Seeing Subtle Cues Through the Lens of Spatial Decomposition for Fine-Grained Visual Classification
by: Xu, Qin, et al.
Published: (2025)
by: Xu, Qin, et al.
Published: (2025)
Self-Creative Text-to-Object Generation using Semantic-Aware Spatial Weighting
by: Yu, Yue, et al.
Published: (2026)
by: Yu, Yue, et al.
Published: (2026)
LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding
by: Li, Hongyu, et al.
Published: (2025)
by: Li, Hongyu, et al.
Published: (2025)
GeneQuery: A General QA-based Framework for Spatial Gene Expression Predictions from Histology Images
by: Xiong, Ying, et al.
Published: (2024)
by: Xiong, Ying, et al.
Published: (2024)
Context-Aware Aerial Object Detection: Leveraging Inter-Object and Background Relationships
by: Ren, Botao, et al.
Published: (2024)
by: Ren, Botao, et al.
Published: (2024)
Similar Items
-
Multimodal Mathematical Reasoning Embedded in Aerial Vehicle Imagery: Benchmarking, Analysis, and Exploration
by: Zhou, Yue, et al.
Published: (2025) -
Spatial Retrieval Augmented Autonomous Driving
by: Jia, Xiaosong, et al.
Published: (2025) -
AeroRAG: Structured Multimodal Retrieval-Augmented LLM for Fine-Grained Aerial Visual Reasoning
by: Xue, Junxiao, et al.
Published: (2026) -
SpatialReward: Verifiable Spatial Reward Modeling for Fine-Grained Spatial Consistency in Text-to-Image Generation
by: Zhou, Sashuai, et al.
Published: (2026) -
SpatialEdit: Benchmarking Fine-Grained Image Spatial Editing
by: Xiao, Yicheng, et al.
Published: (2026)