Saved in:
| Main Authors: | Raj, Hilton, AV, Vishnuram |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2606.02463 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Embodied Spatial Intelligence: from Implicit Scene Modeling to Spatial Reasoning
by: Fang, Jiading
Published: (2025)
by: Fang, Jiading
Published: (2025)
BIP3D: Bridging 2D Images and 3D Perception for Embodied Intelligence
by: Lin, Xuewu, et al.
Published: (2024)
by: Lin, Xuewu, et al.
Published: (2024)
AmaraSpatial-10K: A Spatially and Semantically Aligned 3D Dataset for Spatial Computing and Embodied AI
by: Salehi, Mohammad Sadegh, et al.
Published: (2026)
by: Salehi, Mohammad Sadegh, et al.
Published: (2026)
Towards Scalable Spatial Intelligence via 2D-to-3D Data Lifting
by: Miao, Xingyu, et al.
Published: (2025)
by: Miao, Xingyu, et al.
Published: (2025)
Reinforced Embodied Active Defense: Exploiting Adaptive Interaction for Robust Visual Perception in Adversarial 3D Environments
by: Yang, Xiao, et al.
Published: (2025)
by: Yang, Xiao, et al.
Published: (2025)
Embodied-R: Collaborative Framework for Activating Embodied Spatial Reasoning in Foundation Models via Reinforcement Learning
by: Zhao, Baining, et al.
Published: (2025)
by: Zhao, Baining, et al.
Published: (2025)
SpatialPoint: Spatial-aware Point Prediction for Embodied Localization
by: Zhu, Qiming, et al.
Published: (2026)
by: Zhu, Qiming, et al.
Published: (2026)
SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
by: Zhu, Haoyi, et al.
Published: (2024)
by: Zhu, Haoyi, et al.
Published: (2024)
Federated Cross-Modal Retrieval with Missing Modalities via Semantic Routing and Adapter Personalization
by: Zhou, Hefeng, et al.
Published: (2026)
by: Zhou, Hefeng, et al.
Published: (2026)
Vision-Language Navigation with Embodied Intelligence: A Survey
by: Gao, Peng, et al.
Published: (2024)
by: Gao, Peng, et al.
Published: (2024)
Dejavu: Towards Experience Feedback Learning for Embodied Intelligence
by: Wu, Shaokai, et al.
Published: (2025)
by: Wu, Shaokai, et al.
Published: (2025)
ESI-Bench: Towards Embodied Spatial Intelligence that Closes the Perception-Action Loop
by: Hong, Yining, et al.
Published: (2026)
by: Hong, Yining, et al.
Published: (2026)
SpatialScore: Towards Comprehensive Evaluation for Spatial Intelligence
by: Wu, Haoning, et al.
Published: (2025)
by: Wu, Haoning, et al.
Published: (2025)
SPATIOROUTE: Dynamic Prompt Routing for Zero-Shot Spatial Reasoning
by: Chunhachatrachai, Pawat, et al.
Published: (2026)
by: Chunhachatrachai, Pawat, et al.
Published: (2026)
Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning
by: Xu, Huilin, et al.
Published: (2025)
by: Xu, Huilin, et al.
Published: (2025)
Spatial 3D-LLM: Exploring Spatial Awareness in 3D Vision-Language Models
by: Wang, Xiaoyan, et al.
Published: (2025)
by: Wang, Xiaoyan, et al.
Published: (2025)
3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model
by: Hu, Wenbo, et al.
Published: (2025)
by: Hu, Wenbo, et al.
Published: (2025)
SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning
by: Liu, Yuecheng, et al.
Published: (2025)
by: Liu, Yuecheng, et al.
Published: (2025)
OmniEVA: Embodied Versatile Planner via Task-Adaptive 3D-Grounded and Embodiment-aware Reasoning
by: Liu, Yuecheng, et al.
Published: (2025)
by: Liu, Yuecheng, et al.
Published: (2025)
g3D-LF: Generalizable 3D-Language Feature Fields for Embodied Tasks
by: Wang, Zihan, et al.
Published: (2024)
by: Wang, Zihan, et al.
Published: (2024)
EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding
by: Wu, Yuqi, et al.
Published: (2024)
by: Wu, Yuqi, et al.
Published: (2024)
SpatialLLM: From Multi-modality Data to Urban Spatial Intelligence
by: Chen, Jiabin, et al.
Published: (2025)
by: Chen, Jiabin, et al.
Published: (2025)
MosaicThinker: On-Device Visual Spatial Reasoning for Embodied AI via Iterative Construction of Space Representation
by: Wang, Haoming, et al.
Published: (2026)
by: Wang, Haoming, et al.
Published: (2026)
MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse
by: Pan, Zhenyu, et al.
Published: (2025)
by: Pan, Zhenyu, et al.
Published: (2025)
UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban Spaces
by: Zhao, Baining, et al.
Published: (2025)
by: Zhao, Baining, et al.
Published: (2025)
SmartSpatial: Enhancing the 3D Spatial Arrangement Capabilities of Stable Diffusion Models and Introducing a Novel 3D Spatial Evaluation Framework
by: Huang, Mao Xun, et al.
Published: (2025)
by: Huang, Mao Xun, et al.
Published: (2025)
M3D-BFS: a Multi-stage Dynamic Fusion Strategy for Sample-Adaptive Multi-Modal Brain Network Analysis
by: Dong, Rui, et al.
Published: (2026)
by: Dong, Rui, et al.
Published: (2026)
EmbSpatial-Bench: Benchmarking Spatial Understanding for Embodied Tasks with Large Vision-Language Models
by: Du, Mengfei, et al.
Published: (2024)
by: Du, Mengfei, et al.
Published: (2024)
An Embodied Generalist Agent in 3D World
by: Huang, Jiangyong, et al.
Published: (2023)
by: Huang, Jiangyong, et al.
Published: (2023)
Sensor-Adaptive Flood Mapping with Pre-trained Multi-Modal Transformers across SAR and Multispectral Modalities
by: Tanaka, Tomohiro, et al.
Published: (2025)
by: Tanaka, Tomohiro, et al.
Published: (2025)
LAECIPS: Large Vision Model Assisted Adaptive Edge-Cloud Collaboration for IoT-based Embodied Intelligence System
by: Hu, Shijing, et al.
Published: (2024)
by: Hu, Shijing, et al.
Published: (2024)
SpatialForge: Bootstrapping 3D-Aware Spatial Reasoning from Open-World 2D Images
by: Liu, Zishan, et al.
Published: (2026)
by: Liu, Zishan, et al.
Published: (2026)
G3: An Effective and Adaptive Framework for Worldwide Geolocalization Using Large Multi-Modality Models
by: Jia, Pengyue, et al.
Published: (2024)
by: Jia, Pengyue, et al.
Published: (2024)
3D-Agent:Tri-Modal Multi-Agent Collaboration for Scalable 3D Object Annotation
by: Zhang, Jusheng, et al.
Published: (2026)
by: Zhang, Jusheng, et al.
Published: (2026)
VideoRouter: Query-Adaptive Dual Routing for Efficient Long-Video Understanding
by: Lin, Kuanwei, et al.
Published: (2026)
by: Lin, Kuanwei, et al.
Published: (2026)
SDA-PLANNER: State-Dependency Aware Adaptive Planner for Embodied Task Planning
by: Shen, Zichao, et al.
Published: (2025)
by: Shen, Zichao, et al.
Published: (2025)
RoomTour3D: Geometry-Aware Video-Instruction Tuning for Embodied Navigation
by: Han, Mingfei, et al.
Published: (2024)
by: Han, Mingfei, et al.
Published: (2024)
MM-Mixing: Multi-Modal Mixing Alignment for 3D Understanding
by: Wang, Jiaze, et al.
Published: (2024)
by: Wang, Jiaze, et al.
Published: (2024)
Look Inside for More: Internal Spatial Modality Perception for 3D Anomaly Detection
by: Liang, Hanzhe, et al.
Published: (2024)
by: Liang, Hanzhe, et al.
Published: (2024)
EmbodiedVSR: Dynamic Scene Graph-Guided Chain-of-Thought Reasoning for Visual Spatial Tasks
by: Zhang, Yi, et al.
Published: (2025)
by: Zhang, Yi, et al.
Published: (2025)
Similar Items
-
Embodied Spatial Intelligence: from Implicit Scene Modeling to Spatial Reasoning
by: Fang, Jiading
Published: (2025) -
BIP3D: Bridging 2D Images and 3D Perception for Embodied Intelligence
by: Lin, Xuewu, et al.
Published: (2024) -
AmaraSpatial-10K: A Spatially and Semantically Aligned 3D Dataset for Spatial Computing and Embodied AI
by: Salehi, Mohammad Sadegh, et al.
Published: (2026) -
Towards Scalable Spatial Intelligence via 2D-to-3D Data Lifting
by: Miao, Xingyu, et al.
Published: (2025) -
Reinforced Embodied Active Defense: Exploiting Adaptive Interaction for Robust Visual Perception in Adversarial 3D Environments
by: Yang, Xiao, et al.
Published: (2025)