:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Raj, Hilton, AV, Vishnuram
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2606.02463
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Embodied Spatial Intelligence: from Implicit Scene Modeling to Spatial Reasoning
by: Fang, Jiading
Published: (2025)

BIP3D: Bridging 2D Images and 3D Perception for Embodied Intelligence
by: Lin, Xuewu, et al.
Published: (2024)

AmaraSpatial-10K: A Spatially and Semantically Aligned 3D Dataset for Spatial Computing and Embodied AI
by: Salehi, Mohammad Sadegh, et al.
Published: (2026)

Towards Scalable Spatial Intelligence via 2D-to-3D Data Lifting
by: Miao, Xingyu, et al.
Published: (2025)

Reinforced Embodied Active Defense: Exploiting Adaptive Interaction for Robust Visual Perception in Adversarial 3D Environments
by: Yang, Xiao, et al.
Published: (2025)

Embodied-R: Collaborative Framework for Activating Embodied Spatial Reasoning in Foundation Models via Reinforcement Learning
by: Zhao, Baining, et al.
Published: (2025)

SpatialPoint: Spatial-aware Point Prediction for Embodied Localization
by: Zhu, Qiming, et al.
Published: (2026)

SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
by: Zhu, Haoyi, et al.
Published: (2024)

Federated Cross-Modal Retrieval with Missing Modalities via Semantic Routing and Adapter Personalization
by: Zhou, Hefeng, et al.
Published: (2026)

Vision-Language Navigation with Embodied Intelligence: A Survey
by: Gao, Peng, et al.
Published: (2024)

Dejavu: Towards Experience Feedback Learning for Embodied Intelligence
by: Wu, Shaokai, et al.
Published: (2025)

ESI-Bench: Towards Embodied Spatial Intelligence that Closes the Perception-Action Loop
by: Hong, Yining, et al.
Published: (2026)

SpatialScore: Towards Comprehensive Evaluation for Spatial Intelligence
by: Wu, Haoning, et al.
Published: (2025)

SPATIOROUTE: Dynamic Prompt Routing for Zero-Shot Spatial Reasoning
by: Chunhachatrachai, Pawat, et al.
Published: (2026)

Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning
by: Xu, Huilin, et al.
Published: (2025)

Spatial 3D-LLM: Exploring Spatial Awareness in 3D Vision-Language Models
by: Wang, Xiaoyan, et al.
Published: (2025)

3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model
by: Hu, Wenbo, et al.
Published: (2025)

SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning
by: Liu, Yuecheng, et al.
Published: (2025)

OmniEVA: Embodied Versatile Planner via Task-Adaptive 3D-Grounded and Embodiment-aware Reasoning
by: Liu, Yuecheng, et al.
Published: (2025)

g3D-LF: Generalizable 3D-Language Feature Fields for Embodied Tasks
by: Wang, Zihan, et al.
Published: (2024)

EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding
by: Wu, Yuqi, et al.
Published: (2024)

SpatialLLM: From Multi-modality Data to Urban Spatial Intelligence
by: Chen, Jiabin, et al.
Published: (2025)

MosaicThinker: On-Device Visual Spatial Reasoning for Embodied AI via Iterative Construction of Space Representation
by: Wang, Haoming, et al.
Published: (2026)

MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse
by: Pan, Zhenyu, et al.
Published: (2025)

UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban Spaces
by: Zhao, Baining, et al.
Published: (2025)

SmartSpatial: Enhancing the 3D Spatial Arrangement Capabilities of Stable Diffusion Models and Introducing a Novel 3D Spatial Evaluation Framework
by: Huang, Mao Xun, et al.
Published: (2025)

M3D-BFS: a Multi-stage Dynamic Fusion Strategy for Sample-Adaptive Multi-Modal Brain Network Analysis
by: Dong, Rui, et al.
Published: (2026)

EmbSpatial-Bench: Benchmarking Spatial Understanding for Embodied Tasks with Large Vision-Language Models
by: Du, Mengfei, et al.
Published: (2024)

An Embodied Generalist Agent in 3D World
by: Huang, Jiangyong, et al.
Published: (2023)

Sensor-Adaptive Flood Mapping with Pre-trained Multi-Modal Transformers across SAR and Multispectral Modalities
by: Tanaka, Tomohiro, et al.
Published: (2025)

LAECIPS: Large Vision Model Assisted Adaptive Edge-Cloud Collaboration for IoT-based Embodied Intelligence System
by: Hu, Shijing, et al.
Published: (2024)

SpatialForge: Bootstrapping 3D-Aware Spatial Reasoning from Open-World 2D Images
by: Liu, Zishan, et al.
Published: (2026)

G3: An Effective and Adaptive Framework for Worldwide Geolocalization Using Large Multi-Modality Models
by: Jia, Pengyue, et al.
Published: (2024)

3D-Agent:Tri-Modal Multi-Agent Collaboration for Scalable 3D Object Annotation
by: Zhang, Jusheng, et al.
Published: (2026)

VideoRouter: Query-Adaptive Dual Routing for Efficient Long-Video Understanding
by: Lin, Kuanwei, et al.
Published: (2026)

SDA-PLANNER: State-Dependency Aware Adaptive Planner for Embodied Task Planning
by: Shen, Zichao, et al.
Published: (2025)

RoomTour3D: Geometry-Aware Video-Instruction Tuning for Embodied Navigation
by: Han, Mingfei, et al.
Published: (2024)

MM-Mixing: Multi-Modal Mixing Alignment for 3D Understanding
by: Wang, Jiaze, et al.
Published: (2024)

Look Inside for More: Internal Spatial Modality Perception for 3D Anomaly Detection
by: Liang, Hanzhe, et al.
Published: (2024)

EmbodiedVSR: Dynamic Scene Graph-Guided Chain-of-Thought Reasoning for Visual Spatial Tasks
by: Zhang, Yi, et al.
Published: (2025)