:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Salehi, Mohammad Sadegh, Perkins, Alex, Maurell, Igor, Dabbagh, Ashkan, Wong, Raymond
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2604.23018
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

SpatialPoint: Spatial-aware Point Prediction for Embodied Localization
by: Zhu, Qiming, et al.
Published: (2026)

Lost in Volume: The CT-SpatialVQA Benchmark for Evaluating Semantic-Spatial Understanding of 3D Medical Vision-Language Models
by: Monon, Mashrafi, et al.
Published: (2026)

Embodied Spatial Intelligence: from Implicit Scene Modeling to Spatial Reasoning
by: Fang, Jiading
Published: (2025)

SVR-GS: Spatially Variant Regularization for Probabilistic Masks in 3D Gaussian Splatting
by: Taghipour, Ashkan, et al.
Published: (2025)

SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
by: Zhu, Haoyi, et al.
Published: (2024)

Predicting Future States with Spatial Point Processes in Single Molecule Resolution Spatial Transcriptomics
by: Rout, Biraaj, et al.
Published: (2024)

MASER: Modality-Adaptive Specialist Routing for Embodied 3D Spatial Intelligence
by: Raj, Hilton, et al.
Published: (2026)

AI-driven 3D Spatial Transcriptomics
by: Almagro-Pérez, Cristina, et al.
Published: (2025)

Sentinel: Embodied Cooperative Spatial Reasoning and Planning
by: Lin, Xiangye, et al.
Published: (2026)

Improving Dental Diagnostics: Enhanced Convolution with Spatial Attention Mechanism
by: Rezaie, Shahriar, et al.
Published: (2024)

Vision to Geometry: 3D Spatial Memory for Sequential Embodied MLLM Reasoning and Exploration
by: Cai, Zhongyi, et al.
Published: (2025)

SpatialVID: A Large-Scale Video Dataset with Spatial Annotations
by: Wang, Jiahao, et al.
Published: (2025)

Embodied3DBench: Benchmarking Low-Level Embodied Spatial Intelligence of Vision Language Models
by: Zhang, Jiyao, et al.
Published: (2026)

SpatialReasoner: Towards Explicit and Generalizable 3D Spatial Reasoning
by: Ma, Wufei, et al.
Published: (2025)

SMILE: Infusing Spatial and Motion Semantics in Masked Video Learning
by: Thoker, Fida Mohammad, et al.
Published: (2025)

EmbSpatial-Bench: Benchmarking Spatial Understanding for Embodied Tasks with Large Vision-Language Models
by: Du, Mengfei, et al.
Published: (2024)

SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning
by: Liu, Yuecheng, et al.
Published: (2025)

Multi-Object 3D Grounding with Dynamic Modules and Language-Informed Spatial Attention
by: Zhang, Haomeng, et al.
Published: (2024)

InternSpatial: A Comprehensive Dataset for Spatial Reasoning in Vision-Language Models
by: Deng, Nianchen, et al.
Published: (2025)

MetaSpatial: Reinforcing 3D Spatial Reasoning in VLMs for the Metaverse
by: Pan, Zhenyu, et al.
Published: (2025)

Human-AI Divergence in Ego-centric Action Recognition under Spatial and Spatiotemporal Manipulations
by: Rahmaniboldaji, Sadegh, et al.
Published: (2026)

Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence
by: Gao, Yuanyuan, et al.
Published: (2026)

Spatial 3D-LLM: Exploring Spatial Awareness in 3D Vision-Language Models
by: Wang, Xiaoyan, et al.
Published: (2025)

MM-Spatial: Exploring 3D Spatial Understanding in Multimodal LLMs
by: Daxberger, Erik, et al.
Published: (2025)

SPATIALALIGN: Aligning Dynamic Spatial Relationships in Video Generation
by: Liu, Fengming, et al.
Published: (2026)

GSMem: 3D Gaussian Splatting as Persistent Spatial Memory for Zero-Shot Embodied Exploration and Reasoning
by: Lu, Yiren, et al.
Published: (2026)

SpatialGeo:Boosting Spatial Reasoning in Multimodal LLMs via Geometry-Semantics Fusion
by: Guo, Jiajie, et al.
Published: (2025)

SpatialSplat: Efficient Semantic 3D from Sparse Unposed Images
by: Sheng, Yu, et al.
Published: (2025)

Large Spatial Model: End-to-end Unposed Images to Semantic 3D
by: Fan, Zhiwen, et al.
Published: (2024)

Robust 3D Semantic Occupancy Prediction with Calibration-free Spatial Transformation
by: Zhuang, Zhuangwei, et al.
Published: (2024)

3D Reconstruction with Spatial Memory
by: Wang, Hengyi, et al.
Published: (2024)

ASIGN: An Anatomy-aware Spatial Imputation Graphic Network for 3D Spatial Transcriptomics
by: Zhu, Junchao, et al.
Published: (2024)

SpatialStack: Layered Geometry-Language Fusion for 3D VLM Spatial Reasoning
by: Zhang, Jian, et al.
Published: (2026)

pySpatial: Generating 3D Visual Programs for Zero-Shot Spatial Reasoning
by: Luo, Zhanpeng, et al.
Published: (2026)

HiSpatial: Taming Hierarchical 3D Spatial Understanding in Vision-Language Models
by: Liang, Huizhi, et al.
Published: (2026)

Semantic Foam: Unifying Spatial and Semantic Scene Decomposition
by: Sharafeldin, Amr, et al.
Published: (2026)

SmartSpatial: Enhancing the 3D Spatial Arrangement Capabilities of Stable Diffusion Models and Introducing a Novel 3D Spatial Evaluation Framework
by: Huang, Mao Xun, et al.
Published: (2025)

Spatial Frequency Modulation for Semantic Segmentation
by: Chen, Linwei, et al.
Published: (2025)

Towards Understanding Multimodal Fine-Tuning: Spatial Features
by: Naghashyar, Lachin, et al.
Published: (2026)

SURPRISE3D: A Dataset for Spatial Understanding and Reasoning in Complex 3D Scenes
by: Huang, Jiaxin, et al.
Published: (2025)