Saved in:
Bibliographic Details
Main Authors: Salehi, Mohammad Sadegh, Perkins, Alex, Maurell, Igor, Dabbagh, Ashkan, Wong, Raymond
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2604.23018
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914562354184192
author Salehi, Mohammad Sadegh
Perkins, Alex
Maurell, Igor
Dabbagh, Ashkan
Wong, Raymond
author_facet Salehi, Mohammad Sadegh
Perkins, Alex
Maurell, Igor
Dabbagh, Ashkan
Wong, Raymond
contents Web-scale 3D asset collections are abundant but rarely deployment-ready, suffering from arbitrary metric scaling, incorrect pivots, brittle geometry, and incomplete textures, defects that limit their use in embodied AI, robotics, and spatial computing. We present AmaraSpatial-10K, a dataset of over 10,000 synthetic 3D assets optimised for zero-shot deployment. Each asset ships as a metric-scaled, deterministically anchored .glb with separated PBR maps, a convex collision hull, a paired reference image, and multi-sentence text metadata. Alongside the dataset we introduce a reusable evaluation suite for 3D asset banks, a continuous Scale Plausibility Score (SPS), an LLM Concept Density metric, anchor-error auditing, and a cross-modal CLIP coherence protocol, and apply it to AmaraSpatial-10K alongside matched subsets of Objaverse, HSSD, ABO, and GSO. AmaraSpatial-10K improves CLIP Recall@5 by $3.4\times$ over Objaverse ($0.612$ vs. $0.181$, median rank $267 \rightarrow 3$), achieves a $99.1\%$ physics-stability rate under Habitat-Sim with $\sim 20\times$ wall-time speed-up, and produces zero-overlap scenes when used as a drop-in asset bank for Holodeck. Controlled ablations on the same asset bank attribute the retrieval gain to description richness.
format Preprint
id arxiv_https___arxiv_org_abs_2604_23018
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle AmaraSpatial-10K: A Spatially and Semantically Aligned 3D Dataset for Spatial Computing and Embodied AI
Salehi, Mohammad Sadegh
Perkins, Alex
Maurell, Igor
Dabbagh, Ashkan
Wong, Raymond
Computer Vision and Pattern Recognition
Artificial Intelligence
Machine Learning
Web-scale 3D asset collections are abundant but rarely deployment-ready, suffering from arbitrary metric scaling, incorrect pivots, brittle geometry, and incomplete textures, defects that limit their use in embodied AI, robotics, and spatial computing. We present AmaraSpatial-10K, a dataset of over 10,000 synthetic 3D assets optimised for zero-shot deployment. Each asset ships as a metric-scaled, deterministically anchored .glb with separated PBR maps, a convex collision hull, a paired reference image, and multi-sentence text metadata. Alongside the dataset we introduce a reusable evaluation suite for 3D asset banks, a continuous Scale Plausibility Score (SPS), an LLM Concept Density metric, anchor-error auditing, and a cross-modal CLIP coherence protocol, and apply it to AmaraSpatial-10K alongside matched subsets of Objaverse, HSSD, ABO, and GSO. AmaraSpatial-10K improves CLIP Recall@5 by $3.4\times$ over Objaverse ($0.612$ vs. $0.181$, median rank $267 \rightarrow 3$), achieves a $99.1\%$ physics-stability rate under Habitat-Sim with $\sim 20\times$ wall-time speed-up, and produces zero-overlap scenes when used as a drop-in asset bank for Holodeck. Controlled ablations on the same asset bank attribute the retrieval gain to description richness.
title AmaraSpatial-10K: A Spatially and Semantically Aligned 3D Dataset for Spatial Computing and Embodied AI
topic Computer Vision and Pattern Recognition
Artificial Intelligence
Machine Learning
url https://arxiv.org/abs/2604.23018