Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Gromniak, Martin, Habekost, Jan-Gerrit, Kamp, Sebastian, Magg, Sven, Wermter, Stefan
Format: Preprint
Veröffentlicht: 2026
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2602.10943
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866910019295903744
author Gromniak, Martin
Habekost, Jan-Gerrit
Kamp, Sebastian
Magg, Sven
Wermter, Stefan
author_facet Gromniak, Martin
Habekost, Jan-Gerrit
Kamp, Sebastian
Magg, Sven
Wermter, Stefan
contents We introduce a Generalizable Neural Radiance Field approach for predicting 3D workspace occupancy from egocentric robot observations. Unlike prior methods operating in camera-centric coordinates, our model constructs occupancy representations in a global workspace frame, making it directly applicable to robotic manipulation. The model integrates flexible source views and generalizes to unseen object arrangements without scene-specific finetuning. We demonstrate the approach on a humanoid robot and evaluate predicted geometry against 3D sensor ground truth. Trained on 40 real scenes, our model achieves 26mm reconstruction error, including occluded regions, validating its ability to infer complete 3D occupancy beyond traditional stereo vision methods.
format Preprint
id arxiv_https___arxiv_org_abs_2602_10943
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Towards Learning a Generalizable 3D Scene Representation from 2D Observations
Gromniak, Martin
Habekost, Jan-Gerrit
Kamp, Sebastian
Magg, Sven
Wermter, Stefan
Computer Vision and Pattern Recognition
Robotics
We introduce a Generalizable Neural Radiance Field approach for predicting 3D workspace occupancy from egocentric robot observations. Unlike prior methods operating in camera-centric coordinates, our model constructs occupancy representations in a global workspace frame, making it directly applicable to robotic manipulation. The model integrates flexible source views and generalizes to unseen object arrangements without scene-specific finetuning. We demonstrate the approach on a humanoid robot and evaluate predicted geometry against 3D sensor ground truth. Trained on 40 real scenes, our model achieves 26mm reconstruction error, including occluded regions, validating its ability to infer complete 3D occupancy beyond traditional stereo vision methods.
title Towards Learning a Generalizable 3D Scene Representation from 2D Observations
topic Computer Vision and Pattern Recognition
Robotics
url https://arxiv.org/abs/2602.10943