:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wilcox, Albert, Ghanem, Mohamed, Moghani, Masoud, Barroso, Pierre, Joffe, Benjamin, Garg, Animesh
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence Machine Learning Robotics
Online Access:	https://arxiv.org/abs/2503.04877
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

AMPLIFY: Actionless Motion Priors for Robot Learning from Videos
by: Collins, Jeremy A., et al.
Published: (2025)

SuFIA-BC: Generating High Quality Demonstration Data for Visuomotor Policy Learning in Surgical Subtasks
by: Moghani, Masoud, et al.
Published: (2025)

SoftMimicGen: A Data Generation System for Scalable Robot Learning in Deformable Object Manipulation
by: Moghani, Masoud, et al.
Published: (2026)

QueST: Self-Supervised Skill Abstractions for Learning Continuous Control
by: Mete, Atharva, et al.
Published: (2024)

SPIRE: Synergistic Planning, Imitation, and Reinforcement Learning for Long-Horizon Manipulation
by: Zhou, Zihan, et al.
Published: (2024)

SuFIA: Language-Guided Augmented Dexterity for Robotic Surgical Assistants
by: Moghani, Masoud, et al.
Published: (2024)

COBALT: Crowdsourcing Robot Learning via Cloud-Based Teleoperation with Smartphones
by: Agarwal, Ayush, et al.
Published: (2026)

SAIL: Faster-than-Demonstration Execution of Imitation Learning Policies
by: Arachchige, Nadun Ranawaka, et al.
Published: (2025)

ORBIT-Surgical: An Open-Simulation Framework for Learning Surgical Augmented Dexterity
by: Yu, Qinxi, et al.
Published: (2024)

OG-VLA: Orthographic Image Generation for 3D-Aware Vision-Language Action Model
by: Singh, Ishika, et al.
Published: (2025)

Towards Learning a Generalizable 3D Scene Representation from 2D Observations
by: Gromniak, Martin, et al.
Published: (2026)

QueSTMaps: Queryable Semantic Topological Maps for 3D Scene Understanding
by: Mehan, Yash, et al.
Published: (2024)

OpenLex3D: A Tiered Evaluation Benchmark for Open-Vocabulary 3D Scene Representations
by: Kassab, Christina, et al.
Published: (2025)

Discovering Robotic Interaction Modes with Discrete Representation Learning
by: Wang, Liquan, et al.
Published: (2024)

Unifying Scene Representation and Hand-Eye Calibration with 3D Foundation Models
by: Zhi, Weiming, et al.
Published: (2024)

Adaptive Keyframe Selection for Scalable 3D Scene Reconstruction in Dynamic Environments
by: Jha, Raman, et al.
Published: (2025)

3D Diffuser Actor: Policy Diffusion with 3D Scene Representations
by: Ke, Tsung-Wei, et al.
Published: (2024)

Articulate3D: Holistic Understanding of 3D Scenes as Universal Scene Description
by: Halacheva, Anna-Maria, et al.
Published: (2024)

Towards Fusing Point Cloud and Visual Representations for Imitation Learning
by: Donat, Atalay, et al.
Published: (2025)

What Is The Best 3D Scene Representation for Robotics? From Geometric to Foundation Models
by: Deng, Tianchen, et al.
Published: (2025)

OpenOcc: Open Vocabulary 3D Scene Reconstruction via Occupancy Representation
by: Jiang, Haochen, et al.
Published: (2024)

Preference-Driven Active 3D Scene Representation for Robotic Inspection in Nuclear Decommissioning
by: Meng, Zhen, et al.
Published: (2025)

RoCoDA: Counterfactual Data Augmentation for Data-Efficient Robot Learning from Demonstrations
by: Ameperosa, Ezra, et al.
Published: (2024)

R3D: Revisiting 3D Policy Learning
by: Hong, Zhengdong, et al.
Published: (2026)

CL3R: 3D Reconstruction and Contrastive Learning for Enhanced Robotic Manipulation Representations
by: Cui, Wenbo, et al.
Published: (2025)

Clutt3R-Seg: Sparse-view 3D Instance Segmentation for Language-grounded Grasping in Cluttered Scenes
by: Noh, Jeongho, et al.
Published: (2026)

StereoAdapter: Adapting Stereo Depth Estimation to Underwater Scenes
by: Wu, Zhengri, et al.
Published: (2025)

OSN: Infinite Representations of Dynamic 3D Scenes from Monocular Videos
by: Song, Ziyang, et al.
Published: (2024)

3D-Mem: 3D Scene Memory for Embodied Exploration and Reasoning
by: Yang, Yuncong, et al.
Published: (2024)

SHOW3D: Capturing Scenes of 3D Hands and Objects in the Wild
by: Rim, Patrick, et al.
Published: (2026)

REACT3D: Recovering Articulations for Interactive Physical 3D Scenes
by: Huang, Zhao, et al.
Published: (2025)

Language-Assisted 3D Scene Understanding
by: Wu, Yanmin, et al.
Published: (2023)

WildScenes: A Benchmark for 2D and 3D Semantic Segmentation in Large-scale Natural Environments
by: Vidanapathirana, Kavisha, et al.
Published: (2023)

CO^3: Cooperative Unsupervised 3D Representation Learning for Autonomous Driving
by: Chen, Runjian, et al.
Published: (2022)

Text-Scene: A Scene-to-Language Parsing Framework for 3D Scene Understanding
by: Li, Haoyuan, et al.
Published: (2025)

SURPRISE3D: A Dataset for Spatial Understanding and Reasoning in Complex 3D Scenes
by: Huang, Jiaxin, et al.
Published: (2025)

EvObj: Learning Evolving Object-centric Representations for 3D Instance Segmentation without Scene Supervision
by: Chen, Jiahao, et al.
Published: (2026)

Generate, Transfer, Adapt: Learning Functional Dexterous Grasping from a Single Human Demonstration
by: He, Xingyi, et al.
Published: (2026)

Incremental Joint Learning of Depth, Pose and Implicit Scene Representation on Monocular Camera in Large-scale Scenes
by: Deng, Tianchen, et al.
Published: (2024)

DGSG-Mind: Dynamic 3D Gaussian Scene Graphs for Long-Term Scene Understanding and Grounding
by: Ge, Luzhou, et al.
Published: (2026)