:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Li, Yanbang, Gong, Ziyang, Li, Haoyang, Huang, Xiaoqi, Kang, Haolan, Bai, Guangping, Ma, Xianzheng
Format:	Preprint
Published:	2025
Subjects:	Robotics Artificial Intelligence Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2505.00693
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

AIC MLLM: Autonomous Interactive Correction MLLM for Robust Robotic Manipulation
by: Xiong, Chuyan, et al.
Published: (2024)

Distracted Robot: How Visual Clutter Undermine Robotic Manipulation
by: Rasouli, Amir, et al.
Published: (2025)

Instruction-Guided Visual Masking
by: Zheng, Jinliang, et al.
Published: (2024)

Spatially Visual Perception for End-to-End Robotic Learning
by: Davies, Travis, et al.
Published: (2024)

Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Datasets
by: Jiang, Guangqi, et al.
Published: (2024)

Visual Foresight for Robotic Stow: A Diffusion-Based World Model from Sparse Snapshots
by: Zhang, Lijun, et al.
Published: (2026)

Visual IRL for Human-Like Robotic Manipulation
by: Asali, Ehsan, et al.
Published: (2024)

ReKep: Spatio-Temporal Reasoning of Relational Keypoint Constraints for Robotic Manipulation
by: Huang, Wenlong, et al.
Published: (2024)

SpatialAnt: Autonomous Zero-Shot Robot Navigation via Active Scene Reconstruction and Visual Anticipation
by: Zhang, Jiwen, et al.
Published: (2026)

Grounding 3D Object Affordance with Language Instructions, Visual Observations and Interactions
by: Zhu, He, et al.
Published: (2025)

UAD: Unsupervised Affordance Distillation for Generalization in Robotic Manipulation
by: Tang, Yihe, et al.
Published: (2025)

Mobile-Seed: Joint Semantic Segmentation and Boundary Detection for Mobile Robots
by: Liao, Youqi, et al.
Published: (2023)

Advances and Innovations in the Multi-Agent Robotic System (MARS) Challenge
by: Kang, Li, et al.
Published: (2026)

Robix: A Unified Model for Robot Interaction, Reasoning and Planning
by: Fang, Huang, et al.
Published: (2025)

Vega: Learning to Drive with Natural Language Instructions
by: Zuo, Sicheng, et al.
Published: (2026)

Manipulation as in Simulation: Enabling Accurate Geometry Perception in Robots
by: Liu, Minghuan, et al.
Published: (2025)

Robotic Ultrasound Makes CBCT Alive
by: Li, Feng, et al.
Published: (2026)

SEM: Enhancing Spatial Understanding for Robust Robot Manipulation
by: Lin, Xuewu, et al.
Published: (2025)

Shape Completion and Real-Time Visualization in Robotic Ultrasound Spine Acquisitions
by: Gafencu, Miruna-Alexandra, et al.
Published: (2025)

What Matters to You? Towards Visual Representation Alignment for Robot Learning
by: Tian, Ran, et al.
Published: (2023)

Cross-Modal Instructions for Robot Motion Generation
by: Barron, William, et al.
Published: (2025)

FloNa: Floor Plan Guided Embodied Visual Navigation
by: Li, Jiaxin, et al.
Published: (2024)

A Real-to-Sim-to-Real Approach to Robotic Manipulation with VLM-Generated Iterative Keypoint Rewards
by: Patel, Shivansh, et al.
Published: (2025)

SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation
by: Li, Xin, et al.
Published: (2024)

Enhanced Safety in Autonomous Driving: Integrating Latent State Diffusion Model for End-to-End Navigation
by: Chu, Detian, et al.
Published: (2024)

Visual Homing in Outdoor Robots Using Mushroom Body Circuits and Learning Walks
by: Gattaux, Gabriel G., et al.
Published: (2025)

ManipDreamer3D : Synthesizing Plausible Robotic Manipulation Video with Occupancy-aware 3D Trajectory
by: Li, Ying, et al.
Published: (2025)

Visual SLAMMOT Considering Multiple Motion Models
by: Tian, Peilin, et al.
Published: (2024)

Recognizing Actions from Robotic View for Natural Human-Robot Interaction
by: Wang, Ziyi, et al.
Published: (2025)

AgriVLN: Vision-and-Language Navigation for Agricultural Robots
by: Zhao, Xiaobei, et al.
Published: (2025)

PointWorld: Scaling 3D World Models for In-The-Wild Robotic Manipulation
by: Huang, Wenlong, et al.
Published: (2026)

SkiP: When to Skip and When to Refine for Efficient Robot Manipulation
by: Dai, Mingtong, et al.
Published: (2026)

A Multi-Modal Neuro-Symbolic Approach for Spatial Reasoning-Based Visual Grounding in Robotics
by: Jahangard, Simindokht, et al.
Published: (2025)

RoboVIP: Multi-View Video Generation with Visual Identity Prompting Augments Robot Manipulation
by: Wang, Boyang, et al.
Published: (2026)

Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness
by: Wang, Haochen, et al.
Published: (2025)

RoomTour3D: Geometry-Aware Video-Instruction Tuning for Embodied Navigation
by: Han, Mingfei, et al.
Published: (2024)

VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning
by: Kang, Li, et al.
Published: (2025)

When Search Becomes Memory: Turning Robot Design Trials into Transferable Skills
by: Wang, Yunfei, et al.
Published: (2026)

Multi-Modal World Model for Physical Robot Interactions: Simultaneous Visual and Tactile Predictions for Enhanced Accuracy
by: Mandil, Willow, et al.
Published: (2023)

Adaptive Visual Imitation Learning for Robotic Assisted Feeding Across Varied Bowl Configurations and Food Types
by: Liu, Rui, et al.
Published: (2024)