Saved in:
| Main Authors: | Li, Yanbang, Gong, Ziyang, Li, Haoyang, Huang, Xiaoqi, Kang, Haolan, Bai, Guangping, Ma, Xianzheng |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.00693 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
AIC MLLM: Autonomous Interactive Correction MLLM for Robust Robotic Manipulation
by: Xiong, Chuyan, et al.
Published: (2024)
by: Xiong, Chuyan, et al.
Published: (2024)
Distracted Robot: How Visual Clutter Undermine Robotic Manipulation
by: Rasouli, Amir, et al.
Published: (2025)
by: Rasouli, Amir, et al.
Published: (2025)
Instruction-Guided Visual Masking
by: Zheng, Jinliang, et al.
Published: (2024)
by: Zheng, Jinliang, et al.
Published: (2024)
Spatially Visual Perception for End-to-End Robotic Learning
by: Davies, Travis, et al.
Published: (2024)
by: Davies, Travis, et al.
Published: (2024)
Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Datasets
by: Jiang, Guangqi, et al.
Published: (2024)
by: Jiang, Guangqi, et al.
Published: (2024)
Visual Foresight for Robotic Stow: A Diffusion-Based World Model from Sparse Snapshots
by: Zhang, Lijun, et al.
Published: (2026)
by: Zhang, Lijun, et al.
Published: (2026)
Visual IRL for Human-Like Robotic Manipulation
by: Asali, Ehsan, et al.
Published: (2024)
by: Asali, Ehsan, et al.
Published: (2024)
ReKep: Spatio-Temporal Reasoning of Relational Keypoint Constraints for Robotic Manipulation
by: Huang, Wenlong, et al.
Published: (2024)
by: Huang, Wenlong, et al.
Published: (2024)
SpatialAnt: Autonomous Zero-Shot Robot Navigation via Active Scene Reconstruction and Visual Anticipation
by: Zhang, Jiwen, et al.
Published: (2026)
by: Zhang, Jiwen, et al.
Published: (2026)
Grounding 3D Object Affordance with Language Instructions, Visual Observations and Interactions
by: Zhu, He, et al.
Published: (2025)
by: Zhu, He, et al.
Published: (2025)
UAD: Unsupervised Affordance Distillation for Generalization in Robotic Manipulation
by: Tang, Yihe, et al.
Published: (2025)
by: Tang, Yihe, et al.
Published: (2025)
Mobile-Seed: Joint Semantic Segmentation and Boundary Detection for Mobile Robots
by: Liao, Youqi, et al.
Published: (2023)
by: Liao, Youqi, et al.
Published: (2023)
Advances and Innovations in the Multi-Agent Robotic System (MARS) Challenge
by: Kang, Li, et al.
Published: (2026)
by: Kang, Li, et al.
Published: (2026)
Robix: A Unified Model for Robot Interaction, Reasoning and Planning
by: Fang, Huang, et al.
Published: (2025)
by: Fang, Huang, et al.
Published: (2025)
Vega: Learning to Drive with Natural Language Instructions
by: Zuo, Sicheng, et al.
Published: (2026)
by: Zuo, Sicheng, et al.
Published: (2026)
Manipulation as in Simulation: Enabling Accurate Geometry Perception in Robots
by: Liu, Minghuan, et al.
Published: (2025)
by: Liu, Minghuan, et al.
Published: (2025)
Robotic Ultrasound Makes CBCT Alive
by: Li, Feng, et al.
Published: (2026)
by: Li, Feng, et al.
Published: (2026)
SEM: Enhancing Spatial Understanding for Robust Robot Manipulation
by: Lin, Xuewu, et al.
Published: (2025)
by: Lin, Xuewu, et al.
Published: (2025)
Shape Completion and Real-Time Visualization in Robotic Ultrasound Spine Acquisitions
by: Gafencu, Miruna-Alexandra, et al.
Published: (2025)
by: Gafencu, Miruna-Alexandra, et al.
Published: (2025)
What Matters to You? Towards Visual Representation Alignment for Robot Learning
by: Tian, Ran, et al.
Published: (2023)
by: Tian, Ran, et al.
Published: (2023)
Cross-Modal Instructions for Robot Motion Generation
by: Barron, William, et al.
Published: (2025)
by: Barron, William, et al.
Published: (2025)
FloNa: Floor Plan Guided Embodied Visual Navigation
by: Li, Jiaxin, et al.
Published: (2024)
by: Li, Jiaxin, et al.
Published: (2024)
A Real-to-Sim-to-Real Approach to Robotic Manipulation with VLM-Generated Iterative Keypoint Rewards
by: Patel, Shivansh, et al.
Published: (2025)
by: Patel, Shivansh, et al.
Published: (2025)
SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation
by: Li, Xin, et al.
Published: (2024)
by: Li, Xin, et al.
Published: (2024)
Enhanced Safety in Autonomous Driving: Integrating Latent State Diffusion Model for End-to-End Navigation
by: Chu, Detian, et al.
Published: (2024)
by: Chu, Detian, et al.
Published: (2024)
Visual Homing in Outdoor Robots Using Mushroom Body Circuits and Learning Walks
by: Gattaux, Gabriel G., et al.
Published: (2025)
by: Gattaux, Gabriel G., et al.
Published: (2025)
ManipDreamer3D : Synthesizing Plausible Robotic Manipulation Video with Occupancy-aware 3D Trajectory
by: Li, Ying, et al.
Published: (2025)
by: Li, Ying, et al.
Published: (2025)
Visual SLAMMOT Considering Multiple Motion Models
by: Tian, Peilin, et al.
Published: (2024)
by: Tian, Peilin, et al.
Published: (2024)
Recognizing Actions from Robotic View for Natural Human-Robot Interaction
by: Wang, Ziyi, et al.
Published: (2025)
by: Wang, Ziyi, et al.
Published: (2025)
AgriVLN: Vision-and-Language Navigation for Agricultural Robots
by: Zhao, Xiaobei, et al.
Published: (2025)
by: Zhao, Xiaobei, et al.
Published: (2025)
PointWorld: Scaling 3D World Models for In-The-Wild Robotic Manipulation
by: Huang, Wenlong, et al.
Published: (2026)
by: Huang, Wenlong, et al.
Published: (2026)
SkiP: When to Skip and When to Refine for Efficient Robot Manipulation
by: Dai, Mingtong, et al.
Published: (2026)
by: Dai, Mingtong, et al.
Published: (2026)
A Multi-Modal Neuro-Symbolic Approach for Spatial Reasoning-Based Visual Grounding in Robotics
by: Jahangard, Simindokht, et al.
Published: (2025)
by: Jahangard, Simindokht, et al.
Published: (2025)
RoboVIP: Multi-View Video Generation with Visual Identity Prompting Augments Robot Manipulation
by: Wang, Boyang, et al.
Published: (2026)
by: Wang, Boyang, et al.
Published: (2026)
Ross3D: Reconstructive Visual Instruction Tuning with 3D-Awareness
by: Wang, Haochen, et al.
Published: (2025)
by: Wang, Haochen, et al.
Published: (2025)
RoomTour3D: Geometry-Aware Video-Instruction Tuning for Embodied Navigation
by: Han, Mingfei, et al.
Published: (2024)
by: Han, Mingfei, et al.
Published: (2024)
VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning
by: Kang, Li, et al.
Published: (2025)
by: Kang, Li, et al.
Published: (2025)
When Search Becomes Memory: Turning Robot Design Trials into Transferable Skills
by: Wang, Yunfei, et al.
Published: (2026)
by: Wang, Yunfei, et al.
Published: (2026)
Multi-Modal World Model for Physical Robot Interactions: Simultaneous Visual and Tactile Predictions for Enhanced Accuracy
by: Mandil, Willow, et al.
Published: (2023)
by: Mandil, Willow, et al.
Published: (2023)
Adaptive Visual Imitation Learning for Robotic Assisted Feeding Across Varied Bowl Configurations and Food Types
by: Liu, Rui, et al.
Published: (2024)
by: Liu, Rui, et al.
Published: (2024)
Similar Items
-
AIC MLLM: Autonomous Interactive Correction MLLM for Robust Robotic Manipulation
by: Xiong, Chuyan, et al.
Published: (2024) -
Distracted Robot: How Visual Clutter Undermine Robotic Manipulation
by: Rasouli, Amir, et al.
Published: (2025) -
Instruction-Guided Visual Masking
by: Zheng, Jinliang, et al.
Published: (2024) -
Spatially Visual Perception for End-to-End Robotic Learning
by: Davies, Travis, et al.
Published: (2024) -
Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Datasets
by: Jiang, Guangqi, et al.
Published: (2024)