:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Yang, Libing, Li, Yang, Chen, Long
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence Robotics
Online Access:	https://arxiv.org/abs/2405.04549
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

PEAfowl: Perception-Enhanced Multi-View Vision-Language-Action for Bimanual Manipulation
by: Fan, Qingyu, et al.
Published: (2026)

InternVLA-M1: A Spatially Guided Vision-Language-Action Framework for Generalist Robot Policy
by: Chen, Xinyi, et al.
Published: (2025)

On-Device Diffusion Transformer Policy for Efficient Robot Manipulation
by: Wu, Yiming, et al.
Published: (2025)

HAMSTER: Hierarchical Action Models For Open-World Robot Manipulation
by: Li, Yi, et al.
Published: (2025)

Toward Aligning Human and Robot Actions via Multi-Modal Demonstration Learning
by: Zahid, Azizul, et al.
Published: (2025)

Explainable Adversarial-Robust Vision-Language-Action Model for Robotic Manipulation
by: Kim, Ju-Young, et al.
Published: (2025)

SEM: Enhancing Spatial Understanding for Robust Robot Manipulation
by: Lin, Xuewu, et al.
Published: (2025)

Spatial Policy: Guiding Visuomotor Robotic Manipulation with Spatial-Aware Modeling and Reasoning
by: Liu, Yijun, et al.
Published: (2025)

Chameleon: Episodic Memory for Long-Horizon Robotic Manipulation
by: Guo, Xinying, et al.
Published: (2026)

Global Prior Meets Local Consistency: Dual-Memory Augmented Vision-Language-Action Model for Efficient Robotic Manipulation
by: Li, Zaijing, et al.
Published: (2026)

Manipulation as in Simulation: Enabling Accurate Geometry Perception in Robots
by: Liu, Minghuan, et al.
Published: (2025)

FUNCanon: Learning Pose-Aware Action Primitives via Functional Object Canonicalization for Generalizable Robotic Manipulation
by: Xu, Hongli, et al.
Published: (2025)

ClothHMR: 3D Mesh Recovery of Humans in Diverse Clothing from Single Image
by: Gao, Yunqi, et al.
Published: (2025)

Benchmarking the Sim-to-Real Gap in Cloth Manipulation
by: Blanco-Mulero, David, et al.
Published: (2023)

Efficient Robotic Policy Learning via Latent Space Backward Planning
by: Liu, Dongxiu, et al.
Published: (2025)

CL3R: 3D Reconstruction and Contrastive Learning for Enhanced Robotic Manipulation Representations
by: Cui, Wenbo, et al.
Published: (2025)

Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos
by: Li, Qixiu, et al.
Published: (2025)

MolmoSpaces: A Large-Scale Open Ecosystem for Robot Navigation and Manipulation
by: Kim, Yejin, et al.
Published: (2026)

Demystifying Action Space Design for Robotic Manipulation Policies
by: Feng, Yuchun, et al.
Published: (2026)

Cloth-Splatting: 3D Cloth State Estimation from RGB Supervision
by: Longhini, Alberta, et al.
Published: (2025)

BOSS: Benchmark for Observation Space Shift in Long-Horizon Task
by: Yang, Yue, et al.
Published: (2025)

VideoVLA: Video Generators Can Be Generalizable Robot Manipulators
by: Shen, Yichao, et al.
Published: (2025)

Towards Long-horizon Embodied Agents with Tool-Aligned Vision-Language-Action Models
by: Lei, Zixing, et al.
Published: (2026)

Distracted Robot: How Visual Clutter Undermine Robotic Manipulation
by: Rasouli, Amir, et al.
Published: (2025)

Redundancy-aware Action Spaces for Robot Learning
by: Mazzaglia, Pietro, et al.
Published: (2024)

HomeRobot: Open-Vocabulary Mobile Manipulation
by: Yenamandra, Sriram, et al.
Published: (2023)

RoboEXP: Action-Conditioned Scene Graph via Interactive Exploration for Robotic Manipulation
by: Jiang, Hanxiao, et al.
Published: (2024)

CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
by: Li, Qixiu, et al.
Published: (2024)

VOTE: Vision-Language-Action Optimization with Trajectory Ensemble Voting
by: Lin, Juyi, et al.
Published: (2025)

ManiSoft: Towards Vision-Language Manipulation for Soft Continuum Robotics
by: Wei, Ziyu, et al.
Published: (2026)

ReKep: Spatio-Temporal Reasoning of Relational Keypoint Constraints for Robotic Manipulation
by: Huang, Wenlong, et al.
Published: (2024)

Diffusion Dynamics Models with Generative State Estimation for Cloth Manipulation
by: Tian, Tongxuan, et al.
Published: (2025)

Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Datasets
by: Jiang, Guangqi, et al.
Published: (2024)

FlowHOI: Flow-based Semantics-Grounded Generation of Hand-Object Interactions for Dexterous Robot Manipulation
by: Zeng, Huajian, et al.
Published: (2026)

Render and Diffuse: Aligning Image and Action Spaces for Diffusion-based Behaviour Cloning
by: Vosylius, Vitalis, et al.
Published: (2024)

UAD: Unsupervised Affordance Distillation for Generalization in Robotic Manipulation
by: Tang, Yihe, et al.
Published: (2025)

DVGT-2: Vision-Geometry-Action Model for Autonomous Driving at Scale
by: Zuo, Sicheng, et al.
Published: (2026)

Recognizing Actions from Robotic View for Natural Human-Robot Interaction
by: Wang, Ziyi, et al.
Published: (2025)

Gondola: Grounded Vision Language Planning for Generalizable Robotic Manipulation
by: Chen, Shizhe, et al.
Published: (2025)

CHRIS: Clothed Human Reconstruction with Side View Consistency
by: Liu, Dong, et al.
Published: (2025)