:: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Lin, Wenjun, Zhang, Jensen, Cai, Kaitong, Wang, Keze
Format:	Preprint
Published:	2025
Subjects:	Robotics Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2512.18477
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

SirenPose: Dynamic Scene Reconstruction via Geometric Supervision
by: Cai, Kaitong, et al.
Published: (2025)

WristWorld: Generating Wrist-Views via 4D World Models for Robotic Manipulation
by: Qian, Zezhong, et al.
Published: (2025)

A Step Toward World Models: A Survey on Robotic Manipulation
by: Zhang, Peng-Fei, et al.
Published: (2025)

ABot-PhysWorld: Interactive World Foundation Model for Robotic Manipulation with Physics Alignment
by: Chen, Yuzhi, et al.
Published: (2026)

From Motion to Behavior: Hierarchical Modeling of Humanoid Generative Behavior Control
by: Zhang, Jusheng, et al.
Published: (2025)

FlashVLM: Text-Guided Visual Token Selection for Large Multimodal Models
by: Cai, Kaitong, et al.
Published: (2025)

Device-Conditioned Neural Architecture Search for Efficient Robotic Manipulation
by: Wu, Yiming, et al.
Published: (2026)

Manipulate-Anything: Automating Real-World Robots using Vision-Language Models
by: Duan, Jiafei, et al.
Published: (2024)

GEM-4D: Geometry-Enhanced Video World Models for Robot Manipulation
by: Zhou, Kaichen, et al.
Published: (2026)

Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation
by: Liao, Yue, et al.
Published: (2025)

Beyond Dense Futures: World Models as Structured Planners for Robotic Manipulation
by: Jin, Minghao, et al.
Published: (2026)

Robotic Programmer: Video Instructed Policy Code Generation for Robotic Manipulation
by: Xie, Senwei, et al.
Published: (2025)

Dream to Manipulate: Compositional World Models Empowering Robot Imitation Learning with Imagination
by: Barcellona, Leonardo, et al.
Published: (2024)

GAF: Gaussian Action Field as a 4D Representation for Dynamic World Modeling in Robotic Manipulation
by: Chai, Ying, et al.
Published: (2025)

Causal World Modeling for Robot Control
by: Li, Lin, et al.
Published: (2026)

MIND-V: Hierarchical World Model for Long-Horizon Robotic Manipulation with RL-based Physical Alignment
by: Zhang, Ruicheng, et al.
Published: (2025)

RoboTrustBench: Benchmarking the Trustworthiness of Video World Models for Robotic Manipulation
by: Li, Huiqiong, et al.
Published: (2026)

Occupancy World Model for Robots
by: Zhang, Zhang, et al.
Published: (2025)

FlowDreamer: A RGB-D World Model with Flow-based Motion Representations for Robot Manipulation
by: Guo, Jun, et al.
Published: (2025)

HAMSTER: Hierarchical Action Models For Open-World Robot Manipulation
by: Li, Yi, et al.
Published: (2025)

Object-Centric World Model for Language-Guided Manipulation
by: Jeong, Youngjoon, et al.
Published: (2025)

Physically Grounded Vision-Language Models for Robotic Manipulation
by: Gao, Jensen, et al.
Published: (2023)

PTTA: A Pure Text-to-Animation Framework for High-Quality Creation
by: Chen, Ruiqi, et al.
Published: (2025)

ReWorld: Multi-Dimensional Reward Modeling for Embodied World Models
by: Peng, Baorui, et al.
Published: (2026)

Ensuring Force Safety in Vision-Guided Robotic Manipulation via Implicit Tactile Calibration
by: Wei, Lai, et al.
Published: (2024)

PointWorld: Scaling 3D World Models for In-The-Wild Robotic Manipulation
by: Huang, Wenlong, et al.
Published: (2026)

PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation
by: Zhang, Kaidong, et al.
Published: (2024)

CoAgent: Collaborative Planning and Consistency Agent for Coherent Video Generation
by: Zeng, Qinglin, et al.
Published: (2025)

Spatial Policy: Guiding Visuomotor Robotic Manipulation with Spatial-Aware Modeling and Reasoning
by: Liu, Yijun, et al.
Published: (2025)

Language-Guided Grasp Detection with Coarse-to-Fine Learning for Robotic Manipulation
by: Jiang, Zebin, et al.
Published: (2025)

Improving Generalization of Language-Conditioned Robot Manipulation
by: Cui, Chenglin, et al.
Published: (2025)

IRASim: A Fine-Grained World Model for Robot Manipulation
by: Zhu, Fangqi, et al.
Published: (2024)

Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation
by: Zhou, Jiaming, et al.
Published: (2024)

MemoryVLA: Perceptual-Cognitive Memory in Vision-Language-Action Models for Robotic Manipulation
by: Shi, Hao, et al.
Published: (2025)

RoboPearls: Editable Video Simulation for Robot Manipulation
by: Tang, Tao, et al.
Published: (2025)

TAG: Target-Agnostic Guidance for Stable Object-Centric Inference in Vision-Language-Action Models
by: Zhou, Jiaying, et al.
Published: (2026)

Observe Then Act: Asynchronous Active Vision-Action Model for Robotic Manipulation
by: Wang, Guokang, et al.
Published: (2024)

Play to the Score: Stage-Guided Dynamic Multi-Sensory Fusion for Robotic Manipulation
by: Feng, Ruoxuan, et al.
Published: (2024)

Image Generation as a Visual Planner for Robotic Manipulation
by: Pang, Ye
Published: (2025)

Surfer: Progressive Reasoning with World Models for Robotic Manipulation
by: Ren, Pengzhen, et al.
Published: (2023)