:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Bai, Yu, Yu, MingMing, Li, Chaojie, Bai, Ziyi, Wang, Xinlong, Karlsson, Börje F.
Format:	Preprint
Published:	2026
Subjects:	Robotics Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2602.04515
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

ExoActor: Exocentric Video Generation as Generalizable Interactive Humanoid Control
by: Zhou, Yanghao, et al.
Published: (2026)

Being-0: A Humanoid Robotic Agent with Vision-Language Models and Modular Skills
by: Yuan, Haoqi, et al.
Published: (2025)

RANGER: A Monocular Zero-Shot Semantic Navigation Framework through Visual Contextual Adaptation
by: Yu, Ming-Ming, et al.
Published: (2025)

Towards Proprioception-Aware Embodied Planning for Dual-Arm Humanoid Robots
by: Li, Boyu, et al.
Published: (2025)

EgoHumanoid: Unlocking In-the-Wild Loco-Manipulation with Robot-Free Egocentric Demonstration
by: Shi, Modi, et al.
Published: (2026)

VEGA: Visual Encoder Grounding Alignment for Spatially-Aware Vision-Language-Action Models
by: Wang, Hao, et al.
Published: (2026)

DualTHOR: A Dual-Arm Humanoid Simulation Platform for Contingency-Aware Planning
by: Li, Boyu, et al.
Published: (2025)

ARMOR: Egocentric Perception for Humanoid Robot Collision Avoidance and Motion Planning
by: Kim, Daehwa, et al.
Published: (2024)

SWITCH: Benchmarking Modeling and Handling of Tangible Interfaces in Long-horizon Embodied Scenarios
by: Lin, Jieru, et al.
Published: (2025)

X-DiffVLA: X-Embodied Diffusion Action Heads for Vision-Language-Action Models
by: Li, Boyu, et al.
Published: (2026)

Learning Soccer Skills for Humanoid Robots: A Progressive Perception-Action Framework
by: Kong, Jipeng, et al.
Published: (2026)

Ego-Grounding for Personalized Question-Answering in Egocentric Videos
by: Xiao, Junbin, et al.
Published: (2026)

EgoLive: A Large-Scale Egocentric Dataset from Real-World Human Tasks
by: Li, Yihang, et al.
Published: (2026)

LLM-GROP: Visually Grounded Robot Task and Motion Planning with Large Language Models
by: Zhang, Xiaohan, et al.
Published: (2025)

EgoVLA: Learning Vision-Language-Action Models from Egocentric Human Videos
by: Yang, Ruihan, et al.
Published: (2025)

Ego-Vision World Model for Humanoid Contact Planning
by: Liu, Hang, et al.
Published: (2025)

EgoEvGesture: Gesture Recognition Based on Egocentric Event Camera
by: Wang, Luming, et al.
Published: (2025)

EgoDemoGen: Egocentric Demonstration Generation for Viewpoint Generalization in Robotic Manipulation
by: Xu, Yuan, et al.
Published: (2025)

EgoPush: Learning End-to-End Egocentric Multi-Object Rearrangement for Mobile Robots
by: An, Boyuan, et al.
Published: (2026)

Visual-Language-Guided Task Planning for Horticultural Robots
by: Cuaran, Jose, et al.
Published: (2026)

iWalker: Imperative Visual Planning for Walking Humanoid Robot
by: Lin, Xiao, et al.
Published: (2024)

RobotDancing: Residual-Action Reinforcement Learning Enables Robust Long-Horizon Humanoid Motion Tracking
by: Sun, Zhenguo, et al.
Published: (2025)

EgoPAT3Dv2: Predicting 3D Action Target from 2D Egocentric Vision for Human-Robot Interaction
by: Fang, Irving, et al.
Published: (2024)

PANav: Toward Privacy-Aware Robot Navigation via Vision-Language Models
by: Yu, Bangguo, et al.
Published: (2024)

EgoAVFlow: Robot Policy Learning with Active Vision from Human Egocentric Videos via 3D Flow
by: Cho, Daesol, et al.
Published: (2026)

IndEgo: A Dataset of Industrial Scenarios and Collaborative Work for Egocentric Assistants
by: Chavan, Vivek, et al.
Published: (2025)

EgoVerse: An Egocentric Human Dataset for Robot Learning from Around the World
by: Punamiya, Ryan, et al.
Published: (2026)

OpenEgo: A Large-Scale Multimodal Egocentric Dataset for Dexterous Manipulation
by: Jawaid, Ahad, et al.
Published: (2025)

Task and Motion Planning for Humanoid Loco-manipulation
by: Ciebielski, Michal, et al.
Published: (2025)

EgoMimic: Scaling Imitation Learning via Egocentric Video
by: Kareer, Simar, et al.
Published: (2024)

EgoMI: Learning Active Vision and Whole-Body Manipulation from Egocentric Human Demonstrations
by: Yu, Justin, et al.
Published: (2025)

AssemLM: Spatial Reasoning Multimodal Large Language Models for Robotic Assembly
by: Jing, Zhi, et al.
Published: (2026)

Ego to World: Collaborative Spatial Reasoning in Embodied Systems via Reinforcement Learning
by: Zhou, Heng, et al.
Published: (2026)

Spatially Grounded Long-Horizon Task Planning in the Wild
by: Jung, Sehun, et al.
Published: (2026)

Afford-VLA: Action-Aligned Visual Planning via Internalized Affordance
by: Wang, Runze, et al.
Published: (2026)

HumanoidGen: Data Generation for Bimanual Dexterous Manipulation via LLM Reasoning
by: Jing, Zhi, et al.
Published: (2025)

EgoScale: Scaling Dexterous Manipulation with Diverse Egocentric Human Data
by: Zheng, Ruijie, et al.
Published: (2026)

PLanAR: Planning-Language-Grounded Agentic Reasoning for Robot Manipulation
by: Guo, Pengyuan, et al.
Published: (2026)

Language-Grounded Decoupled Action Representation for Robotic Manipulation
by: Weng, Wuding, et al.
Published: (2026)

From Experts to a Generalist: Toward General Whole-Body Control for Humanoid Robots
by: Wang, Yuxuan, et al.
Published: (2025)