Saved in:
| Main Authors: | Lin, Leo, Patel, Shivansh, Moon, Jay, Lazebnik, Svetlana, Jain, Unnat |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2603.12120 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Robotic Manipulation by Imitating Generated Videos Without Physical Demonstrations
by: Patel, Shivansh, et al.
Published: (2025)
by: Patel, Shivansh, et al.
Published: (2025)
A Real-to-Sim-to-Real Approach to Robotic Manipulation with VLM-Generated Iterative Keypoint Rewards
by: Patel, Shivansh, et al.
Published: (2025)
by: Patel, Shivansh, et al.
Published: (2025)
ViPRA: Video Prediction for Robot Actions
by: Routray, Sandeep, et al.
Published: (2025)
by: Routray, Sandeep, et al.
Published: (2025)
CRAFT: Video Diffusion for Bimanual Robot Data Generation
by: Chen, Jason, et al.
Published: (2026)
by: Chen, Jason, et al.
Published: (2026)
Hand-Object Interaction Pretraining from Videos
by: Singh, Himanshu Gaurav, et al.
Published: (2024)
by: Singh, Himanshu Gaurav, et al.
Published: (2024)
Bimanual Grasp Synthesis for Dexterous Robot Hands
by: Shao, Yanming, et al.
Published: (2024)
by: Shao, Yanming, et al.
Published: (2024)
Rule-VLN: Bridging Perception and Compliance via Semantic Reasoning and Geometric Rectification
by: Wen, Jiawen, et al.
Published: (2026)
by: Wen, Jiawen, et al.
Published: (2026)
ManiSoft: Towards Vision-Language Manipulation for Soft Continuum Robotics
by: Wei, Ziyu, et al.
Published: (2026)
by: Wei, Ziyu, et al.
Published: (2026)
MOPA: Modular Object Navigation with PointGoal Agents
by: Raychaudhuri, Sonia, et al.
Published: (2023)
by: Raychaudhuri, Sonia, et al.
Published: (2023)
Twisting Lids Off with Two Hands
by: Lin, Toru, et al.
Published: (2024)
by: Lin, Toru, et al.
Published: (2024)
OPENTOUCH: Bringing Full-Hand Touch to Real-World Interaction
by: Song, Yuxin Ray, et al.
Published: (2025)
by: Song, Yuxin Ray, et al.
Published: (2025)
RealDex: Towards Human-like Grasping for Robotic Dexterous Hand
by: Liu, Yumeng, et al.
Published: (2024)
by: Liu, Yumeng, et al.
Published: (2024)
Learning Visuotactile Skills with Two Multifingered Hands
by: Lin, Toru, et al.
Published: (2024)
by: Lin, Toru, et al.
Published: (2024)
World Models for Learning Dexterous Hand-Object Interactions from Human Videos
by: Goswami, Raktim Gautam, et al.
Published: (2025)
by: Goswami, Raktim Gautam, et al.
Published: (2025)
PhysHanDI: Physics-Based Reconstruction of Hand-Deformable Object Interactions
by: Lee, Jihyun, et al.
Published: (2026)
by: Lee, Jihyun, et al.
Published: (2026)
Conditioning Latent-Space Clusters for Real-World Anomaly Classification
by: Bogdoll, Daniel, et al.
Published: (2023)
by: Bogdoll, Daniel, et al.
Published: (2023)
Gaze-Guided 3D Hand Motion Prediction for Detecting Intent in Egocentric Grasping Tasks
by: He, Yufei, et al.
Published: (2025)
by: He, Yufei, et al.
Published: (2025)
HandDGP: Camera-Space Hand Mesh Prediction with Differentiable Global Positioning
by: Valassakis, Eugene, et al.
Published: (2024)
by: Valassakis, Eugene, et al.
Published: (2024)
Object-conditioned Bag of Instances for Few-Shot Personalized Instance Recognition
by: Michieli, Umberto, et al.
Published: (2024)
by: Michieli, Umberto, et al.
Published: (2024)
ANAVI: Audio Noise Awareness using Visuals of Indoor environments for NAVIgation
by: Jain, Vidhi, et al.
Published: (2024)
by: Jain, Vidhi, et al.
Published: (2024)
SynHLMA:Synthesizing Hand Language Manipulation for Articulated Object with Discrete Human Object Interaction Representation
by: zhi, Wang, et al.
Published: (2025)
by: zhi, Wang, et al.
Published: (2025)
FlowHOI: Flow-based Semantics-Grounded Generation of Hand-Object Interactions for Dexterous Robot Manipulation
by: Zeng, Huajian, et al.
Published: (2026)
by: Zeng, Huajian, et al.
Published: (2026)
HOT3D: Hand and Object Tracking in 3D from Egocentric Multi-View Videos
by: Banerjee, Prithviraj, et al.
Published: (2024)
by: Banerjee, Prithviraj, et al.
Published: (2024)
World-Ego Modeling for Long-Horizon Evolution in Hybrid Embodied Tasks
by: Lin, Zuyao, et al.
Published: (2026)
by: Lin, Zuyao, et al.
Published: (2026)
Flowing from Reasoning to Motion: Learning 3D Hand Trajectory Prediction from Egocentric Human Interaction Videos
by: Chen, Mingfei, et al.
Published: (2025)
by: Chen, Mingfei, et al.
Published: (2025)
ViTaS: Visual Tactile Soft Fusion Contrastive Learning for Visuomotor Learning
by: Tian, Yufeng, et al.
Published: (2026)
by: Tian, Yufeng, et al.
Published: (2026)
VISOR: VIsual Spatial Object Reasoning for Language-driven Object Navigation
by: Taioli, Francesco, et al.
Published: (2026)
by: Taioli, Francesco, et al.
Published: (2026)
Swiss DINO: Efficient and Versatile Vision Framework for On-device Personal Object Search
by: Paramonov, Kirill, et al.
Published: (2024)
by: Paramonov, Kirill, et al.
Published: (2024)
X-VLA: Soft-Prompted Transformer as Scalable Cross-Embodiment Vision-Language-Action Model
by: Zheng, Jinliang, et al.
Published: (2025)
by: Zheng, Jinliang, et al.
Published: (2025)
Autonomous Robot for Disaster Mapping and Victim Localization
by: Potter, Michael, et al.
Published: (2024)
by: Potter, Michael, et al.
Published: (2024)
What Matters to Enhance Traffic Rule Compliance of Imitation Learning for End-to-End Autonomous Driving
by: Zhou, Hongkuan, et al.
Published: (2023)
by: Zhou, Hongkuan, et al.
Published: (2023)
3D Hand Pose Estimation in Everyday Egocentric Images
by: Prakash, Aditya, et al.
Published: (2023)
by: Prakash, Aditya, et al.
Published: (2023)
Prune-Then-Plan: Step-Level Calibration for Stable Frontier Exploration in Embodied Question Answering
by: Frahm, Noah, et al.
Published: (2025)
by: Frahm, Noah, et al.
Published: (2025)
A Vision-Enabled Prosthetic Hand for Children with Upper Limb Disabilities
by: Sarker, Md Abdul Baset, et al.
Published: (2025)
by: Sarker, Md Abdul Baset, et al.
Published: (2025)
Tether: Autonomous Functional Play with Correspondence-Driven Trajectory Warping
by: Liang, William, et al.
Published: (2026)
by: Liang, William, et al.
Published: (2026)
Environment-Driven Online LiDAR-Camera Extrinsic Calibration
by: Huang, Zhiwei, et al.
Published: (2025)
by: Huang, Zhiwei, et al.
Published: (2025)
Runtime Safety Monitoring of Deep Neural Networks for Perception: A Survey
by: Schotschneider, Albert, et al.
Published: (2025)
by: Schotschneider, Albert, et al.
Published: (2025)
A Unified Perception-Language-Action Framework for Adaptive Autonomous Driving
by: Zhang, Yi, et al.
Published: (2025)
by: Zhang, Yi, et al.
Published: (2025)
Unifying 2D and 3D Vision-Language Understanding
by: Jain, Ayush, et al.
Published: (2025)
by: Jain, Ayush, et al.
Published: (2025)
Self-driving cars: Are we there yet?
by: Atasever, Merve, et al.
Published: (2025)
by: Atasever, Merve, et al.
Published: (2025)
Similar Items
-
Robotic Manipulation by Imitating Generated Videos Without Physical Demonstrations
by: Patel, Shivansh, et al.
Published: (2025) -
A Real-to-Sim-to-Real Approach to Robotic Manipulation with VLM-Generated Iterative Keypoint Rewards
by: Patel, Shivansh, et al.
Published: (2025) -
ViPRA: Video Prediction for Robot Actions
by: Routray, Sandeep, et al.
Published: (2025) -
CRAFT: Video Diffusion for Bimanual Robot Data Generation
by: Chen, Jason, et al.
Published: (2026) -
Hand-Object Interaction Pretraining from Videos
by: Singh, Himanshu Gaurav, et al.
Published: (2024)