Saved in:
| Main Authors: | Sheikholeslami, Sahara, Bölöni, Ladislau |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2504.14634 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Think Proprioceptively: Embodied Visual Reasoning for VLA Manipulation
by: Wang, Fangyuan, et al.
Published: (2026)
by: Wang, Fangyuan, et al.
Published: (2026)
Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers
by: Wang, Lirui, et al.
Published: (2024)
by: Wang, Lirui, et al.
Published: (2024)
CARE: Multi-Task Pretraining for Latent Continuous Action Representation in Robot Control
by: Shi, Jiaqi, et al.
Published: (2026)
by: Shi, Jiaqi, et al.
Published: (2026)
Control-oriented Clustering of Visual Latent Representation
by: Qi, Han, et al.
Published: (2024)
by: Qi, Han, et al.
Published: (2024)
Bootstrap Dynamic-Aware 3D Visual Representation for Scalable Robot Learning
by: Liang, Qiwei, et al.
Published: (2025)
by: Liang, Qiwei, et al.
Published: (2025)
Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations
by: Hu, Yucheng, et al.
Published: (2024)
by: Hu, Yucheng, et al.
Published: (2024)
ReViP: Mitigating False Completion in Vision-Language-Action Models with Vision-Proprioception Rebalance
by: Li, Zhuohao, et al.
Published: (2026)
by: Li, Zhuohao, et al.
Published: (2026)
LatentBKI: Open-Dictionary Continuous Mapping in Visual-Language Latent Spaces with Quantifiable Uncertainty
by: Wilson, Joey, et al.
Published: (2024)
by: Wilson, Joey, et al.
Published: (2024)
Physically Grounded 3D Generative Reconstruction under Hand Occlusion using Proprioception and Multi-Contact Touch
by: Caddeo, Gabriele Mario, et al.
Published: (2026)
by: Caddeo, Gabriele Mario, et al.
Published: (2026)
RoboUniView: Visual-Language Model with Unified View Representation for Robotic Manipulation
by: Liu, Fanfan, et al.
Published: (2024)
by: Liu, Fanfan, et al.
Published: (2024)
Learning Object Properties Using Robot Proprioception via Differentiable Robot-Object Interaction
by: Chen, Peter Yichen, et al.
Published: (2024)
by: Chen, Peter Yichen, et al.
Published: (2024)
What Matters to You? Towards Visual Representation Alignment for Robot Learning
by: Tian, Ran, et al.
Published: (2023)
by: Tian, Ran, et al.
Published: (2023)
Pixel Motion as Universal Representation for Robot Control
by: Ranasinghe, Kanchana, et al.
Published: (2025)
by: Ranasinghe, Kanchana, et al.
Published: (2025)
Visual Semantic Navigation with Real Robots
by: Gutiérrez-Álvarez, Carlos, et al.
Published: (2023)
by: Gutiérrez-Álvarez, Carlos, et al.
Published: (2023)
MateRobot: Material Recognition in Wearable Robotics for People with Visual Impairments
by: Zheng, Junwei, et al.
Published: (2023)
by: Zheng, Junwei, et al.
Published: (2023)
From Imagined Futures to Executable Actions: Mixture of Latent Actions for Robot Manipulation
by: Li, Yajie, et al.
Published: (2026)
by: Li, Yajie, et al.
Published: (2026)
Disentangled Object-Centric Image Representation for Robotic Manipulation
by: Emukpere, David, et al.
Published: (2025)
by: Emukpere, David, et al.
Published: (2025)
Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation
by: Zhou, Jiaming, et al.
Published: (2024)
by: Zhou, Jiaming, et al.
Published: (2024)
LARY: A Latent Action Representation Yielding Benchmark for Generalizable Vision-to-Action Alignment
by: Nie, Dujun, et al.
Published: (2026)
by: Nie, Dujun, et al.
Published: (2026)
DeFM: Learning Foundation Representations from Depth for Robotics
by: Patel, Manthan, et al.
Published: (2026)
by: Patel, Manthan, et al.
Published: (2026)
Image Generation as a Visual Planner for Robotic Manipulation
by: Pang, Ye
Published: (2025)
by: Pang, Ye
Published: (2025)
CoMo: Learning Continuous Latent Motion from Internet Videos for Scalable Robot Learning
by: Yang, Jiange, et al.
Published: (2025)
by: Yang, Jiange, et al.
Published: (2025)
LaST-R1: Reinforcing Robotic Manipulation via Adaptive Physical Latent Reasoning
by: Chen, Hao, et al.
Published: (2026)
by: Chen, Hao, et al.
Published: (2026)
How Robot Dogs See the Unseeable: Improving Visual Interpretability via Peering for Exploratory Robots
by: Bimber, Oliver, et al.
Published: (2025)
by: Bimber, Oliver, et al.
Published: (2025)
SlotVLA: Towards Modeling of Object-Relation Representations in Robotic Manipulation
by: Hanyu, Taisei, et al.
Published: (2025)
by: Hanyu, Taisei, et al.
Published: (2025)
SpatialActor: Exploring Disentangled Spatial Representations for Robust Robotic Manipulation
by: Shi, Hao, et al.
Published: (2025)
by: Shi, Hao, et al.
Published: (2025)
Towards Fusing Point Cloud and Visual Representations for Imitation Learning
by: Donat, Atalay, et al.
Published: (2025)
by: Donat, Atalay, et al.
Published: (2025)
Simultaneous Tactile-Visual Perception for Learning Multimodal Robot Manipulation
by: Li, Yuyang, et al.
Published: (2025)
by: Li, Yuyang, et al.
Published: (2025)
Bio-Inspired Event-Based Visual Servoing for Ground Robots
by: Mordad, Maral, et al.
Published: (2026)
by: Mordad, Maral, et al.
Published: (2026)
Object-Centric Action-Enhanced Representations for Robot Visuo-Motor Policy Learning
by: Giannakakis, Nikos, et al.
Published: (2025)
by: Giannakakis, Nikos, et al.
Published: (2025)
ConViTac: Aligning Visual-Tactile Fusion with Contrastive Representations
by: Wu, Zhiyuan, et al.
Published: (2025)
by: Wu, Zhiyuan, et al.
Published: (2025)
Visual Anomaly Detection for Reliable Robotic Implantation of Flexible Microelectrode Array
by: Chen, Yitong, et al.
Published: (2025)
by: Chen, Yitong, et al.
Published: (2025)
Adversarial Attacks and Detection in Visual Place Recognition for Safer Robot Navigation
by: Malone, Connor, et al.
Published: (2025)
by: Malone, Connor, et al.
Published: (2025)
SEBVS: Synthetic Event-based Visual Servoing for Robot Navigation and Manipulation
by: Vinod, Krishna, et al.
Published: (2025)
by: Vinod, Krishna, et al.
Published: (2025)
SEMNAV: Enhancing Visual Semantic Navigation in Robotics through Semantic Segmentation
by: Flor-Rodríguez, Rafael, et al.
Published: (2025)
by: Flor-Rodríguez, Rafael, et al.
Published: (2025)
Robotic Visual Instruction
by: Li, Yanbang, et al.
Published: (2025)
by: Li, Yanbang, et al.
Published: (2025)
ReVLA: Reverting Visual Domain Limitation of Robotic Foundation Models
by: Dey, Sombit, et al.
Published: (2024)
by: Dey, Sombit, et al.
Published: (2024)
What Is The Best 3D Scene Representation for Robotics? From Geometric to Foundation Models
by: Deng, Tianchen, et al.
Published: (2025)
by: Deng, Tianchen, et al.
Published: (2025)
Preference-Driven Active 3D Scene Representation for Robotic Inspection in Nuclear Decommissioning
by: Meng, Zhen, et al.
Published: (2025)
by: Meng, Zhen, et al.
Published: (2025)
StaMo: Unsupervised Learning of Generalizable Robot Motion from Compact State Representation
by: Liu, Mingyu, et al.
Published: (2025)
by: Liu, Mingyu, et al.
Published: (2025)
Similar Items
-
Think Proprioceptively: Embodied Visual Reasoning for VLA Manipulation
by: Wang, Fangyuan, et al.
Published: (2026) -
Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers
by: Wang, Lirui, et al.
Published: (2024) -
CARE: Multi-Task Pretraining for Latent Continuous Action Representation in Robot Control
by: Shi, Jiaqi, et al.
Published: (2026) -
Control-oriented Clustering of Visual Latent Representation
by: Qi, Han, et al.
Published: (2024) -
Bootstrap Dynamic-Aware 3D Visual Representation for Scalable Robot Learning
by: Liang, Qiwei, et al.
Published: (2025)