Saved in:
Bibliographic Details
Main Authors: Sheikholeslami, Sahara, Bölöni, Ladislau
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2504.14634
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915873059504128
author Sheikholeslami, Sahara
Bölöni, Ladislau
author_facet Sheikholeslami, Sahara
Bölöni, Ladislau
contents Robotic manipulation requires explicit or implicit knowledge of the robot's joint positions. Precise proprioception is standard in high-quality industrial robots but is often unavailable in inexpensive robots operating in unstructured environments. In this paper, we ask: to what extent can a fast, single-pass regression architecture perform visual proprioception from a single external camera image, available even in the simplest manipulation settings? We explore several latent representations, including CNNs, VAEs, ViTs, and bags of uncalibrated fiducial markers, using fine-tuning techniques adapted to the limited data available. We evaluate the achievable accuracy through experiments on an inexpensive 6-DoF robot.
format Preprint
id arxiv_https___arxiv_org_abs_2504_14634
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Latent Representations for Visual Proprioception in Inexpensive Robots
Sheikholeslami, Sahara
Bölöni, Ladislau
Robotics
Computer Vision and Pattern Recognition
Robotic manipulation requires explicit or implicit knowledge of the robot's joint positions. Precise proprioception is standard in high-quality industrial robots but is often unavailable in inexpensive robots operating in unstructured environments. In this paper, we ask: to what extent can a fast, single-pass regression architecture perform visual proprioception from a single external camera image, available even in the simplest manipulation settings? We explore several latent representations, including CNNs, VAEs, ViTs, and bags of uncalibrated fiducial markers, using fine-tuning techniques adapted to the limited data available. We evaluate the achievable accuracy through experiments on an inexpensive 6-DoF robot.
title Latent Representations for Visual Proprioception in Inexpensive Robots
topic Robotics
Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2504.14634