Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Tsagkas, Nikolaos, Sochopoulos, Andreas, Danier, Duolikun, Lu, Chris Xiaoxuan, Mac Aodha, Oisin
Format:	Preprint
Published:	2025
Subjects:	Robotics Artificial Intelligence Computer Vision and Pattern Recognition Machine Learning
Online Access:	https://arxiv.org/abs/2502.03270
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

The integration of pre-trained visual representations (PVRs) has significantly advanced visuomotor policy learning. However, effectively leveraging these models remains a challenge. We identify temporal entanglement as a critical, inherent issue when using these time-invariant models in sequential decision-making tasks. This entanglement arises because PVRs, optimised for static image understanding, struggle to represent the temporal dependencies crucial for visuomotor control. In this work, we quantify the impact of temporal entanglement, demonstrating a strong correlation between a policy's success rate and the ability of its latent space to capture task-progression cues. Based on these insights, we propose a simple, yet effective disentanglement baseline designed to mitigate temporal entanglement. Our empirical results show that traditional methods aimed at enriching features with temporal components are insufficient on their own, highlighting the necessity of explicitly addressing temporal disentanglement for robust visuomotor policy learning.

Similar Items