:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Tsagkas, Nikolaos, Sochopoulos, Andreas, Danier, Duolikun, Lu, Chris Xiaoxuan, Mac Aodha, Oisin
Format:	Preprint
Published:	2025
Subjects:	Robotics Artificial Intelligence Computer Vision and Pattern Recognition Machine Learning
Online Access:	https://arxiv.org/abs/2502.03270
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Attentive Feature Aggregation or: How Policies Learn to Stop Worrying about Robustness and Attend to Task-Relevant Visual Cues
by: Tsagkas, Nikolaos, et al.
Published: (2025)

Click to Grasp: Zero-Shot Precise Manipulation via Visual Diffusion Descriptors
by: Tsagkas, Nikolaos, et al.
Published: (2024)

DepthCues: Evaluating Monocular Depth Perception in Large Vision Models
by: Danier, Duolikun, et al.
Published: (2024)

Fast Flow-based Visuomotor Policies via Conditional Optimal Transport Couplings
by: Sochopoulos, Andreas, et al.
Published: (2025)

View-Consistent Diffusion Representations for 3D-Consistent Video Generation
by: Danier, Duolikun, et al.
Published: (2025)

Representational Similarity via Interpretable Visual Concepts
by: Kondapaneni, Neehar, et al.
Published: (2025)

Bridging the Sim2Real Gap: Vision Encoder Pre-Training for Visuomotor Policy Transfer
by: Yardi, Yash, et al.
Published: (2025)

ST-MFNet: A Spatio-Temporal Multi-Flow Network for Frame Interpolation
by: Danier, Duolikun, et al.
Published: (2021)

SAOR: Single-View Articulated Object Reconstruction
by: Aygün, Mehmet, et al.
Published: (2023)

Interpretable Text-Guided Image Clustering via Iterative Search
by: Zhao, Bingchen, et al.
Published: (2025)

DINOv3-Diffusion Policy: Self-Supervised Large Visual Model for Visuomotor Diffusion Policy Learning
by: Egbe, ThankGod, et al.
Published: (2025)

Representational Difference Explanations
by: Kondapaneni, Neehar, et al.
Published: (2025)

Efficient Training of Generalizable Visuomotor Policies via Control-Aware Augmentation
by: Zhao, Yinuo, et al.
Published: (2024)

Fast Visuomotor Policy for Robotic Manipulation
by: Jia, Jingkai, et al.
Published: (2025)

3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations
by: Ze, Yanjie, et al.
Published: (2024)

Referring-Aware Visuomotor Policy Learning for Closed-Loop Manipulation
by: Ma, Jiahua, et al.
Published: (2026)

Learning Precise Affordances from Egocentric Videos for Robotic Manipulation
by: Li, Gen, et al.
Published: (2024)

H$^3$DP: Triply-Hierarchical Diffusion Policy for Visuomotor Learning
by: Lu, Yiyang, et al.
Published: (2025)

Dreamitate: Real-World Visuomotor Policy Learning via Video Generation
by: Liang, Junbang, et al.
Published: (2024)

CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction
by: Gong, Zhefei, et al.
Published: (2024)

Enhancing 2D Representation Learning with a 3D Prior
by: Aygün, Mehmet, et al.
Published: (2024)

BVI-VFI: A Video Quality Database for Video Frame Interpolation
by: Danier, Duolikun, et al.
Published: (2022)

Enhancing Deformable Convolution based Video Frame Interpolation with Coarse-to-fine 3D CNN
by: Danier, Duolikun, et al.
Published: (2022)

LDMVFI: Video Frame Interpolation with Latent Diffusion Models
by: Danier, Duolikun, et al.
Published: (2023)

A Subjective Quality Study for Video Frame Interpolation
by: Danier, Duolikun, et al.
Published: (2022)

FLASH: Efficient Visuomotor Policy via Sparse Sampling
by: Bai, Jiaqi, et al.
Published: (2026)

CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion
by: Ma, Jiahua, et al.
Published: (2025)

Improving Semantic Correspondence with Viewpoint-Guided Spherical Maps
by: Mariotti, Octave, et al.
Published: (2023)

Learning Predictive Visuomotor Coordination
by: Jia, Wenqi, et al.
Published: (2025)

FreqPolicy: Frequency Autoregressive Visuomotor Policy with Continuous Tokens
by: Zhong, Yiming, et al.
Published: (2025)

ViTaS: Visual Tactile Soft Fusion Contrastive Learning for Visuomotor Learning
by: Tian, Yufeng, et al.
Published: (2026)

WildSAT: Learning Satellite Image Representations from Wildlife Observations
by: Daroya, Rangel, et al.
Published: (2024)

Crossway Diffusion: Improving Diffusion-based Visuomotor Policy via Self-supervised Learning
by: Li, Xiang, et al.
Published: (2023)

Robust 3D Object Detection from LiDAR-Radar Point Clouds via Cross-Modal Feature Augmentation
by: Deng, Jianning, et al.
Published: (2023)

Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations
by: Hu, Yucheng, et al.
Published: (2024)

MoonSeg3R: Monocular Online Zero-Shot Segment Anything in 3D with Reconstructive Foundation Priors
by: Du, Zhipeng, et al.
Published: (2025)

Adapting by Analogy: OOD Generalization of Visuomotor Policies via Functional Correspondence
by: Gupta, Pranay, et al.
Published: (2025)

Less is More: Discovering Concise Network Explanations
by: Kondapaneni, Neehar, et al.
Published: (2024)

Labeled Data Selection for Category Discovery
by: Zhao, Bingchen, et al.
Published: (2024)

VISC: mmWave Radar Scene Flow Estimation using Pervasive Visual-Inertial Supervision
by: Liu, Kezhong, et al.
Published: (2025)