Saved in:
| Main Authors: | Tsagkas, Nikolaos, Sochopoulos, Andreas, Danier, Duolikun, Lu, Chris Xiaoxuan, Mac Aodha, Oisin |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.03270 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Attentive Feature Aggregation or: How Policies Learn to Stop Worrying about Robustness and Attend to Task-Relevant Visual Cues
by: Tsagkas, Nikolaos, et al.
Published: (2025)
by: Tsagkas, Nikolaos, et al.
Published: (2025)
Click to Grasp: Zero-Shot Precise Manipulation via Visual Diffusion Descriptors
by: Tsagkas, Nikolaos, et al.
Published: (2024)
by: Tsagkas, Nikolaos, et al.
Published: (2024)
DepthCues: Evaluating Monocular Depth Perception in Large Vision Models
by: Danier, Duolikun, et al.
Published: (2024)
by: Danier, Duolikun, et al.
Published: (2024)
Fast Flow-based Visuomotor Policies via Conditional Optimal Transport Couplings
by: Sochopoulos, Andreas, et al.
Published: (2025)
by: Sochopoulos, Andreas, et al.
Published: (2025)
View-Consistent Diffusion Representations for 3D-Consistent Video Generation
by: Danier, Duolikun, et al.
Published: (2025)
by: Danier, Duolikun, et al.
Published: (2025)
Representational Similarity via Interpretable Visual Concepts
by: Kondapaneni, Neehar, et al.
Published: (2025)
by: Kondapaneni, Neehar, et al.
Published: (2025)
Bridging the Sim2Real Gap: Vision Encoder Pre-Training for Visuomotor Policy Transfer
by: Yardi, Yash, et al.
Published: (2025)
by: Yardi, Yash, et al.
Published: (2025)
ST-MFNet: A Spatio-Temporal Multi-Flow Network for Frame Interpolation
by: Danier, Duolikun, et al.
Published: (2021)
by: Danier, Duolikun, et al.
Published: (2021)
SAOR: Single-View Articulated Object Reconstruction
by: Aygün, Mehmet, et al.
Published: (2023)
by: Aygün, Mehmet, et al.
Published: (2023)
Interpretable Text-Guided Image Clustering via Iterative Search
by: Zhao, Bingchen, et al.
Published: (2025)
by: Zhao, Bingchen, et al.
Published: (2025)
DINOv3-Diffusion Policy: Self-Supervised Large Visual Model for Visuomotor Diffusion Policy Learning
by: Egbe, ThankGod, et al.
Published: (2025)
by: Egbe, ThankGod, et al.
Published: (2025)
Representational Difference Explanations
by: Kondapaneni, Neehar, et al.
Published: (2025)
by: Kondapaneni, Neehar, et al.
Published: (2025)
Efficient Training of Generalizable Visuomotor Policies via Control-Aware Augmentation
by: Zhao, Yinuo, et al.
Published: (2024)
by: Zhao, Yinuo, et al.
Published: (2024)
Fast Visuomotor Policy for Robotic Manipulation
by: Jia, Jingkai, et al.
Published: (2025)
by: Jia, Jingkai, et al.
Published: (2025)
3D Diffusion Policy: Generalizable Visuomotor Policy Learning via Simple 3D Representations
by: Ze, Yanjie, et al.
Published: (2024)
by: Ze, Yanjie, et al.
Published: (2024)
Referring-Aware Visuomotor Policy Learning for Closed-Loop Manipulation
by: Ma, Jiahua, et al.
Published: (2026)
by: Ma, Jiahua, et al.
Published: (2026)
Learning Precise Affordances from Egocentric Videos for Robotic Manipulation
by: Li, Gen, et al.
Published: (2024)
by: Li, Gen, et al.
Published: (2024)
H$^3$DP: Triply-Hierarchical Diffusion Policy for Visuomotor Learning
by: Lu, Yiyang, et al.
Published: (2025)
by: Lu, Yiyang, et al.
Published: (2025)
Dreamitate: Real-World Visuomotor Policy Learning via Video Generation
by: Liang, Junbang, et al.
Published: (2024)
by: Liang, Junbang, et al.
Published: (2024)
CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction
by: Gong, Zhefei, et al.
Published: (2024)
by: Gong, Zhefei, et al.
Published: (2024)
Enhancing 2D Representation Learning with a 3D Prior
by: Aygün, Mehmet, et al.
Published: (2024)
by: Aygün, Mehmet, et al.
Published: (2024)
BVI-VFI: A Video Quality Database for Video Frame Interpolation
by: Danier, Duolikun, et al.
Published: (2022)
by: Danier, Duolikun, et al.
Published: (2022)
Enhancing Deformable Convolution based Video Frame Interpolation with Coarse-to-fine 3D CNN
by: Danier, Duolikun, et al.
Published: (2022)
by: Danier, Duolikun, et al.
Published: (2022)
LDMVFI: Video Frame Interpolation with Latent Diffusion Models
by: Danier, Duolikun, et al.
Published: (2023)
by: Danier, Duolikun, et al.
Published: (2023)
A Subjective Quality Study for Video Frame Interpolation
by: Danier, Duolikun, et al.
Published: (2022)
by: Danier, Duolikun, et al.
Published: (2022)
FLASH: Efficient Visuomotor Policy via Sparse Sampling
by: Bai, Jiaqi, et al.
Published: (2026)
by: Bai, Jiaqi, et al.
Published: (2026)
CDP: Towards Robust Autoregressive Visuomotor Policy Learning via Causal Diffusion
by: Ma, Jiahua, et al.
Published: (2025)
by: Ma, Jiahua, et al.
Published: (2025)
Improving Semantic Correspondence with Viewpoint-Guided Spherical Maps
by: Mariotti, Octave, et al.
Published: (2023)
by: Mariotti, Octave, et al.
Published: (2023)
Learning Predictive Visuomotor Coordination
by: Jia, Wenqi, et al.
Published: (2025)
by: Jia, Wenqi, et al.
Published: (2025)
FreqPolicy: Frequency Autoregressive Visuomotor Policy with Continuous Tokens
by: Zhong, Yiming, et al.
Published: (2025)
by: Zhong, Yiming, et al.
Published: (2025)
ViTaS: Visual Tactile Soft Fusion Contrastive Learning for Visuomotor Learning
by: Tian, Yufeng, et al.
Published: (2026)
by: Tian, Yufeng, et al.
Published: (2026)
WildSAT: Learning Satellite Image Representations from Wildlife Observations
by: Daroya, Rangel, et al.
Published: (2024)
by: Daroya, Rangel, et al.
Published: (2024)
Crossway Diffusion: Improving Diffusion-based Visuomotor Policy via Self-supervised Learning
by: Li, Xiang, et al.
Published: (2023)
by: Li, Xiang, et al.
Published: (2023)
Robust 3D Object Detection from LiDAR-Radar Point Clouds via Cross-Modal Feature Augmentation
by: Deng, Jianning, et al.
Published: (2023)
by: Deng, Jianning, et al.
Published: (2023)
Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations
by: Hu, Yucheng, et al.
Published: (2024)
by: Hu, Yucheng, et al.
Published: (2024)
MoonSeg3R: Monocular Online Zero-Shot Segment Anything in 3D with Reconstructive Foundation Priors
by: Du, Zhipeng, et al.
Published: (2025)
by: Du, Zhipeng, et al.
Published: (2025)
Adapting by Analogy: OOD Generalization of Visuomotor Policies via Functional Correspondence
by: Gupta, Pranay, et al.
Published: (2025)
by: Gupta, Pranay, et al.
Published: (2025)
Less is More: Discovering Concise Network Explanations
by: Kondapaneni, Neehar, et al.
Published: (2024)
by: Kondapaneni, Neehar, et al.
Published: (2024)
Labeled Data Selection for Category Discovery
by: Zhao, Bingchen, et al.
Published: (2024)
by: Zhao, Bingchen, et al.
Published: (2024)
VISC: mmWave Radar Scene Flow Estimation using Pervasive Visual-Inertial Supervision
by: Liu, Kezhong, et al.
Published: (2025)
by: Liu, Kezhong, et al.
Published: (2025)
Similar Items
-
Attentive Feature Aggregation or: How Policies Learn to Stop Worrying about Robustness and Attend to Task-Relevant Visual Cues
by: Tsagkas, Nikolaos, et al.
Published: (2025) -
Click to Grasp: Zero-Shot Precise Manipulation via Visual Diffusion Descriptors
by: Tsagkas, Nikolaos, et al.
Published: (2024) -
DepthCues: Evaluating Monocular Depth Perception in Large Vision Models
by: Danier, Duolikun, et al.
Published: (2024) -
Fast Flow-based Visuomotor Policies via Conditional Optimal Transport Couplings
by: Sochopoulos, Andreas, et al.
Published: (2025) -
View-Consistent Diffusion Representations for 3D-Consistent Video Generation
by: Danier, Duolikun, et al.
Published: (2025)