Saved in:
Bibliographic Details
Main Authors: Wang, Yihao, Miao, Yang, Zhao, Wenshuai, Yang, Wenyan, Wang, Zihan, Pajarinen, Joni, Van Gool, Luc, Paudel, Danda Pani, Kannala, Juho, Wang, Xi, Solin, Arno
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.25539
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Articulation perception aims to recover the motion and structure of articulated objects (e.g., drawers and cupboards), and is fundamental to 3D scene understanding in robotics, simulation, and animation. Existing learning-based methods rely heavily on supervised training with high-quality 3D data and manual annotations, limiting scalability and diversity. To address this limitation, we propose PAWS, a method that directly extracts object articulations from hand-object interactions in large-scale in-the-wild egocentric videos. We evaluate our method on the public data sets, including HD-EPIC and Arti4D data sets, achieving significant improvements over baselines. We further demonstrate that the extracted articulations benefit downstream tasks, including fine-tuning 3D articulation prediction models and enabling robot manipulation. See the project website at https://aaltoml.github.io/PAWS/.