:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Jiang, Hanwen, Jiang, Zhenyu, Grauman, Kristen, Zhu, Yuke
Format:	Preprint
Published:	2022
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2212.04492
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Audio-Visual Camera Pose Estimation with Passive Scene Sounds and In-the-Wild Video
by: Adebi, Daniel, et al.
Published: (2025)

Learning Object State Changes in Videos: An Open-World Perspective
by: Xue, Zihui, et al.
Published: (2023)

HieraMamba: Video Temporal Grounding via Hierarchical Anchor-Mamba Pooling
by: An, Joungbin, et al.
Published: (2025)

HOI-Swap: Swapping Objects in Videos with Hand-Object Interaction Awareness
by: Xue, Zihui, et al.
Published: (2024)

Switch-a-View: View Selection Learned from Unlabeled In-the-wild Videos
by: Majumder, Sagnik, et al.
Published: (2024)

Generic Objects as Pose Probes for Few-shot View Synthesis
by: Gao, Zhirui, et al.
Published: (2024)

Seeing without Pixels: Perception from Camera Trajectories
by: Xue, Zihui, et al.
Published: (2025)

ExpertEdit: Learning Skill-Aware Motion Editing from Expert Videos
by: Somayazulu, Arjun, et al.
Published: (2026)

Learning Skill-Attributes for Transferable Assessment in Video
by: Ashutosh, Kumar, et al.
Published: (2025)

ViewBridge: Curriculum Knowledge Distillation for Activity View-Invariance Under Extreme Viewpoint Changes
by: Somayazulu, Arjun, et al.
Published: (2025)

UniversalVTG: A Universal and Lightweight Foundation Model for Video Temporal Grounding
by: An, Joungbin, et al.
Published: (2026)

MZEN: Multi-Zoom Enhanced NeRF for 3-D Reconstruction with Unknown Camera Poses
by: Park, Jong-Ik, et al.
Published: (2025)

SPOC: Spatially-Progressing Object State Change Segmentation in Video
by: Mandikal, Priyanka, et al.
Published: (2025)

Category-level Object Detection, Pose Estimation and Reconstruction from Stereo Images
by: Zhang, Chuanrui, et al.
Published: (2024)

FIction: 4D Future Interaction Prediction from Video
by: Ashutosh, Kumar, et al.
Published: (2024)

Seeing the Arrow of Time in Large Multimodal Models
by: Xue, Zihui, et al.
Published: (2025)

Don't Let the Video Speak: Audio-Contrastive Preference Optimization for Audio-Visual Language Models
by: Baid, Ami, et al.
Published: (2026)

Which Viewpoint Shows it Best? Language for Weakly Supervising View Selection in Multi-view Instructional Videos
by: Majumder, Sagnik, et al.
Published: (2024)

Dense Dynamic Scene Reconstruction and Camera Pose Estimation from Multi-View Videos
by: Sun, Shuo, et al.
Published: (2026)

A Construct-Optimize Approach to Sparse View Synthesis without Camera Pose
by: Jiang, Kaiwen, et al.
Published: (2024)

Beyond 'Templates': Category-Agnostic Object Pose, Size, and Shape Estimation from a Single View
by: Zhang, Jinyu, et al.
Published: (2025)

SkillSight: Efficient First-Person Skill Assessment with Gaze
by: Wu, Chi Hsuan, et al.
Published: (2025)

EgoExo-WM: Unlocking Exo Video for Ego World Models
by: Tran, Danny, et al.
Published: (2026)

Progress-Aware Video Frame Captioning
by: Xue, Zihui, et al.
Published: (2024)

Stitch-a-Demo: Video Demonstrations from Multistep Descriptions
by: Wu, Chi Hsuan, et al.
Published: (2025)

SportSkills: Physical Skill Learning from Sports Instructional Videos
by: Ashutosh, Kumar, et al.
Published: (2026)

CleanPose: Category-Level Object Pose Estimation via Causal Learning and Knowledge Distillation
by: Lin, Xiao, et al.
Published: (2025)

Learning a Category-level Object Pose Estimator without Pose Annotations
by: Tian, Fengrui, et al.
Published: (2024)

GCE-Pose: Global Context Enhancement for Category-level Object Pose Estimation
by: Li, Weihang, et al.
Published: (2025)

Learning Unknowns from Unknowns: Diversified Negative Prototypes Generator for Few-Shot Open-Set Recognition
by: Zhang, Zhenyu, et al.
Published: (2024)

Free-Moving Object Reconstruction and Pose Estimation with Virtual Camera
by: Shi, Haixin, et al.
Published: (2024)

Put Myself in Your Shoes: Lifting the Egocentric Perspective from Exocentric Videos
by: Luo, Mi, et al.
Published: (2024)

Detours for Navigating Instructional Videos
by: Ashutosh, Kumar, et al.
Published: (2024)

Indoor 3D Reconstruction with an Unknown Camera-Projector Pair
by: Qi, Zhaoshuai, et al.
Published: (2024)

Real3D: Scaling Up Large Reconstruction Models with Real-World Images
by: Jiang, Hanwen, et al.
Published: (2024)

Marginalized Bundle Adjustment: Multi-View Camera Pose from Monocular Depth Estimates
by: Zhu, Shengjie, et al.
Published: (2026)

CLIPose: Category-Level Object Pose Estimation with Pre-trained Vision-Language Knowledge
by: Lin, Xiao, et al.
Published: (2024)

Mash, Spread, Slice! Learning to Manipulate Object States via Visual Spatial Progress
by: Mandikal, Priyanka, et al.
Published: (2025)

TSM-Pose: Topology-Aware Learning with Semantic Mamba for Category-Level Object Pose Estimation
by: Liu, Jinshuo, et al.
Published: (2026)

Exploring Category-level Articulated Object Pose Tracking on SE(3) Manifolds
by: Meng, Xianhui, et al.
Published: (2025)