:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Chu, Wen-Hsuan, Harley, Adam W., Tokmakov, Pavel, Dave, Achal, Guibas, Leonidas, Fragkiadaki, Katerina
Format:	Preprint
Published:	2023
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2310.06992
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

AllTracker: Efficient Dense Point Tracking at High Resolution
by: Harley, Adam W., et al.
Published: (2025)

GRIN: Zero-Shot Metric Depth with Pixel-Level Diffusion
by: Guizilini, Vitor, et al.
Published: (2024)

Refining Pre-Trained Motion Models
by: Sun, Xinglong, et al.
Published: (2024)

Generative 4D Scene Gaussian Splatting with Object View-Synthesis Priors
by: Chu, Wen-Hsuan, et al.
Published: (2025)

Zero-Shot Image Feature Consensus with Deep Functional Maps
by: Cheng, Xinle, et al.
Published: (2024)

TAPIP3D: Tracking Any Point in Persistent 3D Geometry
by: Zhang, Bowei, et al.
Published: (2025)

DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos
by: Chu, Wen-Hsuan, et al.
Published: (2024)

Support-Set Context Matters for Bongard Problems
by: Raghuraman, Nikhil, et al.
Published: (2023)

View-Consistent Hierarchical 3D Segmentation Using Ultrametric Feature Fields
by: He, Haodi, et al.
Published: (2024)

Animal Pose Labeling Using General-Purpose Point Trackers
by: Pan, Zhuoyang, et al.
Published: (2025)

LookOut: Real-World Humanoid Egocentric Navigation
by: Pan, Boxiao, et al.
Published: (2025)

MoSca: Dynamic Gaussian Fusion from Casual Videos via 4D Motion Scaffolds
by: Lei, Jiahui, et al.
Published: (2024)

pix2gestalt: Amodal Segmentation by Synthesizing Wholes
by: Ozguroglu, Ege, et al.
Published: (2024)

Understanding Video Transformers via Universal Concept Discovery
by: Kowal, Matthew, et al.
Published: (2024)

Dreamitate: Real-World Visuomotor Policy Learning via Video Generation
by: Liang, Junbang, et al.
Published: (2024)

Energy-based Models are Zero-Shot Planners for Compositional Scene Rearrangement
by: Gkanatsios, Nikolaos, et al.
Published: (2023)

Understanding Complexity in VideoQA via Visual Program Generation
by: Eyzaguirre, Cristobal, et al.
Published: (2025)

Dynamic Gaussian Marbles for Novel View Synthesis of Casual Monocular Videos
by: Stearns, Colton, et al.
Published: (2024)

Diffusion Self-Distillation for Zero-Shot Customized Image Generation
by: Cai, Shengqu, et al.
Published: (2024)

Monocular Dynamic Gaussian Splatting: Fast, Brittle, and Scene Complexity Rules
by: Liang, Yiqing, et al.
Published: (2024)

Zero-Shot Open-Vocabulary Human Motion Grounding with Test-Time Training
by: Zhou, Yunjiao, et al.
Published: (2025)

PACE: A Large-Scale Dataset with Pose Annotations in Cluttered Environments
by: You, Yang, et al.
Published: (2023)

Synthetic Captions for Open-Vocabulary Zero-Shot Segmentation
by: Lebailly, Tim, et al.
Published: (2025)

ODIN: A Single Model for 2D and 3D Segmentation
by: Jain, Ayush, et al.
Published: (2024)

RAM: Retrieval-Based Affordance Transfer for Generalizable Zero-Shot Robotic Manipulation
by: Kuang, Yuxuan, et al.
Published: (2024)

Dex4D: Task-Agnostic Point Track Policy for Sim-to-Real Dexterous Manipulation
by: Kuang, Yuxuan, et al.
Published: (2026)

BlenderAlchemy: Editing 3D Graphics with Vision-Language Models
by: Huang, Ian, et al.
Published: (2024)

Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis
by: Van Hoorick, Basile, et al.
Published: (2024)

A Simple Framework for Open-Vocabulary Zero-Shot Segmentation
by: Stegmüller, Thomas, et al.
Published: (2024)

Exploring Vision-Language Models for Open-Vocabulary Zero-Shot Action Segmentation
by: Unmesh, Asim, et al.
Published: (2026)

Zero-Shot Video Semantic Segmentation based on Pre-Trained Diffusion Models
by: Wang, Qian, et al.
Published: (2024)

Open-Vocabulary Panoptic Segmentation Using BERT Pre-Training of Vision-Language Multiway Transformer Model
by: Chen, Yi-Chia, et al.
Published: (2024)

Asymmetric Flow Models
by: Chen, Hansheng, et al.
Published: (2026)

RADSeg: Unleashing Parameter and Compute Efficient Zero-Shot Open-Vocabulary Segmentation Using Agglomerative Models
by: Alama, Omar, et al.
Published: (2025)

Aligning Text-to-Image Diffusion Models with Reward Backpropagation
by: Prabhudesai, Mihir, et al.
Published: (2023)

Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large Multi-Modal Models
by: Kurzendörfer, David, et al.
Published: (2024)

RSVG-ZeroOV: Exploring a Training-Free Framework for Zero-Shot Open-Vocabulary Visual Grounding in Remote Sensing Images
by: Li, Ke, et al.
Published: (2025)

OCH3R: Object-Centric Holistic 3D Reconstruction
by: Du, Yi, et al.
Published: (2026)

InfoGaussian: Structure-Aware Dynamic Gaussians through Lightweight Information Shaping
by: Zhang, Yunchao, et al.
Published: (2024)

SparseDFF: Sparse-View Feature Distillation for One-Shot Dexterous Manipulation
by: Wang, Qianxu, et al.
Published: (2023)