:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Litman, Yehonathan, Liu, Shikun, Seyb, Dario, Milef, Nicholas, Zhou, Yang, Marshall, Carl, Tulsiani, Shubham, Leak, Caleb
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2602.15031
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

LightSwitch: Multi-view Relighting with Material-guided Diffusion
by: Litman, Yehonathan, et al.
Published: (2025)

MaterialFusion: Enhancing Inverse Rendering with Material Diffusion Priors
by: Litman, Yehonathan, et al.
Published: (2024)

Towards Unstructured Unlabeled Optical Mocap: A Video Helps!
by: Milef, Nicholas, et al.
Published: (2024)

Real‐Time Neural Materials on Mobile VR
by: Zilin Xu, et al.
Published: (2026)

Sparse-view Pose Estimation and Reconstruction via Analysis by Generative Synthesis
by: Zhao, Qitao, et al.
Published: (2024)

SceneFactor: Factored Latent 3D Diffusion for Controllable 3D Scene Generation
by: Bokhovkin, Alexey, et al.
Published: (2024)

REST3D: Reconstructing Physically Stable 3D Scenes from a Single Image
by: Ma, Xiaoxuan, et al.
Published: (2026)

DriveCtrl: Conditioned Sim-to-Real Driving Video Generation
by: Zhao, Haonan, et al.
Published: (2026)

Dex4D: Task-Agnostic Point Track Policy for Sim-to-Real Dexterous Manipulation
by: Kuang, Yuxuan, et al.
Published: (2026)

Track2Act: Predicting Point Tracks from Internet Videos enables Generalizable Robot Manipulation
by: Bharadhwaj, Homanga, et al.
Published: (2024)

EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing
by: Li, Runjia, et al.
Published: (2025)

CameraCtrl: Enabling Camera Control for Text-to-Video Generation
by: He, Hao, et al.
Published: (2024)

PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation
by: Wang, Chen, et al.
Published: (2025)

DemoDiffusion: One-Shot Human Imitation using pre-trained Diffusion Policy
by: Park, Sungjae, et al.
Published: (2025)

G-HOP: Generative Hand-Object Prior for Interaction Reconstruction and Grasp Synthesis
by: Ye, Yufei, et al.
Published: (2024)

CRISP: Contact-Guided Real2Sim from Monocular Video with Planar Scene Primitives
by: Wang, Zihan, et al.
Published: (2025)

MVD-Fusion: Single-view 3D via Depth-consistent Multi-view Generation
by: Hu, Hanzhe, et al.
Published: (2024)

CtrlVDiff: Controllable Video Generation via Unified Multimodal Video Diffusion
by: Xi, Dianbing, et al.
Published: (2025)

DressRecon: Freeform 4D Human Reconstruction from Monocular Video
by: Tan, Jeff, et al.
Published: (2024)

BlobCtrl: Taming Controllable Blob for Element-level Image Editing
by: Li, Yaowei, et al.
Published: (2025)

MotionCtrl: A Unified and Flexible Motion Controller for Video Generation
by: Wang, Zhouxia, et al.
Published: (2023)

LightCtrl: Training-free Controllable Video Relighting
by: Peng, Yizuo, et al.
Published: (2026)

TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control
by: Zeng, Weichao, et al.
Published: (2024)

Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation
by: Bharadhwaj, Homanga, et al.
Published: (2024)

Ctrl-VI: Controllable Video Synthesis via Variational Inference
by: Duan, Haoyi, et al.
Published: (2025)

Flow3r: Factored Flow Prediction for Scalable Visual Geometry Learning
by: Cong, Zhongxiao, et al.
Published: (2026)

Diverse Score Distillation
by: Xu, Yanbo, et al.
Published: (2024)

Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion
by: Luo, Ge Ya, et al.
Published: (2024)

EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning
by: Ju, Xuan, et al.
Published: (2025)

DisProtEdit: Exploring Disentangled Representations for Multi-Attribute Protein Editing
by: Ku, Max, et al.
Published: (2025)

Predicting 4D Hand Trajectory from Monocular Videos
by: Ye, Yufei, et al.
Published: (2025)

ControlEdit: A MultiModal Local Clothing Image Editing Method
by: Cheng, Di, et al.
Published: (2024)

Edit As You Wish: Video Caption Editing with Multi-grained User Control
by: Yao, Linli, et al.
Published: (2023)

PPS-Ctrl: Controllable Sim-to-Real Translation for Colonoscopy Depth Estimation
by: Xiong, Xinqi, et al.
Published: (2025)

DexCtrl: Towards Sim-to-Real Dexterity with Adaptive Controller Learning
by: Zhao, Shuqi, et al.
Published: (2025)

EmoCtrl: Controllable Emotional Image Content Generation
by: Yang, Jingyuan, et al.
Published: (2025)

List Decoding Expander-Based Codes up to Capacity in Near-Linear Time
by: Srivastava, Shashank, et al.
Published: (2025)

ExpressEdit: Video Editing with Natural Language and Sketching
by: Tilekbay, Bekzat, et al.
Published: (2024)

UniPhy: Learning a Unified Constitutive Model for Inverse Physics Simulation
by: Mittal, Himangi, et al.
Published: (2025)

Edit-Your-Motion: Space-Time Diffusion Decoupling Learning for Video Motion Editing
by: Zuo, Yi, et al.
Published: (2024)