Saved in:
| Main Authors: | Litman, Yehonathan, Liu, Shikun, Seyb, Dario, Milef, Nicholas, Zhou, Yang, Marshall, Carl, Tulsiani, Shubham, Leak, Caleb |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2602.15031 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
LightSwitch: Multi-view Relighting with Material-guided Diffusion
by: Litman, Yehonathan, et al.
Published: (2025)
by: Litman, Yehonathan, et al.
Published: (2025)
MaterialFusion: Enhancing Inverse Rendering with Material Diffusion Priors
by: Litman, Yehonathan, et al.
Published: (2024)
by: Litman, Yehonathan, et al.
Published: (2024)
Towards Unstructured Unlabeled Optical Mocap: A Video Helps!
by: Milef, Nicholas, et al.
Published: (2024)
by: Milef, Nicholas, et al.
Published: (2024)
Real‐Time Neural Materials on Mobile VR
by: Zilin Xu, et al.
Published: (2026)
by: Zilin Xu, et al.
Published: (2026)
Sparse-view Pose Estimation and Reconstruction via Analysis by Generative Synthesis
by: Zhao, Qitao, et al.
Published: (2024)
by: Zhao, Qitao, et al.
Published: (2024)
SceneFactor: Factored Latent 3D Diffusion for Controllable 3D Scene Generation
by: Bokhovkin, Alexey, et al.
Published: (2024)
by: Bokhovkin, Alexey, et al.
Published: (2024)
REST3D: Reconstructing Physically Stable 3D Scenes from a Single Image
by: Ma, Xiaoxuan, et al.
Published: (2026)
by: Ma, Xiaoxuan, et al.
Published: (2026)
DriveCtrl: Conditioned Sim-to-Real Driving Video Generation
by: Zhao, Haonan, et al.
Published: (2026)
by: Zhao, Haonan, et al.
Published: (2026)
Dex4D: Task-Agnostic Point Track Policy for Sim-to-Real Dexterous Manipulation
by: Kuang, Yuxuan, et al.
Published: (2026)
by: Kuang, Yuxuan, et al.
Published: (2026)
Track2Act: Predicting Point Tracks from Internet Videos enables Generalizable Robot Manipulation
by: Bharadhwaj, Homanga, et al.
Published: (2024)
by: Bharadhwaj, Homanga, et al.
Published: (2024)
EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing
by: Li, Runjia, et al.
Published: (2025)
by: Li, Runjia, et al.
Published: (2025)
CameraCtrl: Enabling Camera Control for Text-to-Video Generation
by: He, Hao, et al.
Published: (2024)
by: He, Hao, et al.
Published: (2024)
PhysCtrl: Generative Physics for Controllable and Physics-Grounded Video Generation
by: Wang, Chen, et al.
Published: (2025)
by: Wang, Chen, et al.
Published: (2025)
DemoDiffusion: One-Shot Human Imitation using pre-trained Diffusion Policy
by: Park, Sungjae, et al.
Published: (2025)
by: Park, Sungjae, et al.
Published: (2025)
G-HOP: Generative Hand-Object Prior for Interaction Reconstruction and Grasp Synthesis
by: Ye, Yufei, et al.
Published: (2024)
by: Ye, Yufei, et al.
Published: (2024)
CRISP: Contact-Guided Real2Sim from Monocular Video with Planar Scene Primitives
by: Wang, Zihan, et al.
Published: (2025)
by: Wang, Zihan, et al.
Published: (2025)
MVD-Fusion: Single-view 3D via Depth-consistent Multi-view Generation
by: Hu, Hanzhe, et al.
Published: (2024)
by: Hu, Hanzhe, et al.
Published: (2024)
CtrlVDiff: Controllable Video Generation via Unified Multimodal Video Diffusion
by: Xi, Dianbing, et al.
Published: (2025)
by: Xi, Dianbing, et al.
Published: (2025)
DressRecon: Freeform 4D Human Reconstruction from Monocular Video
by: Tan, Jeff, et al.
Published: (2024)
by: Tan, Jeff, et al.
Published: (2024)
BlobCtrl: Taming Controllable Blob for Element-level Image Editing
by: Li, Yaowei, et al.
Published: (2025)
by: Li, Yaowei, et al.
Published: (2025)
MotionCtrl: A Unified and Flexible Motion Controller for Video Generation
by: Wang, Zhouxia, et al.
Published: (2023)
by: Wang, Zhouxia, et al.
Published: (2023)
LightCtrl: Training-free Controllable Video Relighting
by: Peng, Yizuo, et al.
Published: (2026)
by: Peng, Yizuo, et al.
Published: (2026)
TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control
by: Zeng, Weichao, et al.
Published: (2024)
by: Zeng, Weichao, et al.
Published: (2024)
Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation
by: Bharadhwaj, Homanga, et al.
Published: (2024)
by: Bharadhwaj, Homanga, et al.
Published: (2024)
Ctrl-VI: Controllable Video Synthesis via Variational Inference
by: Duan, Haoyi, et al.
Published: (2025)
by: Duan, Haoyi, et al.
Published: (2025)
Flow3r: Factored Flow Prediction for Scalable Visual Geometry Learning
by: Cong, Zhongxiao, et al.
Published: (2026)
by: Cong, Zhongxiao, et al.
Published: (2026)
Diverse Score Distillation
by: Xu, Yanbo, et al.
Published: (2024)
by: Xu, Yanbo, et al.
Published: (2024)
Ctrl-V: Higher Fidelity Video Generation with Bounding-Box Controlled Object Motion
by: Luo, Ge Ya, et al.
Published: (2024)
by: Luo, Ge Ya, et al.
Published: (2024)
EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning
by: Ju, Xuan, et al.
Published: (2025)
by: Ju, Xuan, et al.
Published: (2025)
DisProtEdit: Exploring Disentangled Representations for Multi-Attribute Protein Editing
by: Ku, Max, et al.
Published: (2025)
by: Ku, Max, et al.
Published: (2025)
Predicting 4D Hand Trajectory from Monocular Videos
by: Ye, Yufei, et al.
Published: (2025)
by: Ye, Yufei, et al.
Published: (2025)
ControlEdit: A MultiModal Local Clothing Image Editing Method
by: Cheng, Di, et al.
Published: (2024)
by: Cheng, Di, et al.
Published: (2024)
Edit As You Wish: Video Caption Editing with Multi-grained User Control
by: Yao, Linli, et al.
Published: (2023)
by: Yao, Linli, et al.
Published: (2023)
PPS-Ctrl: Controllable Sim-to-Real Translation for Colonoscopy Depth Estimation
by: Xiong, Xinqi, et al.
Published: (2025)
by: Xiong, Xinqi, et al.
Published: (2025)
DexCtrl: Towards Sim-to-Real Dexterity with Adaptive Controller Learning
by: Zhao, Shuqi, et al.
Published: (2025)
by: Zhao, Shuqi, et al.
Published: (2025)
EmoCtrl: Controllable Emotional Image Content Generation
by: Yang, Jingyuan, et al.
Published: (2025)
by: Yang, Jingyuan, et al.
Published: (2025)
List Decoding Expander-Based Codes up to Capacity in Near-Linear Time
by: Srivastava, Shashank, et al.
Published: (2025)
by: Srivastava, Shashank, et al.
Published: (2025)
ExpressEdit: Video Editing with Natural Language and Sketching
by: Tilekbay, Bekzat, et al.
Published: (2024)
by: Tilekbay, Bekzat, et al.
Published: (2024)
UniPhy: Learning a Unified Constitutive Model for Inverse Physics Simulation
by: Mittal, Himangi, et al.
Published: (2025)
by: Mittal, Himangi, et al.
Published: (2025)
Edit-Your-Motion: Space-Time Diffusion Decoupling Learning for Video Motion Editing
by: Zuo, Yi, et al.
Published: (2024)
by: Zuo, Yi, et al.
Published: (2024)
Similar Items
-
LightSwitch: Multi-view Relighting with Material-guided Diffusion
by: Litman, Yehonathan, et al.
Published: (2025) -
MaterialFusion: Enhancing Inverse Rendering with Material Diffusion Priors
by: Litman, Yehonathan, et al.
Published: (2024) -
Towards Unstructured Unlabeled Optical Mocap: A Video Helps!
by: Milef, Nicholas, et al.
Published: (2024) -
Real‐Time Neural Materials on Mobile VR
by: Zilin Xu, et al.
Published: (2026) -
Sparse-view Pose Estimation and Reconstruction via Analysis by Generative Synthesis
by: Zhao, Qitao, et al.
Published: (2024)