Saved in:
| Main Authors: | Gu, Yuming, Wang, Yizhi, Hong, Yining, Gao, Yipeng, Jiang, Hao, Wang, Angtian, Liu, Bo, Dennler, Nathaniel S., Kuang, Zhengfei, Li, Hao, Wetzstein, Gordon, Ma, Chongyang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2512.22626 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control
by: Kuang, Zhengfei, et al.
Published: (2024)
by: Kuang, Zhengfei, et al.
Published: (2024)
GeoFlow: Enforcing Implicit Geometric Consistency in Video Generation
by: Ackermann, Jan, et al.
Published: (2026)
by: Ackermann, Jan, et al.
Published: (2026)
VIVA: VLM-Guided Instruction-Based Video Editing with Reward Optimization
by: Cong, Xiaoyan, et al.
Published: (2025)
by: Cong, Xiaoyan, et al.
Published: (2025)
VULCAN: Tool-Augmented Multi Agents for Iterative 3D Object Arrangement
by: Kuang, Zhengfei, et al.
Published: (2025)
by: Kuang, Zhengfei, et al.
Published: (2025)
Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors
by: Kuang, Zhengfei, et al.
Published: (2024)
by: Kuang, Zhengfei, et al.
Published: (2024)
CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance
by: Deng, Yufan, et al.
Published: (2025)
by: Deng, Yufan, et al.
Published: (2025)
ATI: Any Trajectory Instruction for Controllable Video Generation
by: Wang, Angtian, et al.
Published: (2025)
by: Wang, Angtian, et al.
Published: (2025)
BulletTime: Decoupled Control of Time and Camera Pose for Video Generation
by: Wang, Yiming, et al.
Published: (2025)
by: Wang, Yiming, et al.
Published: (2025)
MAGREF: Masked Guidance for Any-Reference Video Generation with Subject Disentanglement
by: Deng, Yufan, et al.
Published: (2025)
by: Deng, Yufan, et al.
Published: (2025)
Spectral Progressive Diffusion for Efficient Image and Video Generation
by: Xiao, Howard, et al.
Published: (2026)
by: Xiao, Howard, et al.
Published: (2026)
Foveated Diffusion: Efficient Spatially Adaptive Image and Video Generation
by: Chao, Brian, et al.
Published: (2026)
by: Chao, Brian, et al.
Published: (2026)
HECTOR: Hybrid Editable Compositional Object References for Video Generation
by: Zhang, Guofeng, et al.
Published: (2026)
by: Zhang, Guofeng, et al.
Published: (2026)
CameraCtrl: Enabling Camera Control for Text-to-Video Generation
by: He, Hao, et al.
Published: (2024)
by: He, Hao, et al.
Published: (2024)
Contrastive Learning from Exploratory Actions: Leveraging Natural Interactions for Preference Elicitation
by: Dennler, Nathaniel, et al.
Published: (2025)
by: Dennler, Nathaniel, et al.
Published: (2025)
GIFT: Generalizing Intent for Flexible Test-Time Rewards
by: Amin, Fin, et al.
Published: (2026)
by: Amin, Fin, et al.
Published: (2026)
Using Causal Trees to Estimate Personalized Task Difficulty in Post-Stroke Individuals
by: Dennler, Nathaniel, et al.
Published: (2024)
by: Dennler, Nathaniel, et al.
Published: (2024)
The Current State of AI Bias Bounties: An Overview of Existing Programmes and Research
by: Kucenko, Sergej, et al.
Published: (2025)
by: Kucenko, Sergej, et al.
Published: (2025)
Singing the Body Electric: The Impact of Robot Embodiment on User Expectations
by: Dennler, Nathaniel, et al.
Published: (2024)
by: Dennler, Nathaniel, et al.
Published: (2024)
BAgger: Backwards Aggregation for Mitigating Drift in Autoregressive Video Diffusion Models
by: Po, Ryan, et al.
Published: (2025)
by: Po, Ryan, et al.
Published: (2025)
Dual Ascent Diffusion for Inverse Problems
by: Kim, Minseo, et al.
Published: (2025)
by: Kim, Minseo, et al.
Published: (2025)
Generic 3D Diffusion Adapter Using Controlled Multi-View Editing
by: Chen, Hansheng, et al.
Published: (2024)
by: Chen, Hansheng, et al.
Published: (2024)
Frame Context Packing and Drift Prevention in Next-Frame-Prediction Video Diffusion Models
by: Zhang, Lvmin, et al.
Published: (2025)
by: Zhang, Lvmin, et al.
Published: (2025)
X-Dyna: Expressive Dynamic Human Image Animation
by: Chang, Di, et al.
Published: (2025)
by: Chang, Di, et al.
Published: (2025)
Infinite Gaze Generation for Videos with Autoregressive Diffusion
by: Kang, Jenna, et al.
Published: (2026)
by: Kang, Jenna, et al.
Published: (2026)
Position: Olfaction Standardization is Essential for the Advancement of Embodied Artificial Intelligence
by: France, Kordel K., et al.
Published: (2025)
by: France, Kordel K., et al.
Published: (2025)
Designing Robot Identity: The Role of Voice, Clothing, and Task on Robot Gender Perception
by: Dennler, Nathaniel S., et al.
Published: (2024)
by: Dennler, Nathaniel S., et al.
Published: (2024)
Improving User Experience in Preference-Based Optimization of Reward Functions for Assistive Robots
by: Dennler, Nathaniel, et al.
Published: (2024)
by: Dennler, Nathaniel, et al.
Published: (2024)
TGT: Text-Grounded Trajectories for Locally Controlled Video Generation
by: Zhang, Guofeng, et al.
Published: (2025)
by: Zhang, Guofeng, et al.
Published: (2025)
Orthogonal Adaptation for Modular Customization of Diffusion Models
by: Po, Ryan, et al.
Published: (2023)
by: Po, Ryan, et al.
Published: (2023)
Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion
by: Deng, Boyang, et al.
Published: (2024)
by: Deng, Boyang, et al.
Published: (2024)
CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models
by: He, Hao, et al.
Published: (2025)
by: He, Hao, et al.
Published: (2025)
RelightVid: Temporal-Consistent Diffusion Model for Video Relighting
by: Fang, Ye, et al.
Published: (2025)
by: Fang, Ye, et al.
Published: (2025)
LiPUP-MA: A Residential Experience-centric Multi-Agent Framework for Living-in-the-loop Participatory Urban Planning
by: Ni, Hang, et al.
Published: (2024)
by: Ni, Hang, et al.
Published: (2024)
Turning Text and Imagery into Captivating Visual Video
by: Wang, Mingming, et al.
Published: (2024)
by: Wang, Mingming, et al.
Published: (2024)
Masked IRL: LLM-Guided Reward Disambiguation from Demonstrations and Language
by: Hwang, Minyoung, et al.
Published: (2025)
by: Hwang, Minyoung, et al.
Published: (2025)
QuickLAP: Quick Language-Action Preference Learning for Semi-Autonomous Agents
by: Nader, Jordan Abi, et al.
Published: (2025)
by: Nader, Jordan Abi, et al.
Published: (2025)
EmbodiedPlace: Learning Mixture-of-Features with Embodied Constraints for Visual Place Recognition
by: Liu, Bingxi, et al.
Published: (2025)
by: Liu, Bingxi, et al.
Published: (2025)
SplatPainter: Interactive Authoring of 3D Gaussians from 2D Edits via Test-Time Training
by: Zheng, Yang, et al.
Published: (2025)
by: Zheng, Yang, et al.
Published: (2025)
Neural Ganglion Sensors: Learning Task-specific Event Cameras Inspired by the Neural Circuit of the Human Retina
by: So, Haley M., et al.
Published: (2025)
by: So, Haley M., et al.
Published: (2025)
EnerVerse-AC: Envisioning Embodied Environments with Action Condition
by: Jiang, Yuxin, et al.
Published: (2025)
by: Jiang, Yuxin, et al.
Published: (2025)
Similar Items
-
Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control
by: Kuang, Zhengfei, et al.
Published: (2024) -
GeoFlow: Enforcing Implicit Geometric Consistency in Video Generation
by: Ackermann, Jan, et al.
Published: (2026) -
VIVA: VLM-Guided Instruction-Based Video Editing with Reward Optimization
by: Cong, Xiaoyan, et al.
Published: (2025) -
VULCAN: Tool-Augmented Multi Agents for Iterative 3D Object Arrangement
by: Kuang, Zhengfei, et al.
Published: (2025) -
Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors
by: Kuang, Zhengfei, et al.
Published: (2024)