:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Gu, Yuming, Wang, Yizhi, Hong, Yining, Gao, Yipeng, Jiang, Hao, Wang, Angtian, Liu, Bo, Dennler, Nathaniel S., Kuang, Zhengfei, Li, Hao, Wetzstein, Gordon, Ma, Chongyang
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2512.22626
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control
by: Kuang, Zhengfei, et al.
Published: (2024)

GeoFlow: Enforcing Implicit Geometric Consistency in Video Generation
by: Ackermann, Jan, et al.
Published: (2026)

VIVA: VLM-Guided Instruction-Based Video Editing with Reward Optimization
by: Cong, Xiaoyan, et al.
Published: (2025)

VULCAN: Tool-Augmented Multi Agents for Iterative 3D Object Arrangement
by: Kuang, Zhengfei, et al.
Published: (2025)

Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors
by: Kuang, Zhengfei, et al.
Published: (2024)

CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance
by: Deng, Yufan, et al.
Published: (2025)

ATI: Any Trajectory Instruction for Controllable Video Generation
by: Wang, Angtian, et al.
Published: (2025)

BulletTime: Decoupled Control of Time and Camera Pose for Video Generation
by: Wang, Yiming, et al.
Published: (2025)

MAGREF: Masked Guidance for Any-Reference Video Generation with Subject Disentanglement
by: Deng, Yufan, et al.
Published: (2025)

Spectral Progressive Diffusion for Efficient Image and Video Generation
by: Xiao, Howard, et al.
Published: (2026)

Foveated Diffusion: Efficient Spatially Adaptive Image and Video Generation
by: Chao, Brian, et al.
Published: (2026)

HECTOR: Hybrid Editable Compositional Object References for Video Generation
by: Zhang, Guofeng, et al.
Published: (2026)

CameraCtrl: Enabling Camera Control for Text-to-Video Generation
by: He, Hao, et al.
Published: (2024)

Contrastive Learning from Exploratory Actions: Leveraging Natural Interactions for Preference Elicitation
by: Dennler, Nathaniel, et al.
Published: (2025)

GIFT: Generalizing Intent for Flexible Test-Time Rewards
by: Amin, Fin, et al.
Published: (2026)

Using Causal Trees to Estimate Personalized Task Difficulty in Post-Stroke Individuals
by: Dennler, Nathaniel, et al.
Published: (2024)

The Current State of AI Bias Bounties: An Overview of Existing Programmes and Research
by: Kucenko, Sergej, et al.
Published: (2025)

Singing the Body Electric: The Impact of Robot Embodiment on User Expectations
by: Dennler, Nathaniel, et al.
Published: (2024)

BAgger: Backwards Aggregation for Mitigating Drift in Autoregressive Video Diffusion Models
by: Po, Ryan, et al.
Published: (2025)

Dual Ascent Diffusion for Inverse Problems
by: Kim, Minseo, et al.
Published: (2025)

Generic 3D Diffusion Adapter Using Controlled Multi-View Editing
by: Chen, Hansheng, et al.
Published: (2024)

Frame Context Packing and Drift Prevention in Next-Frame-Prediction Video Diffusion Models
by: Zhang, Lvmin, et al.
Published: (2025)

X-Dyna: Expressive Dynamic Human Image Animation
by: Chang, Di, et al.
Published: (2025)

Infinite Gaze Generation for Videos with Autoregressive Diffusion
by: Kang, Jenna, et al.
Published: (2026)

Position: Olfaction Standardization is Essential for the Advancement of Embodied Artificial Intelligence
by: France, Kordel K., et al.
Published: (2025)

Designing Robot Identity: The Role of Voice, Clothing, and Task on Robot Gender Perception
by: Dennler, Nathaniel S., et al.
Published: (2024)

Improving User Experience in Preference-Based Optimization of Reward Functions for Assistive Robots
by: Dennler, Nathaniel, et al.
Published: (2024)

TGT: Text-Grounded Trajectories for Locally Controlled Video Generation
by: Zhang, Guofeng, et al.
Published: (2025)

Orthogonal Adaptation for Modular Customization of Diffusion Models
by: Po, Ryan, et al.
Published: (2023)

Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion
by: Deng, Boyang, et al.
Published: (2024)

CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models
by: He, Hao, et al.
Published: (2025)

RelightVid: Temporal-Consistent Diffusion Model for Video Relighting
by: Fang, Ye, et al.
Published: (2025)

LiPUP-MA: A Residential Experience-centric Multi-Agent Framework for Living-in-the-loop Participatory Urban Planning
by: Ni, Hang, et al.
Published: (2024)

Turning Text and Imagery into Captivating Visual Video
by: Wang, Mingming, et al.
Published: (2024)

Masked IRL: LLM-Guided Reward Disambiguation from Demonstrations and Language
by: Hwang, Minyoung, et al.
Published: (2025)

QuickLAP: Quick Language-Action Preference Learning for Semi-Autonomous Agents
by: Nader, Jordan Abi, et al.
Published: (2025)

EmbodiedPlace: Learning Mixture-of-Features with Embodied Constraints for Visual Place Recognition
by: Liu, Bingxi, et al.
Published: (2025)

SplatPainter: Interactive Authoring of 3D Gaussians from 2D Edits via Test-Time Training
by: Zheng, Yang, et al.
Published: (2025)

Neural Ganglion Sensors: Learning Task-specific Event Cameras Inspired by the Neural Circuit of the Human Retina
by: So, Haley M., et al.
Published: (2025)

EnerVerse-AC: Envisioning Embodied Environments with Action Condition
by: Jiang, Yuxin, et al.
Published: (2025)