:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Qiu, Haonan, Chen, Zhaoxi, Wang, Zhouxia, He, Yingqing, Xia, Menghan, Liu, Ziwei
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2406.16863
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling
by: Qiu, Haonan, et al.
Published: (2023)

FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion
by: Qiu, Haonan, et al.
Published: (2024)

FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model
by: Cao, Yukang, et al.
Published: (2025)

PoseTraj: Pose-Aware Trajectory Control in Video Diffusion
by: Ji, Longbin, et al.
Published: (2025)

MotionCtrl: A Unified and Flexible Motion Controller for Video Generation
by: Wang, Zhouxia, et al.
Published: (2023)

FlexTraj: Image-to-Video Generation with Flexible Point Trajectory Control
by: Zhang, Zhiyuan, et al.
Published: (2025)

Free4D: Tuning-free 4D Scene Generation with Spatial-Temporal Consistency
by: Liu, Tianqi, et al.
Published: (2025)

CineScale: Free Lunch in High-Resolution Cinematic Visual Generation
by: Qiu, Haonan, et al.
Published: (2025)

FreeInit: Bridging Initialization Gap in Video Diffusion Models
by: Wu, Tianxing, et al.
Published: (2023)

Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos
by: Ma, Yue, et al.
Published: (2023)

TrajLoom: Dense Future Trajectory Generation from Video
by: Zhang, Zewei, et al.
Published: (2026)

EmoVid: A Multimodal Emotion Video Dataset for Emotion-Centric Video Understanding and Generation
by: Qiu, Zongyang, et al.
Published: (2025)

MovieCharacter: A Tuning-Free Framework for Controllable Character Video Synthesis
by: Qiu, Di, et al.
Published: (2024)

FashionEngine: Interactive 3D Human Generation and Editing via Multimodal Controls
by: Hu, Tao, et al.
Published: (2024)

TrajDiffuse: A Conditional Diffusion Model for Environment-Aware Trajectory Prediction
by: Qingze, et al.
Published: (2024)

Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation
by: Guo, Lanqing, et al.
Published: (2024)

OnlineSI: Taming Large Language Model for Online 3D Understanding and Grounding
by: Liu, Zixian, et al.
Published: (2026)

Compositional Generative Model of Unbounded 4D Cities
by: Xie, Haozhe, et al.
Published: (2025)

HeadHunt-VAD: Hunting Robust Anomaly-Sensitive Heads in MLLM for Tuning-Free Video Anomaly Detection
by: Cai, Zhaolin, et al.
Published: (2025)

CityDreamer: Compositional Generative Model of Unbounded 3D Cities
by: Xie, Haozhe, et al.
Published: (2023)

FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
by: Lv, Zhengyao, et al.
Published: (2024)

TrajSV: A Trajectory-based Model for Sports Video Representations and Applications
by: Wang, Zheng, et al.
Published: (2025)

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
by: Chen, Haoxin, et al.
Published: (2024)

LongVie: Multimodal-Guided Controllable Ultra-Long Video Generation
by: Gao, Jianxiong, et al.
Published: (2025)

UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control
by: Xia, Tian, et al.
Published: (2024)

TrajTok: Learning Trajectory Tokens enables better Video Understanding
by: Zheng, Chenhao, et al.
Published: (2026)

Diffusion-DRF: Free, Rich, and Differentiable Reward for Video Diffusion Fine-Tuning
by: Wang, Yifan, et al.
Published: (2026)

LongVie 2: Multimodal Controllable Ultra-Long Video World Model
by: Gao, Jianxiong, et al.
Published: (2025)

Tuning-Free Long Video Generation via Global-Local Collaborative Diffusion
by: Ma, Yongjia, et al.
Published: (2025)

DiffuTraj: A Stochastic Vessel Trajectory Prediction Approach via Guided Diffusion Process
by: Li, Changlin, et al.
Published: (2024)

ObjCtrl-2.5D: Training-free Object Control with Camera Poses
by: Wang, Zhouxia, et al.
Published: (2024)

PhysX-3D: Physical-Grounded 3D Asset Generation
by: Cao, Ziang, et al.
Published: (2025)

Generative Gaussian Splatting for Unbounded 3D City Generation
by: Xie, Haozhe, et al.
Published: (2024)

Collaborative Multi-Modal Coding for High-Quality 3D Generation
by: Cao, Ziang, et al.
Published: (2025)

FAIRT2V: Training-Free Debiasing for Text-to-Video Diffusion Models
by: Zhong, Haonan, et al.
Published: (2026)

FreeFix: Boosting 3D Gaussian Splatting via Fine-Tuning-Free Diffusion Models
by: Zhou, Hongyu, et al.
Published: (2026)

LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation
by: Tang, Jiaxiang, et al.
Published: (2024)

TrajShield: Trajectory-Level Safety Mediation for Defending Text-to-Video Models Against Jailbreak Attacks
by: Zou, Quanchen, et al.
Published: (2026)

Dual-Expert Consistency Model for Efficient and High-Quality Video Generation
by: Lv, Zhengyao, et al.
Published: (2025)

MagicStick: Controllable Video Editing via Control Handle Transformations
by: Ma, Yue, et al.
Published: (2023)