:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Wang, Yuelei, Zhang, Jian, Jiang, Pengtao, Zhang, Hao, Chen, Jinwei, Li, Bo
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2412.01429
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

WATCH: World-aware Allied Trajectory and pose reconstruction for Camera and Human
by: Ying, Qijun, et al.
Published: (2025)

MagicTryOn: Harnessing Diffusion Transformer for Garment-Preserving Video Virtual Try-on
by: Li, Guangyuan, et al.
Published: (2025)

Advancing Comprehensive Aesthetic Insight with Multi-Scale Text-Guided Self-Supervised Learning
by: Liu, Yuti, et al.
Published: (2024)

ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model
by: Qi, Jinwei, et al.
Published: (2025)

High-Precision Dichotomous Image Segmentation via Probing Diffusion Capacity
by: Yu, Qian, et al.
Published: (2024)

GenCompositor: Generative Video Compositing with Diffusion Transformer
by: Yang, Shuzhou, et al.
Published: (2025)

CameraCtrl: Enabling Camera Control for Text-to-Video Generation
by: He, Hao, et al.
Published: (2024)

Towards Photorealistic and Efficient Bokeh Rendering via Diffusion Framework
by: Shi, Linxiao, et al.
Published: (2026)

Improving Consistency in Diffusion Models for Image Super-Resolution
by: Gu, Junhao, et al.
Published: (2024)

Diffusion-APO: Trajectory-Aware Direct Preference Alignment for Video Diffusion Transformers
by: Zhu, Jingyuan, et al.
Published: (2026)

Diffusion-based Data Augmentation for Object Counting Problems
by: Wang, Zhen, et al.
Published: (2024)

IP-Adapter Is All You Need: Towards Fine-Tuning-Free Diffusion-Based Talking Face Generation
by: Wu, Hao, et al.
Published: (2026)

DiffCalib: Reformulating Monocular Camera Calibration as Diffusion-Based Dense Incident Map Generation
by: He, Xiankang, et al.
Published: (2024)

Boosting Camera Motion Control for Video Diffusion Transformers
by: Cheong, Soon Yau, et al.
Published: (2024)

ControlSR: Taming Diffusion Models for Consistent Real-World Image Super Resolution
by: Wan, Yuhao, et al.
Published: (2024)

SDMatte: Grafting Diffusion Models for Interactive Matting
by: Huang, Longfei, et al.
Published: (2025)

SymphoMotion: Joint Control of Camera Motion and Object Dynamics for Coherent Video Generation
by: Zhang, Guiyu, et al.
Published: (2026)

Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention
by: Xu, Dejia, et al.
Published: (2024)

IDCNet: Guided Video Diffusion for Metric-Consistent RGBD Scene Generation with Precise Camera Control
by: Liu, Lijuan, et al.
Published: (2025)

CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models
by: He, Hao, et al.
Published: (2025)

DualCamCtrl: Dual-Branch Diffusion Model for Geometry-Aware Camera-Controlled Video Generation
by: Zhang, Hongfei, et al.
Published: (2025)

Improving Adversarial Energy-Based Model via Diffusion Process
by: Geng, Cong, et al.
Published: (2024)

CamPilot: Improving Camera Control in Video Diffusion Model with Efficient Camera Reward Feedback
by: Ge, Wenhang, et al.
Published: (2026)

Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control
by: Kuang, Zhengfei, et al.
Published: (2024)

Scalable Visual State Space Model with Fractal Scanning
by: Tang, Lv, et al.
Published: (2024)

Controllable and Expressive One-Shot Video Head Swapping
by: Ji, Chaonan, et al.
Published: (2025)

Latte: Latent Diffusion Transformer for Video Generation
by: Ma, Xin, et al.
Published: (2024)

Multi-Task Dense Prediction via Mixture of Low-Rank Experts
by: Yang, Yuqi, et al.
Published: (2024)

Chain of Visual Perception: Harnessing Multimodal Large Language Models for Zero-shot Camouflaged Object Detection
by: Tang, Lv, et al.
Published: (2023)

BLO-Inst: Bi-Level Optimization Based Alignment of YOLO and SAM for Robust Instance Segmentation
by: Zhang, Li, et al.
Published: (2026)

Any-to-Bokeh: Arbitrary-Subject Video Refocusing with Video Diffusion Model
by: Yang, Yang, et al.
Published: (2025)

CameraNoise: Enabling Faithful Camera Control in Video Diffusion through Geometry-Flow-Guided Noise Warping
by: Zhao, Haoyu, et al.
Published: (2026)

Tora: Trajectory-oriented Diffusion Transformer for Video Generation
by: Zhang, Zhenghao, et al.
Published: (2024)

MagicWorld: Towards Long-Horizon Stability for Interactive Video World Exploration
by: Li, Guangyuan, et al.
Published: (2025)

360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model
by: Wang, Qian, et al.
Published: (2024)

Empowering Segmentation Ability to Multi-modal Large Language Models
by: Yang, Yuqi, et al.
Published: (2024)

CinePreGen: Camera Controllable Video Previsualization via Engine-powered Diffusion
by: Chen, Yiran, et al.
Published: (2024)

OneTo3D: One Image to Re-editable Dynamic 3D Model and Video Generation
by: Lin, Jinwei
Published: (2024)

OmniTalker: One-shot Real-time Text-Driven Talking Audio-Video Generation With Multimodal Style Mimicking
by: Wang, Zhongjian, et al.
Published: (2025)

AutoDIR: Automatic All-in-One Image Restoration with Latent Diffusion
by: Jiang, Yitong, et al.
Published: (2023)