Saved in:
| Main Authors: | Wang, Yuelei, Zhang, Jian, Jiang, Pengtao, Zhang, Hao, Chen, Jinwei, Li, Bo |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2412.01429 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
WATCH: World-aware Allied Trajectory and pose reconstruction for Camera and Human
by: Ying, Qijun, et al.
Published: (2025)
by: Ying, Qijun, et al.
Published: (2025)
MagicTryOn: Harnessing Diffusion Transformer for Garment-Preserving Video Virtual Try-on
by: Li, Guangyuan, et al.
Published: (2025)
by: Li, Guangyuan, et al.
Published: (2025)
Advancing Comprehensive Aesthetic Insight with Multi-Scale Text-Guided Self-Supervised Learning
by: Liu, Yuti, et al.
Published: (2024)
by: Liu, Yuti, et al.
Published: (2024)
ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model
by: Qi, Jinwei, et al.
Published: (2025)
by: Qi, Jinwei, et al.
Published: (2025)
High-Precision Dichotomous Image Segmentation via Probing Diffusion Capacity
by: Yu, Qian, et al.
Published: (2024)
by: Yu, Qian, et al.
Published: (2024)
GenCompositor: Generative Video Compositing with Diffusion Transformer
by: Yang, Shuzhou, et al.
Published: (2025)
by: Yang, Shuzhou, et al.
Published: (2025)
CameraCtrl: Enabling Camera Control for Text-to-Video Generation
by: He, Hao, et al.
Published: (2024)
by: He, Hao, et al.
Published: (2024)
Towards Photorealistic and Efficient Bokeh Rendering via Diffusion Framework
by: Shi, Linxiao, et al.
Published: (2026)
by: Shi, Linxiao, et al.
Published: (2026)
Improving Consistency in Diffusion Models for Image Super-Resolution
by: Gu, Junhao, et al.
Published: (2024)
by: Gu, Junhao, et al.
Published: (2024)
Diffusion-APO: Trajectory-Aware Direct Preference Alignment for Video Diffusion Transformers
by: Zhu, Jingyuan, et al.
Published: (2026)
by: Zhu, Jingyuan, et al.
Published: (2026)
Diffusion-based Data Augmentation for Object Counting Problems
by: Wang, Zhen, et al.
Published: (2024)
by: Wang, Zhen, et al.
Published: (2024)
IP-Adapter Is All You Need: Towards Fine-Tuning-Free Diffusion-Based Talking Face Generation
by: Wu, Hao, et al.
Published: (2026)
by: Wu, Hao, et al.
Published: (2026)
DiffCalib: Reformulating Monocular Camera Calibration as Diffusion-Based Dense Incident Map Generation
by: He, Xiankang, et al.
Published: (2024)
by: He, Xiankang, et al.
Published: (2024)
Boosting Camera Motion Control for Video Diffusion Transformers
by: Cheong, Soon Yau, et al.
Published: (2024)
by: Cheong, Soon Yau, et al.
Published: (2024)
ControlSR: Taming Diffusion Models for Consistent Real-World Image Super Resolution
by: Wan, Yuhao, et al.
Published: (2024)
by: Wan, Yuhao, et al.
Published: (2024)
SDMatte: Grafting Diffusion Models for Interactive Matting
by: Huang, Longfei, et al.
Published: (2025)
by: Huang, Longfei, et al.
Published: (2025)
SymphoMotion: Joint Control of Camera Motion and Object Dynamics for Coherent Video Generation
by: Zhang, Guiyu, et al.
Published: (2026)
by: Zhang, Guiyu, et al.
Published: (2026)
Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention
by: Xu, Dejia, et al.
Published: (2024)
by: Xu, Dejia, et al.
Published: (2024)
IDCNet: Guided Video Diffusion for Metric-Consistent RGBD Scene Generation with Precise Camera Control
by: Liu, Lijuan, et al.
Published: (2025)
by: Liu, Lijuan, et al.
Published: (2025)
CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models
by: He, Hao, et al.
Published: (2025)
by: He, Hao, et al.
Published: (2025)
DualCamCtrl: Dual-Branch Diffusion Model for Geometry-Aware Camera-Controlled Video Generation
by: Zhang, Hongfei, et al.
Published: (2025)
by: Zhang, Hongfei, et al.
Published: (2025)
Improving Adversarial Energy-Based Model via Diffusion Process
by: Geng, Cong, et al.
Published: (2024)
by: Geng, Cong, et al.
Published: (2024)
CamPilot: Improving Camera Control in Video Diffusion Model with Efficient Camera Reward Feedback
by: Ge, Wenhang, et al.
Published: (2026)
by: Ge, Wenhang, et al.
Published: (2026)
Collaborative Video Diffusion: Consistent Multi-video Generation with Camera Control
by: Kuang, Zhengfei, et al.
Published: (2024)
by: Kuang, Zhengfei, et al.
Published: (2024)
Scalable Visual State Space Model with Fractal Scanning
by: Tang, Lv, et al.
Published: (2024)
by: Tang, Lv, et al.
Published: (2024)
Controllable and Expressive One-Shot Video Head Swapping
by: Ji, Chaonan, et al.
Published: (2025)
by: Ji, Chaonan, et al.
Published: (2025)
Latte: Latent Diffusion Transformer for Video Generation
by: Ma, Xin, et al.
Published: (2024)
by: Ma, Xin, et al.
Published: (2024)
Multi-Task Dense Prediction via Mixture of Low-Rank Experts
by: Yang, Yuqi, et al.
Published: (2024)
by: Yang, Yuqi, et al.
Published: (2024)
Chain of Visual Perception: Harnessing Multimodal Large Language Models for Zero-shot Camouflaged Object Detection
by: Tang, Lv, et al.
Published: (2023)
by: Tang, Lv, et al.
Published: (2023)
BLO-Inst: Bi-Level Optimization Based Alignment of YOLO and SAM for Robust Instance Segmentation
by: Zhang, Li, et al.
Published: (2026)
by: Zhang, Li, et al.
Published: (2026)
Any-to-Bokeh: Arbitrary-Subject Video Refocusing with Video Diffusion Model
by: Yang, Yang, et al.
Published: (2025)
by: Yang, Yang, et al.
Published: (2025)
CameraNoise: Enabling Faithful Camera Control in Video Diffusion through Geometry-Flow-Guided Noise Warping
by: Zhao, Haoyu, et al.
Published: (2026)
by: Zhao, Haoyu, et al.
Published: (2026)
Tora: Trajectory-oriented Diffusion Transformer for Video Generation
by: Zhang, Zhenghao, et al.
Published: (2024)
by: Zhang, Zhenghao, et al.
Published: (2024)
MagicWorld: Towards Long-Horizon Stability for Interactive Video World Exploration
by: Li, Guangyuan, et al.
Published: (2025)
by: Li, Guangyuan, et al.
Published: (2025)
360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model
by: Wang, Qian, et al.
Published: (2024)
by: Wang, Qian, et al.
Published: (2024)
Empowering Segmentation Ability to Multi-modal Large Language Models
by: Yang, Yuqi, et al.
Published: (2024)
by: Yang, Yuqi, et al.
Published: (2024)
CinePreGen: Camera Controllable Video Previsualization via Engine-powered Diffusion
by: Chen, Yiran, et al.
Published: (2024)
by: Chen, Yiran, et al.
Published: (2024)
OneTo3D: One Image to Re-editable Dynamic 3D Model and Video Generation
by: Lin, Jinwei
Published: (2024)
by: Lin, Jinwei
Published: (2024)
OmniTalker: One-shot Real-time Text-Driven Talking Audio-Video Generation With Multimodal Style Mimicking
by: Wang, Zhongjian, et al.
Published: (2025)
by: Wang, Zhongjian, et al.
Published: (2025)
AutoDIR: Automatic All-in-One Image Restoration with Latent Diffusion
by: Jiang, Yitong, et al.
Published: (2023)
by: Jiang, Yitong, et al.
Published: (2023)
Similar Items
-
WATCH: World-aware Allied Trajectory and pose reconstruction for Camera and Human
by: Ying, Qijun, et al.
Published: (2025) -
MagicTryOn: Harnessing Diffusion Transformer for Garment-Preserving Video Virtual Try-on
by: Li, Guangyuan, et al.
Published: (2025) -
Advancing Comprehensive Aesthetic Insight with Multi-Scale Text-Guided Self-Supervised Learning
by: Liu, Yuti, et al.
Published: (2024) -
ChatAnyone: Stylized Real-time Portrait Video Generation with Hierarchical Motion Diffusion Model
by: Qi, Jinwei, et al.
Published: (2025) -
High-Precision Dichotomous Image Segmentation via Probing Diffusion Capacity
by: Yu, Qian, et al.
Published: (2024)