:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Duan, Zhongjie, Zhou, Wenmeng, Chen, Cen, Li, Yaliang, Qian, Weining
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2406.14130
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Diffutoon: High-Resolution Editable Toon Shading via Diffusion Models
by: Duan, Zhongjie, et al.
Published: (2024)

ArtAug: Enhancing Text-to-Image Generation through Synthesis-Understanding Interaction
by: Duan, Zhongjie, et al.
Published: (2024)

VIRAL: Visual In-Context Reasoning via Analogy in Diffusion Transformers
by: Li, Zhiwen, et al.
Published: (2026)

Growth Inhibitors for Suppressing Inappropriate Image Concepts in Diffusion Models
by: Chen, Die, et al.
Published: (2024)

AttriCtrl: Fine-Grained Control of Aesthetic Attribute Intensity in Diffusion Models
by: Chen, Die, et al.
Published: (2025)

Transferability Bound Theory: Exploring Relationship between Adversarial Transferability and Flatness
by: Fan, Mingyuan, et al.
Published: (2023)

Spectral Evolution Search: Efficient Inference-Time Scaling for Reward-Aligned Image Generation
by: Ye, Jinyan, et al.
Published: (2026)

AutoLoRA: Automatic LoRA Retrieval and Fine-Grained Gated Fusion for Text-to-Image Generation
by: Li, Zhiwen, et al.
Published: (2025)

DAPE: Dual-Stage Parameter-Efficient Fine-Tuning for Consistent Video Editing with Diffusion Models
by: Xia, Junhao, et al.
Published: (2025)

COMUNI: Decomposing Common and Unique Video Signals for Diffusion-based Video Generation
by: Sun, Mingzhen, et al.
Published: (2024)

AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion
by: Sun, Mingzhen, et al.
Published: (2025)

Comprehensive Evaluation and Analysis for NSFW Concept Erasure in Text-to-Image Diffusion Models
by: Chen, Die, et al.
Published: (2025)

Transferable Adversarial Examples with Bayes Approach
by: Fan, Mingyuan, et al.
Published: (2022)

Harvest Video Foundation Models via Efficient Post-Pretraining
by: Li, Yizhuo, et al.
Published: (2023)

Diffusion-DRF: Free, Rich, and Differentiable Reward for Video Diffusion Fine-Tuning
by: Wang, Yifan, et al.
Published: (2026)

StyleInject: Parameter Efficient Tuning of Text-to-Image Diffusion Models
by: Zhou, Mohan, et al.
Published: (2024)

MM-LDM: Multi-Modal Latent Diffusion Model for Sounding Video Generation
by: Sun, Mingzhen, et al.
Published: (2024)

PhysCorr: Dual-Reward DPO for Physics-Constrained Text-to-Video Generation with Automated Preference Selection
by: Wang, Peiyao, et al.
Published: (2025)

Frame-wise Conditioning Adaptation for Fine-Tuning Diffusion Models in Text-to-Video Prediction
by: Liu, Zheyuan, et al.
Published: (2025)

ReLumix: Extending Image Relighting to Video via Video Diffusion Models
by: Wang, Lezhong, et al.
Published: (2025)

Efficient Video Diffusion with Sparse Information Transmission for Video Compression
by: Zhou, Mingde, et al.
Published: (2026)

QVD: Post-training Quantization for Video Diffusion Models
by: Tian, Shilong, et al.
Published: (2024)

SmoothVideo: Smooth Video Synthesis with Noise Constraints on Diffusion Models for One-shot Video Tuning
by: Peng, Liang, et al.
Published: (2023)

HumanVBench: Probing Human-Centric Video Understanding in MLLMs with Automatically Synthesized Benchmarks
by: Zhou, Ting, et al.
Published: (2024)

Video-OPD: Efficient Post-Training of Multimodal Large Language Models for Temporal Video Grounding via On-Policy Distillation
by: Li, Jiaze, et al.
Published: (2026)

CoNo: Consistency Noise Injection for Tuning-free Long Video Diffusion
by: Wang, Xingrui, et al.
Published: (2024)

Tuning-Free Long Video Generation via Global-Local Collaborative Diffusion
by: Ma, Yongjia, et al.
Published: (2025)

FreeTraj: Tuning-Free Trajectory Control in Video Diffusion Models
by: Qiu, Haonan, et al.
Published: (2024)

$R^2$-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding
by: Liu, Ye, et al.
Published: (2024)

360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model
by: Wang, Qian, et al.
Published: (2024)

Streaming Video Instruction Tuning
by: Xia, Jiaer, et al.
Published: (2025)

LDP: Parameter-Efficient Fine-Tuning of Multimodal LLM for Medical Report Generation
by: Zhou, Tianyu, et al.
Published: (2025)

SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training
by: Wang, Jianyi, et al.
Published: (2025)

Streaming Video Diffusion: Online Video Editing with Diffusion Models
by: Chen, Feng, et al.
Published: (2024)

VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step
by: Wang, Hanyang, et al.
Published: (2025)

VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking
by: Hu, Runyi, et al.
Published: (2025)

Temporal-Conditional Referring Video Object Segmentation with Noise-Free Text-to-Video Diffusion Model
by: Zhang, Ruixin, et al.
Published: (2025)

SALAD: Achieve High-Sparsity Attention via Efficient Linear Attention Tuning for Video Diffusion Transformer
by: Fang, Tongcheng, et al.
Published: (2026)

Flash-GRPO: Efficient Alignment for Video Diffusion via One-Step Policy Optimization
by: He, Xiaoxuan, et al.
Published: (2026)

OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion Transformer Models
by: Chen, Jinshu, et al.
Published: (2025)