Saved in:
| Main Authors: | Jiang, Zutao, Fang, Guian, Han, Jianhua, Lu, Guansong, Xu, Hang, Liao, Shengcai, Chang, Xiaojun, Liang, Xiaodan |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2305.19599 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance
by: Fang, Guian, et al.
Published: (2024)
by: Fang, Guian, et al.
Published: (2024)
LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model
by: Huang, Runhui, et al.
Published: (2024)
by: Huang, Runhui, et al.
Published: (2024)
StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration
by: Hu, Panwen, et al.
Published: (2024)
by: Hu, Panwen, et al.
Published: (2024)
NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning
by: Lin, Bingqian, et al.
Published: (2024)
by: Lin, Bingqian, et al.
Published: (2024)
Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models
by: Baliah, Sanoojan, et al.
Published: (2024)
by: Baliah, Sanoojan, et al.
Published: (2024)
DeRaDiff: Denoising Time Realignment of Diffusion Models
by: Manujith, Ratnavibusena Don Shahain, et al.
Published: (2026)
by: Manujith, Ratnavibusena Don Shahain, et al.
Published: (2026)
PanGu-Draw: Advancing Resource-Efficient Text-to-Image Synthesis with Time-Decoupled Training and Reusable Coop-Diffusion
by: Lu, Guansong, et al.
Published: (2023)
by: Lu, Guansong, et al.
Published: (2023)
DirectSwap: Mask-Free Cross-Identity Training and Benchmarking for Expression-Consistent Video Head Swapping
by: Wang, Yanan, et al.
Published: (2025)
by: Wang, Yanan, et al.
Published: (2025)
Re-Thinking the Automatic Evaluation of Image-Text Alignment in Text-to-Image Models
by: Zhang, Huixuan, et al.
Published: (2025)
by: Zhang, Huixuan, et al.
Published: (2025)
SemDiff: Generating Natural Unrestricted Adversarial Examples via Semantic Attributes Optimization in Diffusion Models
by: Dai, Zeyu, et al.
Published: (2025)
by: Dai, Zeyu, et al.
Published: (2025)
SemHiTok: A Unified Image Tokenizer via Semantic-Guided Hierarchical Codebook for Multimodal Understanding and Generation
by: Chen, Zisheng, et al.
Published: (2025)
by: Chen, Zisheng, et al.
Published: (2025)
VidMan: Exploiting Implicit Dynamics from Video Diffusion Model for Effective Robot Manipulation
by: Wen, Youpeng, et al.
Published: (2024)
by: Wen, Youpeng, et al.
Published: (2024)
MagicSeg: Open-World Segmentation Pretraining via Counterfactural Diffusion-Based Auto-Generation
by: Cai, Kaixin, et al.
Published: (2026)
by: Cai, Kaixin, et al.
Published: (2026)
DiffBoost: Enhancing Medical Image Segmentation via Text-Guided Diffusion Model
by: Zhang, Zheyuan, et al.
Published: (2023)
by: Zhang, Zheyuan, et al.
Published: (2023)
EasyControl: Transfer ControlNet to Video Diffusion for Controllable Generation and Interpolation
by: Wang, Cong, et al.
Published: (2024)
by: Wang, Cong, et al.
Published: (2024)
DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions
by: Wang, Guangrun, et al.
Published: (2024)
by: Wang, Guangrun, et al.
Published: (2024)
JoReS-Diff: Joint Retinex and Semantic Priors in Diffusion Model for Low-light Image Enhancement
by: Wu, Yuhui, et al.
Published: (2023)
by: Wu, Yuhui, et al.
Published: (2023)
CorNav: Autonomous Agent with Self-Corrected Planning for Zero-Shot Vision-and-Language Navigation
by: Liang, Xiwen, et al.
Published: (2023)
by: Liang, Xiwen, et al.
Published: (2023)
AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation
by: Gu, Yuchao, et al.
Published: (2026)
by: Gu, Yuchao, et al.
Published: (2026)
ReText: Text Boosts Generalization in Image-Based Person Re-identification
by: Mamedov, Timur, et al.
Published: (2026)
by: Mamedov, Timur, et al.
Published: (2026)
FramePrompt: In-context Controllable Animation with Zero Structural Changes
by: Fang, Guian, et al.
Published: (2025)
by: Fang, Guian, et al.
Published: (2025)
DreamVideo: High-Fidelity Image-to-Video Generation with Image Retention and Text Guidance
by: Wang, Cong, et al.
Published: (2023)
by: Wang, Cong, et al.
Published: (2023)
LEAD: Latent Realignment for Human Motion Diffusion
by: Andreou, Nefeli, et al.
Published: (2024)
by: Andreou, Nefeli, et al.
Published: (2024)
VMU-Diff: A Coarse-to-fine Multi-source Data Fusion Framework for Precipitation Nowcasting
by: Shi, Chunlei, et al.
Published: (2026)
by: Shi, Chunlei, et al.
Published: (2026)
DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing
by: Mou, Chong, et al.
Published: (2024)
by: Mou, Chong, et al.
Published: (2024)
SEPS: Semantic-enhanced Patch Slimming Framework for fine-grained cross-modal alignment
by: Mao, Xinyu, et al.
Published: (2025)
by: Mao, Xinyu, et al.
Published: (2025)
DiffMorph: Text-less Image Morphing with Diffusion Models
by: Chatterjee, Shounak
Published: (2024)
by: Chatterjee, Shounak
Published: (2024)
Diff-Tracker: Text-to-Image Diffusion Models are Unsupervised Trackers
by: Zhang, Zhengbo, et al.
Published: (2024)
by: Zhang, Zhengbo, et al.
Published: (2024)
DiffPop: Plausibility‐Guided Object Placement Diffusion for Image Composition
by: Jiacheng Liu, et al.
Published: (2024)
by: Jiacheng Liu, et al.
Published: (2024)
DiffPop: Plausibility-Guided Object Placement Diffusion for Image Composition
by: Liu, Jiacheng, et al.
Published: (2024)
by: Liu, Jiacheng, et al.
Published: (2024)
TAL: Two-stream Adaptive Learning for Generalizable Person Re-identification
by: Yan, Yichao, et al.
Published: (2021)
by: Yan, Yichao, et al.
Published: (2021)
DiffRefiner: Coarse to Fine Trajectory Planning via Diffusion Refinement with Semantic Interaction for End to End Autonomous Driving
by: Yin, Liuhan, et al.
Published: (2025)
by: Yin, Liuhan, et al.
Published: (2025)
ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement
by: Huang, Runhui, et al.
Published: (2025)
by: Huang, Runhui, et al.
Published: (2025)
CoCoDiff: Diversifying Skeleton Action Features via Coarse-Fine Text-Co-Guided Latent Diffusion
by: Zhao, Zhifu, et al.
Published: (2025)
by: Zhao, Zhifu, et al.
Published: (2025)
ILLUME: Illuminating Your LLMs to See, Draw, and Self-Enhance
by: Wang, Chunwei, et al.
Published: (2024)
by: Wang, Chunwei, et al.
Published: (2024)
The Art of (Mis)alignment: How Fine-Tuning Methods Effectively Misalign and Realign LLMs in Post-Training
by: Zhang, Rui, et al.
Published: (2026)
by: Zhang, Rui, et al.
Published: (2026)
SteerDiff: Steering towards Safe Text-to-Image Diffusion Models
by: Zhang, Hongxiang, et al.
Published: (2024)
by: Zhang, Hongxiang, et al.
Published: (2024)
DiffBlender: Composable and Versatile Multimodal Text-to-Image Diffusion Models
by: Kim, Sungnyun, et al.
Published: (2023)
by: Kim, Sungnyun, et al.
Published: (2023)
Coarse-to-fine Dynamic Uplift Modeling for Real-time Video Recommendation
by: Meng, Chang, et al.
Published: (2024)
by: Meng, Chang, et al.
Published: (2024)
AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning
by: Guo, Yuwei, et al.
Published: (2023)
by: Guo, Yuwei, et al.
Published: (2023)
Similar Items
-
HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance
by: Fang, Guian, et al.
Published: (2024) -
LayerDiff: Exploring Text-guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model
by: Huang, Runhui, et al.
Published: (2024) -
StoryAgent: Customized Storytelling Video Generation via Multi-Agent Collaboration
by: Hu, Panwen, et al.
Published: (2024) -
NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning
by: Lin, Bingqian, et al.
Published: (2024) -
Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models
by: Baliah, Sanoojan, et al.
Published: (2024)