Saved in:
| Main Authors: | Xu, Pengcheng, Fan, Qingnan, Kou, Fei, Qin, Shuai, Gu, Hong, Zhao, Ruoyu, Ling, Charles, Wang, Boyu |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.03495 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
InstructBrush: Learning Attention-based Instruction Optimization for Image Editing
by: Zhao, Ruoyu, et al.
Published: (2024)
by: Zhao, Ruoyu, et al.
Published: (2024)
FreeDiff: Progressive Frequency Truncation for Image Editing with Diffusion Models
by: Wu, Wei, et al.
Published: (2024)
by: Wu, Wei, et al.
Published: (2024)
LIPE: Learning Personalized Identity Prior for Non-rigid Image Editing
by: Liu, Aoyang, et al.
Published: (2024)
by: Liu, Aoyang, et al.
Published: (2024)
Textual and Visual Prompt Fusion for Image Editing via Step-Wise Alignment
by: Feng, Zhanbo, et al.
Published: (2023)
by: Feng, Zhanbo, et al.
Published: (2023)
Class Overwhelms: Mutual Conditional Blended-Target Domain Adaptation
by: Xu, Pengcheng, et al.
Published: (2023)
by: Xu, Pengcheng, et al.
Published: (2023)
Towards Generalized Multi-Image Editing for Unified Multimodal Models
by: Xu, Pengcheng, et al.
Published: (2026)
by: Xu, Pengcheng, et al.
Published: (2026)
RAP-SR: RestorAtion Prior Enhancement in Diffusion Models for Realistic Image Super-Resolution
by: Wang, Jiangang, et al.
Published: (2024)
by: Wang, Jiangang, et al.
Published: (2024)
Visual Textualization for Image Prompted Object Detection
by: Wu, Yongjian, et al.
Published: (2025)
by: Wu, Yongjian, et al.
Published: (2025)
Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing
by: Xu, Pengcheng, et al.
Published: (2024)
by: Xu, Pengcheng, et al.
Published: (2024)
CoMPaSS: Enhancing Spatial Understanding in Text-to-Image Diffusion Models
by: Zhang, Gaoyang, et al.
Published: (2024)
by: Zhang, Gaoyang, et al.
Published: (2024)
AuthFace: Towards Authentic Blind Face Restoration with Face-oriented Generative Diffusion Prior
by: Liang, Guoqiang, et al.
Published: (2024)
by: Liang, Guoqiang, et al.
Published: (2024)
Text-Aware Real-World Image Super-Resolution via Diffusion Model with Joint Segmentation Decoders
by: Hu, Qiming, et al.
Published: (2025)
by: Hu, Qiming, et al.
Published: (2025)
Tuning-Free Adaptive Style Incorporation for Structure-Consistent Text-Driven Style Transfer
by: Ge, Yanqi, et al.
Published: (2024)
by: Ge, Yanqi, et al.
Published: (2024)
EditTransfer++: Toward Faithful and Efficient Visual-Prompt-Guided Image Editing
by: Chen, Lan, et al.
Published: (2026)
by: Chen, Lan, et al.
Published: (2026)
Medal S: Spatio-Textual Prompt Model for Medical Segmentation
by: Shi, Pengcheng, et al.
Published: (2025)
by: Shi, Pengcheng, et al.
Published: (2025)
PromptMoG: Enhancing Diversity in Long-Prompt Image Generation via Prompt Embedding Mixture-of-Gaussian Sampling
by: Ruan, Bo-Kai, et al.
Published: (2025)
by: Ruan, Bo-Kai, et al.
Published: (2025)
TIE: Revolutionizing Text-based Image Editing for Complex-Prompt Following and High-Fidelity Editing
by: Zhang, Xinyu, et al.
Published: (2024)
by: Zhang, Xinyu, et al.
Published: (2024)
How Do Optical Flow and Textual Prompts Collaborate to Assist in Audio-Visual Semantic Segmentation?
by: Lee, Yujian, et al.
Published: (2026)
by: Lee, Yujian, et al.
Published: (2026)
ULF-Loc: Unbiased Landmark Feature for Robust Visual Localization with 3D Gaussian Splatting
by: Gu, Yingdong, et al.
Published: (2026)
by: Gu, Yingdong, et al.
Published: (2026)
TinySR: Pruning Diffusion for Real-World Image Super-Resolution
by: Dong, Linwei, et al.
Published: (2025)
by: Dong, Linwei, et al.
Published: (2025)
TCP:Textual-based Class-aware Prompt tuning for Visual-Language Model
by: Yao, Hantao, et al.
Published: (2023)
by: Yao, Hantao, et al.
Published: (2023)
Relational Diffusion Distillation for Efficient Image Generation
by: Feng, Weilun, et al.
Published: (2024)
by: Feng, Weilun, et al.
Published: (2024)
Visual and Textual Prompts in VLLMs for Enhancing Emotion Recognition
by: Wang, Zhifeng, et al.
Published: (2025)
by: Wang, Zhifeng, et al.
Published: (2025)
CCEdit: Creative and Controllable Video Editing via Diffusion Models
by: Feng, Ruoyu, et al.
Published: (2023)
by: Feng, Ruoyu, et al.
Published: (2023)
Generating Attribute-Aware Human Motions from Textual Prompt
by: Wang, Xinghan, et al.
Published: (2025)
by: Wang, Xinghan, et al.
Published: (2025)
APE: Agentic Prompt Enhancer for Image Generation and Editing
by: Huang, Zijian, et al.
Published: (2026)
by: Huang, Zijian, et al.
Published: (2026)
Image-to-Brain Signal Generation for Visual Prosthesis with CLIP Guided Multimodal Diffusion Models
by: Xu, Ganxi, et al.
Published: (2025)
by: Xu, Ganxi, et al.
Published: (2025)
RORPCap: Retrieval-based Objects and Relations Prompt for Image Captioning
by: Gu, Jinjing, et al.
Published: (2025)
by: Gu, Jinjing, et al.
Published: (2025)
Exploring Multimodal Diffusion Transformers for Enhanced Prompt-based Image Editing
by: Shin, Joonghyuk, et al.
Published: (2025)
by: Shin, Joonghyuk, et al.
Published: (2025)
Beyond Textual CoT: Interleaved Text-Image Chains with Deep Confidence Reasoning for Image Editing
by: Zou, Zhentao, et al.
Published: (2025)
by: Zou, Zhentao, et al.
Published: (2025)
BokehDiff: Neural Lens Blur with One-Step Diffusion
by: Zhu, Chengxuan, et al.
Published: (2025)
by: Zhu, Chengxuan, et al.
Published: (2025)
PromptCIR: Blind Compressed Image Restoration with Prompt Learning
by: Li, Bingchen, et al.
Published: (2024)
by: Li, Bingchen, et al.
Published: (2024)
Bridging Visual Affective Gap: Borrowing Textual Knowledge by Learning from Noisy Image-Text Pairs
by: Wu, Daiqing, et al.
Published: (2025)
by: Wu, Daiqing, et al.
Published: (2025)
DragNeXt: Rethinking Drag-Based Image Editing
by: Zhou, Yuan, et al.
Published: (2025)
by: Zhou, Yuan, et al.
Published: (2025)
VP-Hype: A Hybrid Mamba-Transformer Framework with Visual-Textual Prompting for Hyperspectral Image Classification
by: Sellam, Abdellah Zakaria, et al.
Published: (2026)
by: Sellam, Abdellah Zakaria, et al.
Published: (2026)
Prompt-Guided Image Editing with Masked Logit Nudging in Visual Autoregressive Models
by: El-Ghoussani, Amir, et al.
Published: (2026)
by: El-Ghoussani, Amir, et al.
Published: (2026)
A Survey of Multimodal-Guided Image Editing with Text-to-Image Diffusion Models
by: Shuai, Xincheng, et al.
Published: (2024)
by: Shuai, Xincheng, et al.
Published: (2024)
Forgedit: Text Guided Image Editing via Learning and Forgetting
by: Zhang, Shiwen, et al.
Published: (2023)
by: Zhang, Shiwen, et al.
Published: (2023)
Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model
by: Mao, Qi, et al.
Published: (2025)
by: Mao, Qi, et al.
Published: (2025)
TSD-SR: One-Step Diffusion with Target Score Distillation for Real-World Image Super-Resolution
by: Dong, Linwei, et al.
Published: (2024)
by: Dong, Linwei, et al.
Published: (2024)
Similar Items
-
InstructBrush: Learning Attention-based Instruction Optimization for Image Editing
by: Zhao, Ruoyu, et al.
Published: (2024) -
FreeDiff: Progressive Frequency Truncation for Image Editing with Diffusion Models
by: Wu, Wei, et al.
Published: (2024) -
LIPE: Learning Personalized Identity Prior for Non-rigid Image Editing
by: Liu, Aoyang, et al.
Published: (2024) -
Textual and Visual Prompt Fusion for Image Editing via Step-Wise Alignment
by: Feng, Zhanbo, et al.
Published: (2023) -
Class Overwhelms: Mutual Conditional Blended-Target Domain Adaptation
by: Xu, Pengcheng, et al.
Published: (2023)