Saved in:
| Main Authors: | Gan, Yulu, Park, Sungwoo, Schubert, Alexander, Philippakis, Anthony, Alaa, Ahmed M. |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2310.00390 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Mean-field Chaos Diffusion Models
by: Park, Sungwoo, et al.
Published: (2024)
by: Park, Sungwoo, et al.
Published: (2024)
InstructBooth: Instruction-following Personalized Text-to-Image Generation
by: Chae, Daewon, et al.
Published: (2023)
by: Chae, Daewon, et al.
Published: (2023)
InstructTA: Instruction-Tuned Targeted Attack for Large Vision-Language Models
by: Wang, Xunguang, et al.
Published: (2023)
by: Wang, Xunguang, et al.
Published: (2023)
InstructEngine: Instruction-driven Text-to-Image Alignment
by: Lu, Xingyu, et al.
Published: (2025)
by: Lu, Xingyu, et al.
Published: (2025)
InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation
by: Yang, Shuai, et al.
Published: (2025)
by: Yang, Shuai, et al.
Published: (2025)
Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding
by: Sun, Shenghuan, et al.
Published: (2024)
by: Sun, Shenghuan, et al.
Published: (2024)
From Generalist to Specialist: Adapting Vision Language Models via Task-Specific Visual Instruction Tuning
by: Bai, Yang, et al.
Published: (2024)
by: Bai, Yang, et al.
Published: (2024)
InstructUDrag: Joint Text Instructions and Object Dragging for Interactive Image Editing
by: Yu, Haoran, et al.
Published: (2025)
by: Yu, Haoran, et al.
Published: (2025)
UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing
by: Fu, Tsu-Jui, et al.
Published: (2025)
by: Fu, Tsu-Jui, et al.
Published: (2025)
Instructing Text-to-Image Diffusion Models via Classifier-Guided Semantic Optimization
by: Chang, Yuanyuan, et al.
Published: (2025)
by: Chang, Yuanyuan, et al.
Published: (2025)
Instruction-Free Tuning of Large Vision Language Models for Medical Instruction Following
by: Kang, Myeongkyun, et al.
Published: (2026)
by: Kang, Myeongkyun, et al.
Published: (2026)
Learning to Instruct for Visual Instruction Tuning
by: Zhou, Zhihan, et al.
Published: (2025)
by: Zhou, Zhihan, et al.
Published: (2025)
Toward a Diffusion-Based Generalist for Dense Vision Tasks
by: Fan, Yue, et al.
Published: (2024)
by: Fan, Yue, et al.
Published: (2024)
InstructOCR: Instruction Boosting Scene Text Spotting
by: Duan, Chen, et al.
Published: (2024)
by: Duan, Chen, et al.
Published: (2024)
Vision Foundation Models as Generalist Tokenizers for Image Generation
by: Zheng, Anlin, et al.
Published: (2026)
by: Zheng, Anlin, et al.
Published: (2026)
InstructRestore: Region-Customized Image Restoration with Human Instructions
by: Liu, Shuaizheng, et al.
Published: (2025)
by: Liu, Shuaizheng, et al.
Published: (2025)
Uni-Instruct: One-step Diffusion Model through Unified Diffusion Divergence Instruction
by: Wang, Yifei, et al.
Published: (2025)
by: Wang, Yifei, et al.
Published: (2025)
Batch-Instructed Gradient for Prompt Evolution:Systematic Prompt Optimization for Enhanced Text-to-Image Synthesis
by: Yang, Xinrui, et al.
Published: (2024)
by: Yang, Xinrui, et al.
Published: (2024)
Prompt-Tuning SAM: From Generalist to Specialist with only 2048 Parameters and 16 Training Images
by: Piater, Tristan, et al.
Published: (2025)
by: Piater, Tristan, et al.
Published: (2025)
Image Generators are Generalist Vision Learners
by: Gabeur, Valentin, et al.
Published: (2026)
by: Gabeur, Valentin, et al.
Published: (2026)
Fine Tuning Text-to-Image Diffusion Models for Correcting Anomalous Images
by: Yoo, Hyunwoo
Published: (2024)
by: Yoo, Hyunwoo
Published: (2024)
Mitigating Dialogue Hallucination for Large Vision Language Models via Adversarial Instruction Tuning
by: Park, Dongmin, et al.
Published: (2024)
by: Park, Dongmin, et al.
Published: (2024)
Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model
by: Zhang, Wenqi, et al.
Published: (2024)
by: Zhang, Wenqi, et al.
Published: (2024)
OmniText: A Training-Free Generalist for Controllable Text-Image Manipulation
by: Gunawan, Agus, et al.
Published: (2025)
by: Gunawan, Agus, et al.
Published: (2025)
InstructBrush: Learning Attention-based Instruction Optimization for Image Editing
by: Zhao, Ruoyu, et al.
Published: (2024)
by: Zhao, Ruoyu, et al.
Published: (2024)
Semantic Guidance Tuning for Text-To-Image Diffusion Models
by: Kang, Hyun, et al.
Published: (2023)
by: Kang, Hyun, et al.
Published: (2023)
Medical Vision Generalist: Unifying Medical Imaging Tasks in Context
by: Ren, Sucheng, et al.
Published: (2024)
by: Ren, Sucheng, et al.
Published: (2024)
StyleInject: Parameter Efficient Tuning of Text-to-Image Diffusion Models
by: Zhou, Mohan, et al.
Published: (2024)
by: Zhou, Mohan, et al.
Published: (2024)
RetouchIQ: MLLM Agents for Instruction-Based Image Retouching with Generalist Reward
by: Wu, Qiucheng, et al.
Published: (2026)
by: Wu, Qiucheng, et al.
Published: (2026)
Instruct-Imagen: Image Generation with Multi-modal Instruction
by: Hu, Hexiang, et al.
Published: (2024)
by: Hu, Hexiang, et al.
Published: (2024)
Diffusion Model as a Generalist Segmentation Learner
by: Wang, Haoxiao, et al.
Published: (2026)
by: Wang, Haoxiao, et al.
Published: (2026)
Steering Guidance for Personalized Text-to-Image Diffusion Models
by: Park, Sunghyun, et al.
Published: (2025)
by: Park, Sunghyun, et al.
Published: (2025)
Vision Generalist Model: A Survey
by: Wang, Ziyi, et al.
Published: (2025)
by: Wang, Ziyi, et al.
Published: (2025)
InstructSAM: Segment Any Instance with Any Instructions
by: Yuan, Yuqian, et al.
Published: (2026)
by: Yuan, Yuqian, et al.
Published: (2026)
Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following
by: Feng, Yutong, et al.
Published: (2023)
by: Feng, Yutong, et al.
Published: (2023)
Fair Text to Medical Image Diffusion Model with Subgroup Distribution Aligned Tuning
by: Han, Xu, et al.
Published: (2024)
by: Han, Xu, et al.
Published: (2024)
Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy
by: Hou, Zhi, et al.
Published: (2025)
by: Hou, Zhi, et al.
Published: (2025)
Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models
by: Wu, Xiaoshi, et al.
Published: (2024)
by: Wu, Xiaoshi, et al.
Published: (2024)
DGSolver: Diffusion Generalist Solver with Universal Posterior Sampling for Image Restoration
by: Wang, Hebaixu, et al.
Published: (2025)
by: Wang, Hebaixu, et al.
Published: (2025)
MMCIG: Multimodal Cover Image Generation for Text-only Documents and Its Dataset Construction via Pseudo-labeling
by: Kim, Hyeyeon, et al.
Published: (2025)
by: Kim, Hyeyeon, et al.
Published: (2025)
Similar Items
-
Mean-field Chaos Diffusion Models
by: Park, Sungwoo, et al.
Published: (2024) -
InstructBooth: Instruction-following Personalized Text-to-Image Generation
by: Chae, Daewon, et al.
Published: (2023) -
InstructTA: Instruction-Tuned Targeted Attack for Large Vision-Language Models
by: Wang, Xunguang, et al.
Published: (2023) -
InstructEngine: Instruction-driven Text-to-Image Alignment
by: Lu, Xingyu, et al.
Published: (2025) -
InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation
by: Yang, Shuai, et al.
Published: (2025)