:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Gan, Yulu, Park, Sungwoo, Schubert, Alexander, Philippakis, Anthony, Alaa, Ahmed M.
Format:	Preprint
Published:	2023
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2310.00390
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Mean-field Chaos Diffusion Models
by: Park, Sungwoo, et al.
Published: (2024)

InstructBooth: Instruction-following Personalized Text-to-Image Generation
by: Chae, Daewon, et al.
Published: (2023)

InstructTA: Instruction-Tuned Targeted Attack for Large Vision-Language Models
by: Wang, Xunguang, et al.
Published: (2023)

InstructEngine: Instruction-driven Text-to-Image Alignment
by: Lu, Xingyu, et al.
Published: (2025)

InstructVLA: Vision-Language-Action Instruction Tuning from Understanding to Manipulation
by: Yang, Shuai, et al.
Published: (2025)

Dr-LLaVA: Visual Instruction Tuning with Symbolic Clinical Grounding
by: Sun, Shenghuan, et al.
Published: (2024)

From Generalist to Specialist: Adapting Vision Language Models via Task-Specific Visual Instruction Tuning
by: Bai, Yang, et al.
Published: (2024)

InstructUDrag: Joint Text Instructions and Object Dragging for Interactive Image Editing
by: Yu, Haoran, et al.
Published: (2025)

UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing
by: Fu, Tsu-Jui, et al.
Published: (2025)

Instructing Text-to-Image Diffusion Models via Classifier-Guided Semantic Optimization
by: Chang, Yuanyuan, et al.
Published: (2025)

Instruction-Free Tuning of Large Vision Language Models for Medical Instruction Following
by: Kang, Myeongkyun, et al.
Published: (2026)

Learning to Instruct for Visual Instruction Tuning
by: Zhou, Zhihan, et al.
Published: (2025)

Toward a Diffusion-Based Generalist for Dense Vision Tasks
by: Fan, Yue, et al.
Published: (2024)

InstructOCR: Instruction Boosting Scene Text Spotting
by: Duan, Chen, et al.
Published: (2024)

Vision Foundation Models as Generalist Tokenizers for Image Generation
by: Zheng, Anlin, et al.
Published: (2026)

InstructRestore: Region-Customized Image Restoration with Human Instructions
by: Liu, Shuaizheng, et al.
Published: (2025)

Uni-Instruct: One-step Diffusion Model through Unified Diffusion Divergence Instruction
by: Wang, Yifei, et al.
Published: (2025)

Batch-Instructed Gradient for Prompt Evolution:Systematic Prompt Optimization for Enhanced Text-to-Image Synthesis
by: Yang, Xinrui, et al.
Published: (2024)

Prompt-Tuning SAM: From Generalist to Specialist with only 2048 Parameters and 16 Training Images
by: Piater, Tristan, et al.
Published: (2025)

Image Generators are Generalist Vision Learners
by: Gabeur, Valentin, et al.
Published: (2026)

Fine Tuning Text-to-Image Diffusion Models for Correcting Anomalous Images
by: Yoo, Hyunwoo
Published: (2024)

Mitigating Dialogue Hallucination for Large Vision Language Models via Adversarial Instruction Tuning
by: Park, Dongmin, et al.
Published: (2024)

Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model
by: Zhang, Wenqi, et al.
Published: (2024)

OmniText: A Training-Free Generalist for Controllable Text-Image Manipulation
by: Gunawan, Agus, et al.
Published: (2025)

InstructBrush: Learning Attention-based Instruction Optimization for Image Editing
by: Zhao, Ruoyu, et al.
Published: (2024)

Semantic Guidance Tuning for Text-To-Image Diffusion Models
by: Kang, Hyun, et al.
Published: (2023)

Medical Vision Generalist: Unifying Medical Imaging Tasks in Context
by: Ren, Sucheng, et al.
Published: (2024)

StyleInject: Parameter Efficient Tuning of Text-to-Image Diffusion Models
by: Zhou, Mohan, et al.
Published: (2024)

RetouchIQ: MLLM Agents for Instruction-Based Image Retouching with Generalist Reward
by: Wu, Qiucheng, et al.
Published: (2026)

Instruct-Imagen: Image Generation with Multi-modal Instruction
by: Hu, Hexiang, et al.
Published: (2024)

Diffusion Model as a Generalist Segmentation Learner
by: Wang, Haoxiao, et al.
Published: (2026)

Steering Guidance for Personalized Text-to-Image Diffusion Models
by: Park, Sunghyun, et al.
Published: (2025)

Vision Generalist Model: A Survey
by: Wang, Ziyi, et al.
Published: (2025)

InstructSAM: Segment Any Instance with Any Instructions
by: Yuan, Yuqian, et al.
Published: (2026)

Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following
by: Feng, Yutong, et al.
Published: (2023)

Fair Text to Medical Image Diffusion Model with Subgroup Distribution Aligned Tuning
by: Han, Xu, et al.
Published: (2024)

Dita: Scaling Diffusion Transformer for Generalist Vision-Language-Action Policy
by: Hou, Zhi, et al.
Published: (2025)

Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models
by: Wu, Xiaoshi, et al.
Published: (2024)

DGSolver: Diffusion Generalist Solver with Universal Posterior Sampling for Image Restoration
by: Wang, Hebaixu, et al.
Published: (2025)

MMCIG: Multimodal Cover Image Generation for Text-only Documents and Its Dataset Construction via Pseudo-labeling
by: Kim, Hyeyeon, et al.
Published: (2025)