Guardado en:
| Autores principales: | , , , , , |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2502.10258 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
| _version_ | 1866908225481211904 |
|---|---|
| author | Swami, Kunal Chittersu, Raghu Adlinge, Pranav Irny, Rajeev Doodekula, Shashavali Shukla, Alok |
| author_facet | Swami, Kunal Chittersu, Raghu Adlinge, Pranav Irny, Rajeev Doodekula, Shashavali Shukla, Alok |
| contents | We present PromptArtisan, a groundbreaking approach to multi-instruction image editing that achieves remarkable results in a single pass, eliminating the need for time-consuming iterative refinement. Our method empowers users to provide multiple editing instructions, each associated with a specific mask within the image. This flexibility allows for complex edits involving mask intersections or overlaps, enabling the realization of intricate and nuanced image transformations. PromptArtisan leverages a pre-trained InstructPix2Pix model in conjunction with a novel Complete Attention Control Mechanism (CACM). This mechanism ensures precise adherence to user instructions, granting fine-grained control over the editing process. Furthermore, our approach is zero-shot, requiring no additional training, and boasts improved processing complexity compared to traditional iterative methods. By seamlessly integrating multi-instruction capabilities, single-pass efficiency, and complete attention control, PromptArtisan unlocks new possibilities for creative and efficient image editing workflows, catering to both novice and expert users alike. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2502_10258 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | PromptArtisan: Multi-instruction Image Editing in Single Pass with Complete Attention Control Swami, Kunal Chittersu, Raghu Adlinge, Pranav Irny, Rajeev Doodekula, Shashavali Shukla, Alok Computer Vision and Pattern Recognition Human-Computer Interaction We present PromptArtisan, a groundbreaking approach to multi-instruction image editing that achieves remarkable results in a single pass, eliminating the need for time-consuming iterative refinement. Our method empowers users to provide multiple editing instructions, each associated with a specific mask within the image. This flexibility allows for complex edits involving mask intersections or overlaps, enabling the realization of intricate and nuanced image transformations. PromptArtisan leverages a pre-trained InstructPix2Pix model in conjunction with a novel Complete Attention Control Mechanism (CACM). This mechanism ensures precise adherence to user instructions, granting fine-grained control over the editing process. Furthermore, our approach is zero-shot, requiring no additional training, and boasts improved processing complexity compared to traditional iterative methods. By seamlessly integrating multi-instruction capabilities, single-pass efficiency, and complete attention control, PromptArtisan unlocks new possibilities for creative and efficient image editing workflows, catering to both novice and expert users alike. |
| title | PromptArtisan: Multi-instruction Image Editing in Single Pass with Complete Attention Control |
| topic | Computer Vision and Pattern Recognition Human-Computer Interaction |
| url | https://arxiv.org/abs/2502.10258 |