:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Vemishetty, Sujith, Arora, Advitiya, Sharma, Anupama
Format:	Preprint
Published:	2025
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2507.08039
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Structured Captions Improve Prompt Adherence in Text-to-Image Models (Re-LAION-Caption 19M)
by: Merchant, Nicholas, et al.
Published: (2025)

Image Captions are Natural Prompts for Text-to-Image Models
by: Lei, Shiye, et al.
Published: (2023)

Naïve PAINE: Lightweight Text-to-Image Generation Improvement with Prompt Evaluation
by: Kim, Joong Ho, et al.
Published: (2026)

Prompt Optimizer of Text-to-Image Diffusion Models for Abstract Concept Understanding
by: Fan, Zezhong, et al.
Published: (2024)

Searching for internal symbols underlying deep learning
by: Lee, Jung H., et al.
Published: (2024)

Text-to-Image Diffusion Models Cannot Count, and Prompt Refinement Cannot Help
by: Guo, Xuyang, et al.
Published: (2025)

Prompt-Based Safety Guidance Is Ineffective for Unlearned Text-to-Image Diffusion Models
by: Shin, Jiwoo, et al.
Published: (2025)

PALADIN : Robust Neural Fingerprinting for Text-to-Image Diffusion Models
by: L, Murthy, et al.
Published: (2025)

Minority-Focused Text-to-Image Generation via Prompt Optimization
by: Um, Soobin, et al.
Published: (2024)

Learning Hyperspectral Images with Curated Text Prompts for Efficient Multimodal Alignment
by: Chatterjee, Abhiroop, et al.
Published: (2025)

Optimizing Negative Prompts for Enhanced Aesthetics and Fidelity in Text-To-Image Generation
by: Ogezi, Michael, et al.
Published: (2024)

Robust Box Prompt based SAM for Medical Image Segmentation
by: Huang, Yuhao, et al.
Published: (2024)

Towards Adversarially Robust Vision-Language Models: Insights from Design Choices and Prompt Formatting Techniques
by: Bhagwatkar, Rishika, et al.
Published: (2024)

One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt
by: Liu, Tao, et al.
Published: (2025)

Towards Understanding and Quantifying Uncertainty for Text-to-Image Generation
by: Franchi, Gianni, et al.
Published: (2024)

ProTIP: Probabilistic Robustness Verification on Text-to-Image Diffusion Models against Stochastic Perturbation
by: Zhang, Yi, et al.
Published: (2024)

AutoRubric-T2I: Robust Rule-Based Reward Model for Text-to-Image Alignment
by: Kao, Kuei-Chun, et al.
Published: (2026)

Self-Evaluation Unlocks Any-Step Text-to-Image Generation
by: Yu, Xin, et al.
Published: (2025)

Towards Effective Usage of Human-Centric Priors in Diffusion Models for Text-based Human Image Generation
by: Wang, Junyan, et al.
Published: (2024)

Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation
by: He, Yutong, et al.
Published: (2024)

How to Train your Text-to-Image Model: Evaluating Design Choices for Synthetic Training Captions
by: Brack, Manuel, et al.
Published: (2025)

EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits
by: Yosef, Ron, et al.
Published: (2025)

PQPP: A Joint Benchmark for Text-to-Image Prompt and Query Performance Prediction
by: Poesina, Eduard, et al.
Published: (2024)

ViPCap: Retrieval Text-Based Visual Prompts for Lightweight Image Captioning
by: Kim, Taewhan, et al.
Published: (2024)

Robust Disaster Assessment from Aerial Imagery Using Text-to-Image Synthetic Data
by: Kalluri, Tarun, et al.
Published: (2024)

CLIP-Inspector: Model-Level Backdoor Detection for Prompt-Tuned CLIP via OOD Trigger Inversion
by: Jindal, Akshit, et al.
Published: (2026)

Text-Aware Image Restoration with Diffusion Models
by: Min, Jaewon, et al.
Published: (2025)

SafeRedir: Prompt Embedding Redirection for Robust Unlearning in Image Generation Models
by: Liu, Renyang, et al.
Published: (2026)

ShapeWords: Guiding Text-to-Image Synthesis with 3D Shape-Aware Prompts
by: Petrov, Dmitry, et al.
Published: (2024)

Towards Resolving Optimization Conflicts Between Image- and Text-Based Person Re-Identification
by: Kvanchiani, Karina, et al.
Published: (2026)

X-Prompt: Towards Universal In-Context Image Generation in Auto-Regressive Vision Language Foundation Models
by: Sun, Zeyi, et al.
Published: (2024)

Tiled Prompts: Overcoming Prompt Misguidance in Image and Video Super-Resolution
by: Kim, Bryan Sangwoo, et al.
Published: (2026)

Editing Massive Concepts in Text-to-Image Diffusion Models
by: Xiong, Tianwei, et al.
Published: (2024)

ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation
by: Patni, Suraj, et al.
Published: (2024)

GRADEO: Towards Human-Like Evaluation for Text-to-Video Generation via Multi-Step Reasoning
by: Mou, Zhun, et al.
Published: (2025)

Evaluating Text-to-Visual Generation with Image-to-Text Generation
by: Lin, Zhiqiu, et al.
Published: (2024)

Robustness Evaluation for Video Models with Reinforcement Learning
by: Babu, Ashwin Ramesh, et al.
Published: (2025)

Navigating Text-To-Image Customization: From LyCORIS Fine-Tuning to Model Evaluation
by: Yeh, Shih-Ying, et al.
Published: (2023)

StressDream: Steering Video World Models for Robust Policy Evaluation and Improvement
by: Seo, Junwon, et al.
Published: (2026)

Test-Time Alignment of Text-to-Image Diffusion Models via Null-Text Embedding Optimisation
by: Kim, Taehoon, et al.
Published: (2025)