Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Cui, Benlei, Huang, Bukun, Ye, Zhizeng, Dong, Xuemei, Chen, Tuo, Xue, Hui, Yang, Dingkang, Huang, Longtao, Tang, Jingqun, Hong, Haiwen
Format: Preprint
Veröffentlicht: 2026
Schlagworte:
Online-Zugang:https://arxiv.org/abs/2602.23783
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
_version_ 1866908915410665472
author Cui, Benlei
Huang, Bukun
Ye, Zhizeng
Dong, Xuemei
Chen, Tuo
Xue, Hui
Yang, Dingkang
Huang, Longtao
Tang, Jingqun
Hong, Haiwen
author_facet Cui, Benlei
Huang, Bukun
Ye, Zhizeng
Dong, Xuemei
Chen, Tuo
Xue, Hui
Yang, Dingkang
Huang, Longtao
Tang, Jingqun
Hong, Haiwen
contents Text-to-image (T2I) diffusion models lack an efficient mechanism for early quality assessment, leading to costly trial-and-error in multi-generation scenarios such as prompt iteration, agent-based generation, and flow-grpo. We reveal a strong correlation between early diffusion cross-attention distributions and final image quality. Based on this finding, we introduce Diffusion Probe, a framework that leverages internal cross-attention maps as predictive signals. We design a lightweight predictor that maps statistical properties of early-stage cross-attention extracted from initial denoising steps to the final image's overall quality. This enables accurate forecasting of image quality across diverse evaluation metrics long before full synthesis is complete. We validate Diffusion Probe across a wide range of settings. On multiple T2I models, across early denoising windows, resolutions, and quality metrics, it achieves strong correlation (PCC > 0.7) and high classification performance (AUC-ROC > 0.9). Its reliability translates into practical gains. By enabling early quality-aware decisions in workflows such as prompt optimization, seed selection, and accelerated RL training, the probe supports more targeted sampling and avoids computation on low-potential generations. This reduces computational overhead while improving final output quality.Diffusion Probe is model-agnostic, efficient, and broadly applicable, offering a practical solution for improving T2I generation efficiency through early quality prediction.
format Preprint
id arxiv_https___arxiv_org_abs_2602_23783
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Diffusion Probe: Generated Image Result Prediction Using CNN Probes
Cui, Benlei
Huang, Bukun
Ye, Zhizeng
Dong, Xuemei
Chen, Tuo
Xue, Hui
Yang, Dingkang
Huang, Longtao
Tang, Jingqun
Hong, Haiwen
Computer Vision and Pattern Recognition
Text-to-image (T2I) diffusion models lack an efficient mechanism for early quality assessment, leading to costly trial-and-error in multi-generation scenarios such as prompt iteration, agent-based generation, and flow-grpo. We reveal a strong correlation between early diffusion cross-attention distributions and final image quality. Based on this finding, we introduce Diffusion Probe, a framework that leverages internal cross-attention maps as predictive signals. We design a lightweight predictor that maps statistical properties of early-stage cross-attention extracted from initial denoising steps to the final image's overall quality. This enables accurate forecasting of image quality across diverse evaluation metrics long before full synthesis is complete. We validate Diffusion Probe across a wide range of settings. On multiple T2I models, across early denoising windows, resolutions, and quality metrics, it achieves strong correlation (PCC > 0.7) and high classification performance (AUC-ROC > 0.9). Its reliability translates into practical gains. By enabling early quality-aware decisions in workflows such as prompt optimization, seed selection, and accelerated RL training, the probe supports more targeted sampling and avoids computation on low-potential generations. This reduces computational overhead while improving final output quality.Diffusion Probe is model-agnostic, efficient, and broadly applicable, offering a practical solution for improving T2I generation efficiency through early quality prediction.
title Diffusion Probe: Generated Image Result Prediction Using CNN Probes
topic Computer Vision and Pattern Recognition
url https://arxiv.org/abs/2602.23783