Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Chen, Ke, Wang, Yifeng, Almosapeeh, Hassan, Wang, Haohan
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2511.19829
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917551618916352
author	Chen, Ke Wang, Yifeng Almosapeeh, Hassan Wang, Haohan
author_facet	Chen, Ke Wang, Yifeng Almosapeeh, Hassan Wang, Haohan
contents	Most prompt-optimization methods refine a single static template, making them ineffective in complex and dynamic user scenarios. Existing query-dependent approaches rely on unstable textual feedback or black-box reward models, providing weak and uninterpretable optimization signals. More fundamentally, prompt quality itself lacks a unified, systematic definition, resulting in fragmented and unreliable evaluation signals. Our approach first establishes a performance-oriented, systematic, and comprehensive prompt evaluation framework. Furthermore, we develop and finetune an execution-free evaluator that predicts multi-dimensional quality scores directly from text. The evaluator then instructs a metric-aware optimizer that diagnoses failure modes and rewrites prompts in an interpretable, query-dependent manner. Our evaluator achieves the strongest accuracy in predicting prompt performance, and the evaluation-instructed optimization consistently surpass both static-template and query-dependent baselines across eight datasets and on three backbone models. Overall, we propose a unified, metric-grounded perspective on prompt quality, and demonstrated that our evaluation-instructed optimization pipeline delivers stable, interpretable, and model-agnostic improvements across diverse tasks.
format	Preprint
id	arxiv_https___arxiv_org_abs_2511_19829
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	A Unified Evaluation-Instructed Framework for Query-Dependent Prompt Optimization Chen, Ke Wang, Yifeng Almosapeeh, Hassan Wang, Haohan Artificial Intelligence Most prompt-optimization methods refine a single static template, making them ineffective in complex and dynamic user scenarios. Existing query-dependent approaches rely on unstable textual feedback or black-box reward models, providing weak and uninterpretable optimization signals. More fundamentally, prompt quality itself lacks a unified, systematic definition, resulting in fragmented and unreliable evaluation signals. Our approach first establishes a performance-oriented, systematic, and comprehensive prompt evaluation framework. Furthermore, we develop and finetune an execution-free evaluator that predicts multi-dimensional quality scores directly from text. The evaluator then instructs a metric-aware optimizer that diagnoses failure modes and rewrites prompts in an interpretable, query-dependent manner. Our evaluator achieves the strongest accuracy in predicting prompt performance, and the evaluation-instructed optimization consistently surpass both static-template and query-dependent baselines across eight datasets and on three backbone models. Overall, we propose a unified, metric-grounded perspective on prompt quality, and demonstrated that our evaluation-instructed optimization pipeline delivers stable, interpretable, and model-agnostic improvements across diverse tasks.
title	A Unified Evaluation-Instructed Framework for Query-Dependent Prompt Optimization
topic	Artificial Intelligence
url	https://arxiv.org/abs/2511.19829

Similar Items