Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Ma, Zihan, Zhao, Zhikai, Hua, Chuanbo, Berto, Federico, Park, Jinkyoo
Formato:	Preprint
Publicado:	2026
Materias:	Artificial Intelligence
Acceso en línea:	https://arxiv.org/abs/2601.07477
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866911415926784000
author	Ma, Zihan Zhao, Zhikai Hua, Chuanbo Berto, Federico Park, Jinkyoo
author_facet	Ma, Zihan Zhao, Zhikai Hua, Chuanbo Berto, Federico Park, Jinkyoo
contents	Optimizing LLM-based agentic workflows is challenging for scaling AI capabilities. Current methods rely on coarse, end-to-end evaluation signals and lack fine-grained signals on where to refine, often resulting in inefficient or low-impact modifications. To address these limitations, we propose JudgeFlow, an Evaluation-Judge-Optimization-Update pipeline. We incorporate reusable, configurable logic blocks into agentic workflows to capture fundamental forms of logic. On top of this abstraction, we design a dedicated Judge module that inspects execution traces particularly failed runs and assigns rank-based responsibility scores to problematic blocks. These fine-grained diagnostic signals are then leveraged by an LLM-based optimizer, which focuses modifications on the most problematic block in the workflow. Our approach improves sample efficiency, enhances interpretability through block-level diagnostics, and provides a scalable foundation for automating increasingly complex agentic workflows. We evaluate JudgeFlow on mathematical reasoning and code generation benchmarks, where JudgeFlow achieves superior performance and efficiency compared to existing methods.
format	Preprint
id	arxiv_https___arxiv_org_abs_2601_07477
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	JudgeFlow: Agentic Workflow Optimization via Block Judge Ma, Zihan Zhao, Zhikai Hua, Chuanbo Berto, Federico Park, Jinkyoo Artificial Intelligence Optimizing LLM-based agentic workflows is challenging for scaling AI capabilities. Current methods rely on coarse, end-to-end evaluation signals and lack fine-grained signals on where to refine, often resulting in inefficient or low-impact modifications. To address these limitations, we propose JudgeFlow, an Evaluation-Judge-Optimization-Update pipeline. We incorporate reusable, configurable logic blocks into agentic workflows to capture fundamental forms of logic. On top of this abstraction, we design a dedicated Judge module that inspects execution traces particularly failed runs and assigns rank-based responsibility scores to problematic blocks. These fine-grained diagnostic signals are then leveraged by an LLM-based optimizer, which focuses modifications on the most problematic block in the workflow. Our approach improves sample efficiency, enhances interpretability through block-level diagnostics, and provides a scalable foundation for automating increasingly complex agentic workflows. We evaluate JudgeFlow on mathematical reasoning and code generation benchmarks, where JudgeFlow achieves superior performance and efficiency compared to existing methods.
title	JudgeFlow: Agentic Workflow Optimization via Block Judge
topic	Artificial Intelligence
url	https://arxiv.org/abs/2601.07477

Ejemplares similares