Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Chen, Minyu, Qin, Song, Wu, Ling-I, Xue, Jianxin, Li, Guoqiang
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2605.10634
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910209335623680
author	Chen, Minyu Qin, Song Wu, Ling-I Xue, Jianxin Li, Guoqiang
author_facet	Chen, Minyu Qin, Song Wu, Ling-I Xue, Jianxin Li, Guoqiang
contents	LLM-based automatic heuristic design has shown promise for generating executable heuristics for combinatorial optimization, but existing methods mainly rely on delayed endpoint performance. We propose a \emph{teacher-aware evolutionary framework} that uses independently trained learned optimization policies as behavioral teachers. Instead of deploying or imitating the teacher, our method queries it on states visited by candidate heuristic programs and uses its action preferences as local feedback for evolution. The resulting search discovers static executable heuristics guided by both task performance and teacher-derived behavioral signals. Experiments on scheduling, routing, and graph optimization benchmarks show that our method improves over performance-driven LLM heuristic evolution baselines while requiring no neural inference at deployment. These results suggest that learned optimization policies can be repurposed as behavioral feedback sources for automatic heuristic discovery.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_10634
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Teacher-Aware Evolution of Heuristic Programs from Learned Optimization Policies Chen, Minyu Qin, Song Wu, Ling-I Xue, Jianxin Li, Guoqiang Artificial Intelligence LLM-based automatic heuristic design has shown promise for generating executable heuristics for combinatorial optimization, but existing methods mainly rely on delayed endpoint performance. We propose a \emph{teacher-aware evolutionary framework} that uses independently trained learned optimization policies as behavioral teachers. Instead of deploying or imitating the teacher, our method queries it on states visited by candidate heuristic programs and uses its action preferences as local feedback for evolution. The resulting search discovers static executable heuristics guided by both task performance and teacher-derived behavioral signals. Experiments on scheduling, routing, and graph optimization benchmarks show that our method improves over performance-driven LLM heuristic evolution baselines while requiring no neural inference at deployment. These results suggest that learned optimization policies can be repurposed as behavioral feedback sources for automatic heuristic discovery.
title	Teacher-Aware Evolution of Heuristic Programs from Learned Optimization Policies
topic	Artificial Intelligence
url	https://arxiv.org/abs/2605.10634

Similar Items