Saved in:
Bibliographic Details
Main Authors: Liu, Hao, Yang, Xiao-Wen, Sehgal, Atharva, Wang, Yixin, Guo, Lan-Zhe, Li, Yu-Feng, Yue, Yisong
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2605.03101
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866910190718156800
author Liu, Hao
Yang, Xiao-Wen
Sehgal, Atharva
Wang, Yixin
Guo, Lan-Zhe
Li, Yu-Feng
Yue, Yisong
author_facet Liu, Hao
Yang, Xiao-Wen
Sehgal, Atharva
Wang, Yixin
Guo, Lan-Zhe
Li, Yu-Feng
Yue, Yisong
contents Symbolic regression (SR), the task of discovering mathematical expressions that best describe a given dataset, remains a fundamental challenge in scientific discovery. Traditional approaches, primarily based on genetic algorithms and related evolutionary methods, have proven useful but suffer from scalability and expressivity limitations. Recently, large language model (LLM)-based evolutionary search methods have been introduced into SR and show promise. However, existing LLM-based approaches typically rely on scalar evaluation metrics, such as mean squared error, as the sole source of feedback during the search process, thereby overlooking the rich information embedded in the dataset. To address this limitation, we propose a novel LLM-based evolutionary search framework that incorporates programmatic context augmentation. By enabling code-based interactions with the dataset, our method can actively perform data analysis and extract informative signals, beyond aggregated evaluation scores. We evaluate our framework on advanced benchmarks, such as LLM-SRBench, and demonstrate superior efficiency and accuracy compared to strong baselines.
format Preprint
id arxiv_https___arxiv_org_abs_2605_03101
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Programmatic Context Augmentation for LLM-based Symbolic Regression
Liu, Hao
Yang, Xiao-Wen
Sehgal, Atharva
Wang, Yixin
Guo, Lan-Zhe
Li, Yu-Feng
Yue, Yisong
Artificial Intelligence
Symbolic regression (SR), the task of discovering mathematical expressions that best describe a given dataset, remains a fundamental challenge in scientific discovery. Traditional approaches, primarily based on genetic algorithms and related evolutionary methods, have proven useful but suffer from scalability and expressivity limitations. Recently, large language model (LLM)-based evolutionary search methods have been introduced into SR and show promise. However, existing LLM-based approaches typically rely on scalar evaluation metrics, such as mean squared error, as the sole source of feedback during the search process, thereby overlooking the rich information embedded in the dataset. To address this limitation, we propose a novel LLM-based evolutionary search framework that incorporates programmatic context augmentation. By enabling code-based interactions with the dataset, our method can actively perform data analysis and extract informative signals, beyond aggregated evaluation scores. We evaluate our framework on advanced benchmarks, such as LLM-SRBench, and demonstrate superior efficiency and accuracy compared to strong baselines.
title Programmatic Context Augmentation for LLM-based Symbolic Regression
topic Artificial Intelligence
url https://arxiv.org/abs/2605.03101