Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Liu, Hao, Yang, Xiao-Wen, Sehgal, Atharva, Wang, Yixin, Guo, Lan-Zhe, Li, Yu-Feng, Yue, Yisong
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2605.03101
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866910190718156800
author	Liu, Hao Yang, Xiao-Wen Sehgal, Atharva Wang, Yixin Guo, Lan-Zhe Li, Yu-Feng Yue, Yisong
author_facet	Liu, Hao Yang, Xiao-Wen Sehgal, Atharva Wang, Yixin Guo, Lan-Zhe Li, Yu-Feng Yue, Yisong
contents	Symbolic regression (SR), the task of discovering mathematical expressions that best describe a given dataset, remains a fundamental challenge in scientific discovery. Traditional approaches, primarily based on genetic algorithms and related evolutionary methods, have proven useful but suffer from scalability and expressivity limitations. Recently, large language model (LLM)-based evolutionary search methods have been introduced into SR and show promise. However, existing LLM-based approaches typically rely on scalar evaluation metrics, such as mean squared error, as the sole source of feedback during the search process, thereby overlooking the rich information embedded in the dataset. To address this limitation, we propose a novel LLM-based evolutionary search framework that incorporates programmatic context augmentation. By enabling code-based interactions with the dataset, our method can actively perform data analysis and extract informative signals, beyond aggregated evaluation scores. We evaluate our framework on advanced benchmarks, such as LLM-SRBench, and demonstrate superior efficiency and accuracy compared to strong baselines.
format	Preprint
id	arxiv_https___arxiv_org_abs_2605_03101
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Programmatic Context Augmentation for LLM-based Symbolic Regression Liu, Hao Yang, Xiao-Wen Sehgal, Atharva Wang, Yixin Guo, Lan-Zhe Li, Yu-Feng Yue, Yisong Artificial Intelligence Symbolic regression (SR), the task of discovering mathematical expressions that best describe a given dataset, remains a fundamental challenge in scientific discovery. Traditional approaches, primarily based on genetic algorithms and related evolutionary methods, have proven useful but suffer from scalability and expressivity limitations. Recently, large language model (LLM)-based evolutionary search methods have been introduced into SR and show promise. However, existing LLM-based approaches typically rely on scalar evaluation metrics, such as mean squared error, as the sole source of feedback during the search process, thereby overlooking the rich information embedded in the dataset. To address this limitation, we propose a novel LLM-based evolutionary search framework that incorporates programmatic context augmentation. By enabling code-based interactions with the dataset, our method can actively perform data analysis and extract informative signals, beyond aggregated evaluation scores. We evaluate our framework on advanced benchmarks, such as LLM-SRBench, and demonstrate superior efficiency and accuracy compared to strong baselines.
title	Programmatic Context Augmentation for LLM-based Symbolic Regression
topic	Artificial Intelligence
url	https://arxiv.org/abs/2605.03101

Similar Items