Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Yang, Jianke, Venkatachalam, Ohm, Kianezhad, Mohammad, Vadgama, Sharvaree, Yu, Rose
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2602.12259
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917293158563840
author	Yang, Jianke Venkatachalam, Ohm Kianezhad, Mohammad Vadgama, Sharvaree Yu, Rose
author_facet	Yang, Jianke Venkatachalam, Ohm Kianezhad, Mohammad Vadgama, Sharvaree Yu, Rose
contents	Explaining observed phenomena through symbolic, interpretable formulas is a fundamental goal of science. Recently, large language models (LLMs) have emerged as promising tools for symbolic equation discovery, owing to their broad domain knowledge and strong reasoning capabilities. However, most existing LLM-based systems try to guess equations directly from data, without modeling the multi-step reasoning process that scientists often follow: first inferring physical properties such as symmetries, then using these as priors to restrict the space of candidate equations. We introduce KeplerAgent, an agentic framework that explicitly follows this scientific reasoning process. The agent coordinates physics-based tools to extract intermediate structure and uses these results to configure symbolic regression engines such as PySINDy and PySR, including their function libraries and structural constraints. Across a suite of physical equation benchmarks, KeplerAgent achieves substantially higher symbolic accuracy and greater robustness to noisy data than both LLM and traditional baselines.
format	Preprint
id	arxiv_https___arxiv_org_abs_2602_12259
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Think like a Scientist: Physics-guided LLM Agent for Equation Discovery Yang, Jianke Venkatachalam, Ohm Kianezhad, Mohammad Vadgama, Sharvaree Yu, Rose Artificial Intelligence Machine Learning Explaining observed phenomena through symbolic, interpretable formulas is a fundamental goal of science. Recently, large language models (LLMs) have emerged as promising tools for symbolic equation discovery, owing to their broad domain knowledge and strong reasoning capabilities. However, most existing LLM-based systems try to guess equations directly from data, without modeling the multi-step reasoning process that scientists often follow: first inferring physical properties such as symmetries, then using these as priors to restrict the space of candidate equations. We introduce KeplerAgent, an agentic framework that explicitly follows this scientific reasoning process. The agent coordinates physics-based tools to extract intermediate structure and uses these results to configure symbolic regression engines such as PySINDy and PySR, including their function libraries and structural constraints. Across a suite of physical equation benchmarks, KeplerAgent achieves substantially higher symbolic accuracy and greater robustness to noisy data than both LLM and traditional baselines.
title	Think like a Scientist: Physics-guided LLM Agent for Equation Discovery
topic	Artificial Intelligence Machine Learning
url	https://arxiv.org/abs/2602.12259

Similar Items