Saved in:
Bibliographic Details
Main Authors: Xiao, Jing, Chen, Xinhai, Peng, Jiaming, Wang, Qinglin, Jia, Menghan, Lai, Zhiquan, Yu, Guangping, Li, Dongsheng, Li, Tiejun, Liu, Jie
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.13021
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911449504284672
author Xiao, Jing
Chen, Xinhai
Peng, Jiaming
Wang, Qinglin
Jia, Menghan
Lai, Zhiquan
Yu, Guangping
Li, Dongsheng
Li, Tiejun
Liu, Jie
author_facet Xiao, Jing
Chen, Xinhai
Peng, Jiaming
Wang, Qinglin
Jia, Menghan
Lai, Zhiquan
Yu, Guangping
Li, Dongsheng
Li, Tiejun
Liu, Jie
contents Symbolic Regression (SR) aims to discover interpretable equations from observational data, with the potential to reveal underlying principles behind natural phenomena. However, existing approaches often fall into the Pseudo-Equation Trap: producing equations that fit observations well but remain inconsistent with fundamental scientific principles. A key reason is that these approaches are dominated by empirical risk minimization, lacking explicit constraints to ensure scientific consistency. To bridge this gap, we propose PG-SR, a prior-guided SR framework built upon a three-stage pipeline consisting of warm-up, evolution, and refinement. Throughout the pipeline, PG-SR introduces a prior constraint checker that explicitly encodes domain priors as executable constraint programs, and employs a Prior Annealing Constrained Evaluation (PACE) mechanism during the evolution stage to progressively steer discovery toward scientifically consistent regions. Theoretically, we prove that PG-SR reduces the Rademacher complexity of the hypothesis space, yielding tighter generalization bounds and establishing a guarantee against pseudo-equations. Experimentally, PG-SR outperforms state-of-the-art baselines across diverse domains, maintaining robustness to varying prior quality, noisy data, and data scarcity.
format Preprint
id arxiv_https___arxiv_org_abs_2602_13021
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle Prior-Guided Symbolic Regression: Towards Scientific Consistency in Equation Discovery
Xiao, Jing
Chen, Xinhai
Peng, Jiaming
Wang, Qinglin
Jia, Menghan
Lai, Zhiquan
Yu, Guangping
Li, Dongsheng
Li, Tiejun
Liu, Jie
Machine Learning
Artificial Intelligence
Symbolic Regression (SR) aims to discover interpretable equations from observational data, with the potential to reveal underlying principles behind natural phenomena. However, existing approaches often fall into the Pseudo-Equation Trap: producing equations that fit observations well but remain inconsistent with fundamental scientific principles. A key reason is that these approaches are dominated by empirical risk minimization, lacking explicit constraints to ensure scientific consistency. To bridge this gap, we propose PG-SR, a prior-guided SR framework built upon a three-stage pipeline consisting of warm-up, evolution, and refinement. Throughout the pipeline, PG-SR introduces a prior constraint checker that explicitly encodes domain priors as executable constraint programs, and employs a Prior Annealing Constrained Evaluation (PACE) mechanism during the evolution stage to progressively steer discovery toward scientifically consistent regions. Theoretically, we prove that PG-SR reduces the Rademacher complexity of the hypothesis space, yielding tighter generalization bounds and establishing a guarantee against pseudo-equations. Experimentally, PG-SR outperforms state-of-the-art baselines across diverse domains, maintaining robustness to varying prior quality, noisy data, and data scarcity.
title Prior-Guided Symbolic Regression: Towards Scientific Consistency in Equation Discovery
topic Machine Learning
Artificial Intelligence
url https://arxiv.org/abs/2602.13021