Vista Equipo: :: Library Catalog

Guardado en:

Detalles Bibliográficos
Autores principales:	Roupassov-Ruiz, Anton, Zuo, Yiyang
Formato:	Preprint
Publicado:	2026
Materias:	Machine Learning
Acceso en línea:	https://arxiv.org/abs/2601.04365
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

_version_	1866912809558736896
author	Roupassov-Ruiz, Anton Zuo, Yiyang
author_facet	Roupassov-Ruiz, Anton Zuo, Yiyang
contents	In evolutionary reinforcement learning tasks (ERL), agent policies are often encoded as small artificial neural networks (NERL). Such representations lack explicit modular structure, limiting behavioral interpretation. We investigate whether programmatic policies (PERL), implemented as soft, differentiable decision lists (SDDL), can match the performance of NERL. To support reproducible evaluation, we provide the first fully specified and open-source reimplementation of the classic 1992 Artificial Life (ALife) ERL testbed. We conduct a rigorous survival analysis across 4000 independent trials utilizing Kaplan-Meier curves and Restricted Mean Survival Time (RMST) metrics absent in the original study. We find a statistically significant difference in survival probability between PERL and NERL. PERL agents survive on average 201.69 steps longer than NERL agents. Moreover, SDDL agents using learning alone (no evolution) survive on average 73.67 steps longer than neural agents using both learning and evaluation. These results demonstrate that programmatic policies can exceed the survival performance of neural policies in ALife.
format	Preprint
id	arxiv_https___arxiv_org_abs_2601_04365
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	Survival Dynamics of Neural and Programmatic Policies in Evolutionary Reinforcement Learning Roupassov-Ruiz, Anton Zuo, Yiyang Machine Learning In evolutionary reinforcement learning tasks (ERL), agent policies are often encoded as small artificial neural networks (NERL). Such representations lack explicit modular structure, limiting behavioral interpretation. We investigate whether programmatic policies (PERL), implemented as soft, differentiable decision lists (SDDL), can match the performance of NERL. To support reproducible evaluation, we provide the first fully specified and open-source reimplementation of the classic 1992 Artificial Life (ALife) ERL testbed. We conduct a rigorous survival analysis across 4000 independent trials utilizing Kaplan-Meier curves and Restricted Mean Survival Time (RMST) metrics absent in the original study. We find a statistically significant difference in survival probability between PERL and NERL. PERL agents survive on average 201.69 steps longer than NERL agents. Moreover, SDDL agents using learning alone (no evolution) survive on average 73.67 steps longer than neural agents using both learning and evaluation. These results demonstrate that programmatic policies can exceed the survival performance of neural policies in ALife.
title	Survival Dynamics of Neural and Programmatic Policies in Evolutionary Reinforcement Learning
topic	Machine Learning
url	https://arxiv.org/abs/2601.04365

Ejemplares similares