Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Thomas, Morgan, Bou, Albert, Gómez-Tamayo, Jose Carlos, Tresadern, Gary, Ahmad, Mazen, De Fabritiis, Gianni
Format:	Preprint
Published:	2025
Subjects:	Machine Learning
Online Access:	https://arxiv.org/abs/2501.15971
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866915597929938944
author	Thomas, Morgan Bou, Albert Gómez-Tamayo, Jose Carlos Tresadern, Gary Ahmad, Mazen De Fabritiis, Gianni
author_facet	Thomas, Morgan Bou, Albert Gómez-Tamayo, Jose Carlos Tresadern, Gary Ahmad, Mazen De Fabritiis, Gianni
contents	Chemical language models, combined with reinforcement learning (RL), have shown significant promise to efficiently traverse large chemical spaces for drug discovery. However, the performance of various RL algorithms and their best practices for practical drug discovery are still unclear. Here, starting from the principles of the REINFORCE algorithm, we investigate the effect of different components from RL theory including experience replay, hill-climbing, baselines to reduce variance, and alternative reward shaping. We propose a new regularization method more aligned to REINFORCE than current standard practices, and demonstrate how RL hyperparameters can be fine-tuned for effectiveness and efficiency. Lastly, we apply our learnings to practical drug discovery by demonstrating enhanced learning efficiency on frontier binding affinity models by using Boltz2 as a reward model. We share our RL models used in the ACEGEN repository, and hope the experiments here act as a guide to researchers applying RL to chemical language models for drug discovery.
format	Preprint
id	arxiv_https___arxiv_org_abs_2501_15971
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	REINFORCE-ING Chemical Language Models for Drug Discovery Thomas, Morgan Bou, Albert Gómez-Tamayo, Jose Carlos Tresadern, Gary Ahmad, Mazen De Fabritiis, Gianni Machine Learning Chemical language models, combined with reinforcement learning (RL), have shown significant promise to efficiently traverse large chemical spaces for drug discovery. However, the performance of various RL algorithms and their best practices for practical drug discovery are still unclear. Here, starting from the principles of the REINFORCE algorithm, we investigate the effect of different components from RL theory including experience replay, hill-climbing, baselines to reduce variance, and alternative reward shaping. We propose a new regularization method more aligned to REINFORCE than current standard practices, and demonstrate how RL hyperparameters can be fine-tuned for effectiveness and efficiency. Lastly, we apply our learnings to practical drug discovery by demonstrating enhanced learning efficiency on frontier binding affinity models by using Boltz2 as a reward model. We share our RL models used in the ACEGEN repository, and hope the experiments here act as a guide to researchers applying RL to chemical language models for drug discovery.
title	REINFORCE-ING Chemical Language Models for Drug Discovery
topic	Machine Learning
url	https://arxiv.org/abs/2501.15971

Similar Items