Saved in:
Bibliographic Details
Main Authors: Okada, Masahi, Sakai, Kazuki, Yoshida, Hiroaki, Okoshi, Masaki, Taniguchi, Tadahiro
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2602.07002
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911428582047744
author Okada, Masahi
Sakai, Kazuki
Yoshida, Hiroaki
Okoshi, Masaki
Taniguchi, Tadahiro
author_facet Okada, Masahi
Sakai, Kazuki
Yoshida, Hiroaki
Okoshi, Masaki
Taniguchi, Tadahiro
contents We study sample-efficient molecular optimization under a limited budget of oracle evaluations. We propose MolLIBRA (MultimOdaLity and Language Integrated Bayesian and evolutionaRy optimizAtion), a genetic algorithm based framework that pre-ranks candidate molecules using multiple critics before oracle calls: (i) an ensemble of Gaussian process (GP) surrogates defined over multiple molecular fingerprints and (ii) a pretrained text-molecule aligned encoder CLAMP. The GP ensemble enables adaptive selection of task-appropriate fingerprints, while CLAMP provides a zero-shot scoring signal from task descriptions by measuring the similarity between molecular and text embeddings. On the Practical Molecular Optimization (PMO) benchmark with a budget of 1,000 evaluations (PMO-1K), MolLIBRA-L, our variant with a language-model-based candidate generator, attains the best Top-10 AUC on 14/22 tasks and the highest overall sum of Top-10 AUC across tasks among prior methods.
format Preprint
id arxiv_https___arxiv_org_abs_2602_07002
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle MolLIBRA: Genetic Molecular Optimization with Multi-Fingerprint Surrogates and Text-Molecule Aligned Critic
Okada, Masahi
Sakai, Kazuki
Yoshida, Hiroaki
Okoshi, Masaki
Taniguchi, Tadahiro
Neural and Evolutionary Computing
Materials Science
Machine Learning
We study sample-efficient molecular optimization under a limited budget of oracle evaluations. We propose MolLIBRA (MultimOdaLity and Language Integrated Bayesian and evolutionaRy optimizAtion), a genetic algorithm based framework that pre-ranks candidate molecules using multiple critics before oracle calls: (i) an ensemble of Gaussian process (GP) surrogates defined over multiple molecular fingerprints and (ii) a pretrained text-molecule aligned encoder CLAMP. The GP ensemble enables adaptive selection of task-appropriate fingerprints, while CLAMP provides a zero-shot scoring signal from task descriptions by measuring the similarity between molecular and text embeddings. On the Practical Molecular Optimization (PMO) benchmark with a budget of 1,000 evaluations (PMO-1K), MolLIBRA-L, our variant with a language-model-based candidate generator, attains the best Top-10 AUC on 14/22 tasks and the highest overall sum of Top-10 AUC across tasks among prior methods.
title MolLIBRA: Genetic Molecular Optimization with Multi-Fingerprint Surrogates and Text-Molecule Aligned Critic
topic Neural and Evolutionary Computing
Materials Science
Machine Learning
url https://arxiv.org/abs/2602.07002