Saved in:
Bibliographic Details
Main Authors: Bigaud, Nathan, Cabeli, Vincent, Gürel, Meltem, Pignet, Arthur, Klein, John, Wainrib, Gilles, Durand, Eric
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2508.16315
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866911120470573056
author Bigaud, Nathan
Cabeli, Vincent
Gürel, Meltem
Pignet, Arthur
Klein, John
Wainrib, Gilles
Durand, Eric
author_facet Bigaud, Nathan
Cabeli, Vincent
Gürel, Meltem
Pignet, Arthur
Klein, John
Wainrib, Gilles
Durand, Eric
contents While large language models (LLMs) are rapidly advancing scientific research, they continue to struggle with core biological reasoning tasks essential for translational and biomedical discovery. To address this limitation, we created and curated eight comprehensive benchmark datasets comprising over 300,000 verifiable question-and-answer pairs, each targeting critical challenges in drug discovery including target druggability, modality suitability, and drug perturbation effects. Using this resource, we developed the OwkinZero models by post-training open-source LLMs through a Reinforcement Learning from Verifiable Rewards strategy. Our results demonstrate that specialized 8-32B OwkinZero models substantially outperform larger, state-of-the-art commercial LLMs on these biological benchmarks. Remarkably, we uncover evidence of a key aspect of generalization: specialist models trained on a single task consistently outperform their base models on previously unseen tasks. This generalization effect is further amplified in our comprehensive OwkinZero models, which were trained on a mixture of datasets and achieve even broader cross-task improvements. This study represents a significant step toward addressing the biological reasoning blind spot in current LLMs, demonstrating that targeted reinforcement learning on carefully curated data can unlock generalizable performance in specialized models, thereby accelerating AI-driven biological discovery.
format Preprint
id arxiv_https___arxiv_org_abs_2508_16315
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle OwkinZero: Accelerating Biological Discovery with AI
Bigaud, Nathan
Cabeli, Vincent
Gürel, Meltem
Pignet, Arthur
Klein, John
Wainrib, Gilles
Durand, Eric
Machine Learning
While large language models (LLMs) are rapidly advancing scientific research, they continue to struggle with core biological reasoning tasks essential for translational and biomedical discovery. To address this limitation, we created and curated eight comprehensive benchmark datasets comprising over 300,000 verifiable question-and-answer pairs, each targeting critical challenges in drug discovery including target druggability, modality suitability, and drug perturbation effects. Using this resource, we developed the OwkinZero models by post-training open-source LLMs through a Reinforcement Learning from Verifiable Rewards strategy. Our results demonstrate that specialized 8-32B OwkinZero models substantially outperform larger, state-of-the-art commercial LLMs on these biological benchmarks. Remarkably, we uncover evidence of a key aspect of generalization: specialist models trained on a single task consistently outperform their base models on previously unseen tasks. This generalization effect is further amplified in our comprehensive OwkinZero models, which were trained on a mixture of datasets and achieve even broader cross-task improvements. This study represents a significant step toward addressing the biological reasoning blind spot in current LLMs, demonstrating that targeted reinforcement learning on carefully curated data can unlock generalizable performance in specialized models, thereby accelerating AI-driven biological discovery.
title OwkinZero: Accelerating Biological Discovery with AI
topic Machine Learning
url https://arxiv.org/abs/2508.16315