Saved in:
Bibliographic Details
Main Authors: Bolton, Elliot, Venigalla, Abhinav, Yasunaga, Michihiro, Hall, David, Xiong, Betty, Lee, Tony, Daneshjou, Roxana, Frankle, Jonathan, Liang, Percy, Carbin, Michael, Manning, Christopher D.
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2403.18421
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866929291687624704
author Bolton, Elliot
Venigalla, Abhinav
Yasunaga, Michihiro
Hall, David
Xiong, Betty
Lee, Tony
Daneshjou, Roxana
Frankle, Jonathan
Liang, Percy
Carbin, Michael
Manning, Christopher D.
author_facet Bolton, Elliot
Venigalla, Abhinav
Yasunaga, Michihiro
Hall, David
Xiong, Betty
Lee, Tony
Daneshjou, Roxana
Frankle, Jonathan
Liang, Percy
Carbin, Michael
Manning, Christopher D.
contents Models such as GPT-4 and Med-PaLM 2 have demonstrated impressive performance on a wide variety of biomedical NLP tasks. However, these models have hundreds of billions of parameters, are computationally expensive to run, require users to send their input data over the internet, and are trained on unknown data sources. Can smaller, more targeted models compete? To address this question, we build and release BioMedLM, a 2.7 billion parameter GPT-style autoregressive model trained exclusively on PubMed abstracts and full articles. When fine-tuned, BioMedLM can produce strong multiple-choice biomedical question-answering results competitive with much larger models, such as achieving a score of 57.3% on MedMCQA (dev) and 69.0% on the MMLU Medical Genetics exam. BioMedLM can also be fine-tuned to produce useful answers to patient questions on medical topics. This demonstrates that smaller models can potentially serve as transparent, privacy-preserving, economical and environmentally friendly foundations for particular NLP applications, such as in biomedicine. The model is available on the Hugging Face Hub: https://huggingface.co/stanford-crfm/BioMedLM.
format Preprint
id arxiv_https___arxiv_org_abs_2403_18421
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
Bolton, Elliot
Venigalla, Abhinav
Yasunaga, Michihiro
Hall, David
Xiong, Betty
Lee, Tony
Daneshjou, Roxana
Frankle, Jonathan
Liang, Percy
Carbin, Michael
Manning, Christopher D.
Computation and Language
Artificial Intelligence
Models such as GPT-4 and Med-PaLM 2 have demonstrated impressive performance on a wide variety of biomedical NLP tasks. However, these models have hundreds of billions of parameters, are computationally expensive to run, require users to send their input data over the internet, and are trained on unknown data sources. Can smaller, more targeted models compete? To address this question, we build and release BioMedLM, a 2.7 billion parameter GPT-style autoregressive model trained exclusively on PubMed abstracts and full articles. When fine-tuned, BioMedLM can produce strong multiple-choice biomedical question-answering results competitive with much larger models, such as achieving a score of 57.3% on MedMCQA (dev) and 69.0% on the MMLU Medical Genetics exam. BioMedLM can also be fine-tuned to produce useful answers to patient questions on medical topics. This demonstrates that smaller models can potentially serve as transparent, privacy-preserving, economical and environmentally friendly foundations for particular NLP applications, such as in biomedicine. The model is available on the Hugging Face Hub: https://huggingface.co/stanford-crfm/BioMedLM.
title BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text
topic Computation and Language
Artificial Intelligence
url https://arxiv.org/abs/2403.18421