Saved in:
Bibliographic Details
Main Authors: Beinat, Matilda, Beinat, Julian, Shoaib, Mohammed, Magenti, Jorge Gomez
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2401.05145
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866913191492059136
author Beinat, Matilda
Beinat, Julian
Shoaib, Mohammed
Magenti, Jorge Gomez
author_facet Beinat, Matilda
Beinat, Julian
Shoaib, Mohammed
Magenti, Jorge Gomez
contents Projected to impact 1.6 million people in the UK by 2040 and costing £25 billion annually, dementia presents a growing challenge to society. This study, a pioneering effort to predict the translational potential of dementia research using machine learning, hopes to address the slow translation of fundamental discoveries into practical applications despite dementia's significant societal and economic impact. We used the Dimensions database to extract data from 43,091 UK dementia research publications between the years 1990-2023, specifically metadata (authors, publication year etc.), concepts mentioned in the paper, and the paper abstract. To prepare the data for machine learning we applied methods such as one hot encoding and/or word embeddings. We trained a CatBoost Classifier to predict if a publication will be cited in a future patent or clinical trial. We trained several model variations. The model combining metadata, concept, and abstract embeddings yielded the highest performance: for patent predictions, an Area Under the Receiver Operating Characteristic Curve (AUROC) of 0.84 and 77.17% accuracy; for clinical trial predictions, an AUROC of 0.81 and 75.11% accuracy. The results demonstrate that integrating machine learning within current research methodologies can uncover overlooked publications, expediting the identification of promising research and potentially transforming dementia research by predicting real-world impact and guiding translational strategies.
format Preprint
id arxiv_https___arxiv_org_abs_2401_05145
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Machine Learning to Promote Translational Research: Predicting Patent and Clinical Trial Inclusion in Dementia Research
Beinat, Matilda
Beinat, Julian
Shoaib, Mohammed
Magenti, Jorge Gomez
Machine Learning
Projected to impact 1.6 million people in the UK by 2040 and costing £25 billion annually, dementia presents a growing challenge to society. This study, a pioneering effort to predict the translational potential of dementia research using machine learning, hopes to address the slow translation of fundamental discoveries into practical applications despite dementia's significant societal and economic impact. We used the Dimensions database to extract data from 43,091 UK dementia research publications between the years 1990-2023, specifically metadata (authors, publication year etc.), concepts mentioned in the paper, and the paper abstract. To prepare the data for machine learning we applied methods such as one hot encoding and/or word embeddings. We trained a CatBoost Classifier to predict if a publication will be cited in a future patent or clinical trial. We trained several model variations. The model combining metadata, concept, and abstract embeddings yielded the highest performance: for patent predictions, an Area Under the Receiver Operating Characteristic Curve (AUROC) of 0.84 and 77.17% accuracy; for clinical trial predictions, an AUROC of 0.81 and 75.11% accuracy. The results demonstrate that integrating machine learning within current research methodologies can uncover overlooked publications, expediting the identification of promising research and potentially transforming dementia research by predicting real-world impact and guiding translational strategies.
title Machine Learning to Promote Translational Research: Predicting Patent and Clinical Trial Inclusion in Dementia Research
topic Machine Learning
url https://arxiv.org/abs/2401.05145