Internformat: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Bose, Avishek, Cong, Guojing
Format:	Preprint
Veröffentlicht:	2024
Schlagworte:	Machine Learning
Online-Zugang:	https://arxiv.org/abs/2410.09280
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

_version_	1866929538540240896
author	Bose, Avishek Cong, Guojing
author_facet	Bose, Avishek Cong, Guojing
contents	Graph neural networks (GNNs) have emerged as one of the most effective ML techniques for drug effect prediction from drug molecular graphs. Despite having immense potential, GNN models lack performance when using datasets that contain high-dimensional, asymmetrically co-occurrent drug effects as targets with complex correlations between them. Training individual learning models for each drug effect and incorporating every prediction result for a wide spectrum of drug effects are impractical. Therefore, an opportunity exists to address this challenge as multitarget prediction problems and predict all drug effects at a time. We developed standard and hybrid GNNs to perform two separate tasks: multiregression for continuous values and multilabel classification for categorical values contained in our datasets. Because multilabel classification makes the target data even more sparse and introduces asymmetric label co-occurrence, learning these models becomes difficult and heavily impacts the GNN's performance. To address these challenges, we propose a new data oversampling technique to improve multilabel classification performances on all the given imbalanced molecular graph datasets. Using the technique, we improve the data imbalance ratio of the drug effects while protecting the datasets' integrity. Finally, we evaluate the multilabel classification performance of the best-performing hybrid GNN model on all the oversampled datasets obtained from the proposed oversampling technique. In all the evaluation metrics (i.e., precision, recall, and F1 score), this model significantly outperforms other ML models, including GNN models when they are trained on the original datasets or oversampled datasets with MLSMOTE, which is a well-known oversampling technique.
format	Preprint
id	arxiv_https___arxiv_org_abs_2410_09280
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	Predicting Drug Effects from High-Dimensional, Asymmetric Drug Datasets by Using Graph Neural Networks: A Comprehensive Analysis of Multitarget Drug Effect Prediction Bose, Avishek Cong, Guojing Machine Learning Graph neural networks (GNNs) have emerged as one of the most effective ML techniques for drug effect prediction from drug molecular graphs. Despite having immense potential, GNN models lack performance when using datasets that contain high-dimensional, asymmetrically co-occurrent drug effects as targets with complex correlations between them. Training individual learning models for each drug effect and incorporating every prediction result for a wide spectrum of drug effects are impractical. Therefore, an opportunity exists to address this challenge as multitarget prediction problems and predict all drug effects at a time. We developed standard and hybrid GNNs to perform two separate tasks: multiregression for continuous values and multilabel classification for categorical values contained in our datasets. Because multilabel classification makes the target data even more sparse and introduces asymmetric label co-occurrence, learning these models becomes difficult and heavily impacts the GNN's performance. To address these challenges, we propose a new data oversampling technique to improve multilabel classification performances on all the given imbalanced molecular graph datasets. Using the technique, we improve the data imbalance ratio of the drug effects while protecting the datasets' integrity. Finally, we evaluate the multilabel classification performance of the best-performing hybrid GNN model on all the oversampled datasets obtained from the proposed oversampling technique. In all the evaluation metrics (i.e., precision, recall, and F1 score), this model significantly outperforms other ML models, including GNN models when they are trained on the original datasets or oversampled datasets with MLSMOTE, which is a well-known oversampling technique.
title	Predicting Drug Effects from High-Dimensional, Asymmetric Drug Datasets by Using Graph Neural Networks: A Comprehensive Analysis of Multitarget Drug Effect Prediction
topic	Machine Learning
url	https://arxiv.org/abs/2410.09280

Ähnliche Einträge