Enregistré dans:
Détails bibliographiques
Auteurs principaux: Prevot, Emma, Toogood, Rory, Pagani, Filippo, Kirk, Paul D. W.
Format: Preprint
Publié: 2024
Sujets:
Accès en ligne:https://arxiv.org/abs/2411.19262
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
_version_ 1866916498456444928
author Prevot, Emma
Toogood, Rory
Pagani, Filippo
Kirk, Paul D. W.
author_facet Prevot, Emma
Toogood, Rory
Pagani, Filippo
Kirk, Paul D. W.
contents Cluster analyses of high-dimensional data are often hampered by the presence of large numbers of variables that do not provide relevant information, as well as the perennial issue of choosing an appropriate number of clusters. These challenges are frequently encountered when analysing `omics datasets, such as in molecular precision medicine, where a key goal is to identify disease subtypes and the biomarkers that define them. Here we introduce an annealed variational Bayes algorithm for fitting high-dimensional mixture models while performing variable selection. Our algorithm is scalable and computationally efficient, and we provide an open source Python implementation, VBVarSel. In a range of simulated and real biomedical examples, we show that VBVarSel outperforms the current state of the art, and demonstrate its use for cancer subtyping and biomarker discovery.
format Preprint
id arxiv_https___arxiv_org_abs_2411_19262
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Annealed variational mixtures for disease subtyping and biomarker discovery
Prevot, Emma
Toogood, Rory
Pagani, Filippo
Kirk, Paul D. W.
Computation
Applications
Cluster analyses of high-dimensional data are often hampered by the presence of large numbers of variables that do not provide relevant information, as well as the perennial issue of choosing an appropriate number of clusters. These challenges are frequently encountered when analysing `omics datasets, such as in molecular precision medicine, where a key goal is to identify disease subtypes and the biomarkers that define them. Here we introduce an annealed variational Bayes algorithm for fitting high-dimensional mixture models while performing variable selection. Our algorithm is scalable and computationally efficient, and we provide an open source Python implementation, VBVarSel. In a range of simulated and real biomedical examples, we show that VBVarSel outperforms the current state of the art, and demonstrate its use for cancer subtyping and biomarker discovery.
title Annealed variational mixtures for disease subtyping and biomarker discovery
topic Computation
Applications
url https://arxiv.org/abs/2411.19262