Guardado en:
Detalles Bibliográficos
Autores principales: Gu, Mengyang, Xu, Yanxun
Formato: Preprint
Publicado: 2017
Materias:
Acceso en línea:https://arxiv.org/abs/1711.11501
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
_version_ 1866917918989615104
author Gu, Mengyang
Xu, Yanxun
author_facet Gu, Mengyang
Xu, Yanxun
contents Gaussian stochastic process (GaSP) has been widely used as a prior over functions due to its flexibility and tractability in modeling. However, the computational cost in evaluating the likelihood is $O(n^3)$, where $n$ is the number of observed points in the process, as it requires to invert the covariance matrix. This bottleneck prevents GaSP being widely used in large-scale data. We propose a general class of nonseparable GaSP models for multiple functional observations with a fast and exact algorithm, in which the computation is linear ($O(n)$) and exact, requiring no approximation to compute the likelihood. We show that the commonly used linear regression and separable models are special cases of the proposed nonseparable GaSP model. Through the study of an epigenetic application, the proposed nonseparable GaSP model can accurately predict the genome-wide DNA methylation levels and compares favorably to alternative methods, such as linear regression, random forest and localized Kriging method. The algorithm for fast computation is implemented in the ${\tt FastGaSP}$ R package on CRAN.
format Preprint
id arxiv_https___arxiv_org_abs_1711_11501
institution arXiv
publishDate 2017
record_format arxiv
spellingShingle Fast Nonseparable Gaussian Stochastic Process with Application to Methylation Level Interpolation
Gu, Mengyang
Xu, Yanxun
Methodology
Gaussian stochastic process (GaSP) has been widely used as a prior over functions due to its flexibility and tractability in modeling. However, the computational cost in evaluating the likelihood is $O(n^3)$, where $n$ is the number of observed points in the process, as it requires to invert the covariance matrix. This bottleneck prevents GaSP being widely used in large-scale data. We propose a general class of nonseparable GaSP models for multiple functional observations with a fast and exact algorithm, in which the computation is linear ($O(n)$) and exact, requiring no approximation to compute the likelihood. We show that the commonly used linear regression and separable models are special cases of the proposed nonseparable GaSP model. Through the study of an epigenetic application, the proposed nonseparable GaSP model can accurately predict the genome-wide DNA methylation levels and compares favorably to alternative methods, such as linear regression, random forest and localized Kriging method. The algorithm for fast computation is implemented in the ${\tt FastGaSP}$ R package on CRAN.
title Fast Nonseparable Gaussian Stochastic Process with Application to Methylation Level Interpolation
topic Methodology
url https://arxiv.org/abs/1711.11501