Saved in:
Bibliographic Details
Main Authors: Kumarasinghe, Yohhan, Williams, Jacob, Yuan, Yuxin, Wang, Wenbo, Dias, Julie-Alexia, Zhang, Haoyu, Li, Zilin, Li, Xihao
Format: Recurso digital
Language:
Published: Zenodo 2026
Online Access:https://doi.org/10.5281/zenodo.19476470
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • <p>This dataset serves as the source data for Figure 2, Extended Data Figures 3-6, and Supplementary Figures 1-6 of the manuscript titled "MetaSTAARlite: An all-in-one tool for biobank-scale whole-genome sequencing meta-analysis". MetaSTAARlite provides a scalable and resource-efficient summary statistics-based pipeline for powerful, functionally-informed rare variant meta-analysis of biobank-scale sequencing data.</p> <p><strong>The files included in this dataset are as follows:</strong></p> <p><code>UKB_meta_TC_coding.zip</code>: Gene-centric coding meta-analysis results of total cholesterol (TC) for a 1:2:3 random partition of the UK Biobank whole-genome sequencing data ($n_1$ = 31,685; $n_2$ = 63,370; $n_3$ = 95,055).</p> <p><code>UKB_meta_TC_noncoding.zip</code>: Gene-centric noncoding meta-analysis results of total cholesterol (TC) for a 1:2:3 random partition of the UK Biobank whole-genome sequencing data ($n_1$ = 31,685; $n_2$ = 63,370; $n_3$ = 95,055).</p> <p><code>UKB_meta_TC_ncRNA.Rdata</code>: Noncoding RNA (ncRNA) meta-analysis results of total cholesterol (TC) for a 1:2:3 random partition of the UK Biobank whole-genome sequencing data ($n_1$ = 31,685; $n_2$ = 63,370; $n_3$ = 95,055).</p> <p><code>UKB_pooled_TC_coding.zip</code>: Gene-centric coding pooled analysis results of total cholesterol (TC) using individual-level data from the UK Biobank whole-genome sequencing data dataset ($n$ = 190,110).</p> <p><code>UKB_pooled_TC_noncoding.zip</code>: Gene-centric noncoding pooled analysis results of total cholesterol (TC) using individual-level data from the UK Biobank whole-genome sequencing data dataset ($n$ = 190,110).</p> <p><code>UKB_pooled_TC_ncRNA.Rdata</code>: Noncoding RNA (ncRNA) pooled analysis results of total cholesterol (TC) using individual-level data from the UK Biobank whole-genome sequencing data dataset ($n$ = 190,110).</p> <p><code>UKB_meta_TC_coding_INT.zip</code>: Gene-centric coding sensitivity meta-analysis results of total cholesterol (TC) for a 1:2:3 random partition of the UK Biobank whole-genome sequencing data ($n_1$ = 31,685; $n_2$ = 63,370; $n_3$ = 95,055), in which the rank-based inverse normal transformation to the residuals were applied <em>after</em> the 1:2:3 random partition.</p> <p><code>UKB_meta_TC_noncoding_INT.zip</code>: Gene-centric noncoding sensitivity meta-analysis results of total cholesterol (TC) for a 1:2:3 random partition of the UK Biobank whole-genome sequencing data ($n_1$ = 31,685; $n_2$ = 63,370; $n_3$ = 95,055), in which the rank-based inverse normal transformation to the residuals were applied <em>after</em> the 1:2:3 random partition.</p> <p><code>UKB_meta_TC_ncRNA_INT.Rdata</code>: Noncoding RNA (ncRNA) sensitivity meta-analysis results of total cholesterol (TC) for a 1:2:3 random partition of the UK Biobank whole-genome sequencing data ($n_1$ = 31,685; $n_2$ = 63,370; $n_3$ = 95,055), in which the rank-based inverse normal transformation to the residuals were applied <em>after</em> the 1:2:3 random partition.</p> <p><code>UKB_AoU_meta_TC.zip</code>: Gene-centric coding meta-analysis results of total cholesterol (TC) using the UK Biobank whole-exome sequencing data ($n$ = 446,933) and All of Us exome callset of the short read whole-genome sequencing data ($n$ = 94,532).</p> <p><code>UKB_AoU_meta_height.zip</code>: Gene-centric coding meta-analysis results of height using the UK Biobank whole-exome sequencing data ($n$ = 467,038) and the All of Us exome callset of short read whole-genome sequencing data ($n$ = 222,316).</p> <p><code>UKB_AoU_meta_eGFR.zip</code>: Gene-centric coding meta-analysis results of estimated glomerular filtration rate (eGFR) using the UK Biobank whole-exome sequencing data ($n$ = 446,314) and the All of Us exome callset of short read whole-genome sequencing data ($n$ = 22,658).</p> <p><code>UKB_AoU_meta_calcium.zip</code>: Gene-centric coding meta-analysis results of calcium using the UK Biobank whole-exome sequencing data ($n$ = 409,114) and the All of Us exome callset of short read whole-genome sequencing data ($n$ = 129,547).</p> <p><code>UKB_AoU_meta_binary_LDL.zip</code>: Gene-centric coding meta-analysis results of elevated low-density lipoprotein cholesterol (adjusted LDL-C > 130 mg/dL) using the UK Biobank whole-exome sequencing data ($n$ = 435,410) and All of Us exome callset of the short read whole-genome sequencing data ($n$ = 89,147).</p> <p><strong>The files used as source data to each figure are as follows:</strong></p> <p><strong>Figure 2.</strong> Miami plot, quantile-quantile (Q-Q) plot, and scatterplot comparing the results obtained from gene-centric noncoding meta-analysis of total cholesterol (TC) for a 1:2:3 random partition of the UK Biobank whole-genome sequencing data ($n_1$ = 31,685; $n_2$ = 63,370; $n_3$ = 95,055) and those obtained from a gene-centric noncoding pooled analysis of TC using individual-level data from the same dataset ($n$ = 190,110). Source: <code>UKB_meta_TC_noncoding.zip</code>, <code>UKB_pooled_TC_noncoding.zip</code></p> <p><strong>Extended Data Figure 3.</strong> Miami plot, quantile-quantile (Q-Q) plot, and scatterplot comparing the results obtained from gene-centric coding meta-analysis of total cholesterol (TC) for a 1:2:3 random partition of the UK Biobank whole-genome sequencing data ($n_1$ = 31,685; $n_2$ = 63,370; $n_3$ = 95,055) and those obtained from a gene-centric coding pooled analysis of TC using individual-level data from the same dataset ($n$ = 190,110). Source: <code>UKB_meta_TC_coding.zip</code>, <code>UKB_pooled_TC_coding.zip</code></p> <p><strong>Extended Data Figure 4.</strong> Scatterplots comparing results for a 1:2:3 random partition ($n_1$ = 31,685; $n_2$ = 63,370; $n_3$ = 95,055) of the UK Biobank whole-genome sequencing data that were generated by MetaSTAAR-O to results that were generated by other rare variant meta-analysis methods. Source: <code>UKB_meta_TC_noncoding.zip</code>, <code>UKB_meta_TC_coding.zip</code></p> <p><strong>Extended Data Figure 5.</strong> Manhattan plot and quantile-quantile (Q-Q) plot for meta-analysis of total cholesterol using the UK Biobank whole-exome sequencing data ($n$ = 446,933) and All of Us exome callset of the short read whole-genome sequencing data ($n$ = 94,532). Source: <code>UKB_AoU_meta_TC.zip</code></p> <p><strong>Extended Data Figure 6.</strong> Manhattan plot and quantile-quantile (Q-Q) plot for meta-analysis of elevated low-density lipoprotein cholesterol (adjusted LDL-C > 130 mg/dL) using the UK Biobank whole-exome sequencing data (n = 435,410) and All of Us exome callset of the short read whole-genome sequencing data (n = 89,147). Source: <code>UKB_AoU_meta_binary_LDL.zip</code></p> <p><strong>Supplementary Figure 1.</strong> Quantile-quantile (Q-Q) plots for gene-centric coding meta-analysis of total cholesterol (TC) and gene-centric coding pooled analysis of TC, using a 1:2:3 random partition of the UK Biobank whole-genome sequencing data ($n_1$ = 31,685; $n_2$ = 63,370; $n_3$ = 95,055; $n$ = 190,110). Source: <code>UKB_meta_TC_coding.zip</code>, <code>UKB_pooled_TC_coding.zip</code></p> <p><strong>Supplementary Figure 2.</strong> Quantile-quantile (Q-Q) plots for gene-centric noncoding meta-analysis of total cholesterol (TC) and gene-centric noncoding pooled analysis of TC, using a 1:2:3 random partition of the UK Biobank WGS data ($n_1$ = 31,685; $n_2$ = 63,370; $n_3$ = 95,055; $n$ = 190,110). Source: <code>UKB_meta_TC_noncoding.zip</code>, <code>UKB_pooled_TC_noncoding.zip</code></p> <p><strong>Supplementary Figure 3.</strong> Manhattan plot and quantile-quantile (Q-Q) plot for meta-analysis of height using the UK Biobank whole-exome sequencing data ($n$ = 467,038) and the All of Us exome callset of short read whole-genome sequencing data ($n$ = 222,316). Source: <code>UKB_AoU_meta_height.zip</code></p> <p><strong>Supplementary Figure 4.</strong> Manhattan plot and quantile-quantile (Q-Q) plot for meta-analysis of estimated glomerular filtration rate (eGFR) using the UK Biobank whole-exome sequencing data ($n$ = 446,314) and the All of Us exome callset of short read whole-genome sequencing data ($n$ = 22,658). Source: <code>UKB_AoU_meta_eGFR.zip</code></p> <p><strong>Supplementary Figure 5.</strong> Manhattan plot and quantile-quantile (Q-Q) plot for meta-analysis of calcium using the UK Biobank whole-exome sequencing data ($n$ = 409,114) and the All of Us exome callset of short read whole-genome sequencing data ($n$ = 129,547). Source: <code>UKB_AoU_meta_calcium.zip</code></p> <p><strong>Supplementary Figure 6.</strong> Scatterplots comparing gene-centric unconditional MetaSTAAR-O <em>P</em> values obtained from a sensitivity meta-analysis (in which the rank-based inverse normal transformation to the residuals were applied <em>after</em> the 1:2:3 random partition) to STAAR-O <em>P</em> values obtained from the joint analysis of pooled individual-level data. Source: <code>UKB_meta_TC_coding_INT.zip</code>, <code>UKB_pooled_TC_coding.zip</code>, <code>UKB_meta_TC_noncoding_INT.zip</code>, <code>UKB_pooled_TC_noncoding.zip</code></p> <p> </p>