Saved in:
Bibliographic Details
Main Authors: Li, Jiajie, Schmelzle, Jan-Niklas, Du, Yixiao, Heumos, Simon, Guarracino, Andrea, Guidi, Giulia, Prins, Pjotr, Garrison, Erik, Zhang, Zhiru
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2409.00876
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866916056655724544
author Li, Jiajie
Schmelzle, Jan-Niklas
Du, Yixiao
Heumos, Simon
Guarracino, Andrea
Guidi, Giulia
Prins, Pjotr
Garrison, Erik
Zhang, Zhiru
author_facet Li, Jiajie
Schmelzle, Jan-Niklas
Du, Yixiao
Heumos, Simon
Guarracino, Andrea
Guidi, Giulia
Prins, Pjotr
Garrison, Erik
Zhang, Zhiru
contents Computational Pangenomics is an emerging field that studies genetic variation using a graph structure encompassing multiple genomes. Visualizing pangenome graphs is vital for understanding genome diversity. Yet, handling large graphs can be challenging due to the high computational demands of the graph layout process. In this work, we conduct a thorough performance characterization of a state-of-the-art pangenome graph layout algorithm, revealing significant data-level parallelism, which makes GPUs a promising option for compute acceleration. However, irregular data access and the algorithm's memory-bound nature present significant hurdles. To overcome these challenges, we develop a solution implementing three key optimizations: a cache-friendly data layout, coalesced random states, and warp merging. Additionally, we propose a quantitative metric for scalable evaluation of pangenome layout quality. Evaluated on 24 human whole-chromosome pangenomes, our GPU-based solution achieves a 57.3x speedup over the state-of-the-art multithreaded CPU baseline without layout quality loss, reducing execution time from hours to minutes.
format Preprint
id arxiv_https___arxiv_org_abs_2409_00876
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Rapid GPU-Based Pangenome Graph Layout
Li, Jiajie
Schmelzle, Jan-Niklas
Du, Yixiao
Heumos, Simon
Guarracino, Andrea
Guidi, Giulia
Prins, Pjotr
Garrison, Erik
Zhang, Zhiru
Distributed, Parallel, and Cluster Computing
Computational Engineering, Finance, and Science
Data Structures and Algorithms
Computational Pangenomics is an emerging field that studies genetic variation using a graph structure encompassing multiple genomes. Visualizing pangenome graphs is vital for understanding genome diversity. Yet, handling large graphs can be challenging due to the high computational demands of the graph layout process. In this work, we conduct a thorough performance characterization of a state-of-the-art pangenome graph layout algorithm, revealing significant data-level parallelism, which makes GPUs a promising option for compute acceleration. However, irregular data access and the algorithm's memory-bound nature present significant hurdles. To overcome these challenges, we develop a solution implementing three key optimizations: a cache-friendly data layout, coalesced random states, and warp merging. Additionally, we propose a quantitative metric for scalable evaluation of pangenome layout quality. Evaluated on 24 human whole-chromosome pangenomes, our GPU-based solution achieves a 57.3x speedup over the state-of-the-art multithreaded CPU baseline without layout quality loss, reducing execution time from hours to minutes.
title Rapid GPU-Based Pangenome Graph Layout
topic Distributed, Parallel, and Cluster Computing
Computational Engineering, Finance, and Science
Data Structures and Algorithms
url https://arxiv.org/abs/2409.00876