Saved in:
Bibliographic Details
Main Authors: Yogatama, Bobbi, Yang, Yifei, Kristensen, Kevin, Sarda, Devesh, Kim, Abigale, Cockcroft, Adrian, Teng, Yu, Patterson, Joshua, Kimball, Gregory, McKinney, Wes, Gong, Weiwei, Yu, Xiangyao
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2508.04701
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909971487129600
author Yogatama, Bobbi
Yang, Yifei
Kristensen, Kevin
Sarda, Devesh
Kim, Abigale
Cockcroft, Adrian
Teng, Yu
Patterson, Joshua
Kimball, Gregory
McKinney, Wes
Gong, Weiwei
Yu, Xiangyao
author_facet Yogatama, Bobbi
Yang, Yifei
Kristensen, Kevin
Sarda, Devesh
Kim, Abigale
Cockcroft, Adrian
Teng, Yu
Patterson, Joshua
Kimball, Gregory
McKinney, Wes
Gong, Weiwei
Yu, Xiangyao
contents The era of GPU-powered data analytics has arrived. In this paper, we argue that recent advances in hardware (e.g., larger GPU memory, faster interconnect and IO, and declining cost) and software (e.g., composable data systems and mature libraries) have removed the key barriers that have limited the wider adoption of GPU data analytics. We present Sirius, a prototype open-source GPU-native SQL engine that offers drop-in acceleration for diverse data systems. Sirius treats GPU as the primary engine and leverages libraries like libcudf for high-performance relational operators. It provides drop-in acceleration for existing databases by leveraging the standard Substrait query representation, replacing the CPU engine without changing the user-facing interface. Sirius achieves 8.3x and 7.4x better cost efficiency on TPC-H and ClickBench, respectively, when integrated with single-node DuckDB, and delivers up to 12.5x speedup when integrated with Apache Doris distributed engine.
format Preprint
id arxiv_https___arxiv_org_abs_2508_04701
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle Rethinking Analytical Processing in the GPU Era
Yogatama, Bobbi
Yang, Yifei
Kristensen, Kevin
Sarda, Devesh
Kim, Abigale
Cockcroft, Adrian
Teng, Yu
Patterson, Joshua
Kimball, Gregory
McKinney, Wes
Gong, Weiwei
Yu, Xiangyao
Databases
The era of GPU-powered data analytics has arrived. In this paper, we argue that recent advances in hardware (e.g., larger GPU memory, faster interconnect and IO, and declining cost) and software (e.g., composable data systems and mature libraries) have removed the key barriers that have limited the wider adoption of GPU data analytics. We present Sirius, a prototype open-source GPU-native SQL engine that offers drop-in acceleration for diverse data systems. Sirius treats GPU as the primary engine and leverages libraries like libcudf for high-performance relational operators. It provides drop-in acceleration for existing databases by leveraging the standard Substrait query representation, replacing the CPU engine without changing the user-facing interface. Sirius achieves 8.3x and 7.4x better cost efficiency on TPC-H and ClickBench, respectively, when integrated with single-node DuckDB, and delivers up to 12.5x speedup when integrated with Apache Doris distributed engine.
title Rethinking Analytical Processing in the GPU Era
topic Databases
url https://arxiv.org/abs/2508.04701