Saved in:
Bibliographic Details
Main Authors: Ruiter, Skyler, Tian, Jiannan, Song, Fengguang
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2509.20563
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915511772643328
author Ruiter, Skyler
Tian, Jiannan
Song, Fengguang
author_facet Ruiter, Skyler
Tian, Jiannan
Song, Fengguang
contents Modern scientific simulations and instruments generate data volumes that overwhelm memory and storage, throttling scalability. Lossy compression mitigates this by trading controlled error for reduced footprint and throughput gains, yet optimal pipelines are highly data and objective specific, demanding compression expertise. GPU compressors supply raw throughput but often hard-code fused kernels that hinder rapid experimentation, and underperform in rate-distortion. We present FZModules, a heterogeneous framework for assembling error-bounded custom compression pipelines from high-performance modules through a concise extensible interface. We further utilize an asynchronous task-backed execution library that infers data dependencies, manages memory movement, and exposes branch and stage level concurrency for powerful asynchronous compression pipelines. Evaluating three pipelines built with FZModules on four representative scientific datasets, we show they can compare end-to-end speedup of fused-kernel GPU compressors while achieving similar rate-distortion to higher fidelity CPU or hybrid compressors, enabling rapid, domain-tailored design.
format Preprint
id arxiv_https___arxiv_org_abs_2509_20563
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle FZModules: A Heterogeneous Computing Framework for Customizable Scientific Data Compression Pipelines
Ruiter, Skyler
Tian, Jiannan
Song, Fengguang
Distributed, Parallel, and Cluster Computing
Modern scientific simulations and instruments generate data volumes that overwhelm memory and storage, throttling scalability. Lossy compression mitigates this by trading controlled error for reduced footprint and throughput gains, yet optimal pipelines are highly data and objective specific, demanding compression expertise. GPU compressors supply raw throughput but often hard-code fused kernels that hinder rapid experimentation, and underperform in rate-distortion. We present FZModules, a heterogeneous framework for assembling error-bounded custom compression pipelines from high-performance modules through a concise extensible interface. We further utilize an asynchronous task-backed execution library that infers data dependencies, manages memory movement, and exposes branch and stage level concurrency for powerful asynchronous compression pipelines. Evaluating three pipelines built with FZModules on four representative scientific datasets, we show they can compare end-to-end speedup of fused-kernel GPU compressors while achieving similar rate-distortion to higher fidelity CPU or hybrid compressors, enabling rapid, domain-tailored design.
title FZModules: A Heterogeneous Computing Framework for Customizable Scientific Data Compression Pipelines
topic Distributed, Parallel, and Cluster Computing
url https://arxiv.org/abs/2509.20563