Table of Contents: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Laso, Ruben, Krupitza, Diego, Hunold, Sascha
Format:	Preprint
Published:	2024
Subjects:	Distributed, Parallel, and Cluster Computing Performance Programming Languages
Online Access:	https://arxiv.org/abs/2402.06384
Tags:	Add Tag No Tags, Be the first to tag this record!

Table of Contents:

Since the advent of parallel algorithms in the C++17 Standard Template Library (STL), the STL has become a viable framework for creating performance-portable applications. Given multiple existing implementations of the parallel algorithms, a systematic, quantitative performance comparison is essential for choosing the appropriate implementation for a particular hardware configuration. In this work, we introduce a specialized set of micro-benchmarks to assess the scalability of the parallel algorithms in the STL. By selecting different backends, our micro-benchmarks can be used on multi-core systems and GPUs. Using the suite, in a case study on AMD and Intel CPUs and NVIDIA GPUs, we were able to identify substantial performance disparities among different implementations, including GCC+TBB, GCC+HPX, Intel's compiler with TBB, or NVIDIA's compiler with OpenMP and CUDA.

Similar Items