Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Homola, Jakub, Vavřík, Radim, Meca, Ondřej, Brzobohatý, Tomáš, Říha, Lubomír
Format:	Preprint
Published:	2025
Subjects:	Mathematical Software D.1.3; G.1.3; G.4; I.3.1
Online Access:	https://arxiv.org/abs/2502.08382
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912603841757184
author	Homola, Jakub Vavřík, Radim Meca, Ondřej Brzobohatý, Tomáš Říha, Lubomír
author_facet	Homola, Jakub Vavřík, Radim Meca, Ondřej Brzobohatý, Tomáš Říha, Lubomír
contents	FETI is a numerical method used to solve engineering problems. It builds on the ideas of domain decomposition, which makes it highly scalable and capable of efficiently utilizing whole supercomputers. One of the most time-consuming parts of the FETI solver is the application of the dual operator F in every iteration of the solver. It is traditionally performed on the CPU using an implicit approach of applying the individual sparse matrices that form F right-to-left. Another approach is to apply the dual operator explicitly, which primarily involves a simple dense matrix-vector multiplication and can be efficiently performed on the GPU. However, this requires additional preprocessing on the CPU where the dense matrix is assembled, which makes the explicit approach beneficial only after hundreds of iterations are performed. In this paper, we use the GPU to accelerate the assembly process as well. This significantly shortens the preprocessing time, thus decreasing the number of solver iterations needed to make the explicit approach beneficial. With a proper configuration, we only need a few tens of iterations to achieve speedup relative to the implicit CPU approach. Compared to the CPU-only explicit approach, we achieved up to 10x speedup for the preprocessing and 25x for the application.
format	Preprint
id	arxiv_https___arxiv_org_abs_2502_08382
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Assembly of FETI dual operator using CUDA Homola, Jakub Vavřík, Radim Meca, Ondřej Brzobohatý, Tomáš Říha, Lubomír Mathematical Software D.1.3; G.1.3; G.4; I.3.1 FETI is a numerical method used to solve engineering problems. It builds on the ideas of domain decomposition, which makes it highly scalable and capable of efficiently utilizing whole supercomputers. One of the most time-consuming parts of the FETI solver is the application of the dual operator F in every iteration of the solver. It is traditionally performed on the CPU using an implicit approach of applying the individual sparse matrices that form F right-to-left. Another approach is to apply the dual operator explicitly, which primarily involves a simple dense matrix-vector multiplication and can be efficiently performed on the GPU. However, this requires additional preprocessing on the CPU where the dense matrix is assembled, which makes the explicit approach beneficial only after hundreds of iterations are performed. In this paper, we use the GPU to accelerate the assembly process as well. This significantly shortens the preprocessing time, thus decreasing the number of solver iterations needed to make the explicit approach beneficial. With a proper configuration, we only need a few tens of iterations to achieve speedup relative to the implicit CPU approach. Compared to the CPU-only explicit approach, we achieved up to 10x speedup for the preprocessing and 25x for the application.
title	Assembly of FETI dual operator using CUDA
topic	Mathematical Software D.1.3; G.1.3; G.4; I.3.1
url	https://arxiv.org/abs/2502.08382

Similar Items