:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Oancea, Cosmin E., Watt, Stephen M.
Natura:	Preprint
Pubblicazione:	2024
Soggetti:	Distributed, Parallel, and Cluster Computing Mathematical Software Programming Languages
Accesso online:	https://arxiv.org/abs/2405.14642
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

Compiler-supported reduced precision and AoS-SoA transformations for heterogeneous hardware
di: Radtke, Pawel K., et al.
Pubblicazione: (2025)

Julia GraphBLAS with Nonblocking Execution
di: Costanza, Pascal, et al.
Pubblicazione: (2025)

Toward Portable GPU Performance: Julia Recursive Implementation of TRMM and TRSM
di: Carrica, Vicki, et al.
Pubblicazione: (2025)

Verifying Properties of Index Arrays in a Purely-Functional Data-Parallel Language
di: Hinnerskov, Nikolaj Hey, et al.
Pubblicazione: (2025)

Implementing Multi-GPU Scientific Computing Miniapps Across Performance Portable Frameworks
di: Villalobos, Johansell, et al.
Pubblicazione: (2025)

Ocean: Fast Estimation-Based Sparse General Matrix-Matrix Multiplication on GPU
di: Li, Yifan, et al.
Pubblicazione: (2026)

LEGO: A Layout Expression Language for Code Generation of Hierarchical Mapping
di: Tavakkoli, Amir Mohammad, et al.
Pubblicazione: (2025)

Pipelined Dense Symmetric Eigenvalue Decomposition on Multi-GPU Architectures
di: Wang, Hansheng, et al.
Pubblicazione: (2025)

Communication-Avoiding SpGEMM via Trident Partitioning on Hierarchical GPU Interconnects
di: Bellavita, Julian, et al.
Pubblicazione: (2026)

Integrating Odeint Time Stepping into OpenFPM for Distributed and GPU Accelerated Numerical Solvers
di: Singh, Abhinav, et al.
Pubblicazione: (2023)

Performant Unified GPU Kernels for Portable Singular Value Computation Across Hardware and Precision
di: Ringoot, Evelyne, et al.
Pubblicazione: (2025)

On the energy efficiency of sparse matrix computations on multi-GPU clusters
di: Bernaschi, Massimo, et al.
Pubblicazione: (2025)

FalconGEMM: Surpassing Hardware Peaks with Lower-Complexity Matrix Multiplication
di: Zhu, Honglin, et al.
Pubblicazione: (2026)

SPUMA: a minimally invasive approach to the GPU porting of OPENFOAM
di: Bnà, Simone, et al.
Pubblicazione: (2025)

Black-Scholes Option Pricing on Intel CPUs and GPUs: Implementation on SYCL and Optimization Techniques
di: Panova, Elena, et al.
Pubblicazione: (2022)

Scheduling Languages: A Past, Present, and Future Taxonomy
di: Hall, Mary, et al.
Pubblicazione: (2024)

GoldbachGPU: An Open Source GPU-Accelerated Framework for Verification of Goldbach's Conjecture
di: Llorente-Saguer, Isaac
Pubblicazione: (2026)

GPU Accelerated Newton for Taylor Series Solutions of Polynomial Homotopies in Multiple Double Precision
di: Verschelde, Jan
Pubblicazione: (2023)

Fray: An Efficient General-Purpose Concurrency Testing Platform for the JVM (Extended Version)
di: Li, Ao, et al.
Pubblicazione: (2025)

Determinacy with Priorities up to Clocks
di: Liquori, Luigi, et al.
Pubblicazione: (2026)

A shared compilation stack for distributed-memory parallelism in stencil DSLs
di: Bisbas, George, et al.
Pubblicazione: (2024)

Robustness and Accuracy in Pipelined Bi-Conjugate Gradient Stabilized Method: A Comparative Study
di: Havdiak, Mykhailo, et al.
Pubblicazione: (2024)

Efficient N-to-M Checkpointing Algorithm for Finite Element Simulations
di: Ham, David A., et al.
Pubblicazione: (2024)

Enabling MPI communication within Numba/LLVM JIT-compiled Python code using numba-mpi v1.0
di: Derlatka, Kacper, et al.
Pubblicazione: (2024)

Enabling mixed-precision in spectral element codes
di: Chen, Yanxiang, et al.
Pubblicazione: (2025)

High-Performance Star-M SVD for Big Data Compression
di: Hussain, Md Taufique, et al.
Pubblicazione: (2026)

Accelerating Bidiagonalization of Banded Matrices through Memory-Aware Bulge-Chasing on GPUs
di: Ringoot, Evelyne, et al.
Pubblicazione: (2025)

On the Challenges of Energy-Efficiency Analysis in HPC Systems: Evaluating Synthetic Benchmarks and Gromacs
di: Machado, Rafael Ravedutti Lucio, et al.
Pubblicazione: (2025)

A new open source framework for multiscale modeling of fibrous materials on heterogeneous supercomputers
di: Merson, Jacob, et al.
Pubblicazione: (2023)

SYCL compute kernels for ExaHyPE
di: Loi, Chung Ming, et al.
Pubblicazione: (2023)

PETSc/TAO Developments for GPU-Based Early Exascale Systems
di: Mills, Richard Tran, et al.
Pubblicazione: (2024)

Enabling mixed-precision with the help of tools: A Nekbone case study
di: Chen, Yanxiang, et al.
Pubblicazione: (2024)

Verification Challenges in Sparse Matrix Vector Multiplication in High Performance Computing: Part I
di: Zhang, Junchao
Pubblicazione: (2025)

Comparative analysis of large data processing in Apache Spark using Java, Python and Scala
di: Borodii, Ivan, et al.
Pubblicazione: (2025)

Xabclib:A Fully Auto-tuned Sparse Iterative Solver
di: Katagiri, Takahiro, et al.
Pubblicazione: (2024)

A Communication Avoiding and Reducing Algorithm for Symmetric Eigenproblem for Very Small Matrices
di: Katagiri, Takahiro, et al.
Pubblicazione: (2024)

Automated MPI-X code generation for scalable finite-difference solvers
di: Bisbas, George, et al.
Pubblicazione: (2023)

NApy: Efficient Statistics in Python for Large-Scale Heterogeneous Data with Enhanced Support for Missing Data
di: Woller, Fabian, et al.
Pubblicazione: (2025)

Beating vDSP: A 138 GFLOPS Radix-8 Stockham FFT on Apple Silicon via Two-Tier Register-Threadgroup Memory Decomposition
di: Bergach, Mohamed Amine
Pubblicazione: (2026)

Performance measurements of modern Fortran MPI applications with Score-P
di: Corbin, Gregor
Pubblicazione: (2025)