:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Carrica, Vicki, Alomairy, Rabab, Ringoot, Evelyne, Edelman, Alan
Natura:	Preprint
Pubblicazione:	2026
Soggetti:	Distributed, Parallel, and Cluster Computing Emerging Technologies Mathematical Software Performance
Accesso online:	https://arxiv.org/abs/2601.08082
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

Toward Portable GPU Performance: Julia Recursive Implementation of TRMM and TRSM
di: Carrica, Vicki, et al.
Pubblicazione: (2025)

Accelerating Bidiagonalization of Banded Matrices through Memory-Aware Bulge-Chasing on GPUs
di: Ringoot, Evelyne, et al.
Pubblicazione: (2025)

Performant Unified GPU Kernels for Portable Singular Value Computation Across Hardware and Precision
di: Ringoot, Evelyne, et al.
Pubblicazione: (2025)

Evaluating Fault Tolerance and Scalability in Distributed File Systems: A Case Study of GFS, HDFS, and MinIO
di: Malhotra, Shubham, et al.
Pubblicazione: (2025)

Optimizing Intra-Container Communication with Memory Protection Keys: A Novel Approach to Secure and Efficient Microservice Interaction
di: Yashu, Fnu, et al.
Pubblicazione: (2025)

Seamless acceleration of Fortran intrinsics via AMD AI engines
di: Brown, Nick, et al.
Pubblicazione: (2025)

The Landscape of GPU-Centric Communication
di: Unat, Didem, et al.
Pubblicazione: (2024)

Active Inference-Based Adaptive Routing for Heterogeneous Edge AI Services
di: Wang, Zihang, et al.
Pubblicazione: (2026)

On the Performance of Cloud-based ARM SVE for Zero-Knowledge Proving Systems
di: Loghin, Dumitrel, et al.
Pubblicazione: (2025)

A Communication Avoiding and Reducing Algorithm for Symmetric Eigenproblem for Very Small Matrices
di: Katagiri, Takahiro, et al.
Pubblicazione: (2024)

A Hybrid Heuristic Framework for Resource-Efficient Querying of Scientific Experiments Data
di: Patel, Mayank, et al.
Pubblicazione: (2025)

Accelerating High-Order Finite Element Simulations at Extreme Scale with FP64 Tensor Cores
di: Tu, Jiqun, et al.
Pubblicazione: (2026)

The Sunk Carbon Fallacy: Rethinking Carbon Footprint Metrics for Effective Carbon-Aware Scheduling
di: Bashir, Noman, et al.
Pubblicazione: (2024)

KPI2KVI: A Multi Agent Workflow for Calculating Key Value Indicators from Service Descriptions
di: Shokrnezhad, Masoud, et al.
Pubblicazione: (2026)

Wattlytics: A Web Platform for Co-Optimizing Performance, Energy, and TCO in HPC Clusters
di: Afzal, Ayesha, et al.
Pubblicazione: (2026)

E-QUARTIC: Energy Efficient Edge Ensemble of Convolutional Neural Networks for Resource-Optimized Learning
di: Zhang, Le, et al.
Pubblicazione: (2024)

pyGinkgo: A Sparse Linear Algebra Operator Framework for Python
di: Tuteja, Keshvi, et al.
Pubblicazione: (2025)

Model Discovery and Graph Simulation: A Lightweight Gateway to Chaos Engineering
di: Krasnovsky, Anatoly A.
Pubblicazione: (2025)

Xabclib:A Fully Auto-tuned Sparse Iterative Solver
di: Katagiri, Takahiro, et al.
Pubblicazione: (2024)

Automated MPI-X code generation for scalable finite-difference solvers
di: Bisbas, George, et al.
Pubblicazione: (2023)

On the energy efficiency of sparse matrix computations on multi-GPU clusters
di: Bernaschi, Massimo, et al.
Pubblicazione: (2025)

NApy: Efficient Statistics in Python for Large-Scale Heterogeneous Data with Enhanced Support for Missing Data
di: Woller, Fabian, et al.
Pubblicazione: (2025)

Beating vDSP: A 138 GFLOPS Radix-8 Stockham FFT on Apple Silicon via Two-Tier Register-Threadgroup Memory Decomposition
di: Bergach, Mohamed Amine
Pubblicazione: (2026)

Performance measurements of modern Fortran MPI applications with Score-P
di: Corbin, Gregor
Pubblicazione: (2025)

Black-Scholes Option Pricing on Intel CPUs and GPUs: Implementation on SYCL and Optimization Techniques
di: Panova, Elena, et al.
Pubblicazione: (2022)

GoldbachGPU: An Open Source GPU-Accelerated Framework for Verification of Goldbach's Conjecture
di: Llorente-Saguer, Isaac
Pubblicazione: (2026)

Performant Automatic BLAS Offloading on Unified Memory Architecture with OpenMP First-Touch Style Data Movement
di: Li, Junjie
Pubblicazione: (2024)

Harnessing the Full Potential of RRAMs through Scalable and Distributed In-Memory Computing with Integrated Error Correction
di: Vo, Huynh Q. N., et al.
Pubblicazione: (2025)

Deep Optimizer States: Towards Scalable Training of Transformer Models Using Interleaved Offloading
di: Maurya, Avinash, et al.
Pubblicazione: (2024)

Scaling Intelligence: Designing Data Centers for Next-Gen Language Models
di: Tithi, Jesmin Jahan, et al.
Pubblicazione: (2025)

SCALE: Self-regulated Clustered federAted LEarning in a Homogeneous Environment
di: Puppala, Sai, et al.
Pubblicazione: (2024)

Easy Acceleration with Distributed Arrays
di: Kepner, Jeremy, et al.
Pubblicazione: (2025)

Quantum resources in resource management systems
di: Bacher, Utz, et al.
Pubblicazione: (2025)

Execution Envelopes: A Shared Admission Contract for Backend AI Execution Requests
di: Tallam, Krti
Pubblicazione: (2026)

Optimizing Spot Instance Reliability and Security Using Cloud-Native Data and Tools
di: Saqib, Muhammad, et al.
Pubblicazione: (2025)

Generative AI for Software Architecture. Applications, Challenges, and Future Directions
di: Esposito, Matteo, et al.
Pubblicazione: (2025)

Quantum-HPC Software Stacks and the openQSE Reference Architecture: A Survey
di: Shehata, Amir, et al.
Pubblicazione: (2026)

Scaling Sample-Based Quantum Diagonalization on GPU-Accelerated Systems using OpenMP Offload
di: Walkup, Robert, et al.
Pubblicazione: (2026)

GPU-Accelerated Quantum Simulation: Empirical Backend Selection, Gate Fusion, and Adaptive Precision
di: Kumaresan, Poornima, et al.
Pubblicazione: (2026)

From GPUs to RRAMs: Distributed In-Memory Primal-Dual Hybrid Gradient Method for Solving Large-Scale Linear Optimization Problem
di: Vo, Huynh Q. N., et al.
Pubblicazione: (2025)