Saved in:
| Main Authors: | Laso, Ruben, Krupitza, Diego, Hunold, Sascha |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.06384 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Comparing Parallel Functional Array Languages: Programming and Performance
by: van Balen, David, et al.
Published: (2025)
by: van Balen, David, et al.
Published: (2025)
pPython Performance Study
by: Byun, Chansup, et al.
Published: (2023)
by: Byun, Chansup, et al.
Published: (2023)
TAPA: A Scalable Task-Parallel Dataflow Programming Framework for Modern FPGAs with Co-Optimization of HLS and Physical Design
by: Guo, Licheng, et al.
Published: (2022)
by: Guo, Licheng, et al.
Published: (2022)
Massimult: A Novel Parallel CPU Architecture Based on Combinator Reduction
by: Nicklisch-Franken, Jurgen, et al.
Published: (2024)
by: Nicklisch-Franken, Jurgen, et al.
Published: (2024)
Developing a Modular Compiler for a Subset of a C-like Language
by: Dutta, Debasish, et al.
Published: (2025)
by: Dutta, Debasish, et al.
Published: (2025)
CoNST: Code Generator for Sparse Tensor Networks
by: Raje, Saurabh, et al.
Published: (2024)
by: Raje, Saurabh, et al.
Published: (2024)
Simplicity Scales
by: Sampson, Andrew, et al.
Published: (2026)
by: Sampson, Andrew, et al.
Published: (2026)
LEGO: A Layout Expression Language for Code Generation of Hierarchical Mapping
by: Tavakkoli, Amir Mohammad, et al.
Published: (2025)
by: Tavakkoli, Amir Mohammad, et al.
Published: (2025)
Scheduling Languages: A Past, Present, and Future Taxonomy
by: Hall, Mary, et al.
Published: (2024)
by: Hall, Mary, et al.
Published: (2024)
Iterating Pointers: Enabling Static Analysis for Loop-based Pointers
by: Lepori, Andrea, et al.
Published: (2025)
by: Lepori, Andrea, et al.
Published: (2025)
Towards a Linear-Algebraic Hypervisor
by: Considine, Breandan
Published: (2026)
by: Considine, Breandan
Published: (2026)
AI-NativeBench: An Open-Source White-Box Agentic Benchmark Suite for AI-Native Systems
by: Wang, Zirui, et al.
Published: (2026)
by: Wang, Zirui, et al.
Published: (2026)
SProBench: Stream Processing Benchmark for High Performance Computing Infrastructure
by: Kulkarni, Apurv Deepak, et al.
Published: (2025)
by: Kulkarni, Apurv Deepak, et al.
Published: (2025)
Toward Scalable Docker-Based Emulations of Blockchain Networks for Research and Development
by: Pennino, Diego, et al.
Published: (2024)
by: Pennino, Diego, et al.
Published: (2024)
OMPILOT: Harnessing Transformer Models for Auto Parallelization to Shared Memory Computing Paradigms
by: Bhattacharjee, Arijit, et al.
Published: (2025)
by: Bhattacharjee, Arijit, et al.
Published: (2025)
Agentic Auto-Scheduling: An Experimental Study of LLM-Guided Loop Optimization
by: Merouani, Massinissa, et al.
Published: (2025)
by: Merouani, Massinissa, et al.
Published: (2025)
LOOPRAG: Enhancing Loop Transformation Optimization with Retrieval-Augmented Large Language Models
by: Zhi, Yijie, et al.
Published: (2025)
by: Zhi, Yijie, et al.
Published: (2025)
Linear Layouts: Robust Code Generation of Efficient Tensor Computation Using $\mathbb{F}_2$
by: Zhou, Keren, et al.
Published: (2025)
by: Zhou, Keren, et al.
Published: (2025)
On Orchestrating Parallel Broadcasts for Distributed Ledgers
by: Sheng, Peiyao, et al.
Published: (2024)
by: Sheng, Peiyao, et al.
Published: (2024)
Automated Programmatic Performance Analysis of Parallel Programs
by: Cankur, Onur, et al.
Published: (2024)
by: Cankur, Onur, et al.
Published: (2024)
Optimal Parallel Scheduling under Concave Speedup Functions
by: Li, Chengzhang, et al.
Published: (2025)
by: Li, Chengzhang, et al.
Published: (2025)
Recorder: Comprehensive Parallel I/O Tracing and Analysis
by: Wang, Chen, et al.
Published: (2025)
by: Wang, Chen, et al.
Published: (2025)
ParaLog: Consistent Host-side Logging for Parallel Checkpoints
by: Chien, Steven W. D., et al.
Published: (2024)
by: Chien, Steven W. D., et al.
Published: (2024)
Cache Blocking of Distributed-Memory Parallel Matrix Power Kernels
by: Lacey, Dane C., et al.
Published: (2024)
by: Lacey, Dane C., et al.
Published: (2024)
Fine-Grained Energy Prediction For Parallellized LLM Inference With PIE-P
by: Dutt, Anurag, et al.
Published: (2025)
by: Dutt, Anurag, et al.
Published: (2025)
Automated Calibration of Parallel and Distributed Computing Simulators: A Case Study
by: McDonald, Jesse, et al.
Published: (2024)
by: McDonald, Jesse, et al.
Published: (2024)
Scalable GPU Performance Variability Analysis framework
by: Lahiry, Ankur, et al.
Published: (2025)
by: Lahiry, Ankur, et al.
Published: (2025)
Unified schemes for directive-based GPU offloading
by: Miki, Yohei, et al.
Published: (2024)
by: Miki, Yohei, et al.
Published: (2024)
Safe Memory Reclamation Techniques
by: Singh, Ajay
Published: (2025)
by: Singh, Ajay
Published: (2025)
Fault-Tolerant Hybrid-Parallel Training at Scale with Reliable and Efficient In-memory Checkpointing
by: Wang, Yuxin, et al.
Published: (2023)
by: Wang, Yuxin, et al.
Published: (2023)
PolyTOPS: Reconfigurable and Flexible Polyhedral Scheduler
by: Consolaro, Gianpietro, et al.
Published: (2024)
by: Consolaro, Gianpietro, et al.
Published: (2024)
mLR: Scalable Laminography Reconstruction based on Memoization
by: Ma, Bin, et al.
Published: (2025)
by: Ma, Bin, et al.
Published: (2025)
Matryoshka: Optimization of Dynamic Diverse Quantum Chemistry Systems via Elastic Parallelism Transformation
by: Wang, Tuowei, et al.
Published: (2024)
by: Wang, Tuowei, et al.
Published: (2024)
CARAT: Client-Side Adaptive RPC and Cache Co-Tuning for Parallel File Systems
by: Rashid, Md Hasanur, et al.
Published: (2026)
by: Rashid, Md Hasanur, et al.
Published: (2026)
HeteGen: Heterogeneous Parallel Inference for Large Language Models on Resource-Constrained Devices
by: Zhao, Xuanlei, et al.
Published: (2024)
by: Zhao, Xuanlei, et al.
Published: (2024)
AcceleratedKernels.jl: Cross-Architecture Parallel Algorithms from a Unified, Transpiled Codebase
by: Nicusan, Andrei-Leonard, et al.
Published: (2025)
by: Nicusan, Andrei-Leonard, et al.
Published: (2025)
Parallel I/O Characterization and Optimization on Large-Scale HPC Systems: A 360-Degree Survey
by: Ather, Hammad, et al.
Published: (2024)
by: Ather, Hammad, et al.
Published: (2024)
Scalable Systems and Software Architectures for High-Performance Computing on cloud platforms
by: Ramesh, Risshab Srinivas
Published: (2024)
by: Ramesh, Risshab Srinivas
Published: (2024)
Towards a Scalable and Efficient PGAS-based Distributed OpenMP
by: Shan, Baodi, et al.
Published: (2024)
by: Shan, Baodi, et al.
Published: (2024)
CodeRosetta: Pushing the Boundaries of Unsupervised Code Translation for Parallel Programming
by: TehraniJamsaz, Ali, et al.
Published: (2024)
by: TehraniJamsaz, Ali, et al.
Published: (2024)
Similar Items
-
Comparing Parallel Functional Array Languages: Programming and Performance
by: van Balen, David, et al.
Published: (2025) -
pPython Performance Study
by: Byun, Chansup, et al.
Published: (2023) -
TAPA: A Scalable Task-Parallel Dataflow Programming Framework for Modern FPGAs with Co-Optimization of HLS and Physical Design
by: Guo, Licheng, et al.
Published: (2022) -
Massimult: A Novel Parallel CPU Architecture Based on Combinator Reduction
by: Nicklisch-Franken, Jurgen, et al.
Published: (2024) -
Developing a Modular Compiler for a Subset of a C-like Language
by: Dutta, Debasish, et al.
Published: (2025)