:: Library Catalog

Buchumschlag

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Maeda, Yuki, Taura, Kenjiro
Format:	Preprint
Veröffentlicht:	2026
Schlagworte:	Distributed, Parallel, and Cluster Computing
Online-Zugang:	https://arxiv.org/abs/2604.05982
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Ähnliche Einträge

Parallel Joinable B-Trees in the Fork-Join I/O Model
von: Goodrich, Michael, et al.
Veröffentlicht: (2025)

GPU-Resident Gaussian Process Regression Leveraging Asynchronous Tasks with HPX
von: Möllmann, Henrik, et al.
Veröffentlicht: (2026)

Automatic Tracing in Task-Based Runtime Systems
von: Yadav, Rohan, et al.
Veröffentlicht: (2024)

TURNIP: A "Nondeterministic" GPU Runtime with CPU RAM Offload
von: Ding, Zhimin, et al.
Veröffentlicht: (2024)

Fail-Closed Lowering of Resident KV Claims onto LLM Serving Runtimes
von: Stepanek, Lukas
Veröffentlicht: (2026)

Runtime-optimized Multi-way Stream Join Operator for Large-scale Streaming data
von: Hu, Jinlong, et al.
Veröffentlicht: (2024)

Amoeba: Runtime Tensor Parallel Transformation for LLM Inference Services
von: Chen, Haoyu, et al.
Veröffentlicht: (2025)

GPU-Based Parallel Computing Methods for Medical Photoacoustic Image Reconstruction
von: Yi, Xinyao, et al.
Veröffentlicht: (2024)

GICC: A High-Performance Runtime for GPU-Initiated Communication and Coordination in Modern HPC Systems
von: Shan, Baodi, et al.
Veröffentlicht: (2026)

FlexiWalker: Extensible GPU Framework for Efficient Dynamic Random Walks with Runtime Adaptation
von: Park, Seongyeon, et al.
Veröffentlicht: (2025)

A Readiness-Driven Runtime for Pipeline-Parallel Training under Runtime Variability
von: Liu, Ruitao, et al.
Veröffentlicht: (2026)

Large Scale Multi-GPU Based Parallel Traffic Simulation for Accelerated Traffic Assignment and Propagation
von: Jiang, Xuan, et al.
Veröffentlicht: (2024)

HARP: Orchestrating Automated Parallel Training on Heterogeneous GPU Clusters
von: Liang, Antian, et al.
Veröffentlicht: (2025)

INSPIRIT: Optimizing Heterogeneous Task Scheduling through Adaptive Priority in Task-based Runtime Systems
von: Wang, Yiqing, et al.
Veröffentlicht: (2024)

Pragma driven shared memory parallelism in Zig by supporting OpenMP loop directives
von: Kacs, David, et al.
Veröffentlicht: (2024)

PPipe: Efficient Video Analytics Serving on Heterogeneous GPU Clusters via Pool-Based Pipeline Parallelism
von: Kong, Z. Jonny, et al.
Veröffentlicht: (2025)

A New Execution Model and Executor for Adaptively Optimizing the Performance of Parallel Algorithms Using HPX Runtime System
von: Mohammadiporshokooh, Karame, et al.
Veröffentlicht: (2025)

Concurrent Scheduling of High-Level Parallel Programs on Multi-GPU Systems
von: Knorr, Fabian, et al.
Veröffentlicht: (2025)

MonadBFT: Fast, Responsive, Fork-Resistant Streamlined Consensus
von: Jalalzai, Mohammad Mussadiq, et al.
Veröffentlicht: (2025)

Integrating and Characterizing HPC Task Runtime Systems for hybrid AI-HPC workloads
von: Merzky, Andre, et al.
Veröffentlicht: (2025)

Hetis: Serving LLMs in Heterogeneous GPU Clusters with Fine-grained and Dynamic Parallelism
von: Mo, Zizhao, et al.
Veröffentlicht: (2025)

Multi-GPU Acceleration of PALABOS Fluid Solver using C++ Standard Parallelism
von: Latt, Jonas, et al.
Veröffentlicht: (2025)

Heimdall++: Optimizing GPU Utilization and Pipeline Parallelism for Efficient Single-Pulse Detection
von: Xia, Bingzheng, et al.
Veröffentlicht: (2025)

Parallel GPU-Enabled Algorithms for SpGEMM on Arbitrary Semirings with Hybrid Communication
von: McFarland, Thomas, et al.
Veröffentlicht: (2025)

Radiation Hydrodynamics at Scale: Comparing MPI and Asynchronous Many-Task Runtimes with FleCSI
von: Strack, Alexander, et al.
Veröffentlicht: (2026)

CkIO: Parallel File Input for Over-Decomposed Task-Based Systems
von: Jacob, Mathew, et al.
Veröffentlicht: (2024)

Parallel Collaborative ADMM Privacy Computing and Adaptive GPU Acceleration for Distributed Edge Networks
von: Xia, Mengchun, et al.
Veröffentlicht: (2026)

APEX: Asynchronous Parallel CPU-GPU Execution for Online LLM Inference on Constrained GPUs
von: Fan, Jiakun, et al.
Veröffentlicht: (2025)

Neutron particle transport 3D method of characteristic Multi GPU platform Parallel Computing
von: Zhou, Faguo, et al.
Veröffentlicht: (2025)

SiPipe: Bridging the CPU-GPU Utilization Gap for Efficient Pipeline-Parallel LLM Inference
von: He, Yongchao, et al.
Veröffentlicht: (2025)

Virtual Garbage Collector (VGC): A Zone-Based Garbage Collection Architecture for Python's Parallel Runtime
von: M, Abdulla
Veröffentlicht: (2025)

Maya: Optimizing Deep Learning Training Workloads using GPU Runtime Emulation
von: Yarlagadda, Srihas, et al.
Veröffentlicht: (2025)

GRNND: A GPU-Parallel Relative NN-Descent Algorithm for Efficient Approximate Nearest Neighbor Graph Construction
von: Li, Xiang, et al.
Veröffentlicht: (2025)

CusADi: A GPU Parallelization Framework for Symbolic Expressions and Optimal Control
von: Jeon, Se Hwan, et al.
Veröffentlicht: (2024)

Tasking framework for Adaptive Speculative Parallel Mesh Generation
von: Tsolakis, Christos, et al.
Veröffentlicht: (2024)

EXaCTz: Guaranteed Extremum Graph and Contour Tree Preservation for Distributed- and GPU-Parallel Lossy Compression
von: Li, Yuxiao, et al.
Veröffentlicht: (2026)

Classic and Quantum Task-Based Intelligent Runtime for QIRs Running on Multiple QPUs
von: Miniskar, Narasinga Rao, et al.
Veröffentlicht: (2026)

cuFastTuckerPlus: A Stochastic Parallel Sparse FastTucker Decomposition Using GPU Tensor Cores
von: Li, Zixuan, et al.
Veröffentlicht: (2024)

WRATH: Workload Resilience Across Task Hierarchies in Task-based Parallel Programming Frameworks
von: Zhou, Sicheng, et al.
Veröffentlicht: (2025)

Exploring Performance-Productivity Trade-offs in AMT Runtimes: A Task Bench Study of Itoyori, ItoyoriFBC, HPX, and MPI
von: Lahnor, Torben R., et al.
Veröffentlicht: (2026)