Saved in:
| Main Authors: | Esaulov, Vladislav, Chen, Jieyang, Podhorszki, Norbert, Suter, Fred, Klasky, Scott, Bourgeois, Anu G, Wan, Lipeng |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.17084 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
HPDR: High-Performance Portable Scientific Data Reduction Framework
by: Chen, Jieyang, et al.
Published: (2025)
by: Chen, Jieyang, et al.
Published: (2025)
Error-controlled Progressive Retrieval of Scientific Data under Derivable Quantities of Interest
by: Wu, Xuan, et al.
Published: (2024)
by: Wu, Xuan, et al.
Published: (2024)
Enabling High-Throughput Parallel I/O in Particle-in-Cell Monte Carlo Simulations with openPMD and Darshan I/O Monitoring
by: Williams, Jeremy J., et al.
Published: (2024)
by: Williams, Jeremy J., et al.
Published: (2024)
CGSim: A Simulation Framework for Large Scale Distributed Computing Environment
by: Vatsavai, Sairam Sri, et al.
Published: (2025)
by: Vatsavai, Sairam Sri, et al.
Published: (2025)
HP-MDR: High-performance and Portable Data Refactoring and Progressive Retrieval with Advanced GPUs
by: Li, Yanliang, et al.
Published: (2025)
by: Li, Yanliang, et al.
Published: (2025)
Automated Calibration of Parallel and Distributed Computing Simulators: A Case Study
by: McDonald, Jesse, et al.
Published: (2024)
by: McDonald, Jesse, et al.
Published: (2024)
Characterizing Adaptive Mesh Refinement on Heterogeneous Platforms with Parthenon-VIBE
by: Poptani, Akash, et al.
Published: (2025)
by: Poptani, Akash, et al.
Published: (2025)
FACT: Compositional Kernel Synthesis with a Three-Stage Agentic Workflow
by: Heidari, Sina, et al.
Published: (2026)
by: Heidari, Sina, et al.
Published: (2026)
QoSFlow: Ensuring Service Quality of Distributed Workflows Using Interpretable Sensitivity Models
by: Rashid, Md Hasanur, et al.
Published: (2026)
by: Rashid, Md Hasanur, et al.
Published: (2026)
Denoising Application Performance Models with Noise-Resilient Priors
by: de Morais, Gustavo, et al.
Published: (2025)
by: de Morais, Gustavo, et al.
Published: (2025)
RAPID-LLM: Resilience-Aware Performance analysis of Infrastructure for Distributed LLM Training and Inference
by: Karfakis, George, et al.
Published: (2025)
by: Karfakis, George, et al.
Published: (2025)
Comprehensive Plugin-Based Monitoring of Nexflow Workflow Executions
by: Kharma, Sami, et al.
Published: (2026)
by: Kharma, Sami, et al.
Published: (2026)
AARC: Automated Affinity-aware Resource Configuration for Serverless Workflows
by: Jin, Lingxiao, et al.
Published: (2025)
by: Jin, Lingxiao, et al.
Published: (2025)
LEO: Tracing GPU Stall Root Causes via Cross-Vendor Backward Slicing
by: Xia, Yuning, et al.
Published: (2026)
by: Xia, Yuning, et al.
Published: (2026)
CARAT: Client-Side Adaptive RPC and Cache Co-Tuning for Parallel File Systems
by: Rashid, Md Hasanur, et al.
Published: (2026)
by: Rashid, Md Hasanur, et al.
Published: (2026)
AcceleratedKernels.jl: Cross-Architecture Parallel Algorithms from a Unified, Transpiled Codebase
by: Nicusan, Andrei-Leonard, et al.
Published: (2025)
by: Nicusan, Andrei-Leonard, et al.
Published: (2025)
Towards Portability at Scale: A Cross-Architecture Performance Evaluation of a GPU-enabled Shallow Water Solver
by: Villalobos, Johansell, et al.
Published: (2025)
by: Villalobos, Johansell, et al.
Published: (2025)
Iterating Pointers: Enabling Static Analysis for Loop-based Pointers
by: Lepori, Andrea, et al.
Published: (2025)
by: Lepori, Andrea, et al.
Published: (2025)
A Hybrid Heuristic Framework for Resource-Efficient Querying of Scientific Experiments Data
by: Patel, Mayank, et al.
Published: (2025)
by: Patel, Mayank, et al.
Published: (2025)
Active Inference-Based Adaptive Routing for Heterogeneous Edge AI Services
by: Wang, Zihang, et al.
Published: (2026)
by: Wang, Zihang, et al.
Published: (2026)
Optimal Allocation of Tasks and Price of Anarchy of Distributed Optimization in Networked Computing Facilities
by: Mancuso, Vincenzo, et al.
Published: (2024)
by: Mancuso, Vincenzo, et al.
Published: (2024)
ExpertFlow: Adaptive Expert Scheduling and Memory Coordination for Efficient MoE Inference
by: Shen, Zixu, et al.
Published: (2025)
by: Shen, Zixu, et al.
Published: (2025)
AdapTBF: Decentralized Bandwidth Control via Adaptive Token Borrowing for HPC Storage
by: Rashid, Md Hasanur, et al.
Published: (2026)
by: Rashid, Md Hasanur, et al.
Published: (2026)
Extracting Practical, Actionable Energy Insights from Supercomputer Telemetry and Logs
by: Cornelius, Melanie, et al.
Published: (2025)
by: Cornelius, Melanie, et al.
Published: (2025)
Scaling Large-scale GNN Training to Thousands of Processors on CPU-based Supercomputers
by: Zhuang, Chen, et al.
Published: (2024)
by: Zhuang, Chen, et al.
Published: (2024)
Profiling and optimization of multi-card GPU machine learning jobs
by: Lawenda, Marcin, et al.
Published: (2025)
by: Lawenda, Marcin, et al.
Published: (2025)
Optimal Parallel Scheduling under Concave Speedup Functions
by: Li, Chengzhang, et al.
Published: (2025)
by: Li, Chengzhang, et al.
Published: (2025)
WebAssembly and Unikernels: A Comparative Study for Serverless at the Edge
by: Besozzi, Valerio, et al.
Published: (2025)
by: Besozzi, Valerio, et al.
Published: (2025)
Efficient GPU-Centered Singular Value Decomposition Using the Divide-and-Conquer Method
by: Liu, Shifang, et al.
Published: (2025)
by: Liu, Shifang, et al.
Published: (2025)
Resource Management Schemes for Cloud-Native Platforms with Computing Containers of Docker and Kubernetes
by: Mao, Ying, et al.
Published: (2020)
by: Mao, Ying, et al.
Published: (2020)
Staging Blocked Evaluation over Structured Sparse Matrices
by: Das, Pratyush, et al.
Published: (2024)
by: Das, Pratyush, et al.
Published: (2024)
Cloud Performance Decomposition for Long-Term Performance Engineering: A Case Study
by: Debnath, Shimul, et al.
Published: (2026)
by: Debnath, Shimul, et al.
Published: (2026)
Collaborative Processing for Multi-Tenant Inference on Memory-Constrained Edge TPUs
by: Ng, Nathan, et al.
Published: (2026)
by: Ng, Nathan, et al.
Published: (2026)
Preliminary report: Initial evaluation of StdPar implementations on AMD GPUs for HPC
by: Lin, Wei-Chen, et al.
Published: (2024)
by: Lin, Wei-Chen, et al.
Published: (2024)
Serving Chain-structured Jobs with Large Memory Footprints with Application to Large Foundation Model Serving
by: Sun, Tingyang, et al.
Published: (2026)
by: Sun, Tingyang, et al.
Published: (2026)
Reducing Tail Latencies Through Environment- and Neighbour-aware Thread Management
by: Jeffery, Andrew, et al.
Published: (2024)
by: Jeffery, Andrew, et al.
Published: (2024)
Dissecting the software-based measurement of CPU energy consumption: a comparative analysis
by: Raffin, Guillaume, et al.
Published: (2024)
by: Raffin, Guillaume, et al.
Published: (2024)
Bridding OT and PaaS in Edge-to-Cloud Continuum
by: Barrios, Carlos J, et al.
Published: (2025)
by: Barrios, Carlos J, et al.
Published: (2025)
Minos: Systematically Classifying Performance and Power Characteristics of GPU Workloads on HPC Clusters
by: Jain, Rutwik, et al.
Published: (2026)
by: Jain, Rutwik, et al.
Published: (2026)
Hardware-Agnostic and Insightful Efficiency Metrics for Accelerated Systems: Definition and Implementation within TALP
by: Rahimi, Ghazal, et al.
Published: (2026)
by: Rahimi, Ghazal, et al.
Published: (2026)
Similar Items
-
HPDR: High-Performance Portable Scientific Data Reduction Framework
by: Chen, Jieyang, et al.
Published: (2025) -
Error-controlled Progressive Retrieval of Scientific Data under Derivable Quantities of Interest
by: Wu, Xuan, et al.
Published: (2024) -
Enabling High-Throughput Parallel I/O in Particle-in-Cell Monte Carlo Simulations with openPMD and Darshan I/O Monitoring
by: Williams, Jeremy J., et al.
Published: (2024) -
CGSim: A Simulation Framework for Large Scale Distributed Computing Environment
by: Vatsavai, Sairam Sri, et al.
Published: (2025) -
HP-MDR: High-performance and Portable Data Refactoring and Progressive Retrieval with Advanced GPUs
by: Li, Yanliang, et al.
Published: (2025)