Saved in:
| Main Authors: | Medel, VÍctor, Arronategui, Unai, Rana, Omer, BaÑares, JosÉ Ángel, Tolosana-Calasanz, Rafael |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.04491 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Characterising resource management performance in Kubernetes
by: Medel, Víctor, et al.
Published: (2024)
by: Medel, Víctor, et al.
Published: (2024)
An Empirical Characterization of Outages and Incidents in Public Services for Large Language Models
by: Chu, Xiaoyu, et al.
Published: (2025)
by: Chu, Xiaoyu, et al.
Published: (2025)
SProBench: Stream Processing Benchmark for High Performance Computing Infrastructure
by: Kulkarni, Apurv Deepak, et al.
Published: (2025)
by: Kulkarni, Apurv Deepak, et al.
Published: (2025)
Evaluating HPC-Style CPU Performance and Cost in Virtualized Cloud Infrastructures
by: Tharwani, Jay, et al.
Published: (2025)
by: Tharwani, Jay, et al.
Published: (2025)
RAPID-LLM: Resilience-Aware Performance analysis of Infrastructure for Distributed LLM Training and Inference
by: Karfakis, George, et al.
Published: (2025)
by: Karfakis, George, et al.
Published: (2025)
The SAP Cloud Infrastructure Dataset: A Reality Check of Scheduling and Placement of VMs in Cloud Computing
by: Uhlig, Arno, et al.
Published: (2025)
by: Uhlig, Arno, et al.
Published: (2025)
DREAMS: Decentralized Resource Allocation and Service Management across the Compute Continuum Using Service Affinity
by: Dinh-Tuan, Hai, et al.
Published: (2025)
by: Dinh-Tuan, Hai, et al.
Published: (2025)
Characterizing Adaptive Mesh Refinement on Heterogeneous Platforms with Parthenon-VIBE
by: Poptani, Akash, et al.
Published: (2025)
by: Poptani, Akash, et al.
Published: (2025)
FAILS: A Framework for Automated Collection and Analysis of LLM Service Incidents
by: Battaglini-Fischer, Sándor, et al.
Published: (2025)
by: Battaglini-Fischer, Sándor, et al.
Published: (2025)
QoSFlow: Ensuring Service Quality of Distributed Workflows Using Interpretable Sensitivity Models
by: Rashid, Md Hasanur, et al.
Published: (2026)
by: Rashid, Md Hasanur, et al.
Published: (2026)
Efficient Fault Localization in a Cloud Stack Using End-to-End Application Service Topology
by: Mathews, Dhanya R, et al.
Published: (2025)
by: Mathews, Dhanya R, et al.
Published: (2025)
Parallel I/O Characterization and Optimization on Large-Scale HPC Systems: A 360-Degree Survey
by: Ather, Hammad, et al.
Published: (2024)
by: Ather, Hammad, et al.
Published: (2024)
Beyond Thread States: Diagnosing Performance Degradation with eBPF and Thread Dynamics
by: Landau, Diogo, et al.
Published: (2026)
by: Landau, Diogo, et al.
Published: (2026)
Cyclic Data Streaming on GPUs for Short Range Stencils Applied to Molecular Dynamics
by: Rose, Martin, et al.
Published: (2025)
by: Rose, Martin, et al.
Published: (2025)
Matryoshka: Optimization of Dynamic Diverse Quantum Chemistry Systems via Elastic Parallelism Transformation
by: Wang, Tuowei, et al.
Published: (2024)
by: Wang, Tuowei, et al.
Published: (2024)
Portable High-Performance Kernel Generation for a Computational Fluid Dynamics Code with DaCe
by: Andersson, Måns I., et al.
Published: (2025)
by: Andersson, Måns I., et al.
Published: (2025)
A Comparison of the Performance of the Molecular Dynamics Simulation Package GROMACS Implemented in the SYCL and CUDA Programming Models
by: Apanasevich, L., et al.
Published: (2024)
by: Apanasevich, L., et al.
Published: (2024)
SP-IMPact: A Framework for Static Partitioning Interference Mitigation and Performance Analysis
by: Costa, Diogo, et al.
Published: (2025)
by: Costa, Diogo, et al.
Published: (2025)
Report on Challenges of Practical Reproducibility for Systems and HPC Computer Science
by: Keahey, Kate, et al.
Published: (2025)
by: Keahey, Kate, et al.
Published: (2025)
Chopin: An Open Source R-language Tool to Support Spatial Analysis on Parallelizable Infrastructure
by: Song, Insang, et al.
Published: (2024)
by: Song, Insang, et al.
Published: (2024)
Automated Programmatic Performance Analysis of Parallel Programs
by: Cankur, Onur, et al.
Published: (2024)
by: Cankur, Onur, et al.
Published: (2024)
Model-driven development of data intensive applications over cloud resources
by: Tolosana-Calasanz, Rafael, et al.
Published: (2024)
by: Tolosana-Calasanz, Rafael, et al.
Published: (2024)
Extracting Practical, Actionable Energy Insights from Supercomputer Telemetry and Logs
by: Cornelius, Melanie, et al.
Published: (2025)
by: Cornelius, Melanie, et al.
Published: (2025)
Scaling Large-scale GNN Training to Thousands of Processors on CPU-based Supercomputers
by: Zhuang, Chen, et al.
Published: (2024)
by: Zhuang, Chen, et al.
Published: (2024)
Profiling and optimization of multi-card GPU machine learning jobs
by: Lawenda, Marcin, et al.
Published: (2025)
by: Lawenda, Marcin, et al.
Published: (2025)
Optimal Parallel Scheduling under Concave Speedup Functions
by: Li, Chengzhang, et al.
Published: (2025)
by: Li, Chengzhang, et al.
Published: (2025)
WebAssembly and Unikernels: A Comparative Study for Serverless at the Edge
by: Besozzi, Valerio, et al.
Published: (2025)
by: Besozzi, Valerio, et al.
Published: (2025)
Efficient GPU-Centered Singular Value Decomposition Using the Divide-and-Conquer Method
by: Liu, Shifang, et al.
Published: (2025)
by: Liu, Shifang, et al.
Published: (2025)
Resource Management Schemes for Cloud-Native Platforms with Computing Containers of Docker and Kubernetes
by: Mao, Ying, et al.
Published: (2020)
by: Mao, Ying, et al.
Published: (2020)
Staging Blocked Evaluation over Structured Sparse Matrices
by: Das, Pratyush, et al.
Published: (2024)
by: Das, Pratyush, et al.
Published: (2024)
Cloud Performance Decomposition for Long-Term Performance Engineering: A Case Study
by: Debnath, Shimul, et al.
Published: (2026)
by: Debnath, Shimul, et al.
Published: (2026)
Collaborative Processing for Multi-Tenant Inference on Memory-Constrained Edge TPUs
by: Ng, Nathan, et al.
Published: (2026)
by: Ng, Nathan, et al.
Published: (2026)
Preliminary report: Initial evaluation of StdPar implementations on AMD GPUs for HPC
by: Lin, Wei-Chen, et al.
Published: (2024)
by: Lin, Wei-Chen, et al.
Published: (2024)
Serving Chain-structured Jobs with Large Memory Footprints with Application to Large Foundation Model Serving
by: Sun, Tingyang, et al.
Published: (2026)
by: Sun, Tingyang, et al.
Published: (2026)
Reducing Tail Latencies Through Environment- and Neighbour-aware Thread Management
by: Jeffery, Andrew, et al.
Published: (2024)
by: Jeffery, Andrew, et al.
Published: (2024)
Dissecting the software-based measurement of CPU energy consumption: a comparative analysis
by: Raffin, Guillaume, et al.
Published: (2024)
by: Raffin, Guillaume, et al.
Published: (2024)
Bridding OT and PaaS in Edge-to-Cloud Continuum
by: Barrios, Carlos J, et al.
Published: (2025)
by: Barrios, Carlos J, et al.
Published: (2025)
Minos: Systematically Classifying Performance and Power Characteristics of GPU Workloads on HPC Clusters
by: Jain, Rutwik, et al.
Published: (2026)
by: Jain, Rutwik, et al.
Published: (2026)
Hardware-Agnostic and Insightful Efficiency Metrics for Accelerated Systems: Definition and Implementation within TALP
by: Rahimi, Ghazal, et al.
Published: (2026)
by: Rahimi, Ghazal, et al.
Published: (2026)
Node Compass: Multilevel Tracing and Debugging of Request Executions in JavaScript-Based Web-Servers
by: Kabamba, Herve Mbikayi, et al.
Published: (2023)
by: Kabamba, Herve Mbikayi, et al.
Published: (2023)
Similar Items
-
Characterising resource management performance in Kubernetes
by: Medel, Víctor, et al.
Published: (2024) -
An Empirical Characterization of Outages and Incidents in Public Services for Large Language Models
by: Chu, Xiaoyu, et al.
Published: (2025) -
SProBench: Stream Processing Benchmark for High Performance Computing Infrastructure
by: Kulkarni, Apurv Deepak, et al.
Published: (2025) -
Evaluating HPC-Style CPU Performance and Cost in Virtualized Cloud Infrastructures
by: Tharwani, Jay, et al.
Published: (2025) -
RAPID-LLM: Resilience-Aware Performance analysis of Infrastructure for Distributed LLM Training and Inference
by: Karfakis, George, et al.
Published: (2025)