:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Subramanian, Shashank, Rrapaj, Ermal, Harrington, Peter, Chheda, Smeet, Farrell, Steven, Austin, Brian, Williams, Samuel, Wright, Nicholas, Bhimji, Wahid
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Distributed, Parallel, and Cluster Computing
Online Access:	https://arxiv.org/abs/2410.00273
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

EvoSort: A Genetic-Algorithm-Based Adaptive Parallel Sorting Framework for Large-Scale High Performance Computing
by: Raj, Shashank, et al.
Published: (2025)

PRISM: Probabilistic Runtime Insights and Scalable Performance Modeling for Large-Scale Distributed Training
by: Golden, Alicia, et al.
Published: (2025)

The Time to Consensus in a Blockchain: Insights into Bitcoin's "6 Blocks Rule''
by: Dey, Partha S., et al.
Published: (2025)

Characterizing Production GPU Workloads using System-wide Telemetry Data
by: Cankur, Onur, et al.
Published: (2025)

A New Execution Model and Executor for Adaptively Optimizing the Performance of Parallel Algorithms Using HPX Runtime System
by: Mohammadiporshokooh, Karame, et al.
Published: (2025)

Large Scale Multi-GPU Based Parallel Traffic Simulation for Accelerated Traffic Assignment and Propagation
by: Jiang, Xuan, et al.
Published: (2024)

Beyond Pre-Training: The Full Lifecycle of Foundation Models on HPC Systems
by: Conciatore, Dino, et al.
Published: (2026)

Modular Foundation Model Inference at the Edge: Network-Aware Microservice Optimization
by: Zhu, Juan, et al.
Published: (2026)

LAPIS: A Performance Portable, High Productivity Compiler Framework
by: Kelley, Brian, et al.
Published: (2025)

Rorqual: Speeding up Narwhal with TEEs
by: Freitas, Luciano, et al.
Published: (2024)

Unleashing Scalable Context Parallelism for Foundation Models Pre-Training via FCP
by: Zhao, Yilong, et al.
Published: (2026)

A Comprehensive Hyperledger Fabric Performance Evaluation based on Resources Capacity Planning
by: Melo, Carlos, et al.
Published: (2025)

Benchmarking Message Brokers for IoT Edge Computing: A Comprehensive Performance Study
by: Paul, Tapajit Chandra, et al.
Published: (2026)

From Attention to Disaggregation: Tracing the Evolution of LLM Inference
by: Kumar, Madabattula Rajesh, et al.
Published: (2025)

Empowering the Quantum Cloud User with QRIO
by: Chakraborty, Shmeelok, et al.
Published: (2024)

ARM SVE Unleashed: Performance and Insights Across HPC Applications on Nvidia Grace
by: Shi, Ruimin, et al.
Published: (2025)

System-Level Performance Modeling of Photonic In-Memory Computing
by: Arockiaraj, Jebacyril, et al.
Published: (2026)

Performance Models for a Two-tiered Storage System
by: Sasidharan, Aparna, et al.
Published: (2025)

Optimal Resource Utilization in Hyperledger Fabric: A Comprehensive SPN-Based Performance Evaluation Paradigm
by: Melo, Carlos, et al.
Published: (2025)

PICO: Performance Insights for Collective Operations
by: Pasqualoni, Saverio, et al.
Published: (2025)

Training DNN Models over Heterogeneous Clusters with Optimal Performance
by: Nie, Chengyi, et al.
Published: (2024)

Evaluation of Programming Models and Performance for Stencil Computation on Current GPU Architectures
by: Shan, Baodi, et al.
Published: (2024)

Benchmarking the Performance of Large Language Models on the Cerebras Wafer Scale Engine
by: Zhang, Zuoning, et al.
Published: (2024)

Experiences with Model Context Protocol Servers for Science and High Performance Computing
by: Pan, Haochen, et al.
Published: (2025)

MoFa: A Unified Performance Modeling Framework for LLM Pretraining
by: Zhao, Lu, et al.
Published: (2025)

Privacy-Preserving Sharing of Data Analytics Runtime Metrics for Performance Modeling
by: Will, Jonathan, et al.
Published: (2024)

Design Principles of Dynamic Resource Management for High-Performance Parallel Programming Models
by: Huber, Dominik, et al.
Published: (2024)

Characterizing the Performance of Accelerated Jetson Edge Devices for Training Deep Learning Models
by: K., Prashanthi S., et al.
Published: (2025)

Efficient Training Approaches for Performance Anomaly Detection Models in Edge Computing Environments
by: Fernando, Duneesha, et al.
Published: (2024)

Cascadia: An Efficient Cascade Serving System for Large Language Models
by: Jiang, Youhe, et al.
Published: (2025)

Leveraging HPC Profiling & Tracing Tools to Understand the Performance of Particle-in-Cell Monte Carlo Simulations
by: Williams, Jeremy J., et al.
Published: (2023)

ML-based Modeling to Predict I/O Performance on Different Storage Sub-systems
by: Xu, Yiheng, et al.
Published: (2023)

Evaluating Cross-Architecture Performance Modeling of Distributed ML Workloads Using StableHLO
by: Svedas, Jonas, et al.
Published: (2026)

Parallel Reduced Order Modeling for Digital Twins using High-Performance Computing Workflows
by: de Parga, S. Ares, et al.
Published: (2024)

Performance Modeling and Evaluation of Hyperledger Fabric: An Analysis Based on Transaction Flow and Endorsement Policies
by: Melo, Carlos, et al.
Published: (2025)

PALM: A Efficient Performance Simulator for Tiled Accelerators with Large-scale Model Training
by: Fang, Jiahao, et al.
Published: (2024)

Transactional Dynamics in Hyperledger Fabric: A Stochastic Modeling and Performance Evaluation of Permissioned Blockchains
by: Melo, Carlos, et al.
Published: (2025)

Lumos: Efficient Performance Modeling and Estimation for Large-scale LLM Training
by: Liang, Mingyu, et al.
Published: (2025)

Optimizing Data Distribution and Kernel Performance for Efficient Training of Chemistry Foundation Models: A Case Study with MACE
by: Firoz, Jesun, et al.
Published: (2025)

How Does Stake Distribution Influence Consensus? Analyzing Blockchain Decentralization
by: Motepalli, Shashank, et al.
Published: (2023)