Saved in:
| Main Authors: | Subramanian, Shashank, Rrapaj, Ermal, Harrington, Peter, Chheda, Smeet, Farrell, Steven, Austin, Brian, Williams, Samuel, Wright, Nicholas, Bhimji, Wahid |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2410.00273 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
EvoSort: A Genetic-Algorithm-Based Adaptive Parallel Sorting Framework for Large-Scale High Performance Computing
by: Raj, Shashank, et al.
Published: (2025)
by: Raj, Shashank, et al.
Published: (2025)
PRISM: Probabilistic Runtime Insights and Scalable Performance Modeling for Large-Scale Distributed Training
by: Golden, Alicia, et al.
Published: (2025)
by: Golden, Alicia, et al.
Published: (2025)
The Time to Consensus in a Blockchain: Insights into Bitcoin's "6 Blocks Rule''
by: Dey, Partha S., et al.
Published: (2025)
by: Dey, Partha S., et al.
Published: (2025)
Characterizing Production GPU Workloads using System-wide Telemetry Data
by: Cankur, Onur, et al.
Published: (2025)
by: Cankur, Onur, et al.
Published: (2025)
A New Execution Model and Executor for Adaptively Optimizing the Performance of Parallel Algorithms Using HPX Runtime System
by: Mohammadiporshokooh, Karame, et al.
Published: (2025)
by: Mohammadiporshokooh, Karame, et al.
Published: (2025)
Large Scale Multi-GPU Based Parallel Traffic Simulation for Accelerated Traffic Assignment and Propagation
by: Jiang, Xuan, et al.
Published: (2024)
by: Jiang, Xuan, et al.
Published: (2024)
Beyond Pre-Training: The Full Lifecycle of Foundation Models on HPC Systems
by: Conciatore, Dino, et al.
Published: (2026)
by: Conciatore, Dino, et al.
Published: (2026)
Modular Foundation Model Inference at the Edge: Network-Aware Microservice Optimization
by: Zhu, Juan, et al.
Published: (2026)
by: Zhu, Juan, et al.
Published: (2026)
LAPIS: A Performance Portable, High Productivity Compiler Framework
by: Kelley, Brian, et al.
Published: (2025)
by: Kelley, Brian, et al.
Published: (2025)
Rorqual: Speeding up Narwhal with TEEs
by: Freitas, Luciano, et al.
Published: (2024)
by: Freitas, Luciano, et al.
Published: (2024)
Unleashing Scalable Context Parallelism for Foundation Models Pre-Training via FCP
by: Zhao, Yilong, et al.
Published: (2026)
by: Zhao, Yilong, et al.
Published: (2026)
A Comprehensive Hyperledger Fabric Performance Evaluation based on Resources Capacity Planning
by: Melo, Carlos, et al.
Published: (2025)
by: Melo, Carlos, et al.
Published: (2025)
Benchmarking Message Brokers for IoT Edge Computing: A Comprehensive Performance Study
by: Paul, Tapajit Chandra, et al.
Published: (2026)
by: Paul, Tapajit Chandra, et al.
Published: (2026)
From Attention to Disaggregation: Tracing the Evolution of LLM Inference
by: Kumar, Madabattula Rajesh, et al.
Published: (2025)
by: Kumar, Madabattula Rajesh, et al.
Published: (2025)
Empowering the Quantum Cloud User with QRIO
by: Chakraborty, Shmeelok, et al.
Published: (2024)
by: Chakraborty, Shmeelok, et al.
Published: (2024)
ARM SVE Unleashed: Performance and Insights Across HPC Applications on Nvidia Grace
by: Shi, Ruimin, et al.
Published: (2025)
by: Shi, Ruimin, et al.
Published: (2025)
System-Level Performance Modeling of Photonic In-Memory Computing
by: Arockiaraj, Jebacyril, et al.
Published: (2026)
by: Arockiaraj, Jebacyril, et al.
Published: (2026)
Performance Models for a Two-tiered Storage System
by: Sasidharan, Aparna, et al.
Published: (2025)
by: Sasidharan, Aparna, et al.
Published: (2025)
Optimal Resource Utilization in Hyperledger Fabric: A Comprehensive SPN-Based Performance Evaluation Paradigm
by: Melo, Carlos, et al.
Published: (2025)
by: Melo, Carlos, et al.
Published: (2025)
PICO: Performance Insights for Collective Operations
by: Pasqualoni, Saverio, et al.
Published: (2025)
by: Pasqualoni, Saverio, et al.
Published: (2025)
Training DNN Models over Heterogeneous Clusters with Optimal Performance
by: Nie, Chengyi, et al.
Published: (2024)
by: Nie, Chengyi, et al.
Published: (2024)
Evaluation of Programming Models and Performance for Stencil Computation on Current GPU Architectures
by: Shan, Baodi, et al.
Published: (2024)
by: Shan, Baodi, et al.
Published: (2024)
Benchmarking the Performance of Large Language Models on the Cerebras Wafer Scale Engine
by: Zhang, Zuoning, et al.
Published: (2024)
by: Zhang, Zuoning, et al.
Published: (2024)
Experiences with Model Context Protocol Servers for Science and High Performance Computing
by: Pan, Haochen, et al.
Published: (2025)
by: Pan, Haochen, et al.
Published: (2025)
MoFa: A Unified Performance Modeling Framework for LLM Pretraining
by: Zhao, Lu, et al.
Published: (2025)
by: Zhao, Lu, et al.
Published: (2025)
Privacy-Preserving Sharing of Data Analytics Runtime Metrics for Performance Modeling
by: Will, Jonathan, et al.
Published: (2024)
by: Will, Jonathan, et al.
Published: (2024)
Design Principles of Dynamic Resource Management for High-Performance Parallel Programming Models
by: Huber, Dominik, et al.
Published: (2024)
by: Huber, Dominik, et al.
Published: (2024)
Characterizing the Performance of Accelerated Jetson Edge Devices for Training Deep Learning Models
by: K., Prashanthi S., et al.
Published: (2025)
by: K., Prashanthi S., et al.
Published: (2025)
Efficient Training Approaches for Performance Anomaly Detection Models in Edge Computing Environments
by: Fernando, Duneesha, et al.
Published: (2024)
by: Fernando, Duneesha, et al.
Published: (2024)
Cascadia: An Efficient Cascade Serving System for Large Language Models
by: Jiang, Youhe, et al.
Published: (2025)
by: Jiang, Youhe, et al.
Published: (2025)
Leveraging HPC Profiling & Tracing Tools to Understand the Performance of Particle-in-Cell Monte Carlo Simulations
by: Williams, Jeremy J., et al.
Published: (2023)
by: Williams, Jeremy J., et al.
Published: (2023)
ML-based Modeling to Predict I/O Performance on Different Storage Sub-systems
by: Xu, Yiheng, et al.
Published: (2023)
by: Xu, Yiheng, et al.
Published: (2023)
Evaluating Cross-Architecture Performance Modeling of Distributed ML Workloads Using StableHLO
by: Svedas, Jonas, et al.
Published: (2026)
by: Svedas, Jonas, et al.
Published: (2026)
Parallel Reduced Order Modeling for Digital Twins using High-Performance Computing Workflows
by: de Parga, S. Ares, et al.
Published: (2024)
by: de Parga, S. Ares, et al.
Published: (2024)
Performance Modeling and Evaluation of Hyperledger Fabric: An Analysis Based on Transaction Flow and Endorsement Policies
by: Melo, Carlos, et al.
Published: (2025)
by: Melo, Carlos, et al.
Published: (2025)
PALM: A Efficient Performance Simulator for Tiled Accelerators with Large-scale Model Training
by: Fang, Jiahao, et al.
Published: (2024)
by: Fang, Jiahao, et al.
Published: (2024)
Transactional Dynamics in Hyperledger Fabric: A Stochastic Modeling and Performance Evaluation of Permissioned Blockchains
by: Melo, Carlos, et al.
Published: (2025)
by: Melo, Carlos, et al.
Published: (2025)
Lumos: Efficient Performance Modeling and Estimation for Large-scale LLM Training
by: Liang, Mingyu, et al.
Published: (2025)
by: Liang, Mingyu, et al.
Published: (2025)
Optimizing Data Distribution and Kernel Performance for Efficient Training of Chemistry Foundation Models: A Case Study with MACE
by: Firoz, Jesun, et al.
Published: (2025)
by: Firoz, Jesun, et al.
Published: (2025)
How Does Stake Distribution Influence Consensus? Analyzing Blockchain Decentralization
by: Motepalli, Shashank, et al.
Published: (2023)
by: Motepalli, Shashank, et al.
Published: (2023)
Similar Items
-
EvoSort: A Genetic-Algorithm-Based Adaptive Parallel Sorting Framework for Large-Scale High Performance Computing
by: Raj, Shashank, et al.
Published: (2025) -
PRISM: Probabilistic Runtime Insights and Scalable Performance Modeling for Large-Scale Distributed Training
by: Golden, Alicia, et al.
Published: (2025) -
The Time to Consensus in a Blockchain: Insights into Bitcoin's "6 Blocks Rule''
by: Dey, Partha S., et al.
Published: (2025) -
Characterizing Production GPU Workloads using System-wide Telemetry Data
by: Cankur, Onur, et al.
Published: (2025) -
A New Execution Model and Executor for Adaptively Optimizing the Performance of Parallel Algorithms Using HPX Runtime System
by: Mohammadiporshokooh, Karame, et al.
Published: (2025)