Saved in:
| Main Authors: | Alliata, Paul Ruiz, Rubaga, Diana, Kumlin, Daniel, Puliga, Alberto |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2509.00937 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Accelerating Drug Discovery in AutoDock-GPU with Tensor Cores
by: Schieffer, Gabin, et al.
Published: (2024)
by: Schieffer, Gabin, et al.
Published: (2024)
Balancing Pipeline Parallelism with Vocabulary Parallelism
by: Yeung, Man Tsung, et al.
Published: (2024)
by: Yeung, Man Tsung, et al.
Published: (2024)
Accelerating Microswimmer Simulations via a Heterogeneous Pipelined Parallel-in-Time Framework
by: Huang, Ruixiang, et al.
Published: (2026)
by: Huang, Ruixiang, et al.
Published: (2026)
Synergistic Tensor and Pipeline Parallelism
by: Qi, Mengshi, et al.
Published: (2025)
by: Qi, Mengshi, et al.
Published: (2025)
A Parallel and Highly-Portable HPC Poisson Solver: Preconditioned Bi-CGSTAB with alpaka
by: Pennati, Luca, et al.
Published: (2025)
by: Pennati, Luca, et al.
Published: (2025)
Parallel Paradigms in Modern HPC: A Comparative Analysis of MPI, OpenMP, and CUDA
by: ALHafez, Nizar, et al.
Published: (2025)
by: ALHafez, Nizar, et al.
Published: (2025)
Towards Exascale Computing for Astrophysical Simulation Leveraging the Leonardo EuroHPC System
by: Shukla, Nitin, et al.
Published: (2025)
by: Shukla, Nitin, et al.
Published: (2025)
A GPU-accelerated Molecular Docking Workflow with Kubernetes and Apache Airflow
by: Medeiros, Daniel, et al.
Published: (2024)
by: Medeiros, Daniel, et al.
Published: (2024)
UniPar: A Unified LLM-Based Framework for Parallel and Accelerated Code Translation in HPC
by: Bitan, Tomer, et al.
Published: (2025)
by: Bitan, Tomer, et al.
Published: (2025)
HPC-Coder: Modeling Parallel Programs using Large Language Models
by: Nichols, Daniel, et al.
Published: (2023)
by: Nichols, Daniel, et al.
Published: (2023)
MRSch: Multi-Resource Scheduling for HPC
by: Li, Boyang, et al.
Published: (2024)
by: Li, Boyang, et al.
Published: (2024)
Sarus Suite: Cloud-native Containers for HPC
by: Madonna, Alberto, et al.
Published: (2026)
by: Madonna, Alberto, et al.
Published: (2026)
Adaptra: Straggler-Resilient Hybrid-Parallel Training with Pipeline Adaptation
by: Wu, Tianyuan, et al.
Published: (2025)
by: Wu, Tianyuan, et al.
Published: (2025)
A Contention-Free Model for Converged Kubernetes on HPC
by: Sochat, Vanessa, et al.
Published: (2024)
by: Sochat, Vanessa, et al.
Published: (2024)
HPC with Enhanced User Separation
by: Prout, Andrew, et al.
Published: (2024)
by: Prout, Andrew, et al.
Published: (2024)
Kub: Enabling Elastic HPC Workloads on Containerized Environments
by: Medeiros, Daniel, et al.
Published: (2024)
by: Medeiros, Daniel, et al.
Published: (2024)
Integrating and Characterizing HPC Task Runtime Systems for hybrid AI-HPC workloads
by: Merzky, Andre, et al.
Published: (2025)
by: Merzky, Andre, et al.
Published: (2025)
A Flexible Programmable Pipeline Parallelism Framework for Efficient DNN Training
by: Jiang, Lijuan, et al.
Published: (2025)
by: Jiang, Lijuan, et al.
Published: (2025)
DawnPiper: A Memory-scablable Pipeline Parallel Training Framework
by: Peng, Xuan, et al.
Published: (2025)
by: Peng, Xuan, et al.
Published: (2025)
SPARS: A Reinforcement Learning-Enabled Simulator for Power Management in HPC Job Scheduling
by: Amrizal, Muhammad Alfian, et al.
Published: (2025)
by: Amrizal, Muhammad Alfian, et al.
Published: (2025)
Understanding Layered Portability from HPC to Cloud in Containerized Environments
by: Medeiros, Daniel, et al.
Published: (2024)
by: Medeiros, Daniel, et al.
Published: (2024)
Leveraging HPC Profiling & Tracing Tools to Understand the Performance of Particle-in-Cell Monte Carlo Simulations
by: Williams, Jeremy J., et al.
Published: (2023)
by: Williams, Jeremy J., et al.
Published: (2023)
Heimdall++: Optimizing GPU Utilization and Pipeline Parallelism for Efficient Single-Pulse Detection
by: Xia, Bingzheng, et al.
Published: (2025)
by: Xia, Bingzheng, et al.
Published: (2025)
Communication-Computation Pipeline Parallel Split Learning over Wireless Edge Networks
by: Liu, Chenyu, et al.
Published: (2025)
by: Liu, Chenyu, et al.
Published: (2025)
JanusPipe: Efficient Pipeline Parallel Training for Machine Learning Interatomic Potentials
by: Wang, Hongyu, et al.
Published: (2026)
by: Wang, Hongyu, et al.
Published: (2026)
Analysis of the carbon footprint of HPC
by: Benhari, Abdessalam, et al.
Published: (2025)
by: Benhari, Abdessalam, et al.
Published: (2025)
Energy-aware operation of HPC systems in Germany
by: Suarez, Estela, et al.
Published: (2024)
by: Suarez, Estela, et al.
Published: (2024)
On the Convergence of Malleability and the HPC PowerStack: Exploiting Dynamism in Over-Provisioned and Power-Constrained HPC Systems
by: Arima, Eishi, et al.
Published: (2024)
by: Arima, Eishi, et al.
Published: (2024)
TD-Pipe: Temporally-Disaggregated Pipeline Parallelism Architecture for High-Throughput LLM Inference
by: Zhang, Hongbin, et al.
Published: (2025)
by: Zhang, Hongbin, et al.
Published: (2025)
Bandwidth-Aware and Cost-Efficient Pipeline Parallel Scheduling in Geo-Distributed LLM Training
by: Zhang, Han, et al.
Published: (2026)
by: Zhang, Han, et al.
Published: (2026)
Usability Evaluation of Cloud for HPC Applications
by: Sochat, Vanessa, et al.
Published: (2025)
by: Sochat, Vanessa, et al.
Published: (2025)
Parallel I/O Characterization and Optimization on Large-Scale HPC Systems: A 360-Degree Survey
by: Ather, Hammad, et al.
Published: (2024)
by: Ather, Hammad, et al.
Published: (2024)
ARC-V: Vertical Resource Adaptivity for HPC Workloads in Containerized Environments
by: Medeiros, Daniel, et al.
Published: (2025)
by: Medeiros, Daniel, et al.
Published: (2025)
SiPipe: Bridging the CPU-GPU Utilization Gap for Efficient Pipeline-Parallel LLM Inference
by: He, Yongchao, et al.
Published: (2025)
by: He, Yongchao, et al.
Published: (2025)
gLLM: Global Balanced Pipeline Parallelism System for Distributed LLM Serving with Token Throttling
by: Guo, Tianyu, et al.
Published: (2025)
by: Guo, Tianyu, et al.
Published: (2025)
SPPO:Efficient Long-sequence LLM Training via Adaptive Sequence Pipeline Parallel Offloading
by: Chen, Qiaoling, et al.
Published: (2025)
by: Chen, Qiaoling, et al.
Published: (2025)
Enhancing Memory Efficiency in Large Language Model Training Through Chronos-aware Pipeline Parallelism
by: Lin, Xinyuan, et al.
Published: (2025)
by: Lin, Xinyuan, et al.
Published: (2025)
Memory Efficient and Staleness Free Pipeline Parallel DNN Training Framework with Improved Convergence Speed
by: Dutta, Ankita, et al.
Published: (2025)
by: Dutta, Ankita, et al.
Published: (2025)
An Elastic Job Scheduler for HPC Applications on the Cloud
by: Bhosale, Aditya, et al.
Published: (2025)
by: Bhosale, Aditya, et al.
Published: (2025)
SIREN: Software Identification and Recognition in HPC Systems
by: Jakobsche, Thomas, et al.
Published: (2025)
by: Jakobsche, Thomas, et al.
Published: (2025)
Similar Items
-
Accelerating Drug Discovery in AutoDock-GPU with Tensor Cores
by: Schieffer, Gabin, et al.
Published: (2024) -
Balancing Pipeline Parallelism with Vocabulary Parallelism
by: Yeung, Man Tsung, et al.
Published: (2024) -
Accelerating Microswimmer Simulations via a Heterogeneous Pipelined Parallel-in-Time Framework
by: Huang, Ruixiang, et al.
Published: (2026) -
Synergistic Tensor and Pipeline Parallelism
by: Qi, Mengshi, et al.
Published: (2025) -
A Parallel and Highly-Portable HPC Poisson Solver: Preconditioned Bi-CGSTAB with alpaka
by: Pennati, Luca, et al.
Published: (2025)