Saved in:
| Main Authors: | Wang, Lingfei, Harwood, Aaron, Rodriguez, Maria A. |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2401.09910 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Rubick: Exploiting Job Reconfigurability for Deep Learning Cluster Scheduling
by: Zhang, Xinyi, et al.
Published: (2024)
by: Zhang, Xinyi, et al.
Published: (2024)
Scheduling Deep Learning Jobs in Multi-Tenant GPU Clusters via Wise Resource Sharing
by: Luo, Yizhou, et al.
Published: (2024)
by: Luo, Yizhou, et al.
Published: (2024)
Prediction-Assisted Online Distributed Deep Learning Workload Scheduling in GPU Clusters
by: Luo, Ziyue, et al.
Published: (2025)
by: Luo, Ziyue, et al.
Published: (2025)
Evaluating Malleable Job Scheduling in HPC Clusters using Real-World Workloads
by: Zojer, Patrick, et al.
Published: (2026)
by: Zojer, Patrick, et al.
Published: (2026)
Aryl: An Elastic Cluster Scheduler for Deep Learning
by: Li, Jiamin, et al.
Published: (2022)
by: Li, Jiamin, et al.
Published: (2022)
Deep Reinforcement Learning for Job Scheduling and Resource Management in Cloud Computing: An Algorithm-Level Review
by: Gu, Yan, et al.
Published: (2025)
by: Gu, Yan, et al.
Published: (2025)
Scalable HPC Job Scheduling and Resource Management in SST
by: Abdurahman, Abubeker, et al.
Published: (2025)
by: Abdurahman, Abubeker, et al.
Published: (2025)
An Elastic Job Scheduler for HPC Applications on the Cloud
by: Bhosale, Aditya, et al.
Published: (2025)
by: Bhosale, Aditya, et al.
Published: (2025)
Quantifying the Carbon Reduction of DAG Workloads: A Job Shop Scheduling Perspective
by: Bostandoost, Roozbeh, et al.
Published: (2025)
by: Bostandoost, Roozbeh, et al.
Published: (2025)
Asymptotically Optimal Scheduling of Multiple Parallelizable Job Classes
by: Berg, Benjamin, et al.
Published: (2024)
by: Berg, Benjamin, et al.
Published: (2024)
GPU Cluster Scheduling for Network-Sensitive Deep Learning
by: Sharma, Aakash, et al.
Published: (2024)
by: Sharma, Aakash, et al.
Published: (2024)
Data-Locality-Aware Task Assignment and Scheduling for Distributed Job Executions
by: Zhao, Hailiang, et al.
Published: (2024)
by: Zhao, Hailiang, et al.
Published: (2024)
Metronome: Efficient Scheduling for Periodic Traffic Jobs with Network and Priority Awareness
by: Jiang, Hao, et al.
Published: (2025)
by: Jiang, Hao, et al.
Published: (2025)
Scale: Deep Reinforcement Learning for Container Scheduling in Serverless Edge Computing
by: Chen, Chen, et al.
Published: (2026)
by: Chen, Chen, et al.
Published: (2026)
Harpagon: Minimizing DNN Serving Cost via Efficient Dispatching, Scheduling and Splitting
by: Zhao, Zhixin, et al.
Published: (2024)
by: Zhao, Zhixin, et al.
Published: (2024)
DeepVM: Integrating Spot and On-Demand VMs for Cost-Efficient Deep Learning Clusters in the Cloud
by: Kim, Yoochan, et al.
Published: (2024)
by: Kim, Yoochan, et al.
Published: (2024)
DeepOps & SLURM: Your GPU Cluster Guide
by: Majee, Arindam
Published: (2024)
by: Majee, Arindam
Published: (2024)
SPARS: A Reinforcement Learning-Enabled Simulator for Power Management in HPC Job Scheduling
by: Amrizal, Muhammad Alfian, et al.
Published: (2025)
by: Amrizal, Muhammad Alfian, et al.
Published: (2025)
CarbonFlex: Enabling Carbon-aware Provisioning and Scheduling for Cloud Clusters
by: Hanafy, Walid A., et al.
Published: (2025)
by: Hanafy, Walid A., et al.
Published: (2025)
Megha: Decentralized Global Fair Scheduling for Federated Clusters
by: Thiyyakat, Meghana, et al.
Published: (2021)
by: Thiyyakat, Meghana, et al.
Published: (2021)
Eva: Cost-Efficient Cloud-Based Cluster Scheduling
by: Chang, Tzu-Tao, et al.
Published: (2025)
by: Chang, Tzu-Tao, et al.
Published: (2025)
A Deep Reinforcement Learning Approach for Cost Optimized Workflow Scheduling in Cloud Computing Environments
by: Jayanetti, Amanda, et al.
Published: (2024)
by: Jayanetti, Amanda, et al.
Published: (2024)
Alternative Mixed Integer Linear Programming Optimization for Joint Job Scheduling and Data Allocation in Grid Computing
by: Feng, Shengyu, et al.
Published: (2025)
by: Feng, Shengyu, et al.
Published: (2025)
Deep Reinforcement Learning-based Methods for Resource Scheduling in Cloud Computing: A Review and Future Directions
by: Zhou, Guangyao, et al.
Published: (2021)
by: Zhou, Guangyao, et al.
Published: (2021)
Maple: A Multi-agent System for Portable Deep Learning across Clusters
by: Wu, Molang, et al.
Published: (2025)
by: Wu, Molang, et al.
Published: (2025)
A Survey on Scheduling Techniques in the Edge Cloud: Issues, Challenges and Future Directions
by: Asghar, Hassan, et al.
Published: (2022)
by: Asghar, Hassan, et al.
Published: (2022)
A Taxonomy of Schedulers -- Operating Systems, Clusters and Big Data Frameworks
by: Sliwko, Leszek
Published: (2025)
by: Sliwko, Leszek
Published: (2025)
A Review of Tools and Techniques for Optimization of Workload Mapping and Scheduling in Heterogeneous HPC System
by: Sharma, Aasish Kumar, et al.
Published: (2025)
by: Sharma, Aasish Kumar, et al.
Published: (2025)
GFS: A Preemption-aware Scheduling Framework for GPU Clusters with Predictive Spot Instance Management
by: Duan, Jiaang, et al.
Published: (2025)
by: Duan, Jiaang, et al.
Published: (2025)
TF-DDRL: A Transformer-enhanced Distributed DRL Technique for Scheduling IoT Applications in Edge and Cloud Computing Environments
by: Wang, Zhiyu, et al.
Published: (2024)
by: Wang, Zhiyu, et al.
Published: (2024)
PAL: A Variability-Aware Policy for Scheduling ML Workloads in GPU Clusters
by: Jain, Rutwik, et al.
Published: (2024)
by: Jain, Rutwik, et al.
Published: (2024)
Adaptive Job Scheduling in Quantum Clouds Using Reinforcement Learning
by: Luo, Waylon, et al.
Published: (2025)
by: Luo, Waylon, et al.
Published: (2025)
Capacity Planning and Scheduling for Jobs with Uncertainty in Resource Usage and Duration
by: Patra, Sunandita, et al.
Published: (2025)
by: Patra, Sunandita, et al.
Published: (2025)
An Online Fragmentation-Aware Scheduler for Managing GPU-Sharing Workloads on Multi-Instance GPUs
by: Ting, Hsu-Tzu, et al.
Published: (2025)
by: Ting, Hsu-Tzu, et al.
Published: (2025)
A Deep Dive into the Google Cluster Workload Traces: Analyzing the Application Failure Characteristics and User Behaviors
by: Bappy, Faisal Haque, et al.
Published: (2023)
by: Bappy, Faisal Haque, et al.
Published: (2023)
Accurate GPU Memory Prediction for Deep Learning Jobs through Dynamic Analysis
by: Shi, Jiabo, et al.
Published: (2025)
by: Shi, Jiabo, et al.
Published: (2025)
Learning-Based Approaches for Job Shop Scheduling Problems: A Review
by: Rihane, Karima, et al.
Published: (2025)
by: Rihane, Karima, et al.
Published: (2025)
Evaluating the Efficacy of LLM-Based Reasoning for Multiobjective HPC Job Scheduling
by: Jadhav, Prachi, et al.
Published: (2025)
by: Jadhav, Prachi, et al.
Published: (2025)
Prefetching in Deep Memory Hierarchies with NVRAM as Main Memory
by: Lurbe, Manel, et al.
Published: (2025)
by: Lurbe, Manel, et al.
Published: (2025)
Raptor: Distributed Scheduling for Serverless Functions
by: Exton, Kevin, et al.
Published: (2024)
by: Exton, Kevin, et al.
Published: (2024)
Similar Items
-
Rubick: Exploiting Job Reconfigurability for Deep Learning Cluster Scheduling
by: Zhang, Xinyi, et al.
Published: (2024) -
Scheduling Deep Learning Jobs in Multi-Tenant GPU Clusters via Wise Resource Sharing
by: Luo, Yizhou, et al.
Published: (2024) -
Prediction-Assisted Online Distributed Deep Learning Workload Scheduling in GPU Clusters
by: Luo, Ziyue, et al.
Published: (2025) -
Evaluating Malleable Job Scheduling in HPC Clusters using Real-World Workloads
by: Zojer, Patrick, et al.
Published: (2026) -
Aryl: An Elastic Cluster Scheduler for Deep Learning
by: Li, Jiamin, et al.
Published: (2022)