Saved in:
Bibliographic Details
Main Authors: Hattay, Anas, Mboula, Fred Ngole, Gascard, Eric, Yahoun, Zakaria
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2604.09202
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866915929923780608
author Hattay, Anas
Mboula, Fred Ngole
Gascard, Eric
Yahoun, Zakaria
author_facet Hattay, Anas
Mboula, Fred Ngole
Gascard, Eric
Yahoun, Zakaria
contents Cloud providers must assign heterogeneous compute resources to workflow DAGs while balancing competing objectives such as completion time, cost, and energy consumption. In this work, we study a single-workflow, queue-free scheduling setting and consider a graph neural network (GNN)-based deep reinforcement learning scheduler designed to minimize workflow completion time and energy usage. We identify specific out-of-distribution (OOD) conditions under which GNN-based deep reinforcement learning schedulers fail and provide a principled explanation of why these failures occur. Through controlled OOD evaluations, we demonstrate that performance degradation stems from structural mismatches between training and deployment environments, which disrupt message passing and undermine policy generalization. Our analysis exposes fundamental limitations of current GNN-based schedulers and highlights the need for more robust representations to ensure reliable scheduling performance under distribution shifts.
format Preprint
id arxiv_https___arxiv_org_abs_2604_09202
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle On the Role of DAG topology in Energy-Aware Cloud Scheduling : A GNN-Based Deep Reinforcement Learning Approach
Hattay, Anas
Mboula, Fred Ngole
Gascard, Eric
Yahoun, Zakaria
Machine Learning
Artificial Intelligence
Cloud providers must assign heterogeneous compute resources to workflow DAGs while balancing competing objectives such as completion time, cost, and energy consumption. In this work, we study a single-workflow, queue-free scheduling setting and consider a graph neural network (GNN)-based deep reinforcement learning scheduler designed to minimize workflow completion time and energy usage. We identify specific out-of-distribution (OOD) conditions under which GNN-based deep reinforcement learning schedulers fail and provide a principled explanation of why these failures occur. Through controlled OOD evaluations, we demonstrate that performance degradation stems from structural mismatches between training and deployment environments, which disrupt message passing and undermine policy generalization. Our analysis exposes fundamental limitations of current GNN-based schedulers and highlights the need for more robust representations to ensure reliable scheduling performance under distribution shifts.
title On the Role of DAG topology in Energy-Aware Cloud Scheduling : A GNN-Based Deep Reinforcement Learning Approach
topic Machine Learning
Artificial Intelligence
url https://arxiv.org/abs/2604.09202