Saved in:
Bibliographic Details
Main Authors: Zhou, Hanlin, Chan, Huah Yong, Zhang, Shun Yao, Lin, Meie, Ni, Jingfei
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.13579
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • With the rise of cloud computing and lightweight containers, Docker has emerged as a leading technology for rapid service deployment, with Kubernetes responsible for pod orchestration. However, for compute-intensive workloads-particularly web services executing containerized machine-learning training-the default Kubernetes scheduler does not always achieve optimal placement. To address this, we propose two custom, reinforcement-learning-based schedulers, SDQN and SDQN-n, both built on the Deep Q-Network (DQN) framework. In compute-intensive scenarios, these models outperform the default Kubernetes scheduler as well as Transformer-and LSTM-based alternatives, reducing average CPU utilization per cluster node by 10%, and by over 20% when using SDQN-n. Moreover, our results show that SDQN-n approach of consolidating pods onto fewer nodes further amplifies resource savings and helps advance greener, more energy-efficient data centers.Therefore, pod scheduling must employ different strategies tailored to each scenario in order to achieve better performance.Since the reinforcement-learning components of the SDQN and SDQN-n architectures proposed in this paper can be easily tuned by adjusting their parameters, they can accommodate the requirements of various future scenarios.