Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Tarraga-Moreno, Joaquin, Escudero-Sahuquillo, Jesus, Garcia, Pedro Javier, Quiles, Francisco J.
Format:	Preprint
Published:	2025
Subjects:	Hardware Architecture
Online Access:	https://arxiv.org/abs/2502.20965
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866908605101375488
author	Tarraga-Moreno, Joaquin Escudero-Sahuquillo, Jesus Garcia, Pedro Javier Quiles, Francisco J.
author_facet	Tarraga-Moreno, Joaquin Escudero-Sahuquillo, Jesus Garcia, Pedro Javier Quiles, Francisco J.
contents	In the last decade, specific-purpose computing and storage devices, such as GPUs, TPUs, or high-speed storage, have been incorporated into server nodes of Supercomputers and Data centers. The development of high-bandwidth memory (HBM) enabled a much more compact form factor for these devices, thus allowing the interconnection of several of them within a server node, typically using an intra-node interconnection network (e.g., PCIe, NVLink, or Infinity Fabric). These networks allow scaling up the number of specific computing and storage devices per node. Furthermore, the inter-node networks communicate thousands of these devices placed in different server nodes in a Supercomputer or Data Center. Unfortunately, the intra- and inter-node networks may become the system's bottleneck due to the increasing communication demand among accelerators of applications such as generative AI. Although current intra-node network designs alleviate this bottleneck by increasing the bandwidth of the intra-node network, we show in this paper that such a high bandwidth for intra-node communication may hinder the inter-node communication performance when traffic from outside the node arrives at the intra-node devices, resulting in interference with intra-node traffic. To analyze the impact of this interference, we have studied the communication operations of realistic traffic patterns exploiting intra-node communication. We have developed a generic intra- and inter-node simulation model based on OMNeT++ and modeled the mentioned communication operations. We have also performed extensive simulation experiments that confirm that increasing the intra-node network bandwidth and the number of computing devices per node (i.e., accelerators) is counterproductive to the inter-node communication performance.
format	Preprint
id	arxiv_https___arxiv_org_abs_2502_20965
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	On the Impact of Intra-node Communication in the Performance of Supercomputer and Data Center Interconnection Networks Tarraga-Moreno, Joaquin Escudero-Sahuquillo, Jesus Garcia, Pedro Javier Quiles, Francisco J. Hardware Architecture In the last decade, specific-purpose computing and storage devices, such as GPUs, TPUs, or high-speed storage, have been incorporated into server nodes of Supercomputers and Data centers. The development of high-bandwidth memory (HBM) enabled a much more compact form factor for these devices, thus allowing the interconnection of several of them within a server node, typically using an intra-node interconnection network (e.g., PCIe, NVLink, or Infinity Fabric). These networks allow scaling up the number of specific computing and storage devices per node. Furthermore, the inter-node networks communicate thousands of these devices placed in different server nodes in a Supercomputer or Data Center. Unfortunately, the intra- and inter-node networks may become the system's bottleneck due to the increasing communication demand among accelerators of applications such as generative AI. Although current intra-node network designs alleviate this bottleneck by increasing the bandwidth of the intra-node network, we show in this paper that such a high bandwidth for intra-node communication may hinder the inter-node communication performance when traffic from outside the node arrives at the intra-node devices, resulting in interference with intra-node traffic. To analyze the impact of this interference, we have studied the communication operations of realistic traffic patterns exploiting intra-node communication. We have developed a generic intra- and inter-node simulation model based on OMNeT++ and modeled the mentioned communication operations. We have also performed extensive simulation experiments that confirm that increasing the intra-node network bandwidth and the number of computing devices per node (i.e., accelerators) is counterproductive to the inter-node communication performance.
title	On the Impact of Intra-node Communication in the Performance of Supercomputer and Data Center Interconnection Networks
topic	Hardware Architecture
url	https://arxiv.org/abs/2502.20965

Similar Items