Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Legtchenko, Sergey, Stefanovici, Ioan, Black, Richard, Rowstron, Antony, Liu, Junyi, Costa, Paolo, Canakci, Burcu, Narayanan, Dushyanth, Wu, Xingbo
Format:	Preprint
Published:	2025
Subjects:	Hardware Architecture Artificial Intelligence Distributed, Parallel, and Cluster Computing Emerging Technologies
Online Access:	https://arxiv.org/abs/2501.09605
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866917893886705664
author	Legtchenko, Sergey Stefanovici, Ioan Black, Richard Rowstron, Antony Liu, Junyi Costa, Paolo Canakci, Burcu Narayanan, Dushyanth Wu, Xingbo
author_facet	Legtchenko, Sergey Stefanovici, Ioan Black, Richard Rowstron, Antony Liu, Junyi Costa, Paolo Canakci, Burcu Narayanan, Dushyanth Wu, Xingbo
contents	AI clusters today are one of the major uses of High Bandwidth Memory (HBM). However, HBM is suboptimal for AI workloads for several reasons. Analysis shows HBM is overprovisioned on write performance, but underprovisioned on density and read bandwidth, and also has significant energy per bit overheads. It is also expensive, with lower yield than DRAM due to manufacturing complexity. We propose a new memory class: Managed-Retention Memory (MRM), which is more optimized to store key data structures for AI inference workloads. We believe that MRM may finally provide a path to viability for technologies that were originally proposed to support Storage Class Memory (SCM). These technologies traditionally offered long-term persistence (10+ years) but provided poor IO performance and/or endurance. MRM makes different trade-offs, and by understanding the workload IO patterns, MRM foregoes long-term data retention and write performance for better potential performance on the metrics important for these workloads.
format	Preprint
id	arxiv_https___arxiv_org_abs_2501_09605
institution	arXiv
publishDate	2025
record_format	arxiv
spellingShingle	Managed-Retention Memory: A New Class of Memory for the AI Era Legtchenko, Sergey Stefanovici, Ioan Black, Richard Rowstron, Antony Liu, Junyi Costa, Paolo Canakci, Burcu Narayanan, Dushyanth Wu, Xingbo Hardware Architecture Artificial Intelligence Distributed, Parallel, and Cluster Computing Emerging Technologies AI clusters today are one of the major uses of High Bandwidth Memory (HBM). However, HBM is suboptimal for AI workloads for several reasons. Analysis shows HBM is overprovisioned on write performance, but underprovisioned on density and read bandwidth, and also has significant energy per bit overheads. It is also expensive, with lower yield than DRAM due to manufacturing complexity. We propose a new memory class: Managed-Retention Memory (MRM), which is more optimized to store key data structures for AI inference workloads. We believe that MRM may finally provide a path to viability for technologies that were originally proposed to support Storage Class Memory (SCM). These technologies traditionally offered long-term persistence (10+ years) but provided poor IO performance and/or endurance. MRM makes different trade-offs, and by understanding the workload IO patterns, MRM foregoes long-term data retention and write performance for better potential performance on the metrics important for these workloads.
title	Managed-Retention Memory: A New Class of Memory for the AI Era
topic	Hardware Architecture Artificial Intelligence Distributed, Parallel, and Cluster Computing Emerging Technologies
url	https://arxiv.org/abs/2501.09605

Similar Items