Saved in:
Bibliographic Details
Main Authors: Beigi, Majed Valad, Cao, Yi, Gurumurthi, Sudhanva, Recchia, Charles, Walton, Andrew, Sridharan, Vilas
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2408.15302
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866914926817181696
author Beigi, Majed Valad
Cao, Yi
Gurumurthi, Sudhanva
Recchia, Charles
Walton, Andrew
Sridharan, Vilas
author_facet Beigi, Majed Valad
Cao, Yi
Gurumurthi, Sudhanva
Recchia, Charles
Walton, Andrew
Sridharan, Vilas
contents This paper is a corrigendum to the paper by Beigi et al. published at HPCA 2023 https://doi.org/10.1109/HPCA56546.2023.10071066. The HPCA paper presented a detailed field data analysis of faults observed at scale in DDR4 DRAM from two different memory vendors. This analysis included a breakdown of fault patterns or modes. Upon further study of the data, we found a bug in how we decoded errors based on the logged row-bank-column address. Specifically, we found that some errors that occurred in one column were mis-interpreted as occurring in two non-adjacent columns. As a result of this, some single-bit faults were misclassified as partial-row faults (i.e., two-bit faults). Similarly, some single-column faults were misclassified as two-column faults. The result of these misclassification errors is that the proportion of single-bit faults is higher than reported in the paper, with a commensurate reduction in the fraction of certain types of multi-bit faults. These misclassifications also slightly change the Failure In Time (FIT) per DRAM device values presented in the original paper. In this corrigendum, we provide an updated version of the relevant tables and figures and point out the corresponding page numbers and references in the original paper that they replace.
format Preprint
id arxiv_https___arxiv_org_abs_2408_15302
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Corrigendum to: A Systematic Study of DDR4 DRAM Faults in the Field
Beigi, Majed Valad
Cao, Yi
Gurumurthi, Sudhanva
Recchia, Charles
Walton, Andrew
Sridharan, Vilas
Hardware Architecture
This paper is a corrigendum to the paper by Beigi et al. published at HPCA 2023 https://doi.org/10.1109/HPCA56546.2023.10071066. The HPCA paper presented a detailed field data analysis of faults observed at scale in DDR4 DRAM from two different memory vendors. This analysis included a breakdown of fault patterns or modes. Upon further study of the data, we found a bug in how we decoded errors based on the logged row-bank-column address. Specifically, we found that some errors that occurred in one column were mis-interpreted as occurring in two non-adjacent columns. As a result of this, some single-bit faults were misclassified as partial-row faults (i.e., two-bit faults). Similarly, some single-column faults were misclassified as two-column faults. The result of these misclassification errors is that the proportion of single-bit faults is higher than reported in the paper, with a commensurate reduction in the fraction of certain types of multi-bit faults. These misclassifications also slightly change the Failure In Time (FIT) per DRAM device values presented in the original paper. In this corrigendum, we provide an updated version of the relevant tables and figures and point out the corresponding page numbers and references in the original paper that they replace.
title Corrigendum to: A Systematic Study of DDR4 DRAM Faults in the Field
topic Hardware Architecture
url https://arxiv.org/abs/2408.15302