Saved in:
Bibliographic Details
Main Authors: Duan, Cenlin, Yang, Jianlei, Wang, Yiou, Wang, Yikun, Qi, Yingjie, He, Xiaolin, Yan, Bonan, Wang, Xueyan, Jia, Xiaotao, Zhao, Weisheng
Format: Preprint
Published: 2024
Subjects:
Online Access:https://arxiv.org/abs/2404.09497
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866909169793105920
author Duan, Cenlin
Yang, Jianlei
Wang, Yiou
Wang, Yikun
Qi, Yingjie
He, Xiaolin
Yan, Bonan
Wang, Xueyan
Jia, Xiaotao
Zhao, Weisheng
author_facet Duan, Cenlin
Yang, Jianlei
Wang, Yiou
Wang, Yikun
Qi, Yingjie
He, Xiaolin
Yan, Bonan
Wang, Xueyan
Jia, Xiaotao
Zhao, Weisheng
contents Bit-level sparsity in neural network models harbors immense untapped potential. Eliminating redundant calculations of randomly distributed zero-bits significantly boosts computational efficiency. Yet, traditional digital SRAM-PIM architecture, limited by rigid crossbar architecture, struggles to effectively exploit this unstructured sparsity. To address this challenge, we propose Dyadic Block PIM (DB-PIM), a groundbreaking algorithm-architecture co-design framework. First, we propose an algorithm coupled with a distinctive sparsity pattern, termed a dyadic block (DB), that preserves the random distribution of non-zero bits to maintain accuracy while restricting the number of these bits in each weight to improve regularity. Architecturally, we develop a custom PIM macro that includes dyadic block multiplication units (DBMUs) and Canonical Signed Digit (CSD)-based adder trees, specifically tailored for Multiply-Accumulate (MAC) operations. An input pre-processing unit (IPU) further refines performance and efficiency by capitalizing on block-wise input sparsity. Results show that our proposed co-design framework achieves a remarkable speedup of up to 7.69x and energy savings of 83.43%.
format Preprint
id arxiv_https___arxiv_org_abs_2404_09497
institution arXiv
publishDate 2024
record_format arxiv
spellingShingle Towards Efficient SRAM-PIM Architecture Design by Exploiting Unstructured Bit-Level Sparsity
Duan, Cenlin
Yang, Jianlei
Wang, Yiou
Wang, Yikun
Qi, Yingjie
He, Xiaolin
Yan, Bonan
Wang, Xueyan
Jia, Xiaotao
Zhao, Weisheng
Hardware Architecture
Bit-level sparsity in neural network models harbors immense untapped potential. Eliminating redundant calculations of randomly distributed zero-bits significantly boosts computational efficiency. Yet, traditional digital SRAM-PIM architecture, limited by rigid crossbar architecture, struggles to effectively exploit this unstructured sparsity. To address this challenge, we propose Dyadic Block PIM (DB-PIM), a groundbreaking algorithm-architecture co-design framework. First, we propose an algorithm coupled with a distinctive sparsity pattern, termed a dyadic block (DB), that preserves the random distribution of non-zero bits to maintain accuracy while restricting the number of these bits in each weight to improve regularity. Architecturally, we develop a custom PIM macro that includes dyadic block multiplication units (DBMUs) and Canonical Signed Digit (CSD)-based adder trees, specifically tailored for Multiply-Accumulate (MAC) operations. An input pre-processing unit (IPU) further refines performance and efficiency by capitalizing on block-wise input sparsity. Results show that our proposed co-design framework achieves a remarkable speedup of up to 7.69x and energy savings of 83.43%.
title Towards Efficient SRAM-PIM Architecture Design by Exploiting Unstructured Bit-Level Sparsity
topic Hardware Architecture
url https://arxiv.org/abs/2404.09497