Saved in:
Bibliographic Details
Main Authors: Schmulbach, Viansa, Kim, Jason, Gao, Ethan, Revina, Lucy, Jha, Nikhil, Wu, Ethan, Nikolic, Borivoje
Format: Preprint
Published: 2025
Subjects:
Online Access:https://arxiv.org/abs/2503.14708
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866908274493751296
author Schmulbach, Viansa
Kim, Jason
Gao, Ethan
Revina, Lucy
Jha, Nikhil
Wu, Ethan
Nikolic, Borivoje
author_facet Schmulbach, Viansa
Kim, Jason
Gao, Ethan
Revina, Lucy
Jha, Nikhil
Wu, Ethan
Nikolic, Borivoje
contents This paper introduces NeCTAr (Near-Cache Transformer Accelerator), a 16nm heterogeneous multicore RISC-V SoC for sparse and dense machine learning kernels with both near-core and near-memory accelerators. A prototype chip runs at 400MHz at 0.85V and performs matrix-vector multiplications with 109 GOPs/W. The effectiveness of the design is demonstrated by running inference on a sparse language model, ReLU-Llama.
format Preprint
id arxiv_https___arxiv_org_abs_2503_14708
institution arXiv
publishDate 2025
record_format arxiv
spellingShingle NeCTAr: A Heterogeneous RISC-V SoC for Language Model Inference in Intel 16
Schmulbach, Viansa
Kim, Jason
Gao, Ethan
Revina, Lucy
Jha, Nikhil
Wu, Ethan
Nikolic, Borivoje
Hardware Architecture
This paper introduces NeCTAr (Near-Cache Transformer Accelerator), a 16nm heterogeneous multicore RISC-V SoC for sparse and dense machine learning kernels with both near-core and near-memory accelerators. A prototype chip runs at 400MHz at 0.85V and performs matrix-vector multiplications with 109 GOPs/W. The effectiveness of the design is demonstrated by running inference on a sparse language model, ReLU-Llama.
title NeCTAr: A Heterogeneous RISC-V SoC for Language Model Inference in Intel 16
topic Hardware Architecture
url https://arxiv.org/abs/2503.14708