Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2503.14708 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866908274493751296 |
|---|---|
| author | Schmulbach, Viansa Kim, Jason Gao, Ethan Revina, Lucy Jha, Nikhil Wu, Ethan Nikolic, Borivoje |
| author_facet | Schmulbach, Viansa Kim, Jason Gao, Ethan Revina, Lucy Jha, Nikhil Wu, Ethan Nikolic, Borivoje |
| contents | This paper introduces NeCTAr (Near-Cache Transformer Accelerator), a 16nm heterogeneous multicore RISC-V SoC for sparse and dense machine learning kernels with both near-core and near-memory accelerators. A prototype chip runs at 400MHz at 0.85V and performs matrix-vector multiplications with 109 GOPs/W. The effectiveness of the design is demonstrated by running inference on a sparse language model, ReLU-Llama. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2503_14708 |
| institution | arXiv |
| publishDate | 2025 |
| record_format | arxiv |
| spellingShingle | NeCTAr: A Heterogeneous RISC-V SoC for Language Model Inference in Intel 16 Schmulbach, Viansa Kim, Jason Gao, Ethan Revina, Lucy Jha, Nikhil Wu, Ethan Nikolic, Borivoje Hardware Architecture This paper introduces NeCTAr (Near-Cache Transformer Accelerator), a 16nm heterogeneous multicore RISC-V SoC for sparse and dense machine learning kernels with both near-core and near-memory accelerators. A prototype chip runs at 400MHz at 0.85V and performs matrix-vector multiplications with 109 GOPs/W. The effectiveness of the design is demonstrated by running inference on a sparse language model, ReLU-Llama. |
| title | NeCTAr: A Heterogeneous RISC-V SoC for Language Model Inference in Intel 16 |
| topic | Hardware Architecture |
| url | https://arxiv.org/abs/2503.14708 |