:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Kabir, MD Arafat, Kamucheka, Tendayi, Fredricks, Nathaniel, Mandebi, Joel, Bakos, Jason, Huang, Miaoqing, Andrews, David
Format:	Preprint
Published:	2024
Subjects:	Hardware Architecture
Online Access:	https://arxiv.org/abs/2410.04367
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

The BRAM is the Limit: Shattering Myths, Shaping Standards, and Building Scalable PIM Accelerators
by: Kabir, MD Arafat, et al.
Published: (2024)

A Runtime-Adaptive Transformer Neural Network Accelerator on FPGAs
by: Kabir, Ehsan, et al.
Published: (2024)

FAMOUS: Flexible Accelerator for the Attention Mechanism of Transformer on UltraScale+ FPGAs
by: Kabir, Ehsan, et al.
Published: (2024)

ProTEA: Programmable Transformer Encoder Acceleration on FPGA
by: Kabir, Ehsan, et al.
Published: (2024)

N-TORC: Native Tensor Optimizer for Real-time Constraints
by: Singh, Suyash Vardhan, et al.
Published: (2025)

Balanced Data Placement for GEMV Acceleration with Processing-In-Memory
by: Ibrahim, Mohamed Assem, et al.
Published: (2024)

SAIL: SRAM-Accelerated LLM Inference System with Lookup-Table-based GEMV
by: Zhang, Jingyao, et al.
Published: (2025)

FireFly-T: High-Throughput Sparsity Exploitation for Spiking Transformer Acceleration with Dual-Engine Overlay Architecture
by: Li, Tenglong, et al.
Published: (2025)

DataMaestro: A Versatile and Efficient Data Streaming Engine Bringing Decoupled Memory Access To Dataflow Accelerators
by: Yi, Xiaoling, et al.
Published: (2025)

To Overlay or to Customize? Revisiting Architectural Choices in Heterogeneous Systems
by: Chen, Xingzhen, et al.
Published: (2026)

Accelerating CRONet on AMD Versal AIE-ML Engines
by: Mhatre, Kaustubh, et al.
Published: (2026)

Modeling Analog-Digital-Converter Energy and Area for Compute-In-Memory Accelerator Design
by: Andrulis, Tanner, et al.
Published: (2024)

Tensor Memory Engine: On-the-fly Data Reorganization for Ideal Locality
by: Hoornaert, Denis, et al.
Published: (2026)

ATLAAS: Automatic Tensor-Level Abstraction of Accelerator Semantics
by: Gao, Ruijie, et al.
Published: (2026)

SkipOPU: An FPGA-based Overlay Processor for Large Language Models with Dynamically Allocated Computation
by: He, Zicheng, et al.
Published: (2026)

Accelerating Elliptic Curve Point Additions on Versal AI Engine for Multi-scalar Multiplication
by: Ohno, Ayumi, et al.
Published: (2025)

GAMA: High-Performance GEMM Acceleration on AMD Versal ML-Optimized AI Engines
by: Mhatre, Kaustubh, et al.
Published: (2025)

TYTAN: Taylor-series based Non-Linear Activation Engine for Deep Learning Accelerators
by: Pramanik, Soham, et al.
Published: (2025)

LogicSparse: Enabling Engine-Free Unstructured Sparsity for Quantised Deep-learning Accelerators
by: Li, Changhong, et al.
Published: (2025)

RACE-IT: A Reconfigurable Analog Computing Engine for In-Memory Transformer Acceleration
by: Zhao, Lei, et al.
Published: (2023)

Stream-HLS: Towards Automatic Dataflow Acceleration
by: Basalama, Suhail, et al.
Published: (2025)

Accelerating Multi-Scale Deformable Attention Using Near-Memory-Processing Architecture
by: Li, Huize, et al.
Published: (2026)

Voxel-CIM: An Efficient Compute-in-Memory Accelerator for Voxel-based Point Cloud Neural Networks
by: Lin, Xipeng, et al.
Published: (2024)

Bancroft: Genomics Acceleration Beyond On-Device Memory
by: Lim, Se-Min, et al.
Published: (2025)

CrossNAS: A Cross-Layer Neural Architecture Search Framework for PIM Systems
by: Amin, Md Hasibul, et al.
Published: (2025)

ADS-IMC: Accelerating Data Sorting with In-Memory Computation
by: Dhakad, Narendra Singh, et al.
Published: (2026)

An Analytical Cost Model for Fast Evaluation of Multiple Compute-Engine CNN Accelerators
by: Qararyah, Fareed, et al.
Published: (2025)

Holistic Optimization Framework for FPGA Accelerators
by: Pouget, Stéphane, et al.
Published: (2025)

Mozart: A Chiplet Ecosystem-Accelerator Codesign Framework for Composable Bespoke Application Specific Integrated Circuits
by: Jin, Haoran, et al.
Published: (2025)

PIMCOMP: An End-to-End DNN Compiler for Processing-In-Memory Accelerators
by: Sun, Xiaotian, et al.
Published: (2024)

AME-PIM: Can Memory be Your Next Tensor Accelerator?
by: Venieri, Emanuele, et al.
Published: (2026)

Generalized Ping-Pong: Off-Chip Memory Bandwidth Centric Pipelining Strategy for Processing-In-Memory Accelerators
by: Wang, Ruibao, et al.
Published: (2024)

CiMLoop: A Flexible, Accurate, and Fast Compute-In-Memory Modeling Tool
by: Andrulis, Tanner, et al.
Published: (2024)

PIMSIM-NN: An ISA-based Simulation Framework for Processing-in-Memory Accelerators
by: Wang, Xinyu, et al.
Published: (2024)

IANUS: Integrated Accelerator based on NPU-PIM Unified Memory System
by: Seo, Minseok, et al.
Published: (2024)

Memory-Guided Unified Hardware Accelerator for Mixed-Precision Scientific Computing
by: Wang, Chuanzhen, et al.
Published: (2026)

AccelCIM: Systematic Dataflow Exploration for SRAM Compute-in-Memory Accelerator
by: Xue, Chenhao, et al.
Published: (2026)

AutoRAC: Automated Processing-in-Memory Accelerator Design for Recommender Systems
by: Cheng, Feng, et al.
Published: (2025)

PIM-GPT: A Hybrid Process-in-Memory Accelerator for Autoregressive Transformers
by: Wu, Yuting, et al.
Published: (2023)

CAMASim: A Comprehensive Simulation Framework for Content-Addressable Memory based Accelerators
by: Li, Mengyuan, et al.
Published: (2024)