:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Author:	Prasad, Rohit
Format:	Preprint
Published:	2025
Subjects:	Hardware Architecture
Online Access:	https://arxiv.org/abs/2511.17235
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

An ultra-low-power CGRA for accelerating Transformers at the edge
by: Prasad, Rohit
Published: (2025)

STRELA: STReaming ELAstic CGRA Accelerator for Embedded Systems
by: Vazquez, Daniel, et al.
Published: (2024)

Performance evaluation of acceleration of convolutional layers on OpenEdgeCGRA
by: Carpentieri, Nicolò, et al.
Published: (2024)

Evaluation of CGRA Toolchains
by: Walter, Dominik, et al.
Published: (2025)

Accelerating PageRank Algorithmic Tasks with a new Programmable Hardware Architecture
by: Chowdhury, Md Rownak Hossain, et al.
Published: (2024)

Enabling Efficient Hardware Acceleration of Hybrid Vision Transformer (ViT) Networks at the Edge
by: Dumoulin, Joren, et al.
Published: (2025)

Building an Open CGRA Ecosystem for Agile Innovation
by: Juneja, Rohan, et al.
Published: (2025)

Hardware-Aware DNN Compression for Homogeneous Edge Devices
by: Zhang, Kunlong, et al.
Published: (2025)

Monomorphism-based CGRA Mapping via Space and Time Decoupling
by: Tirelli, Cristian, et al.
Published: (2025)

Exploiting pre-optimized kernels with polyhedral transformations for CGRA compilation
by: Wang, Yuxuan, et al.
Published: (2026)

Enhancing CGRA Efficiency Through Aligned Compute and Communication Provisioning
by: Li, Zhaoying, et al.
Published: (2024)

Hardware Acceleration of Kolmogorov-Arnold Network (KAN) for Lightweight Edge Inference
by: Huang, Wei-Hsing, et al.
Published: (2024)

An Efficient Sparse Hardware Accelerator for Spike-Driven Transformer
by: Li, Zhengke, et al.
Published: (2025)

Xpikeformer: Hybrid Analog-Digital Hardware Acceleration for Spiking Transformers
by: Song, Zihang, et al.
Published: (2024)

DR-CGRA: Supporting Loop-Carried Dependencies in CGRAs Without Spilling Intermediate Values
by: Hadar, Elad, et al.
Published: (2024)

TATAA: Programmable Mixed-Precision Transformer Acceleration with a Transformable Arithmetic Architecture
by: Wu, Jiajun, et al.
Published: (2024)

Accelerating PoT Quantization on Edge Devices
by: Saha, Rappy, et al.
Published: (2024)

Synapse: Virtualizing Match Tables in Programmable Hardware
by: Lahmer, Seyyidahmed, et al.
Published: (2025)

TrainDeeploy: Hardware-Accelerated Parameter-Efficient Fine-Tuning of Small Transformer Models at the Extreme Edge
by: Wang, Run, et al.
Published: (2026)

Mamba-X: An End-to-End Vision Mamba Accelerator for Edge Computing Devices
by: Yoon, Dongho, et al.
Published: (2025)

Hardware Efficient Accelerator for Spiking Transformer With Reconfigurable Parallel Time Step Computing
by: Chen, Bo-Yu, et al.
Published: (2025)

Hardware-Software Co-Design for Accelerating Transformer Inference Leveraging Compute-in-Memory
by: Kim, Dong Eun, et al.
Published: (2025)

Hardware-Algorithm Co-Optimization of Early-Exit Neural Networks for Multi-Core Edge Accelerators
by: Zniber, Alaa, et al.
Published: (2025)

Hardware-Efficient Softmax and Layer Normalization with Guaranteed Normalization for Edge Devices
by: Choi, Dawon, et al.
Published: (2026)

VitaLLM: A Versatile and Tiny Accelerator for Mixed-Precision LLM Inference on Edge Devices
by: Lin, Zi-Wei, et al.
Published: (2026)

Designing Efficient LLM Accelerators for Edge Devices
by: Haris, Jude, et al.
Published: (2024)

FPGA-Optimized Hardware Accelerator for Fast Fourier Transform and Singular Value Decomposition in AI
by: Ding, Hong, et al.
Published: (2025)

COBRA: Algorithm-Architecture Co-optimized Binary Transformer Accelerator for Edge Inference
by: Qiao, Ye, et al.
Published: (2025)

AMAZE: Accelerated MiMC Hardware Architecture for Zero-Knowledge Applications on the Edge
by: Ahmed, Anees, et al.
Published: (2024)

CIMR-V: An End-to-End SRAM-based CIM Accelerator with RISC-V for AI Edge Device
by: and, Yan-Cheng Guo, et al.
Published: (2025)

A High-Throughput Hardware Accelerator for Lempel-Ziv 4 Compression Algorithm
by: Chen, Tao, et al.
Published: (2024)

DX100: A Programmable Data Access Accelerator for Indirection
by: Khadem, Alireza, et al.
Published: (2025)

Hardware-Accelerated Algorithm for Complex Function Roots Density Graph Plotting
by: Tang, Ruibai, et al.
Published: (2025)

GenPairX: A Hardware-Algorithm Co-Designed Accelerator for Paired-End Read Mapping
by: Eudine, Julien, et al.
Published: (2026)

Bombyx: OpenCilk Compilation for FPGA Hardware Acceleration
by: Shahawy, Mohamed, et al.
Published: (2025)

On-Device Qwen2.5: Efficient LLM Inference with Model Compression and Hardware Acceleration
by: Xiang, Maoyang, et al.
Published: (2025)

Hardware Accelerators for Autonomous Cars: A Review
by: Islayem, Ruba, et al.
Published: (2024)

Resilient and Secure Programmable System-on-Chip Accelerator Offload
by: Gouveia, Inês Pinto, et al.
Published: (2024)

HFRWKV: A High-Performance Fully On-Chip Hardware Accelerator for RWKV
by: Shijie, Liu, et al.
Published: (2026)

J3DAI: A tiny DNN-Based Edge AI Accelerator for 3D-Stacked CMOS Image Sensor
by: Tain, Benoit, et al.
Published: (2025)