Saved in:
| Main Author: | Prasad, Rohit |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2511.17235 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
An ultra-low-power CGRA for accelerating Transformers at the edge
by: Prasad, Rohit
Published: (2025)
by: Prasad, Rohit
Published: (2025)
STRELA: STReaming ELAstic CGRA Accelerator for Embedded Systems
by: Vazquez, Daniel, et al.
Published: (2024)
by: Vazquez, Daniel, et al.
Published: (2024)
Performance evaluation of acceleration of convolutional layers on OpenEdgeCGRA
by: Carpentieri, Nicolò, et al.
Published: (2024)
by: Carpentieri, Nicolò, et al.
Published: (2024)
Evaluation of CGRA Toolchains
by: Walter, Dominik, et al.
Published: (2025)
by: Walter, Dominik, et al.
Published: (2025)
Accelerating PageRank Algorithmic Tasks with a new Programmable Hardware Architecture
by: Chowdhury, Md Rownak Hossain, et al.
Published: (2024)
by: Chowdhury, Md Rownak Hossain, et al.
Published: (2024)
Enabling Efficient Hardware Acceleration of Hybrid Vision Transformer (ViT) Networks at the Edge
by: Dumoulin, Joren, et al.
Published: (2025)
by: Dumoulin, Joren, et al.
Published: (2025)
Building an Open CGRA Ecosystem for Agile Innovation
by: Juneja, Rohan, et al.
Published: (2025)
by: Juneja, Rohan, et al.
Published: (2025)
Hardware-Aware DNN Compression for Homogeneous Edge Devices
by: Zhang, Kunlong, et al.
Published: (2025)
by: Zhang, Kunlong, et al.
Published: (2025)
Monomorphism-based CGRA Mapping via Space and Time Decoupling
by: Tirelli, Cristian, et al.
Published: (2025)
by: Tirelli, Cristian, et al.
Published: (2025)
Exploiting pre-optimized kernels with polyhedral transformations for CGRA compilation
by: Wang, Yuxuan, et al.
Published: (2026)
by: Wang, Yuxuan, et al.
Published: (2026)
Enhancing CGRA Efficiency Through Aligned Compute and Communication Provisioning
by: Li, Zhaoying, et al.
Published: (2024)
by: Li, Zhaoying, et al.
Published: (2024)
Hardware Acceleration of Kolmogorov-Arnold Network (KAN) for Lightweight Edge Inference
by: Huang, Wei-Hsing, et al.
Published: (2024)
by: Huang, Wei-Hsing, et al.
Published: (2024)
An Efficient Sparse Hardware Accelerator for Spike-Driven Transformer
by: Li, Zhengke, et al.
Published: (2025)
by: Li, Zhengke, et al.
Published: (2025)
Xpikeformer: Hybrid Analog-Digital Hardware Acceleration for Spiking Transformers
by: Song, Zihang, et al.
Published: (2024)
by: Song, Zihang, et al.
Published: (2024)
DR-CGRA: Supporting Loop-Carried Dependencies in CGRAs Without Spilling Intermediate Values
by: Hadar, Elad, et al.
Published: (2024)
by: Hadar, Elad, et al.
Published: (2024)
TATAA: Programmable Mixed-Precision Transformer Acceleration with a Transformable Arithmetic Architecture
by: Wu, Jiajun, et al.
Published: (2024)
by: Wu, Jiajun, et al.
Published: (2024)
Accelerating PoT Quantization on Edge Devices
by: Saha, Rappy, et al.
Published: (2024)
by: Saha, Rappy, et al.
Published: (2024)
Synapse: Virtualizing Match Tables in Programmable Hardware
by: Lahmer, Seyyidahmed, et al.
Published: (2025)
by: Lahmer, Seyyidahmed, et al.
Published: (2025)
TrainDeeploy: Hardware-Accelerated Parameter-Efficient Fine-Tuning of Small Transformer Models at the Extreme Edge
by: Wang, Run, et al.
Published: (2026)
by: Wang, Run, et al.
Published: (2026)
Mamba-X: An End-to-End Vision Mamba Accelerator for Edge Computing Devices
by: Yoon, Dongho, et al.
Published: (2025)
by: Yoon, Dongho, et al.
Published: (2025)
Hardware Efficient Accelerator for Spiking Transformer With Reconfigurable Parallel Time Step Computing
by: Chen, Bo-Yu, et al.
Published: (2025)
by: Chen, Bo-Yu, et al.
Published: (2025)
Hardware-Software Co-Design for Accelerating Transformer Inference Leveraging Compute-in-Memory
by: Kim, Dong Eun, et al.
Published: (2025)
by: Kim, Dong Eun, et al.
Published: (2025)
Hardware-Algorithm Co-Optimization of Early-Exit Neural Networks for Multi-Core Edge Accelerators
by: Zniber, Alaa, et al.
Published: (2025)
by: Zniber, Alaa, et al.
Published: (2025)
Hardware-Efficient Softmax and Layer Normalization with Guaranteed Normalization for Edge Devices
by: Choi, Dawon, et al.
Published: (2026)
by: Choi, Dawon, et al.
Published: (2026)
VitaLLM: A Versatile and Tiny Accelerator for Mixed-Precision LLM Inference on Edge Devices
by: Lin, Zi-Wei, et al.
Published: (2026)
by: Lin, Zi-Wei, et al.
Published: (2026)
Designing Efficient LLM Accelerators for Edge Devices
by: Haris, Jude, et al.
Published: (2024)
by: Haris, Jude, et al.
Published: (2024)
FPGA-Optimized Hardware Accelerator for Fast Fourier Transform and Singular Value Decomposition in AI
by: Ding, Hong, et al.
Published: (2025)
by: Ding, Hong, et al.
Published: (2025)
COBRA: Algorithm-Architecture Co-optimized Binary Transformer Accelerator for Edge Inference
by: Qiao, Ye, et al.
Published: (2025)
by: Qiao, Ye, et al.
Published: (2025)
AMAZE: Accelerated MiMC Hardware Architecture for Zero-Knowledge Applications on the Edge
by: Ahmed, Anees, et al.
Published: (2024)
by: Ahmed, Anees, et al.
Published: (2024)
CIMR-V: An End-to-End SRAM-based CIM Accelerator with RISC-V for AI Edge Device
by: and, Yan-Cheng Guo, et al.
Published: (2025)
by: and, Yan-Cheng Guo, et al.
Published: (2025)
A High-Throughput Hardware Accelerator for Lempel-Ziv 4 Compression Algorithm
by: Chen, Tao, et al.
Published: (2024)
by: Chen, Tao, et al.
Published: (2024)
DX100: A Programmable Data Access Accelerator for Indirection
by: Khadem, Alireza, et al.
Published: (2025)
by: Khadem, Alireza, et al.
Published: (2025)
Hardware-Accelerated Algorithm for Complex Function Roots Density Graph Plotting
by: Tang, Ruibai, et al.
Published: (2025)
by: Tang, Ruibai, et al.
Published: (2025)
GenPairX: A Hardware-Algorithm Co-Designed Accelerator for Paired-End Read Mapping
by: Eudine, Julien, et al.
Published: (2026)
by: Eudine, Julien, et al.
Published: (2026)
Bombyx: OpenCilk Compilation for FPGA Hardware Acceleration
by: Shahawy, Mohamed, et al.
Published: (2025)
by: Shahawy, Mohamed, et al.
Published: (2025)
On-Device Qwen2.5: Efficient LLM Inference with Model Compression and Hardware Acceleration
by: Xiang, Maoyang, et al.
Published: (2025)
by: Xiang, Maoyang, et al.
Published: (2025)
Hardware Accelerators for Autonomous Cars: A Review
by: Islayem, Ruba, et al.
Published: (2024)
by: Islayem, Ruba, et al.
Published: (2024)
Resilient and Secure Programmable System-on-Chip Accelerator Offload
by: Gouveia, Inês Pinto, et al.
Published: (2024)
by: Gouveia, Inês Pinto, et al.
Published: (2024)
HFRWKV: A High-Performance Fully On-Chip Hardware Accelerator for RWKV
by: Shijie, Liu, et al.
Published: (2026)
by: Shijie, Liu, et al.
Published: (2026)
J3DAI: A tiny DNN-Based Edge AI Accelerator for 3D-Stacked CMOS Image Sensor
by: Tain, Benoit, et al.
Published: (2025)
by: Tain, Benoit, et al.
Published: (2025)
Similar Items
-
An ultra-low-power CGRA for accelerating Transformers at the edge
by: Prasad, Rohit
Published: (2025) -
STRELA: STReaming ELAstic CGRA Accelerator for Embedded Systems
by: Vazquez, Daniel, et al.
Published: (2024) -
Performance evaluation of acceleration of convolutional layers on OpenEdgeCGRA
by: Carpentieri, Nicolò, et al.
Published: (2024) -
Evaluation of CGRA Toolchains
by: Walter, Dominik, et al.
Published: (2025) -
Accelerating PageRank Algorithmic Tasks with a new Programmable Hardware Architecture
by: Chowdhury, Md Rownak Hossain, et al.
Published: (2024)