:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Pramanik, Soham, William, Vimal, Raha, Arnab, Das, Debayan, Mukherjee, Amitava, Paluh, Janet L.
Format:	Preprint
Published:	2025
Subjects:	Hardware Architecture
Online Access:	https://arxiv.org/abs/2512.23062
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

SafeCiM: Investigating Resilience of Hybrid Floating-Point Compute-in-Memory Deep Learning Accelerators
by: Bhattacharya, Swastik, et al.
Published: (2025)

FlexNN: A Dataflow-aware Flexible Deep Learning Accelerator for Energy-Efficient Edge Devices
by: Raha, Arnab, et al.
Published: (2024)

StruM: Structured Mixed Precision for Efficient Deep Learning Hardware Codesign
by: Wu, Michael, et al.
Published: (2025)

SPARQLe: Sub-Precision Activation Representation for Quantized LLM Inference
by: Parvathy, Aradhana Mohan, et al.
Published: (2026)

Accelerating LLM Inference with Flexible N:M Sparsity via A Fully Digital Compute-in-Memory Accelerator
by: Ramachandran, Akshat, et al.
Published: (2025)

LogicSparse: Enabling Engine-Free Unstructured Sparsity for Quantised Deep-learning Accelerators
by: Li, Changhong, et al.
Published: (2025)

IMAGine: An In-Memory Accelerated GEMV Engine Overlay
by: Kabir, MD Arafat, et al.
Published: (2024)

NL-DPE: An Analog In-memory Non-Linear Dot Product Engine for Efficient CNN and LLM Inference
by: Zhao, Lei, et al.
Published: (2025)

GraNNite: Enabling High-Performance Execution of Graph Neural Networks on Resource-Constrained Neural Processing Units
by: Das, Arghadip, et al.
Published: (2025)

Accelerating CRONet on AMD Versal AIE-ML Engines
by: Mhatre, Kaustubh, et al.
Published: (2026)

Optimizing Neural Networks with Learnable Non-Linear Activation Functions via Lookup-Based FPGA Acceleration
by: Yin, Mengyuan, et al.
Published: (2025)

Accelerating Elliptic Curve Point Additions on Versal AI Engine for Multi-scalar Multiplication
by: Ohno, Ayumi, et al.
Published: (2025)

GAMA: High-Performance GEMM Acceleration on AMD Versal ML-Optimized AI Engines
by: Mhatre, Kaustubh, et al.
Published: (2025)

Multi-Objective Hardware-Mapping Co-Optimisation for Multi-DNN Workloads on Chiplet-based Accelerators
by: Das, Abhijit, et al.
Published: (2022)

FireFly-T: High-Throughput Sparsity Exploitation for Spiking Transformer Acceleration with Dual-Engine Overlay Architecture
by: Li, Tenglong, et al.
Published: (2025)

Special Session: Sustainable Deployment of Deep Neural Networks on Non-Volatile Compute-in-Memory Accelerators
by: Qin, Yifan, et al.
Published: (2025)

An Analytical Cost Model for Fast Evaluation of Multiple Compute-Engine CNN Accelerators
by: Qararyah, Fareed, et al.
Published: (2025)

Polaris: Multi-Fidelity Design Space Exploration of Deep Learning Accelerators
by: Sakhuja, Chirag, et al.
Published: (2024)

SigDLA: A Deep Learning Accelerator Extension for Signal Processing
by: Fu, Fangfa, et al.
Published: (2024)

DataMaestro: A Versatile and Efficient Data Streaming Engine Bringing Decoupled Memory Access To Dataflow Accelerators
by: Yi, Xiaoling, et al.
Published: (2025)

A Low-Power Sparse Deep Learning Accelerator with Optimized Data Reuse
by: Hsu, Kai-Chieh, et al.
Published: (2025)

Escaping Flatland: A Placement Flow for Enabling 3D FPGAs
by: Hao, Cong, et al.
Published: (2026)

Leveraging Application-Specific Knowledge for Energy-Efficient Deep Learning Accelerators on Resource-Constrained FPGAs
by: Qian, Chao
Published: (2025)

In-Pipeline Integration of Digital In-Memory-Computing into RISC-V Vector Architecture to Accelerate Deep Learning
by: Spagnolo, Tommaso, et al.
Published: (2026)

HCiM: ADC-Less Hybrid Analog-Digital Compute in Memory Accelerator for Deep Learning Workloads
by: Negi, Shubham, et al.
Published: (2024)

Communication Characterization of AI Workloads for Large-scale Multi-chiplet Accelerators
by: Musavi, Mariam, et al.
Published: (2024)

Chiplet-Gym: Optimizing Chiplet-based AI Accelerator Design with Reinforcement Learning
by: Mishty, Kaniz, et al.
Published: (2024)

ChipletPart: Cost-Aware Partitioning for 2.5D Systems
by: Graening, Alexander, et al.
Published: (2025)

CARMEN: CORDIC-Accelerated Resource-Efficient Multi-Precision Inference Engine for Deep Learning
by: Kumar, Sonu, et al.
Published: (2026)

CapsBeam: Accelerating Capsule Network based Beamformer for Ultrasound Non-Steered Plane Wave Imaging on Field Programmable Gate Array
by: Rahoof, Abdul, et al.
Published: (2025)

Accelerating Time Series Analysis via Processing using Non-Volatile Memories
by: Fernandez, Ivan, et al.
Published: (2022)

Record Acceleration of the Two-Dimensional Ising Model Using High-Performance Wafer Scale Engine
by: Van Essendelft, Dirk, et al.
Published: (2024)

Exploring the Versal AI Engine for 3D Gaussian Splatting
by: Shimamura, Kotaro, et al.
Published: (2025)

Tensor Memory Engine: On-the-fly Data Reorganization for Ideal Locality
by: Hoornaert, Denis, et al.
Published: (2026)

STAR: An Efficient Softmax Engine for Attention Model with RRAM Crossbar
by: Zhai, Yifeng, et al.
Published: (2024)

DX100: A Programmable Data Access Accelerator for Indirection
by: Khadem, Alireza, et al.
Published: (2025)

Accelerating Electrostatics-based Global Placement with Enhanced FFT Computation
by: Zhang, Hangyu, et al.
Published: (2025)

ApproxPilot: A GNN-based Accelerator Approximation Framework
by: Zhang, Qing, et al.
Published: (2024)

High Utilization Energy-Aware Real-Time Inference Deep Convolutional Neural Network Accelerator
by: Lin, Kuan-Ting, et al.
Published: (2025)

Analysis of Single Event Induced Bit Faults in a Deep Neural Network Accelerator Pipeline
by: Jonckers, Naïn, et al.
Published: (2025)