:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Jingyao, Sadredini, Elaheh
Format:	Preprint
Published:	2025
Subjects:	Hardware Architecture
Online Access:	https://arxiv.org/abs/2509.22980
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

A Near-Cache Architectural Framework for Cryptographic Computing
by: Zhang, Jingyao, et al.
Published: (2025)

CryptoSRAM: Enabling High-Throughput Cryptography on MCUs via In-SRAM Computing
by: Zhang, Jingyao, et al.
Published: (2025)

SAIL: SRAM-Accelerated LLM Inference System with Lookup-Table-based GEMV
by: Zhang, Jingyao, et al.
Published: (2025)

Flexible Bit-Truncation Memory for Approximate Applications on the Edge
by: Oswald, William, et al.
Published: (2025)

Stoch-IMC: A Bit-Parallel Stochastic In-Memory Computing Architecture Based on STT-MRAM
by: Hajisadeghi, Amir M., et al.
Published: (2024)

BitMoD: Bit-serial Mixture-of-Datatype LLM Acceleration
by: Chen, Yuzong, et al.
Published: (2024)

Allspark: Workload Orchestration for Visual Transformers on Processing In-Memory Systems
by: Ge, Mengke, et al.
Published: (2024)

MCBP: A Memory-Compute Efficient LLM Inference Accelerator Leveraging Bit-Slice-enabled Sparsity and Repetitiveness
by: Wang, Huizheng, et al.
Published: (2025)

PICO-RAM: A PVT-Insensitive Analog Compute-In-Memory SRAM Macro with In-Situ Multi-Bit Charge Computing and 6T Thin-Cell-Compatible Layout
by: Chen, Zhiyu, et al.
Published: (2024)

A 64-Spin All-to-All CMOS Ising Machine with Landscape Perturbation Achieving 2.28 nJ/Edge-Bit Energy-to-Solution
by: Salim, Ahmet Yusuf, et al.
Published: (2026)

Bit Transition Reduction by Data Transmission Ordering in NoC-based DNN Accelerator
by: Chen, Yizhi, et al.
Published: (2025)

Weight Transformations in Bit-Sliced Crossbar Arrays for Fault Tolerant Computing-in-Memory: Design Techniques and Evaluation Framework
by: Malhotra, Akul, et al.
Published: (2025)

Bit-Flip Fault Attack: Crushing Graph Neural Networks via Gradual Bit Search
by: Abharian, Sanaz Kazemi, et al.
Published: (2025)

Binary Weight Multi-Bit Activation Quantization for Compute-in-Memory CNN Accelerators
by: Zhou, Wenyong, et al.
Published: (2025)

BF-IMNA: A Bit Fluid In-Memory Neural Architecture for Neural Network Acceleration
by: Rakka, Mariam, et al.
Published: (2024)

DaPPA: A Data-Parallel Programming Framework for Processing-in-Memory Architectures
by: Oliveira, Geraldo F., et al.
Published: (2023)

Workload Characterization for Branch Predictability
by: Vikas, FNU, et al.
Published: (2025)

Breaking the HBM Bit Cost Barrier: Domain-Specific ECC for AI Inference Infrastructure
by: Xie, Rui, et al.
Published: (2025)

Platinum: Path-Adaptable LUT-Based Accelerator Tailored for Low-Bit Weight Matrix Multiplication
by: Shan, Haoxuan, et al.
Published: (2025)

RAS: A Bit-Exact rANS Accelerator For High-Performance Neural Lossless Compression
by: Qin, Yuchao, et al.
Published: (2025)

Big-PERCIVAL: Exploring the Native Use of 64-Bit Posit Arithmetic in Scientific Computing
by: Mallasén, David, et al.
Published: (2023)

Bit-Width-Aware Design Environment for Few-Shot Learning on Edge AI Hardware
by: Kanda, R., et al.
Published: (2026)

Commercial Evaluation of Zero-Skipping MAC Design for Bit Sparsity Exploitation in DL Inference
by: Nair, Harideep, et al.
Published: (2024)

Towards Efficient SRAM-PIM Architecture Design by Exploiting Unstructured Bit-Level Sparsity
by: Duan, Cenlin, et al.
Published: (2024)

T-MAN: Enabling End-to-End Low-Bit LLM Inference on NPUs via Unified Table Lookup
by: Wei, Jianyu, et al.
Published: (2025)

Exploring the Performance Improvement of Tensor Processing Engines through Transformation in the Bit-weight Dimension of MACs
by: Wu, Qizhe, et al.
Published: (2025)

Efficient FIR filtering with Bit Layer Multiply Accumulator
by: Liguori, Vincenzo
Published: (2024)

Efficient SRAM-PIM Co-design by Joint Exploration of Value-Level and Bit-Level Sparsity
by: Duan, Cenlin, et al.
Published: (2025)

Analysis of Single Event Induced Bit Faults in a Deep Neural Network Accelerator Pipeline
by: Jonckers, Naïn, et al.
Published: (2025)

FAULT+PROBE: A Generic Rowhammer-based Bit Recovery Attack
by: Derya, Kemal, et al.
Published: (2024)

BitROM: Weight Reload-Free CiROM Architecture Towards Billion-Parameter 1.58-bit LLM Inference
by: Zhang, Wenlun, et al.
Published: (2025)

Panacea: Novel DNN Accelerator using Accuracy-Preserving Asymmetric Quantization and Energy-Saving Bit-Slice Sparsity
by: Kam, Dongyun, et al.
Published: (2024)

ITERA-LLM: Boosting Sub-8-Bit Large Language Model Inference via Iterative Tensor Decomposition
by: Zheng, Keran, et al.
Published: (2025)

Energy-Efficient p-Bit-Based Fully-Connected Quantum-Inspired Simulated Annealer with Dual BRAM Architecture
by: Onizawa, Naoya, et al.
Published: (2026)

Communication Characterization of AI Workloads for Large-scale Multi-chiplet Accelerators
by: Musavi, Mariam, et al.
Published: (2024)

Characterizing and Optimizing Realistic Workloads on a Commercial Compute-in-SRAM Device
by: Zhang, Niansong, et al.
Published: (2025)

A Bit Level Weight Reordering Strategy Based on Column Similarity to Explore Weight Sparsity in RRAM-based NN Accelerator
by: Yang, Weiping, et al.
Published: (2025)

BitDecoding: Unlocking Tensor Cores for Long-Context LLMs with Low-Bit KV Cache
by: Du, Dayou, et al.
Published: (2025)

BBS: Bi-directional Bit-level Sparsity for Deep Learning Acceleration
by: Chen, Yuzong, et al.
Published: (2024)

FedBit: Accelerating Privacy-Preserving Federated Learning via Bit-Interleaved Packing and Cross-Layer Co-Design
by: Meng, Xiangchen, et al.
Published: (2025)