Saved in:
| Main Authors: | Jeon, Kang Eun, Rhe, Johnny, Ko, Jong Hwan |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2502.07820 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Row-Column Hybrid Grouping for Fault-Resilient Multi-Bit Weight Representation on IMC Arrays
by: Jeon, Kang Eun, et al.
Published: (2025)
by: Jeon, Kang Eun, et al.
Published: (2025)
Column-wise Quantization of Weights and Partial Sums for Accurate and Efficient Compute-In-Memory Accelerators
by: Kim, Jiyoon, et al.
Published: (2025)
by: Kim, Jiyoon, et al.
Published: (2025)
MEMHD: Memory-Efficient Multi-Centroid Hyperdimensional Computing for Fully-Utilized In-Memory Computing Architectures
by: Kang, Do Yeong, et al.
Published: (2025)
by: Kang, Do Yeong, et al.
Published: (2025)
Towards Efficient IMC Accelerator Design Through Joint Hardware-Workload Co-optimization
by: Krestinskaya, Olga, et al.
Published: (2024)
by: Krestinskaya, Olga, et al.
Published: (2024)
A 28 nm AI microcontroller with tightly coupled zero-standby power weight memory featuring standard logic compatible 4 Mb 4-bits/cell embedded flash technology
by: Kim, Daewung, et al.
Published: (2025)
by: Kim, Daewung, et al.
Published: (2025)
RangeGuard: Efficient, Bounded Approximate Error Correction for Reliable DNNs
by: Ko, Hanum, et al.
Published: (2026)
by: Ko, Hanum, et al.
Published: (2026)
HH-PIM: Dynamic Optimization of Power and Performance with Heterogeneous-Hybrid PIM for Edge AI Devices
by: Jeon, Sangmin, et al.
Published: (2025)
by: Jeon, Sangmin, et al.
Published: (2025)
KAN-SAs: Efficient Acceleration of Kolmogorov-Arnold Networks on Systolic Arrays
by: Errabii, Sohaib, et al.
Published: (2025)
by: Errabii, Sohaib, et al.
Published: (2025)
SystolicAttention: Fusing FlashAttention within a Single Systolic Array
by: Lin, Jiawei, et al.
Published: (2025)
by: Lin, Jiawei, et al.
Published: (2025)
Hybrid Systolic Array Accelerator with Optimized Dataflow for Edge Large Language Model Inference
by: Chen, Chun-Ting, et al.
Published: (2025)
by: Chen, Chun-Ting, et al.
Published: (2025)
ProactivePIM: Accelerating Weight-Sharing Embedding Layer with PIM for Scalable Recommendation System
by: Kim, Youngsuk, et al.
Published: (2024)
by: Kim, Youngsuk, et al.
Published: (2024)
Tempus Core: Area-Power Efficient Temporal-Unary Convolution Core for Low-Precision Edge DLAs
by: Vellaisamy, Prabhu, et al.
Published: (2024)
by: Vellaisamy, Prabhu, et al.
Published: (2024)
Bitwise Systolic Array Architecture for Runtime-Reconfigurable Multi-precision Quantized Multiplication on Hardware Accelerators
by: Liu, Yuhao, et al.
Published: (2026)
by: Liu, Yuhao, et al.
Published: (2026)
Strassen Multisystolic Array Hardware Architectures
by: Pogue, Trevor E., et al.
Published: (2025)
by: Pogue, Trevor E., et al.
Published: (2025)
TrIM, Triangular Input Movement Systolic Array for Convolutional Neural Networks: Dataflow and Analytical Modelling
by: Sestito, Cristian, et al.
Published: (2024)
by: Sestito, Cristian, et al.
Published: (2024)
SA-Kura: An Energy-Efficient Systolic Array Accelerator for Locally-Coupled Kuramoto Drift in Diffusion Sampling
by: Jin, Jeongmin, et al.
Published: (2026)
by: Jin, Jeongmin, et al.
Published: (2026)
BitParticle: Partializing Sparse Dual-Factors to Build Quasi-Synchronizing MAC Arrays for Energy-efficient DNNs
by: Qiaoyuan, Feilong, et al.
Published: (2025)
by: Qiaoyuan, Feilong, et al.
Published: (2025)
TriGen: NPU Architecture for End-to-End Acceleration of Large Language Models based on SW-HW Co-Design
by: Lee, Jonghun, et al.
Published: (2026)
by: Lee, Jonghun, et al.
Published: (2026)
Neuromorphic Computing for Low-Power Artificial Intelligence
by: Katti, Keshava, et al.
Published: (2026)
by: Katti, Keshava, et al.
Published: (2026)
Low Power Approximate Multiplier Architecture for Deep Neural Networks
by: Jaswal, Pragun, et al.
Published: (2025)
by: Jaswal, Pragun, et al.
Published: (2025)
KANtize: Exploring Low-bit Quantization of Kolmogorov-Arnold Networks for Efficient Inference
by: Errabii, Sohaib, et al.
Published: (2026)
by: Errabii, Sohaib, et al.
Published: (2026)
Exploration of Unary Arithmetic-Based Matrix Multiply Units for Low Precision DL Accelerators
by: Vellaisamy, Prabhu, et al.
Published: (2026)
by: Vellaisamy, Prabhu, et al.
Published: (2026)
Achieving Trustworthy Real-Time Decision Support Systems with Low-Latency Interpretable AI Models
by: Deng, Zechun, et al.
Published: (2025)
by: Deng, Zechun, et al.
Published: (2025)
Shavette: Low Power Neural Network Acceleration via Algorithm-level Error Detection and Undervolting
by: Rinkinen, Mikael, et al.
Published: (2024)
by: Rinkinen, Mikael, et al.
Published: (2024)
HALO: Memory-Centric Heterogeneous Accelerator with 2.5D Integration for Low-Batch LLM Inference
by: Negi, Shubham, et al.
Published: (2025)
by: Negi, Shubham, et al.
Published: (2025)
Towards Optimal Circuit Generation: Multi-Agent Collaboration Meets Collective Intelligence
by: Qin, Haiyan, et al.
Published: (2025)
by: Qin, Haiyan, et al.
Published: (2025)
FVEval: Understanding Language Model Capabilities in Formal Verification of Digital Hardware
by: Kang, Minwoo, et al.
Published: (2024)
by: Kang, Minwoo, et al.
Published: (2024)
YOCO: A Hybrid In-Memory Computing Architecture with 8-bit Sub-PetaOps/W In-Situ Multiply Arithmetic for Large-Scale AI
by: Xuan, Zihao, et al.
Published: (2023)
by: Xuan, Zihao, et al.
Published: (2023)
Expert Streaming: Accelerating Low-Batch MoE Inference via Multi-chiplet Architecture and Dynamic Expert Trajectory Scheduling
by: Ma, Songchen, et al.
Published: (2026)
by: Ma, Songchen, et al.
Published: (2026)
ReasoningV: Efficient Verilog Code Generation with Adaptive Hybrid Reasoning Model
by: Qin, Haiyan, et al.
Published: (2025)
by: Qin, Haiyan, et al.
Published: (2025)
Sangam: Chiplet-Based DRAM-PIM Accelerator with CXL Integration for LLM Inferencing
by: Kiyawat, Khyati, et al.
Published: (2025)
by: Kiyawat, Khyati, et al.
Published: (2025)
LOREN: Low Rank-Based Code-Rate Adaptation in Neural Receivers
by: Van Bolderik, Bram, et al.
Published: (2026)
by: Van Bolderik, Bram, et al.
Published: (2026)
ELSA: An ELastic SNN Inference Architecture for Efficient Neuromorphic Computing
by: You, Kang, et al.
Published: (2026)
by: You, Kang, et al.
Published: (2026)
SPADE: Sparse Pillar-based 3D Object Detection Accelerator for Autonomous Driving
by: Lee, Minjae, et al.
Published: (2023)
by: Lee, Minjae, et al.
Published: (2023)
SAFFIRA: a Framework for Assessing the Reliability of Systolic-Array-Based DNN Accelerators
by: Taheri, Mahdi, et al.
Published: (2024)
by: Taheri, Mahdi, et al.
Published: (2024)
Chiplet Placement Order Exploration Based on Learning to Rank with Graph Representation
by: Deng, Zhihui, et al.
Published: (2024)
by: Deng, Zhihui, et al.
Published: (2024)
VUSA: Virtually Upscaled Systolic Array Architecture to Exploit Unstructured Sparsity in AI Acceleration
by: Helal, Shereef, et al.
Published: (2025)
by: Helal, Shereef, et al.
Published: (2025)
Explainable AI-Guided Efficient Approximate DNN Generation for Multi-Pod Systolic Arrays
by: Siddique, Ayesha, et al.
Published: (2025)
by: Siddique, Ayesha, et al.
Published: (2025)
Tensor-Compressed and Fully-Quantized Training of Neural PDE Solvers
by: Lu, Jinming, et al.
Published: (2025)
by: Lu, Jinming, et al.
Published: (2025)
FASQ: Flexible Accelerated Subspace Quantization for Calibration-Free LLM Compression
by: Qiao, Ye, et al.
Published: (2026)
by: Qiao, Ye, et al.
Published: (2026)
Similar Items
-
Row-Column Hybrid Grouping for Fault-Resilient Multi-Bit Weight Representation on IMC Arrays
by: Jeon, Kang Eun, et al.
Published: (2025) -
Column-wise Quantization of Weights and Partial Sums for Accurate and Efficient Compute-In-Memory Accelerators
by: Kim, Jiyoon, et al.
Published: (2025) -
MEMHD: Memory-Efficient Multi-Centroid Hyperdimensional Computing for Fully-Utilized In-Memory Computing Architectures
by: Kang, Do Yeong, et al.
Published: (2025) -
Towards Efficient IMC Accelerator Design Through Joint Hardware-Workload Co-optimization
by: Krestinskaya, Olga, et al.
Published: (2024) -
A 28 nm AI microcontroller with tightly coupled zero-standby power weight memory featuring standard logic compatible 4 Mb 4-bits/cell embedded flash technology
by: Kim, Daewung, et al.
Published: (2025)