:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Zhang, Yuanpeng, Hu, Xing, Chen, Xi, Yuan, Zhihang, Li, Cong, Zhu, Jingchen, Wang, Zhao, Zhang, Chenguang, Si, Xin, Gao, Wei, Wu, Qiang, Wang, Runsheng, Sun, Guangyu
Format:	Preprint
Published:	2025
Subjects:	Hardware Architecture Artificial Intelligence Machine Learning
Online Access:	https://arxiv.org/abs/2511.04321
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Inclusive-PIM: Hardware-Software Co-design for Broad Acceleration on Commercial PIM Architectures
by: Alsop, Johnathan, et al.
Published: (2023)

METRO: A Software-Hardware Co-Design of Interconnections for Spatial DNN Accelerators
by: Wang, Zhao, et al.
Published: (2021)

Algorithm-hardware co-design for Energy-Efficient A/D conversion in ReRAM-based accelerators
by: Zhang, Chenguang, et al.
Published: (2024)

Hardware-Software Co-design for 3D-DRAM-based LLM Serving Accelerator
by: Li, Cong, et al.
Published: (2026)

PIM-FW: Hardware-Software Co-Design of All-pairs Shortest Paths in DRAM
by: Lu, Tsung-Han, et al.
Published: (2025)

NeoMem: Hardware/Software Co-Design for CXL-Native Memory Tiering
by: Zhou, Zhe, et al.
Published: (2024)

MERE: Hardware-Software Co-Design for Masking Cache Miss Latency in Embedded Processors
by: You, Dean, et al.
Published: (2025)

Efficient SRAM-PIM Co-design by Joint Exploration of Value-Level and Bit-Level Sparsity
by: Duan, Cenlin, et al.
Published: (2025)

ANCoEF: Asynchronous Neuromorphic Algorithm/Hardware Co-Exploration Framework with a Fully Asynchronous Simulator
by: Zhang, Jian, et al.
Published: (2024)

Pathfinding Future PIM Architectures by Demystifying a Commercial PIM Technology
by: Hyun, Bongjoon, et al.
Published: (2023)

SOLE: Hardware-Software Co-design of Softmax and LayerNorm for Efficient Transformer Inference
by: Wang, Wenxun, et al.
Published: (2025)

PIM-malloc: A Fast and Scalable Dynamic Memory Allocator for Processing-In-Memory (PIM) Architectures
by: Lee, Dongjae, et al.
Published: (2025)

Towards Efficient SRAM-PIM Architecture Design by Exploiting Unstructured Bit-Level Sparsity
by: Duan, Cenlin, et al.
Published: (2024)

HSCO-Bench: An Agent-Driven End-to-End Hardware-Software Co-design Benchmark for Systems-on-Chip
by: Tsai, Pei-Huan, et al.
Published: (2026)

PIM-LLM: A High-Throughput Hybrid PIM Architecture for 1-bit LLMs
by: Malekar, Jinendra, et al.
Published: (2025)

SoftmAP: Software-Hardware Co-design for Integer-Only Softmax on Associative Processors
by: Rakka, Mariam, et al.
Published: (2024)

LP-Spec: Leveraging LPDDR PIM for Efficient LLM Mobile Speculative Inference with Architecture-Dataflow Co-Optimization
by: He, Siyuan, et al.
Published: (2025)

Theseus: Exploring Efficient Wafer-Scale Chip Design for Large Language Models
by: Zhu, Jingchen, et al.
Published: (2024)

LEAP: LLM Inference on Scalable PIM-NoC Architecture with Balanced Dataflow and Fine-Grained Parallelism
by: Wang, Yimin, et al.
Published: (2025)

AccelCIM: Systematic Dataflow Exploration for SRAM Compute-in-Memory Accelerator
by: Xue, Chenhao, et al.
Published: (2026)

CellE: Automated Standard Cell Library Extension via Equality Saturation
by: Ren, Yi, et al.
Published: (2026)

AutoPDR: Circuit-Aware Solver Configuration Prediction for Hardware Model Checking
by: Hu, Guangyu, et al.
Published: (2026)

MixPE: Quantization and Hardware Co-design for Efficient LLM Inference
by: Zhang, Yu, et al.
Published: (2024)

RePart: Efficient Hypergraph Partitioning with Logic Replication Optimization for Multi-FPGA System
by: Fu, Zizhuo, et al.
Published: (2026)

GenDRAM:Hardware-Software Co-Design of General Platform in DRAM
by: Lu, Tsung-Han, et al.
Published: (2026)

Annotated PIM Bibliography
by: Kogge, Peter M.
Published: (2026)

SkyByte: Architecting an Efficient Memory-Semantic CXL-based SSD with OS and Hardware Co-design
by: Zhang, Haoyang, et al.
Published: (2025)

Hardware Software Optimizations for Fast Model Recovery on Reconfigurable Architectures
by: Xu, Bin, et al.
Published: (2025)

The Quest for Reliable AI Accelerators: Cross-Layer Evaluation and Design Optimization
by: Li, Meng, et al.
Published: (2026)

LeGend: A Data-Driven Framework for Lemma Generation in Hardware Model Checking
by: Miao, Mingkai, et al.
Published: (2026)

UpANNS: Enhancing Billion-Scale ANNS Efficiency with Real-World PIM Architecture
by: Chen, Sitian, et al.
Published: (2024)

Hardware-Software Co-Design for Accelerating Transformer Inference Leveraging Compute-in-Memory
by: Kim, Dong Eun, et al.
Published: (2025)

EvolveGen: Algorithmic Level Hardware Model Checking Benchmark Generation through Reinforcement Learning
by: Hu, Guangyu, et al.
Published: (2026)

TokenStack: A Heterogeneous HBM-PIM Architecture and Runtime for Efficient LLM Inference
by: Li, Zhuoran, et al.
Published: (2026)

PIM-MMU: A Memory Management Unit for Accelerating Data Transfers in Commercial PIM Systems
by: Lee, Dongjae, et al.
Published: (2024)

Reconfigurable Stream Network Architecture
by: Wang, Chengyue, et al.
Published: (2024)

LOCALUT: Harnessing Capacity-Computation Tradeoffs for LUT-Based Inference in DRAM-PIM
by: Hong, Junguk, et al.
Published: (2026)

MCMComm: Hardware-Software Co-Optimization for End-to-End Communication in Multi-Chip-Modules
by: Raj, Ritik, et al.
Published: (2025)

L3: DIMM-PIM Integrated Architecture and Coordination for Scalable Long-Context LLM Inference
by: Liu, Qingyuan, et al.
Published: (2025)

CIM-Tuner: Balancing the Compute and Storage Capacity of SRAM-CIM Accelerator via Hardware-mapping Co-exploration
by: Chen, Jinwu, et al.
Published: (2026)