:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Chen, Sitian, Zhou, Amelie Chi, Shi, Yucheng, Li, Yusen, Yao, Xin
Format:	Preprint
Published:	2024
Subjects:	Hardware Architecture
Online Access:	https://arxiv.org/abs/2410.23805
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Co-Designing Graph-based Approximate Nearest Neighbor Search at Billion Scale for Processing-in-Memory
by: Chen, Sitian, et al.
Published: (2026)

SpANNS: Optimizing Approximate Nearest Neighbor Search for Sparse Vectors Using Near Memory Processing
by: Zhang, Tianqi, et al.
Published: (2026)

FaTRQ: Tiered Residual Quantization for LLM Vector Search in Far-Memory-Aware ANNS Systems
by: Zhang, Tianqi, et al.
Published: (2026)

Pathfinding Future PIM Architectures by Demystifying a Commercial PIM Technology
by: Hyun, Bongjoon, et al.
Published: (2023)

Inclusive-PIM: Hardware-Software Co-design for Broad Acceleration on Commercial PIM Architectures
by: Alsop, Johnathan, et al.
Published: (2023)

PIM-malloc: A Fast and Scalable Dynamic Memory Allocator for Processing-In-Memory (PIM) Architectures
by: Lee, Dongjae, et al.
Published: (2025)

PIM-LLM: A High-Throughput Hybrid PIM Architecture for 1-bit LLMs
by: Malekar, Jinendra, et al.
Published: (2025)

Annotated PIM Bibliography
by: Kogge, Peter M.
Published: (2026)

Towards Efficient SRAM-PIM Architecture Design by Exploiting Unstructured Bit-Level Sparsity
by: Duan, Cenlin, et al.
Published: (2024)

PIM-MMU: A Memory Management Unit for Accelerating Data Transfers in Commercial PIM Systems
by: Lee, Dongjae, et al.
Published: (2024)

LEAP: LLM Inference on Scalable PIM-NoC Architecture with Balanced Dataflow and Fine-Grained Parallelism
by: Wang, Yimin, et al.
Published: (2025)

THERMOS: Thermally-Aware Multi-Objective Scheduling of AI Workloads on Heterogeneous Multi-Chiplet PIM Architectures
by: Kanani, Alish, et al.
Published: (2025)

PIM-FW: Hardware-Software Co-Design of All-pairs Shortest Paths in DRAM
by: Lu, Tsung-Han, et al.
Published: (2025)

CD-PIM: A High-Bandwidth and Compute-Efficient LPDDR5-Based PIM for Low-Batch LLM Acceleration on Edge-Device
by: Lin, Ye, et al.
Published: (2026)

LP-Spec: Leveraging LPDDR PIM for Efficient LLM Mobile Speculative Inference with Architecture-Dataflow Co-Optimization
by: He, Siyuan, et al.
Published: (2025)

Accelerating Multi-Scale Deformable Attention Using Near-Memory-Processing Architecture
by: Li, Huize, et al.
Published: (2026)

BitROM: Weight Reload-Free CiROM Architecture Towards Billion-Parameter 1.58-bit LLM Inference
by: Zhang, Wenlun, et al.
Published: (2025)

L3: DIMM-PIM Integrated Architecture and Coordination for Scalable Long-Context LLM Inference
by: Liu, Qingyuan, et al.
Published: (2025)

FILCO: Flexible Composing Architecture with Real-Time Reconfigurability for DNN Acceleration
by: Chen, Xingzhen, et al.
Published: (2026)

PIMfused: Near-Bank DRAM-PIM with Fused-layer Dataflow for CNN Data Transfer Optimization
by: Yang, Simei, et al.
Published: (2025)

SAL-PIM: A Subarray-level Processing-in-Memory Architecture with LUT-based Linear Interpolation for Transformer-based Text Generation
by: Han, Wontak, et al.
Published: (2024)

Toleo: Scaling Freshness to Tera-scale Memory using CXL and PIM
by: Dong, Juechu, et al.
Published: (2024)

DL-PIM: Improving Data Locality in Processing-in-Memory Systems
by: Tian, Parker Hao, et al.
Published: (2025)

AME-PIM: Can Memory be Your Next Tensor Accelerator?
by: Venieri, Emanuele, et al.
Published: (2026)

Fast-OverlaPIM: A Fast Overlap-driven Mapping Framework for Processing In-Memory Neural Network Acceleration
by: Wang, Xuan, et al.
Published: (2024)

ProactivePIM: Accelerating Weight-Sharing Embedding Layer with PIM for Scalable Recommendation System
by: Kim, Youngsuk, et al.
Published: (2024)

NeuPIMs: NPU-PIM Heterogeneous Acceleration for Batched LLM Inferencing
by: Heo, Guseul, et al.
Published: (2024)

IANUS: Integrated Accelerator based on NPU-PIM Unified Memory System
by: Seo, Minseok, et al.
Published: (2024)

Membrane: Accelerating Database Analytics with Bank-Level DRAM-PIM Filtering
by: Shekar, Akhil, et al.
Published: (2025)

PIM-GPT: A Hybrid Process-in-Memory Accelerator for Autoregressive Transformers
by: Wu, Yuting, et al.
Published: (2023)

A$^3$PIM: An Automated, Analytic and Accurate Processing-in-Memory Offloader
by: Jiang, Qingcai, et al.
Published: (2024)

HH-PIM: Dynamic Optimization of Power and Performance with Heterogeneous-Hybrid PIM for Edge AI Devices
by: Jeon, Sangmin, et al.
Published: (2025)

UpDLRM: Accelerating Personalized Recommendation using Real-World PIM Architecture
by: Chen, Sitian, et al.
Published: (2024)

AIM: Software and Hardware Co-design for Architecture-level IR-drop Mitigation in High-performance PIM
by: Zhang, Yuanpeng, et al.
Published: (2025)

Shared-PIM: Enabling Concurrent Computation and Data Flow for Faster Processing-in-DRAM
by: Mamdouh, Ahmed, et al.
Published: (2024)

The BRAM is the Limit: Shattering Myths, Shaping Standards, and Building Scalable PIM Accelerators
by: Kabir, MD Arafat, et al.
Published: (2024)

Sieve: Dynamic Expert-Aware PIM Acceleration for Evolving Mixture-of-Experts Models
by: Kim, Jungwoo, et al.
Published: (2026)

LOCALUT: Harnessing Capacity-Computation Tradeoffs for LUT-Based Inference in DRAM-PIM
by: Hong, Junguk, et al.
Published: (2026)

PIM-AI: A Novel Architecture for High-Efficiency LLM Inference
by: Ortega, Cristobal, et al.
Published: (2024)

PyPIM: Integrating Digital Processing-in-Memory from Microarchitectural Design to Python Tensors
by: Leitersdorf, Orian, et al.
Published: (2023)