Saved in:
Bibliographic Details
Main Authors: Yuhala, Peterson, Mwaisela, Mpoki, Felber, Pascal, Schiavoni, Valerio
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2603.23762
Tags: Add Tag
No Tags, Be the first to tag this record!
Table of Contents:
  • Processing-in-memory (PIM) architectures bring computation closer to data, reducing the processor-memory transfer bottleneck in traditional processor-centric designs. Novel hardware solutions, such as UPMEM's in-memory processing technology, achieve this by integrating low-power DRAM processing units (DPUs) into memory DIMMs, enabling massive parallelism and improved memory bandwidth. However, paradoxically, these PIM architectures introduce mandatory coarse-grained data transfers between host DRAM and DPUs, which often become the new bottleneck. We present PIM-CACHE, a lightweight data staging layer that dynamically eliminates redundant data transfers to PIM DPUs by exploiting workload similarity, achieving content-aware copy (CAC). We evaluate PIM-CACHE on both synthetic workloads and real-world genome datasets, demonstrating its effectiveness in reducing PIM data transfer overhead.