Saved in:
| Main Authors: | Chen, Sitian, Zhou, Amelie Chi, Shi, Yucheng, Li, Yusen, Yao, Xin |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2410.23805 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Co-Designing Graph-based Approximate Nearest Neighbor Search at Billion Scale for Processing-in-Memory
by: Chen, Sitian, et al.
Published: (2026)
by: Chen, Sitian, et al.
Published: (2026)
SpANNS: Optimizing Approximate Nearest Neighbor Search for Sparse Vectors Using Near Memory Processing
by: Zhang, Tianqi, et al.
Published: (2026)
by: Zhang, Tianqi, et al.
Published: (2026)
FaTRQ: Tiered Residual Quantization for LLM Vector Search in Far-Memory-Aware ANNS Systems
by: Zhang, Tianqi, et al.
Published: (2026)
by: Zhang, Tianqi, et al.
Published: (2026)
Pathfinding Future PIM Architectures by Demystifying a Commercial PIM Technology
by: Hyun, Bongjoon, et al.
Published: (2023)
by: Hyun, Bongjoon, et al.
Published: (2023)
Inclusive-PIM: Hardware-Software Co-design for Broad Acceleration on Commercial PIM Architectures
by: Alsop, Johnathan, et al.
Published: (2023)
by: Alsop, Johnathan, et al.
Published: (2023)
PIM-malloc: A Fast and Scalable Dynamic Memory Allocator for Processing-In-Memory (PIM) Architectures
by: Lee, Dongjae, et al.
Published: (2025)
by: Lee, Dongjae, et al.
Published: (2025)
PIM-LLM: A High-Throughput Hybrid PIM Architecture for 1-bit LLMs
by: Malekar, Jinendra, et al.
Published: (2025)
by: Malekar, Jinendra, et al.
Published: (2025)
Annotated PIM Bibliography
by: Kogge, Peter M.
Published: (2026)
by: Kogge, Peter M.
Published: (2026)
Towards Efficient SRAM-PIM Architecture Design by Exploiting Unstructured Bit-Level Sparsity
by: Duan, Cenlin, et al.
Published: (2024)
by: Duan, Cenlin, et al.
Published: (2024)
PIM-MMU: A Memory Management Unit for Accelerating Data Transfers in Commercial PIM Systems
by: Lee, Dongjae, et al.
Published: (2024)
by: Lee, Dongjae, et al.
Published: (2024)
LEAP: LLM Inference on Scalable PIM-NoC Architecture with Balanced Dataflow and Fine-Grained Parallelism
by: Wang, Yimin, et al.
Published: (2025)
by: Wang, Yimin, et al.
Published: (2025)
THERMOS: Thermally-Aware Multi-Objective Scheduling of AI Workloads on Heterogeneous Multi-Chiplet PIM Architectures
by: Kanani, Alish, et al.
Published: (2025)
by: Kanani, Alish, et al.
Published: (2025)
PIM-FW: Hardware-Software Co-Design of All-pairs Shortest Paths in DRAM
by: Lu, Tsung-Han, et al.
Published: (2025)
by: Lu, Tsung-Han, et al.
Published: (2025)
CD-PIM: A High-Bandwidth and Compute-Efficient LPDDR5-Based PIM for Low-Batch LLM Acceleration on Edge-Device
by: Lin, Ye, et al.
Published: (2026)
by: Lin, Ye, et al.
Published: (2026)
LP-Spec: Leveraging LPDDR PIM for Efficient LLM Mobile Speculative Inference with Architecture-Dataflow Co-Optimization
by: He, Siyuan, et al.
Published: (2025)
by: He, Siyuan, et al.
Published: (2025)
Accelerating Multi-Scale Deformable Attention Using Near-Memory-Processing Architecture
by: Li, Huize, et al.
Published: (2026)
by: Li, Huize, et al.
Published: (2026)
BitROM: Weight Reload-Free CiROM Architecture Towards Billion-Parameter 1.58-bit LLM Inference
by: Zhang, Wenlun, et al.
Published: (2025)
by: Zhang, Wenlun, et al.
Published: (2025)
L3: DIMM-PIM Integrated Architecture and Coordination for Scalable Long-Context LLM Inference
by: Liu, Qingyuan, et al.
Published: (2025)
by: Liu, Qingyuan, et al.
Published: (2025)
FILCO: Flexible Composing Architecture with Real-Time Reconfigurability for DNN Acceleration
by: Chen, Xingzhen, et al.
Published: (2026)
by: Chen, Xingzhen, et al.
Published: (2026)
PIMfused: Near-Bank DRAM-PIM with Fused-layer Dataflow for CNN Data Transfer Optimization
by: Yang, Simei, et al.
Published: (2025)
by: Yang, Simei, et al.
Published: (2025)
SAL-PIM: A Subarray-level Processing-in-Memory Architecture with LUT-based Linear Interpolation for Transformer-based Text Generation
by: Han, Wontak, et al.
Published: (2024)
by: Han, Wontak, et al.
Published: (2024)
Toleo: Scaling Freshness to Tera-scale Memory using CXL and PIM
by: Dong, Juechu, et al.
Published: (2024)
by: Dong, Juechu, et al.
Published: (2024)
DL-PIM: Improving Data Locality in Processing-in-Memory Systems
by: Tian, Parker Hao, et al.
Published: (2025)
by: Tian, Parker Hao, et al.
Published: (2025)
AME-PIM: Can Memory be Your Next Tensor Accelerator?
by: Venieri, Emanuele, et al.
Published: (2026)
by: Venieri, Emanuele, et al.
Published: (2026)
Fast-OverlaPIM: A Fast Overlap-driven Mapping Framework for Processing In-Memory Neural Network Acceleration
by: Wang, Xuan, et al.
Published: (2024)
by: Wang, Xuan, et al.
Published: (2024)
ProactivePIM: Accelerating Weight-Sharing Embedding Layer with PIM for Scalable Recommendation System
by: Kim, Youngsuk, et al.
Published: (2024)
by: Kim, Youngsuk, et al.
Published: (2024)
NeuPIMs: NPU-PIM Heterogeneous Acceleration for Batched LLM Inferencing
by: Heo, Guseul, et al.
Published: (2024)
by: Heo, Guseul, et al.
Published: (2024)
IANUS: Integrated Accelerator based on NPU-PIM Unified Memory System
by: Seo, Minseok, et al.
Published: (2024)
by: Seo, Minseok, et al.
Published: (2024)
Membrane: Accelerating Database Analytics with Bank-Level DRAM-PIM Filtering
by: Shekar, Akhil, et al.
Published: (2025)
by: Shekar, Akhil, et al.
Published: (2025)
PIM-GPT: A Hybrid Process-in-Memory Accelerator for Autoregressive Transformers
by: Wu, Yuting, et al.
Published: (2023)
by: Wu, Yuting, et al.
Published: (2023)
A$^3$PIM: An Automated, Analytic and Accurate Processing-in-Memory Offloader
by: Jiang, Qingcai, et al.
Published: (2024)
by: Jiang, Qingcai, et al.
Published: (2024)
HH-PIM: Dynamic Optimization of Power and Performance with Heterogeneous-Hybrid PIM for Edge AI Devices
by: Jeon, Sangmin, et al.
Published: (2025)
by: Jeon, Sangmin, et al.
Published: (2025)
UpDLRM: Accelerating Personalized Recommendation using Real-World PIM Architecture
by: Chen, Sitian, et al.
Published: (2024)
by: Chen, Sitian, et al.
Published: (2024)
AIM: Software and Hardware Co-design for Architecture-level IR-drop Mitigation in High-performance PIM
by: Zhang, Yuanpeng, et al.
Published: (2025)
by: Zhang, Yuanpeng, et al.
Published: (2025)
Shared-PIM: Enabling Concurrent Computation and Data Flow for Faster Processing-in-DRAM
by: Mamdouh, Ahmed, et al.
Published: (2024)
by: Mamdouh, Ahmed, et al.
Published: (2024)
The BRAM is the Limit: Shattering Myths, Shaping Standards, and Building Scalable PIM Accelerators
by: Kabir, MD Arafat, et al.
Published: (2024)
by: Kabir, MD Arafat, et al.
Published: (2024)
Sieve: Dynamic Expert-Aware PIM Acceleration for Evolving Mixture-of-Experts Models
by: Kim, Jungwoo, et al.
Published: (2026)
by: Kim, Jungwoo, et al.
Published: (2026)
LOCALUT: Harnessing Capacity-Computation Tradeoffs for LUT-Based Inference in DRAM-PIM
by: Hong, Junguk, et al.
Published: (2026)
by: Hong, Junguk, et al.
Published: (2026)
PIM-AI: A Novel Architecture for High-Efficiency LLM Inference
by: Ortega, Cristobal, et al.
Published: (2024)
by: Ortega, Cristobal, et al.
Published: (2024)
PyPIM: Integrating Digital Processing-in-Memory from Microarchitectural Design to Python Tensors
by: Leitersdorf, Orian, et al.
Published: (2023)
by: Leitersdorf, Orian, et al.
Published: (2023)
Similar Items
-
Co-Designing Graph-based Approximate Nearest Neighbor Search at Billion Scale for Processing-in-Memory
by: Chen, Sitian, et al.
Published: (2026) -
SpANNS: Optimizing Approximate Nearest Neighbor Search for Sparse Vectors Using Near Memory Processing
by: Zhang, Tianqi, et al.
Published: (2026) -
FaTRQ: Tiered Residual Quantization for LLM Vector Search in Far-Memory-Aware ANNS Systems
by: Zhang, Tianqi, et al.
Published: (2026) -
Pathfinding Future PIM Architectures by Demystifying a Commercial PIM Technology
by: Hyun, Bongjoon, et al.
Published: (2023) -
Inclusive-PIM: Hardware-Software Co-design for Broad Acceleration on Commercial PIM Architectures
by: Alsop, Johnathan, et al.
Published: (2023)