Saved in:
| Main Authors: | Legtchenko, Sergey, Stefanovici, Ioan, Black, Richard, Rowstron, Antony, Liu, Junyi, Costa, Paolo, Canakci, Burcu, Narayanan, Dushyanth, Wu, Xingbo |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2501.09605 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Good things come in small packages: Should we build AI clusters with Lite-GPUs?
by: Canakci, Burcu, et al.
Published: (2025)
by: Canakci, Burcu, et al.
Published: (2025)
From GPUs to RRAMs: Distributed In-Memory Primal-Dual Hybrid Gradient Method for Solving Large-Scale Linear Optimization Problem
by: Vo, Huynh Q. N., et al.
Published: (2025)
by: Vo, Huynh Q. N., et al.
Published: (2025)
Harnessing the Full Potential of RRAMs through Scalable and Distributed In-Memory Computing with Integrated Error Correction
by: Vo, Huynh Q. N., et al.
Published: (2025)
by: Vo, Huynh Q. N., et al.
Published: (2025)
Open Challenges for a Production-ready Cloud Environment on top of RISC-V hardware
by: Call, Aaron, et al.
Published: (2025)
by: Call, Aaron, et al.
Published: (2025)
CLAASIC: a Cortex-Inspired Hardware Accelerator
by: Puente, Valentin, et al.
Published: (2016)
by: Puente, Valentin, et al.
Published: (2016)
DFabric: Scaling Out Data Parallel Applications with CXL-Ethernet Hybrid Interconnects
by: Zhang, Xu, et al.
Published: (2024)
by: Zhang, Xu, et al.
Published: (2024)
COMPASS: A Compiler Framework for Resource-Constrained Crossbar-Array Based In-Memory Deep Learning Accelerators
by: Park, Jihoon, et al.
Published: (2025)
by: Park, Jihoon, et al.
Published: (2025)
Wattlytics: A Web Platform for Co-Optimizing Performance, Energy, and TCO in HPC Clusters
by: Afzal, Ayesha, et al.
Published: (2026)
by: Afzal, Ayesha, et al.
Published: (2026)
TreeVQA: A Tree-Structured Execution Framework for Shot Reduction in Variational Quantum Algorithms
by: Hou, Yuewen, et al.
Published: (2025)
by: Hou, Yuewen, et al.
Published: (2025)
Architecting Distributed Quantum Computers: Design Insights from Resource Estimation
by: Filippov, Dmitry, et al.
Published: (2025)
by: Filippov, Dmitry, et al.
Published: (2025)
ForgetMeNot: Understanding and Modeling the Impact of Forever Chemicals Toward Sustainable Large-Scale Computing
by: Roy, Rohan Basu, et al.
Published: (2025)
by: Roy, Rohan Basu, et al.
Published: (2025)
Carbon Connect: An Ecosystem for Sustainable Computing
by: Lee, Benjamin C., et al.
Published: (2024)
by: Lee, Benjamin C., et al.
Published: (2024)
Reference Architecture of a Quantum-Centric Supercomputer
by: Seelam, Seetharami, et al.
Published: (2026)
by: Seelam, Seetharami, et al.
Published: (2026)
PIM-AI: A Novel Architecture for High-Efficiency LLM Inference
by: Ortega, Cristobal, et al.
Published: (2024)
by: Ortega, Cristobal, et al.
Published: (2024)
CMDS: Cross-layer Dataflow Optimization for DNN Accelerators Exploiting Multi-bank Memories
by: Shi, Man, et al.
Published: (2024)
by: Shi, Man, et al.
Published: (2024)
Memory-Centric Computing: Solving Computing's Memory Problem
by: Mutlu, Onur, et al.
Published: (2025)
by: Mutlu, Onur, et al.
Published: (2025)
Scaling Intelligence: Designing Data Centers for Next-Gen Language Models
by: Tithi, Jesmin Jahan, et al.
Published: (2025)
by: Tithi, Jesmin Jahan, et al.
Published: (2025)
Efficient Optimization Accelerator Framework for Multistate Ising Problems
by: Garg, Chirag, et al.
Published: (2025)
by: Garg, Chirag, et al.
Published: (2025)
Analyzing a Two-Tier Disaggregated Memory Protection Scheme Based on Memory Replication
by: Volos, Haris, et al.
Published: (2025)
by: Volos, Haris, et al.
Published: (2025)
A Modern Primer on Processing in Memory
by: Mutlu, Onur, et al.
Published: (2020)
by: Mutlu, Onur, et al.
Published: (2020)
PIUMA: Programmable Integrated Unified Memory Architecture
by: Aananthakrishnan, Sriram, et al.
Published: (2020)
by: Aananthakrishnan, Sriram, et al.
Published: (2020)
Transforming the Hybrid Cloud for Emerging AI Workloads
by: Chen, Deming, et al.
Published: (2024)
by: Chen, Deming, et al.
Published: (2024)
Towards Memory Specialization: A Case for Long-Term and Short-Term RAM
by: Li, Peijing, et al.
Published: (2025)
by: Li, Peijing, et al.
Published: (2025)
Accelerating Triangle Counting with Real Processing-in-Memory Systems
by: Asquini, Lorenzo, et al.
Published: (2025)
by: Asquini, Lorenzo, et al.
Published: (2025)
Balanced Data Placement for GEMV Acceleration with Processing-In-Memory
by: Ibrahim, Mohamed Assem, et al.
Published: (2024)
by: Ibrahim, Mohamed Assem, et al.
Published: (2024)
Efficient Architecture for RISC-V Vector Memory Access
by: Guan, Hongyi, et al.
Published: (2025)
by: Guan, Hongyi, et al.
Published: (2025)
Memory-Centric Computing: Recent Advances in Processing-in-DRAM
by: Mutlu, Onur, et al.
Published: (2024)
by: Mutlu, Onur, et al.
Published: (2024)
WaferLLM: Large Language Model Inference at Wafer Scale
by: He, Congjie, et al.
Published: (2025)
by: He, Congjie, et al.
Published: (2025)
PASS: An Asynchronous Probabilistic Processor for Next Generation Intelligence
by: Patel, Saavan, et al.
Published: (2024)
by: Patel, Saavan, et al.
Published: (2024)
Experience Deploying Containerized GenAI Services at an HPC Center
by: Beltre, Angel M., et al.
Published: (2025)
by: Beltre, Angel M., et al.
Published: (2025)
FengHuang: Next-Generation Memory Orchestration for AI Inferencing
by: Li, Jiamin, et al.
Published: (2025)
by: Li, Jiamin, et al.
Published: (2025)
Handling of Memory Page Faults during Virtual-Address RDMA
by: Psistakis, Antonis
Published: (2025)
by: Psistakis, Antonis
Published: (2025)
Pooling Engram Conditional Memory in Large Language Models using CXL
by: Ma, Ruiyang, et al.
Published: (2026)
by: Ma, Ruiyang, et al.
Published: (2026)
New Tools, Programming Models, and System Support for Processing-in-Memory Architectures
by: Oliveira, Geraldo F.
Published: (2025)
by: Oliveira, Geraldo F.
Published: (2025)
PIMDAL: Mitigating the Memory Bottleneck in Data Analytics using a Real Processing-in-Memory System
by: Frouzakis, Manos, et al.
Published: (2025)
by: Frouzakis, Manos, et al.
Published: (2025)
TeraPool: A Physical Design Aware, 1024 RISC-V Cores Shared-L1-Memory Scaled-up Cluster Design with High Bandwidth Main Memory Link
by: Zhang, Yichao, et al.
Published: (2026)
by: Zhang, Yichao, et al.
Published: (2026)
BlockAMC: Scalable In-Memory Analog Matrix Computing for Solving Linear Systems
by: Pan, Lunshuai, et al.
Published: (2024)
by: Pan, Lunshuai, et al.
Published: (2024)
Survey of Disaggregated Memory: Cross-layer Technique Insights for Next-Generation Datacenters
by: Wang, Jing, et al.
Published: (2025)
by: Wang, Jing, et al.
Published: (2025)
A Programming Model for Disaggregated Memory over CXL
by: Assa, Gal, et al.
Published: (2024)
by: Assa, Gal, et al.
Published: (2024)
PAM: Processing Across Memory Hierarchy for Efficient KV-centric LLM Serving System
by: Liu, Lian, et al.
Published: (2026)
by: Liu, Lian, et al.
Published: (2026)
Similar Items
-
Good things come in small packages: Should we build AI clusters with Lite-GPUs?
by: Canakci, Burcu, et al.
Published: (2025) -
From GPUs to RRAMs: Distributed In-Memory Primal-Dual Hybrid Gradient Method for Solving Large-Scale Linear Optimization Problem
by: Vo, Huynh Q. N., et al.
Published: (2025) -
Harnessing the Full Potential of RRAMs through Scalable and Distributed In-Memory Computing with Integrated Error Correction
by: Vo, Huynh Q. N., et al.
Published: (2025) -
Open Challenges for a Production-ready Cloud Environment on top of RISC-V hardware
by: Call, Aaron, et al.
Published: (2025) -
CLAASIC: a Cortex-Inspired Hardware Accelerator
by: Puente, Valentin, et al.
Published: (2016)