:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Shin, Injae, Tine, Blaise
Format:	Preprint
Published:	2025
Subjects:	Hardware Architecture
Online Access:	https://arxiv.org/abs/2503.17602
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Ten-Four: An Open-Source Fused Dot Product Unit for Mixed-Precision GPGPU Tensor Cores
by: Rout, Nikhil, et al.
Published: (2025)

Inside VOLT: Designing an Open-Source GPU Compiler
by: Jeong, Shinnung, et al.
Published: (2025)

Hardware vs. Software Implementation of Warp-Level Features in Vortex RISC-V GPU
by: Pu, Huanzhi, et al.
Published: (2025)

CXL-GPU: Pushing GPU Memory Boundaries with the Integration of CXL Technologies
by: Gouk, Donghyun, et al.
Published: (2025)

CMD: A Cache-assisted GPU Memory Deduplication Architecture
by: Zhao, Wei, et al.
Published: (2024)

Five-Minute Rule 40 Years Later: A First-Principles Revisit for Modern Memory Hierarchy
by: Zhang, Tong, et al.
Published: (2025)

Apparate: Evading Memory Hierarchy with GodSpeed Wireless-on-Chip
by: GS, Nitesh Narayana, et al.
Published: (2024)

PUMA: Efficient and Low-Cost Memory Allocation and Alignment Support for Processing-Using-Memory Architectures
by: Oliveira, Geraldo F., et al.
Published: (2024)

e-GPU: An Open-Source and Configurable RISC-V Graphic Processing Unit for TinyAI Applications
by: Machetti, Simone, et al.
Published: (2025)

Theodosian: A Deep Dive into Memory-Hierarchy-Centric FHE Acceleration
by: Choi, Wonseok, et al.
Published: (2025)

A Configurable and Efficient Memory Hierarchy for Neural Network Hardware Accelerator
by: Bause, Oliver, et al.
Published: (2024)

OpenGL GPU-Based Rowhammer Attack (Work in Progress)
by: Plin, Antoine, et al.
Published: (2025)

Optimized Memory System Architecture for VESA VDC-M Decoder with Multi-Slice Support
by: Yang, Hannah, et al.
Published: (2025)

RoboGPU: Accelerating GPU Collision Detection for Robotics
by: Liu, Lufei, et al.
Published: (2026)

Analyzing Modern NVIDIA GPU cores
by: Huerta, Rodrigo, et al.
Published: (2025)

Memory Hierarchy Design for Caching Middleware in the Age of NVM
by: Ghandeharizadeh, Shahram, et al.
Published: (2025)

RapidChiplet: A Toolchain for Rapid Design Space Exploration of Chiplet Architectures
by: Iff, Patrick, et al.
Published: (2023)

Design of a GPU with Heterogeneous Cores for Graphics
by: Tomás, Aurora, et al.
Published: (2026)

Benchmarking and Dissecting the Nvidia Hopper GPU Architecture
by: Luo, Weile, et al.
Published: (2024)

COOK Access Control on an embedded Volta GPU
by: Lesage, Benjamin, et al.
Published: (2024)

täkōFormal: Enabling Robust Software for Programmable Memory Hierarchies (Extended Version)
by: Srinivasan, Pranav, et al.
Published: (2026)

Choreographer: A Full-System Framework for Fine-Grained Tasks in Cache Hierarchies
by: Nguyen, Hoa, et al.
Published: (2025)

Efficient Open Modification Spectral Library Searching in High-Dimensional Space with Multi-Level-Cell Memory
by: Fan, Keming, et al.
Published: (2024)

CuLifter: Lifting GPU Binaries to Typed IR
by: Zhao, Jisheng, et al.
Published: (2026)

All-rounder: A Flexible AI Accelerator with Diverse Data Format Support and Morphable Structure for Multi-DNN Processing
by: Noh, Seock-Hwan, et al.
Published: (2023)

GAP-LA: GPU-Accelerated Performance-Driven Layer Assignment
by: Zhao, Chunyuan, et al.
Published: (2025)

Thermal Analysis for NVIDIA GTX480 Fermi GPU Architecture
by: Nagendra, Savinay
Published: (2024)

Piccolo: Large-Scale Graph Processing with Fine-Grained In-Memory Scatter-Gather
by: Shin, Changmin, et al.
Published: (2025)

Low-overhead General-purpose Near-Data Processing in CXL Memory Expanders
by: Ham, Hyungkyu, et al.
Published: (2024)

Make LLM Inference Affordable to Everyone: Augmenting GPU Memory with NDP-DIMM
by: Liu, Lian, et al.
Published: (2025)

The Anatomy of Silent Data Corruption: GPU Error Pattern Study and Modeling Guidance
by: Tung, Chung-Hsuan, et al.
Published: (2026)

Empirical Measurements of AI Training Power Demand on a GPU-Accelerated Node
by: Latif, Imran, et al.
Published: (2024)

EnergAIzer: Fast and Accurate GPU Power Estimation Framework for AI Workloads
by: Lee, Kyungmi, et al.
Published: (2026)

Towards Performance-Aware Allocation for Accelerated Machine Learning on GPU-SSD Systems
by: Gundawar, Ayush, et al.
Published: (2024)

Edge GPU Aware Multiple AI Model Pipeline for Accelerated MRI Reconstruction and Analysis
by: Majeed, Ashiyana Abdul, et al.
Published: (2025)

GPU-Accelerated Simulated Oscillator Ising/Potts Machine Solving Combinatorial Optimization Problems
by: Gonul, Yilmaz Ege, et al.
Published: (2025)

TLX: Hardware-Native, Evolvable MIMW GPU Compiler for Large-scale Production Environments
by: Guan, Yue, et al.
Published: (2026)

Memory-Efficient FPGA Implementation of Stochastic Simulated Annealing
by: Shin, Duckgyu, et al.
Published: (2026)

The Case for Replication-Aware Memory-Error Protection in Disaggregated Memory
by: Volos, Haris
Published: (2023)

IANUS: Integrated Accelerator based on NPU-PIM Unified Memory System
by: Seo, Minseok, et al.
Published: (2024)