:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Shin, Hery, Kim, Jae-Young, Kim, Donghyuk, Kim, Joo-Young
Format:	Preprint
Published:	2024
Subjects:	Hardware Architecture
Online Access:	https://arxiv.org/abs/2409.16640
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

RED: Energy Optimization Framework for eDRAM-based PIM with Reconfigurable Voltage Swing and Retention-aware Scheduling
by: Kim, Jae-Young, et al.
Published: (2025)

SAL-PIM: A Subarray-level Processing-in-Memory Architecture with LUT-based Linear Interpolation for Transformer-based Text Generation
by: Han, Wontak, et al.
Published: (2024)

ARAS: An Adaptive Low-Cost ReRAM-Based Accelerator for DNNs
by: Sabri, Mohammad, et al.
Published: (2024)

Hamun: An Approximate Computation Method to Prolong the Lifespan of ReRAM-Based Accelerators
by: Sabri, Mohammad, et al.
Published: (2025)

V-Rex: Real-Time Streaming Video LLM Acceleration via Dynamic KV Cache Retrieval
by: Kim, Donghyuk, et al.
Published: (2025)

DIRC-RAG: Accelerating Edge RAG with Robust High-Density and High-Loading-Bandwidth Digital In-ReRAM Computation
by: Shao, Kunming, et al.
Published: (2025)

Online Soft Error Tolerance in ReRAM Crossbars for Deep Learning Accelerators
by: Khezeli, Benyamin, et al.
Published: (2024)

Securing DRAM at Scale: ARFM-Driven Row Hammer Defense with Unveiling the Threat of Short tRC Patterns
by: Joo, Nogeun, et al.
Published: (2025)

All-in-Memory Stochastic Computing using ReRAM
by: de Lima, João Paulo C., et al.
Published: (2025)

Pointer: An Energy-Efficient ReRAM-based Point Cloud Recognition Accelerator with Inter-layer and Intra-layer Optimizations
by: Zhang, Qijun, et al.
Published: (2024)

FARe: Fault-Aware GNN Training on ReRAM-based PIM Accelerators
by: Dhingra, Pratyush, et al.
Published: (2024)

Sensitivity-Aware Mixed-Precision Quantization for ReRAM-based Computing-in-Memory
by: Chen, Guan-Cheng, et al.
Published: (2025)

A Fully Hardware Implemented Accelerator Design in ReRAM Analog Computing without ADCs
by: Dang, Peng, et al.
Published: (2024)

Algorithm-hardware co-design for Energy-Efficient A/D conversion in ReRAM-based accelerators
by: Zhang, Chenguang, et al.
Published: (2024)

MASQ: Accelerating Masked Diffusion via Stage-Wise Multi-Precision Quantization
by: Kim, Seeyeon, et al.
Published: (2026)

DiSC: Resolution-Scalable Acceleration of Diffusion Models by Exploiting Sparsity and Cached Token Reuse with Hash-based Distribution
by: Yoon, Jieon, et al.
Published: (2026)

ReCross: Efficient Embedding Reduction Scheme for In-Memory Computing using ReRAM-Based Crossbar
by: Lai, Yu-Hong, et al.
Published: (2025)

Zero-Space Cost Fault Tolerance for Transformer-based Language Models on ReRAM
by: Li, Bingbing, et al.
Published: (2024)

ORBIS: Output-Guided Token Reduction with Distribution-Aware Matching for Video Diffusion Acceleration
by: Lee, Hangyeol, et al.
Published: (2026)

APINT: A Full-Stack Framework for Acceleration of Privacy-Preserving Inference of Transformers based on Garbled Circuits
by: Cho, Hyunjun, et al.
Published: (2025)

4T2R X-ReRAM CiM Array for Variation-tolerant, Low-power, Massively Parallel MAC Operation
by: Kihara, Fuyuki, et al.
Published: (2025)

Stuck-at Faults in ReRAM Neuromorphic Circuit Array and their Correction through Machine Learning
by: Sawal, Vedant, et al.
Published: (2024)

SCRec: A Scalable Computational Storage System with Statistical Sharding and Tensor-train Decomposition for Recommendation Models
by: Yang, Jinho, et al.
Published: (2025)

All-in-One Analog AI Hardware: On-Chip Training and Inference with Conductive-Metal-Oxide/HfOx ReRAM Devices
by: Falcone, Donato Francesco, et al.
Published: (2025)

31.1 A 14.08-to-135.69Token/s ReRAM-on-Logic Stacked Outlier-Free Large-Language-Model Accelerator with Block-Clustered Weight-Compression and Adaptive Parallel-Speculative-Decoding
by: Dong, Pingcheng, et al.
Published: (2026)

ADOR: A Design Exploration Framework for LLM Serving with Enhanced Latency and Throughput
by: Kim, Junsoo, et al.
Published: (2025)

TL-nvSRAM-CIM: Ultra-High-Density Three-Level ReRAM-Assisted Computing-in-nvSRAM with DC-Power Free Restore and Ternary MAC Operations
by: Wang, Dengfeng, et al.
Published: (2023)

IANUS: Integrated Accelerator based on NPU-PIM Unified Memory System
by: Seo, Minseok, et al.
Published: (2024)

LPU: A Latency-Optimized and Highly Scalable Processor for Large Language Model Inference
by: Moon, Seungjae, et al.
Published: (2024)

Sparse-on-Dense: Area and Energy-Efficient Computing of Sparse Neural Networks on Dense Matrix Multiplication Accelerators
by: Yoon, Hyunsung, et al.
Published: (2026)

Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization
by: Kim, Minsu, et al.
Published: (2025)

ABC-FHE : A Resource-Efficient Accelerator Enabling Bootstrappable Parameters for Client-Side Fully Homomorphic Encryption
by: Yune, Sungwoong, et al.
Published: (2025)

RPCAcc: A High-Performance and Reconfigurable PCIe-attached RPC Accelerator
by: Zhang, Jie, et al.
Published: (2024)

Token-Picker: Accelerating Attention in Text Generation with Minimized Memory Transfer via Probability Estimation
by: Park, Junyoung, et al.
Published: (2024)

FlexNeRFer: A Multi-Dataflow, Adaptive Sparsity-Aware Accelerator for On-Device NeRF Rendering
by: Noh, Seock-Hwan, et al.
Published: (2025)

EXION: Exploiting Inter- and Intra-Iteration Output Sparsity for Diffusion Models
by: Heo, Jaehoon, et al.
Published: (2025)

ProactivePIM: Accelerating Weight-Sharing Embedding Layer with PIM for Scalable Recommendation System
by: Kim, Youngsuk, et al.
Published: (2024)

STT-RAM-based Hierarchical In-Memory Computing
by: Gajaria, Dhruv, et al.
Published: (2024)

Towards Performance-Aware Allocation for Accelerated Machine Learning on GPU-SSD Systems
by: Gundawar, Ayush, et al.
Published: (2024)

Bandwidth-Effective DRAM Cache for GPUs with Storage-Class Memory
by: Hong, Jeongmin, et al.
Published: (2024)