Saved in:
| Main Authors: | Shin, Hery, Kim, Jae-Young, Kim, Donghyuk, Kim, Joo-Young |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2409.16640 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
RED: Energy Optimization Framework for eDRAM-based PIM with Reconfigurable Voltage Swing and Retention-aware Scheduling
by: Kim, Jae-Young, et al.
Published: (2025)
by: Kim, Jae-Young, et al.
Published: (2025)
SAL-PIM: A Subarray-level Processing-in-Memory Architecture with LUT-based Linear Interpolation for Transformer-based Text Generation
by: Han, Wontak, et al.
Published: (2024)
by: Han, Wontak, et al.
Published: (2024)
ARAS: An Adaptive Low-Cost ReRAM-Based Accelerator for DNNs
by: Sabri, Mohammad, et al.
Published: (2024)
by: Sabri, Mohammad, et al.
Published: (2024)
Hamun: An Approximate Computation Method to Prolong the Lifespan of ReRAM-Based Accelerators
by: Sabri, Mohammad, et al.
Published: (2025)
by: Sabri, Mohammad, et al.
Published: (2025)
V-Rex: Real-Time Streaming Video LLM Acceleration via Dynamic KV Cache Retrieval
by: Kim, Donghyuk, et al.
Published: (2025)
by: Kim, Donghyuk, et al.
Published: (2025)
DIRC-RAG: Accelerating Edge RAG with Robust High-Density and High-Loading-Bandwidth Digital In-ReRAM Computation
by: Shao, Kunming, et al.
Published: (2025)
by: Shao, Kunming, et al.
Published: (2025)
Online Soft Error Tolerance in ReRAM Crossbars for Deep Learning Accelerators
by: Khezeli, Benyamin, et al.
Published: (2024)
by: Khezeli, Benyamin, et al.
Published: (2024)
Securing DRAM at Scale: ARFM-Driven Row Hammer Defense with Unveiling the Threat of Short tRC Patterns
by: Joo, Nogeun, et al.
Published: (2025)
by: Joo, Nogeun, et al.
Published: (2025)
All-in-Memory Stochastic Computing using ReRAM
by: de Lima, João Paulo C., et al.
Published: (2025)
by: de Lima, João Paulo C., et al.
Published: (2025)
Pointer: An Energy-Efficient ReRAM-based Point Cloud Recognition Accelerator with Inter-layer and Intra-layer Optimizations
by: Zhang, Qijun, et al.
Published: (2024)
by: Zhang, Qijun, et al.
Published: (2024)
FARe: Fault-Aware GNN Training on ReRAM-based PIM Accelerators
by: Dhingra, Pratyush, et al.
Published: (2024)
by: Dhingra, Pratyush, et al.
Published: (2024)
Sensitivity-Aware Mixed-Precision Quantization for ReRAM-based Computing-in-Memory
by: Chen, Guan-Cheng, et al.
Published: (2025)
by: Chen, Guan-Cheng, et al.
Published: (2025)
A Fully Hardware Implemented Accelerator Design in ReRAM Analog Computing without ADCs
by: Dang, Peng, et al.
Published: (2024)
by: Dang, Peng, et al.
Published: (2024)
Algorithm-hardware co-design for Energy-Efficient A/D conversion in ReRAM-based accelerators
by: Zhang, Chenguang, et al.
Published: (2024)
by: Zhang, Chenguang, et al.
Published: (2024)
MASQ: Accelerating Masked Diffusion via Stage-Wise Multi-Precision Quantization
by: Kim, Seeyeon, et al.
Published: (2026)
by: Kim, Seeyeon, et al.
Published: (2026)
DiSC: Resolution-Scalable Acceleration of Diffusion Models by Exploiting Sparsity and Cached Token Reuse with Hash-based Distribution
by: Yoon, Jieon, et al.
Published: (2026)
by: Yoon, Jieon, et al.
Published: (2026)
ReCross: Efficient Embedding Reduction Scheme for In-Memory Computing using ReRAM-Based Crossbar
by: Lai, Yu-Hong, et al.
Published: (2025)
by: Lai, Yu-Hong, et al.
Published: (2025)
Zero-Space Cost Fault Tolerance for Transformer-based Language Models on ReRAM
by: Li, Bingbing, et al.
Published: (2024)
by: Li, Bingbing, et al.
Published: (2024)
ORBIS: Output-Guided Token Reduction with Distribution-Aware Matching for Video Diffusion Acceleration
by: Lee, Hangyeol, et al.
Published: (2026)
by: Lee, Hangyeol, et al.
Published: (2026)
APINT: A Full-Stack Framework for Acceleration of Privacy-Preserving Inference of Transformers based on Garbled Circuits
by: Cho, Hyunjun, et al.
Published: (2025)
by: Cho, Hyunjun, et al.
Published: (2025)
4T2R X-ReRAM CiM Array for Variation-tolerant, Low-power, Massively Parallel MAC Operation
by: Kihara, Fuyuki, et al.
Published: (2025)
by: Kihara, Fuyuki, et al.
Published: (2025)
Stuck-at Faults in ReRAM Neuromorphic Circuit Array and their Correction through Machine Learning
by: Sawal, Vedant, et al.
Published: (2024)
by: Sawal, Vedant, et al.
Published: (2024)
SCRec: A Scalable Computational Storage System with Statistical Sharding and Tensor-train Decomposition for Recommendation Models
by: Yang, Jinho, et al.
Published: (2025)
by: Yang, Jinho, et al.
Published: (2025)
All-in-One Analog AI Hardware: On-Chip Training and Inference with Conductive-Metal-Oxide/HfOx ReRAM Devices
by: Falcone, Donato Francesco, et al.
Published: (2025)
by: Falcone, Donato Francesco, et al.
Published: (2025)
31.1 A 14.08-to-135.69Token/s ReRAM-on-Logic Stacked Outlier-Free Large-Language-Model Accelerator with Block-Clustered Weight-Compression and Adaptive Parallel-Speculative-Decoding
by: Dong, Pingcheng, et al.
Published: (2026)
by: Dong, Pingcheng, et al.
Published: (2026)
ADOR: A Design Exploration Framework for LLM Serving with Enhanced Latency and Throughput
by: Kim, Junsoo, et al.
Published: (2025)
by: Kim, Junsoo, et al.
Published: (2025)
TL-nvSRAM-CIM: Ultra-High-Density Three-Level ReRAM-Assisted Computing-in-nvSRAM with DC-Power Free Restore and Ternary MAC Operations
by: Wang, Dengfeng, et al.
Published: (2023)
by: Wang, Dengfeng, et al.
Published: (2023)
IANUS: Integrated Accelerator based on NPU-PIM Unified Memory System
by: Seo, Minseok, et al.
Published: (2024)
by: Seo, Minseok, et al.
Published: (2024)
LPU: A Latency-Optimized and Highly Scalable Processor for Large Language Model Inference
by: Moon, Seungjae, et al.
Published: (2024)
by: Moon, Seungjae, et al.
Published: (2024)
Sparse-on-Dense: Area and Energy-Efficient Computing of Sparse Neural Networks on Dense Matrix Multiplication Accelerators
by: Yoon, Hyunsung, et al.
Published: (2026)
by: Yoon, Hyunsung, et al.
Published: (2026)
Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization
by: Kim, Minsu, et al.
Published: (2025)
by: Kim, Minsu, et al.
Published: (2025)
ABC-FHE : A Resource-Efficient Accelerator Enabling Bootstrappable Parameters for Client-Side Fully Homomorphic Encryption
by: Yune, Sungwoong, et al.
Published: (2025)
by: Yune, Sungwoong, et al.
Published: (2025)
RPCAcc: A High-Performance and Reconfigurable PCIe-attached RPC Accelerator
by: Zhang, Jie, et al.
Published: (2024)
by: Zhang, Jie, et al.
Published: (2024)
Token-Picker: Accelerating Attention in Text Generation with Minimized Memory Transfer via Probability Estimation
by: Park, Junyoung, et al.
Published: (2024)
by: Park, Junyoung, et al.
Published: (2024)
FlexNeRFer: A Multi-Dataflow, Adaptive Sparsity-Aware Accelerator for On-Device NeRF Rendering
by: Noh, Seock-Hwan, et al.
Published: (2025)
by: Noh, Seock-Hwan, et al.
Published: (2025)
EXION: Exploiting Inter- and Intra-Iteration Output Sparsity for Diffusion Models
by: Heo, Jaehoon, et al.
Published: (2025)
by: Heo, Jaehoon, et al.
Published: (2025)
ProactivePIM: Accelerating Weight-Sharing Embedding Layer with PIM for Scalable Recommendation System
by: Kim, Youngsuk, et al.
Published: (2024)
by: Kim, Youngsuk, et al.
Published: (2024)
STT-RAM-based Hierarchical In-Memory Computing
by: Gajaria, Dhruv, et al.
Published: (2024)
by: Gajaria, Dhruv, et al.
Published: (2024)
Towards Performance-Aware Allocation for Accelerated Machine Learning on GPU-SSD Systems
by: Gundawar, Ayush, et al.
Published: (2024)
by: Gundawar, Ayush, et al.
Published: (2024)
Bandwidth-Effective DRAM Cache for GPUs with Storage-Class Memory
by: Hong, Jeongmin, et al.
Published: (2024)
by: Hong, Jeongmin, et al.
Published: (2024)
Similar Items
-
RED: Energy Optimization Framework for eDRAM-based PIM with Reconfigurable Voltage Swing and Retention-aware Scheduling
by: Kim, Jae-Young, et al.
Published: (2025) -
SAL-PIM: A Subarray-level Processing-in-Memory Architecture with LUT-based Linear Interpolation for Transformer-based Text Generation
by: Han, Wontak, et al.
Published: (2024) -
ARAS: An Adaptive Low-Cost ReRAM-Based Accelerator for DNNs
by: Sabri, Mohammad, et al.
Published: (2024) -
Hamun: An Approximate Computation Method to Prolong the Lifespan of ReRAM-Based Accelerators
by: Sabri, Mohammad, et al.
Published: (2025) -
V-Rex: Real-Time Streaming Video LLM Acceleration via Dynamic KV Cache Retrieval
by: Kim, Donghyuk, et al.
Published: (2025)