Saved in:
| Main Authors: | Wu, Ruilong, Wang, Yisu, Kutscher, Dirk |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2408.15568 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
MixFP4: Enhancing NVFP4 with Adaptive FP4/INT4 Block Representations
by: Zou, Jiaxiang, et al.
Published: (2026)
by: Zou, Jiaxiang, et al.
Published: (2026)
A System Development Kit for Big Data Applications on FPGA-based Clusters: The EVEREST Approach
by: Pilato, Christian, et al.
Published: (2024)
by: Pilato, Christian, et al.
Published: (2024)
When Small Variations Become Big Failures: Reliability Challenges in Compute-in-Memory Neural Accelerators
by: Qin, Yifan, et al.
Published: (2026)
by: Qin, Yifan, et al.
Published: (2026)
An Affordable Experimental Technique for SRAM Write Margin Characterization for Nanometer CMOS Technologies
by: Alorda, Bartomeu, et al.
Published: (2024)
by: Alorda, Bartomeu, et al.
Published: (2024)
Leveraging Recurrent Patterns in Graph Accelerators
by: Rahimi, Masoud, et al.
Published: (2025)
by: Rahimi, Masoud, et al.
Published: (2025)
Towards An Approach to Identify Divergences in Hardware Designs for HPC Workloads
by: Popovici, Doru Thom, et al.
Published: (2025)
by: Popovici, Doru Thom, et al.
Published: (2025)
Toward Open-Source Chiplets for HPC and AI: Occamy and Beyond
by: Scheffler, Paul, et al.
Published: (2025)
by: Scheffler, Paul, et al.
Published: (2025)
Big-PERCIVAL: Exploring the Native Use of 64-Bit Posit Arithmetic in Scientific Computing
by: Mallasén, David, et al.
Published: (2023)
by: Mallasén, David, et al.
Published: (2023)
Educating for Hardware Specialization in the Chiplet Era: A Path for the HPC Community
by: Yoshii, Kazutomo, et al.
Published: (2024)
by: Yoshii, Kazutomo, et al.
Published: (2024)
Characterization of Real Communication Patterns and Congestion Dynamics in HPC Interconnection Networks
by: de La Rosa, Miguel Sánchez, et al.
Published: (2026)
by: de La Rosa, Miguel Sánchez, et al.
Published: (2026)
Reconfigurable Computing Challenge: Real-Time Graph Neural Networks for Online Event Selection in Big Science
by: Neu, Marc, et al.
Published: (2026)
by: Neu, Marc, et al.
Published: (2026)
Calibrating DRAMPower Model for HPC: A Runtime Perspective from Real-Time Measurements
by: Shi, Xinyu, et al.
Published: (2024)
by: Shi, Xinyu, et al.
Published: (2024)
MCBP: A Memory-Compute Efficient LLM Inference Accelerator Leveraging Bit-Slice-enabled Sparsity and Repetitiveness
by: Wang, Huizheng, et al.
Published: (2025)
by: Wang, Huizheng, et al.
Published: (2025)
Late Breaking Results: Leveraging Approximate Computing for Carbon-Aware DNN Accelerators
by: Panteleaki, Aikaterini Maria, et al.
Published: (2025)
by: Panteleaki, Aikaterini Maria, et al.
Published: (2025)
Hardware-Software Co-Design for Accelerating Transformer Inference Leveraging Compute-in-Memory
by: Kim, Dong Eun, et al.
Published: (2025)
by: Kim, Dong Eun, et al.
Published: (2025)
Apple vs. Oranges: Evaluating the Apple Silicon M-Series SoCs for HPC Performance and Efficiency
by: Hübner, Paul, et al.
Published: (2025)
by: Hübner, Paul, et al.
Published: (2025)
FpgaHub: Fpga-centric Hyper-heterogeneous Computing Platform for Big Data Analytics
by: Wang, Zeke, et al.
Published: (2025)
by: Wang, Zeke, et al.
Published: (2025)
Leveraging Compute-in-Memory for Efficient Generative Model Inference in TPUs
by: Zhu, Zhantong, et al.
Published: (2025)
by: Zhu, Zhantong, et al.
Published: (2025)
Efficient and Accurate Graph Classification with Hyperdimensional Computing on FPGA
by: Arockiaraj, Jebacyril, et al.
Published: (2025)
by: Arockiaraj, Jebacyril, et al.
Published: (2025)
Enabling Efficient Hybrid Systolic Computation in Shared L1-Memory Manycore Clusters
by: Mazzola, Sergio, et al.
Published: (2024)
by: Mazzola, Sergio, et al.
Published: (2024)
Spatz: Clustering Compact RISC-V-Based Vector Units to Maximize Computing Efficiency
by: Perotti, Matteo, et al.
Published: (2023)
by: Perotti, Matteo, et al.
Published: (2023)
PICNIC: Silicon Photonic Interconnected Chiplets with Computational Network and In-memory Computing for LLM Inference Acceleration
by: Chong, Yue Jiet, et al.
Published: (2025)
by: Chong, Yue Jiet, et al.
Published: (2025)
Make LLM Inference Affordable to Everyone: Augmenting GPU Memory with NDP-DIMM
by: Liu, Lian, et al.
Published: (2025)
by: Liu, Lian, et al.
Published: (2025)
SpikeStream: Accelerating Spiking Neural Network Inference on RISC-V Clusters with Sparse Computation Extensions
by: Manoni, Simone, et al.
Published: (2025)
by: Manoni, Simone, et al.
Published: (2025)
ACS: Concurrent Kernel Execution on Irregular, Input-Dependent Computational Graphs
by: Durvasula, Sankeerth, et al.
Published: (2024)
by: Durvasula, Sankeerth, et al.
Published: (2024)
FuseMax: Leveraging Extended Einsums to Optimize Attention Accelerator Design
by: Nayak, Nandeeka, et al.
Published: (2024)
by: Nayak, Nandeeka, et al.
Published: (2024)
ADS-IMC: Accelerating Data Sorting with In-Memory Computation
by: Dhakad, Narendra Singh, et al.
Published: (2026)
by: Dhakad, Narendra Singh, et al.
Published: (2026)
The Tiny Median Filter: A Small Size, Flexible Arbitrary Percentile Finder Scheme Suitable for FPGA Implementation
by: Wu, Jinyuan
Published: (2024)
by: Wu, Jinyuan
Published: (2024)
ControlPULP: A RISC-V On-Chip Parallel Power Controller for Many-Core HPC Processors with FPGA-Based Hardware-In-The-Loop Power and Thermal Emulation
by: Ottaviano, Alessandro, et al.
Published: (2023)
by: Ottaviano, Alessandro, et al.
Published: (2023)
Automated Physical Design Watermarking Leveraging Graph Neural Networks
by: Zhang, Ruisi, et al.
Published: (2024)
by: Zhang, Ruisi, et al.
Published: (2024)
EN-T: Optimizing Tensor Computing Engines Performance via Encoder-Based Methodology
by: Wu, Qizhe, et al.
Published: (2024)
by: Wu, Qizhe, et al.
Published: (2024)
NDSEARCH: Accelerating Graph-Traversal-Based Approximate Nearest Neighbor Search through Near Data Processing
by: Wang, Yitu, et al.
Published: (2023)
by: Wang, Yitu, et al.
Published: (2023)
Shared-PIM: Enabling Concurrent Computation and Data Flow for Faster Processing-in-DRAM
by: Mamdouh, Ahmed, et al.
Published: (2024)
by: Mamdouh, Ahmed, et al.
Published: (2024)
Multilayer Dataflow: Orchestrate Butterfly Sparsity to Accelerate Attention Computation
by: Wu, Haibin, et al.
Published: (2024)
by: Wu, Haibin, et al.
Published: (2024)
SiHGNN: Leveraging Properties of Semantic Graphs for Efficient HGNN Acceleration
by: Xue, Runzhen, et al.
Published: (2024)
by: Xue, Runzhen, et al.
Published: (2024)
CoQMoE: Co-Designed Quantization and Computation Orchestration for Mixture-of-Experts Vision Transformer on FPGA
by: Dong, Jiale, et al.
Published: (2025)
by: Dong, Jiale, et al.
Published: (2025)
RecFlash: Fast Recommendation System on In-Storage Computing with Frequency-Based Data Mapping
by: Baik, Jangho, et al.
Published: (2026)
by: Baik, Jangho, et al.
Published: (2026)
A Computing-in-Memory-based One-Class Hyperdimensional Computing Model for Outlier Detection
by: Wang, Ruixuan, et al.
Published: (2023)
by: Wang, Ruixuan, et al.
Published: (2023)
The Data Conversion Bottleneck in Analog Computing Accelerators
by: Meech, James T., et al.
Published: (2023)
by: Meech, James T., et al.
Published: (2023)
Cohet: A CXL-Driven Coherent Heterogeneous Computing Framework with Hardware-Calibrated Full-System Simulation
by: Wang, Yanjing, et al.
Published: (2025)
by: Wang, Yanjing, et al.
Published: (2025)
Similar Items
-
MixFP4: Enhancing NVFP4 with Adaptive FP4/INT4 Block Representations
by: Zou, Jiaxiang, et al.
Published: (2026) -
A System Development Kit for Big Data Applications on FPGA-based Clusters: The EVEREST Approach
by: Pilato, Christian, et al.
Published: (2024) -
When Small Variations Become Big Failures: Reliability Challenges in Compute-in-Memory Neural Accelerators
by: Qin, Yifan, et al.
Published: (2026) -
An Affordable Experimental Technique for SRAM Write Margin Characterization for Nanometer CMOS Technologies
by: Alorda, Bartomeu, et al.
Published: (2024) -
Leveraging Recurrent Patterns in Graph Accelerators
by: Rahimi, Masoud, et al.
Published: (2025)