Saved in:
| Main Authors: | Langhammer, Martin, Constantinides, George A. |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2406.03227 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
A Statically and Dynamically Scalable Soft GPGPU
by: Langhammer, Martin, et al.
Published: (2024)
by: Langhammer, Martin, et al.
Published: (2024)
Banked Memories for Soft SIMT Processors
by: Langhammer, Martin, et al.
Published: (2025)
by: Langhammer, Martin, et al.
Published: (2025)
A 950 MHz SIMT Soft Processor
by: Langhammer, Martin, et al.
Published: (2025)
by: Langhammer, Martin, et al.
Published: (2025)
ReducedLUT: Table Decomposition with "Don't Care" Conditions
by: Cassidy, Oliver, et al.
Published: (2024)
by: Cassidy, Oliver, et al.
Published: (2024)
ROVER: RTL Optimization via Verified E-Graph Rewriting
by: Coward, Samuel, et al.
Published: (2024)
by: Coward, Samuel, et al.
Published: (2024)
Optimising GPGPU Execution Through Runtime Micro-Architecture Parameter Analysis
by: Sarda, Giuseppe M., et al.
Published: (2024)
by: Sarda, Giuseppe M., et al.
Published: (2024)
NeuraLUT: Hiding Neural Network Density in Boolean Synthesizable Functions
by: Andronic, Marta, et al.
Published: (2024)
by: Andronic, Marta, et al.
Published: (2024)
PolyLUT: Learning Piecewise Polynomials for Ultra-Low Latency FPGA LUT-based Inference
by: Andronic, Marta, et al.
Published: (2023)
by: Andronic, Marta, et al.
Published: (2023)
Combining Power and Arithmetic Optimization via Datapath Rewriting
by: Coward, Samuel, et al.
Published: (2024)
by: Coward, Samuel, et al.
Published: (2024)
Sim-FA: A GPGPU Simulator Framework for Fine-Grained FlashAttention Pipeline Analysis
by: Zhou, Zhongchun, et al.
Published: (2026)
by: Zhou, Zhongchun, et al.
Published: (2026)
FPGA Resource-aware Structured Pruning for Real-Time Neural Networks
by: Ramhorst, Benjamin, et al.
Published: (2023)
by: Ramhorst, Benjamin, et al.
Published: (2023)
PolyLUT: Ultra-low Latency Polynomial Inference with Hardware-Aware Structured Pruning
by: Andronic, Marta, et al.
Published: (2025)
by: Andronic, Marta, et al.
Published: (2025)
StruM: Structured Mixed Precision for Efficient Deep Learning Hardware Codesign
by: Wu, Michael, et al.
Published: (2025)
by: Wu, Michael, et al.
Published: (2025)
ATHEENA: A Toolflow for Hardware Early-Exit Network Automation
by: Biggs, Benjamin, et al.
Published: (2023)
by: Biggs, Benjamin, et al.
Published: (2023)
Exploring FPGA designs for MX and beyond
by: Samson, Ebby, et al.
Published: (2024)
by: Samson, Ebby, et al.
Published: (2024)
A Dataflow Compiler for Efficient LLM Inference using Custom Microscaling Formats
by: Cheng, Jianyi, et al.
Published: (2023)
by: Cheng, Jianyi, et al.
Published: (2023)
AMPLE: Event-Driven Accelerator for Mixed-Precision Inference of Graph Neural Networks
by: Gimenes, Pedro, et al.
Published: (2025)
by: Gimenes, Pedro, et al.
Published: (2025)
Deep Learning and Machine Learning with GPGPU and CUDA: Unlocking the Power of Parallel Computing
by: Li, Ming, et al.
Published: (2024)
by: Li, Ming, et al.
Published: (2024)
Ten-Four: An Open-Source Fused Dot Product Unit for Mixed-Precision GPGPU Tensor Cores
by: Rout, Nikhil, et al.
Published: (2025)
by: Rout, Nikhil, et al.
Published: (2025)
BitMoD: Bit-serial Mixture-of-Datatype LLM Acceleration
by: Chen, Yuzong, et al.
Published: (2024)
by: Chen, Yuzong, et al.
Published: (2024)
Register Dispersion: Reducing the Footprint of the Vector Register File in Vector Engines of Low-Cost RISC-V CPUs
by: Titopoulos, Vasileios, et al.
Published: (2025)
by: Titopoulos, Vasileios, et al.
Published: (2025)
High-Performance Pipelined NTT Accelerators with Homogeneous Digit-Serial Modulo Arithmetic
by: Alexakis, George, et al.
Published: (2025)
by: Alexakis, George, et al.
Published: (2025)
The xPU-athalon: Quantifying the Competition of AI Acceleration
by: Golden, Alicia, et al.
Published: (2026)
by: Golden, Alicia, et al.
Published: (2026)
ObfAx: Obfuscation and IP Piracy Detection in Approximate Circuits
by: Sekanina, Lukas, et al.
Published: (2026)
by: Sekanina, Lukas, et al.
Published: (2026)
Runtime Energy Monitoring for RISC-V Soft-Cores
by: Scionti, Alberto, et al.
Published: (2025)
by: Scionti, Alberto, et al.
Published: (2025)
Closing the Gap Between Float and Posit Hardware Efficiency
by: Jonnalagadda, Aditya Anirudh, et al.
Published: (2026)
by: Jonnalagadda, Aditya Anirudh, et al.
Published: (2026)
Soft Error Probability Estimation of Nano-scale Combinational Circuits
by: Jockar, Ali, et al.
Published: (2025)
by: Jockar, Ali, et al.
Published: (2025)
RealBench: Benchmarking Verilog Generation Models with Real-World IP Designs
by: Jin, Pengwei, et al.
Published: (2025)
by: Jin, Pengwei, et al.
Published: (2025)
ICMarks: A Robust Watermarking Framework for Integrated Circuit Physical Design IP Protection
by: Zhang, Ruisi, et al.
Published: (2024)
by: Zhang, Ruisi, et al.
Published: (2024)
Towards Closing the Performance Gap for Cryptographic Kernels Between CPUs and Specialized Hardware
by: Zhang, Naifeng, et al.
Published: (2025)
by: Zhang, Naifeng, et al.
Published: (2025)
Modeling PFAS in Semiconductor Manufacturing to Quantify Trade-offs in Energy Efficiency and Environmental Impact of Computing Systems
by: Elgamal, Mariam, et al.
Published: (2025)
by: Elgamal, Mariam, et al.
Published: (2025)
C2HLSC: Can LLMs Bridge the Software-to-Hardware Design Gap?
by: Collini, Luca, et al.
Published: (2024)
by: Collini, Luca, et al.
Published: (2024)
Accelerating Mini-batch HGNN Training by Reducing CUDA Kernels
by: Wu, Meng, et al.
Published: (2024)
by: Wu, Meng, et al.
Published: (2024)
NuRedact: Non-Uniform eFPGA Architecture for Low-Overhead and Secure IP Redaction
by: Das, Voktho, et al.
Published: (2026)
by: Das, Voktho, et al.
Published: (2026)
Table-Lookup MAC: Scalable Processing of Quantised Neural Networks in FPGA Soft Logic
by: Gerlinghoff, Daniel, et al.
Published: (2024)
by: Gerlinghoff, Daniel, et al.
Published: (2024)
Development of High-Performance DSP Algorithms on the European Rad-Hard NG-ULTRA SoC FPGA
by: Leon, Vasileios, et al.
Published: (2024)
by: Leon, Vasileios, et al.
Published: (2024)
Advancing Cloud Computing Capabilities on gem5 by Implementing the RISC-V Hypervisor Extension
by: Fragkoulis, George-Marios, et al.
Published: (2024)
by: Fragkoulis, George-Marios, et al.
Published: (2024)
Sensorized Soft Skin for Dexterous Robotic Hands
by: Egli, Jana, et al.
Published: (2024)
by: Egli, Jana, et al.
Published: (2024)
CMOS+X: Stacking Persistent Embedded Memories based on Oxide Transistors upon GPGPU Platforms
by: Waqar, Faaiq, et al.
Published: (2025)
by: Waqar, Faaiq, et al.
Published: (2025)
BARD: Reducing Write Latency of DDR5 Memory by Exploiting Bank-Parallelism
by: Vittal, Suhas, et al.
Published: (2025)
by: Vittal, Suhas, et al.
Published: (2025)
Similar Items
-
A Statically and Dynamically Scalable Soft GPGPU
by: Langhammer, Martin, et al.
Published: (2024) -
Banked Memories for Soft SIMT Processors
by: Langhammer, Martin, et al.
Published: (2025) -
A 950 MHz SIMT Soft Processor
by: Langhammer, Martin, et al.
Published: (2025) -
ReducedLUT: Table Decomposition with "Don't Care" Conditions
by: Cassidy, Oliver, et al.
Published: (2024) -
ROVER: RTL Optimization via Verified E-Graph Rewriting
by: Coward, Samuel, et al.
Published: (2024)