:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Langhammer, Martin, Constantinides, George A.
Format:	Preprint
Published:	2024
Subjects:	Hardware Architecture
Online Access:	https://arxiv.org/abs/2406.03227
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

A Statically and Dynamically Scalable Soft GPGPU
by: Langhammer, Martin, et al.
Published: (2024)

Banked Memories for Soft SIMT Processors
by: Langhammer, Martin, et al.
Published: (2025)

A 950 MHz SIMT Soft Processor
by: Langhammer, Martin, et al.
Published: (2025)

ReducedLUT: Table Decomposition with "Don't Care" Conditions
by: Cassidy, Oliver, et al.
Published: (2024)

ROVER: RTL Optimization via Verified E-Graph Rewriting
by: Coward, Samuel, et al.
Published: (2024)

Optimising GPGPU Execution Through Runtime Micro-Architecture Parameter Analysis
by: Sarda, Giuseppe M., et al.
Published: (2024)

NeuraLUT: Hiding Neural Network Density in Boolean Synthesizable Functions
by: Andronic, Marta, et al.
Published: (2024)

PolyLUT: Learning Piecewise Polynomials for Ultra-Low Latency FPGA LUT-based Inference
by: Andronic, Marta, et al.
Published: (2023)

Combining Power and Arithmetic Optimization via Datapath Rewriting
by: Coward, Samuel, et al.
Published: (2024)

Sim-FA: A GPGPU Simulator Framework for Fine-Grained FlashAttention Pipeline Analysis
by: Zhou, Zhongchun, et al.
Published: (2026)

FPGA Resource-aware Structured Pruning for Real-Time Neural Networks
by: Ramhorst, Benjamin, et al.
Published: (2023)

PolyLUT: Ultra-low Latency Polynomial Inference with Hardware-Aware Structured Pruning
by: Andronic, Marta, et al.
Published: (2025)

StruM: Structured Mixed Precision for Efficient Deep Learning Hardware Codesign
by: Wu, Michael, et al.
Published: (2025)

ATHEENA: A Toolflow for Hardware Early-Exit Network Automation
by: Biggs, Benjamin, et al.
Published: (2023)

Exploring FPGA designs for MX and beyond
by: Samson, Ebby, et al.
Published: (2024)

A Dataflow Compiler for Efficient LLM Inference using Custom Microscaling Formats
by: Cheng, Jianyi, et al.
Published: (2023)

AMPLE: Event-Driven Accelerator for Mixed-Precision Inference of Graph Neural Networks
by: Gimenes, Pedro, et al.
Published: (2025)

Deep Learning and Machine Learning with GPGPU and CUDA: Unlocking the Power of Parallel Computing
by: Li, Ming, et al.
Published: (2024)

Ten-Four: An Open-Source Fused Dot Product Unit for Mixed-Precision GPGPU Tensor Cores
by: Rout, Nikhil, et al.
Published: (2025)

BitMoD: Bit-serial Mixture-of-Datatype LLM Acceleration
by: Chen, Yuzong, et al.
Published: (2024)

Register Dispersion: Reducing the Footprint of the Vector Register File in Vector Engines of Low-Cost RISC-V CPUs
by: Titopoulos, Vasileios, et al.
Published: (2025)

High-Performance Pipelined NTT Accelerators with Homogeneous Digit-Serial Modulo Arithmetic
by: Alexakis, George, et al.
Published: (2025)

The xPU-athalon: Quantifying the Competition of AI Acceleration
by: Golden, Alicia, et al.
Published: (2026)

ObfAx: Obfuscation and IP Piracy Detection in Approximate Circuits
by: Sekanina, Lukas, et al.
Published: (2026)

Runtime Energy Monitoring for RISC-V Soft-Cores
by: Scionti, Alberto, et al.
Published: (2025)

Closing the Gap Between Float and Posit Hardware Efficiency
by: Jonnalagadda, Aditya Anirudh, et al.
Published: (2026)

Soft Error Probability Estimation of Nano-scale Combinational Circuits
by: Jockar, Ali, et al.
Published: (2025)

RealBench: Benchmarking Verilog Generation Models with Real-World IP Designs
by: Jin, Pengwei, et al.
Published: (2025)

ICMarks: A Robust Watermarking Framework for Integrated Circuit Physical Design IP Protection
by: Zhang, Ruisi, et al.
Published: (2024)

Towards Closing the Performance Gap for Cryptographic Kernels Between CPUs and Specialized Hardware
by: Zhang, Naifeng, et al.
Published: (2025)

Modeling PFAS in Semiconductor Manufacturing to Quantify Trade-offs in Energy Efficiency and Environmental Impact of Computing Systems
by: Elgamal, Mariam, et al.
Published: (2025)

C2HLSC: Can LLMs Bridge the Software-to-Hardware Design Gap?
by: Collini, Luca, et al.
Published: (2024)

Accelerating Mini-batch HGNN Training by Reducing CUDA Kernels
by: Wu, Meng, et al.
Published: (2024)

NuRedact: Non-Uniform eFPGA Architecture for Low-Overhead and Secure IP Redaction
by: Das, Voktho, et al.
Published: (2026)

Table-Lookup MAC: Scalable Processing of Quantised Neural Networks in FPGA Soft Logic
by: Gerlinghoff, Daniel, et al.
Published: (2024)

Development of High-Performance DSP Algorithms on the European Rad-Hard NG-ULTRA SoC FPGA
by: Leon, Vasileios, et al.
Published: (2024)

Advancing Cloud Computing Capabilities on gem5 by Implementing the RISC-V Hypervisor Extension
by: Fragkoulis, George-Marios, et al.
Published: (2024)

Sensorized Soft Skin for Dexterous Robotic Hands
by: Egli, Jana, et al.
Published: (2024)

CMOS+X: Stacking Persistent Embedded Memories based on Oxide Transistors upon GPGPU Platforms
by: Waqar, Faaiq, et al.
Published: (2025)

BARD: Reducing Write Latency of DDR5 Memory by Exploiting Bank-Parallelism
by: Vittal, Suhas, et al.
Published: (2025)