Saved in:
| Main Authors: | Maheswaran, Karthikeya Sharma, Bossut, Camille, Wanna, Andy, Zhang, Qirun, Hao, Cong |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2505.14657 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ForgeBench: A Machine Learning Benchmark Suite and Auto-Generation Framework for Next-Generation HLS Tools
by: Wanna, Andy, et al.
Published: (2025)
by: Wanna, Andy, et al.
Published: (2025)
RealProbe: An Automated and Lightweight Performance Profiler for In-FPGA Execution of High-Level Synthesis Designs
by: Kim, Jiho, et al.
Published: (2025)
by: Kim, Jiho, et al.
Published: (2025)
Hardware-Software Co-Design for Accelerating Transformer Inference Leveraging Compute-in-Memory
by: Kim, Dong Eun, et al.
Published: (2025)
by: Kim, Dong Eun, et al.
Published: (2025)
TAPA-CS: Enabling Scalable Accelerator Design on Distributed HBM-FPGAs
by: Prakriya, Neha, et al.
Published: (2023)
by: Prakriya, Neha, et al.
Published: (2023)
Stream-HLS: Towards Automatic Dataflow Acceleration
by: Basalama, Suhail, et al.
Published: (2025)
by: Basalama, Suhail, et al.
Published: (2025)
LightningSimV2: Faster and Scalable Simulation for High-Level Synthesis via Graph Compilation and Optimization
by: Sarkar, Rishov, et al.
Published: (2024)
by: Sarkar, Rishov, et al.
Published: (2024)
Flexible In-NAND Cryptographic Processing for Secure Flash Storage
by: Noh, Seock-Hwan, et al.
Published: (2025)
by: Noh, Seock-Hwan, et al.
Published: (2025)
AutoRAC: Automated Processing-in-Memory Accelerator Design for Recommender Systems
by: Cheng, Feng, et al.
Published: (2025)
by: Cheng, Feng, et al.
Published: (2025)
CogSys: Efficient and Scalable Neurosymbolic Cognition System via Algorithm-Hardware Co-Design
by: Wan, Zishen, et al.
Published: (2025)
by: Wan, Zishen, et al.
Published: (2025)
OmniSim: Simulating Hardware with C Speed and RTL Accuracy for High-Level Synthesis Designs
by: Sarkar, Rishov, et al.
Published: (2025)
by: Sarkar, Rishov, et al.
Published: (2025)
GCoD: Graph Convolutional Network Acceleration via Dedicated Algorithm and Accelerator Co-Design
by: You, Haoran, et al.
Published: (2021)
by: You, Haoran, et al.
Published: (2021)
Prosperity: Accelerating Spiking Neural Networks via Product Sparsity
by: Wei, Chiyue, et al.
Published: (2025)
by: Wei, Chiyue, et al.
Published: (2025)
Holistic Optimization Framework for FPGA Accelerators
by: Pouget, Stéphane, et al.
Published: (2025)
by: Pouget, Stéphane, et al.
Published: (2025)
Exploring and Exploiting Runtime Reconfigurable Floating Point Precision in Scientific Computing: a Case Study for Solving PDEs
by: Hao, Cong "Callie"
Published: (2024)
by: Hao, Cong "Callie"
Published: (2024)
A Near-Cache Architectural Framework for Cryptographic Computing
by: Zhang, Jingyao, et al.
Published: (2025)
by: Zhang, Jingyao, et al.
Published: (2025)
Efficient yet Accurate End-to-End SC Accelerator Design
by: Li, Meng, et al.
Published: (2024)
by: Li, Meng, et al.
Published: (2024)
FIFOAdvisor: A DSE Framework for Automated FIFO Sizing of High-Level Synthesis Designs
by: Abi-Karam, Stefan, et al.
Published: (2025)
by: Abi-Karam, Stefan, et al.
Published: (2025)
CIMPool: Scalable Neural Network Acceleration for Compute-In-Memory using Weight Pools
by: Li, Shurui, et al.
Published: (2025)
by: Li, Shurui, et al.
Published: (2025)
The BRAM is the Limit: Shattering Myths, Shaping Standards, and Building Scalable PIM Accelerators
by: Kabir, MD Arafat, et al.
Published: (2024)
by: Kabir, MD Arafat, et al.
Published: (2024)
GenPairX: A Hardware-Algorithm Co-Designed Accelerator for Paired-End Read Mapping
by: Eudine, Julien, et al.
Published: (2026)
by: Eudine, Julien, et al.
Published: (2026)
The Quest for Reliable AI Accelerators: Cross-Layer Evaluation and Design Optimization
by: Li, Meng, et al.
Published: (2026)
by: Li, Meng, et al.
Published: (2026)
SCREME: A Scalable Framework for Resilient Memory Design
by: Li, Fan, et al.
Published: (2025)
by: Li, Fan, et al.
Published: (2025)
Scope: A Scalable Merged Pipeline Framework for Multi-Chip-Module NN Accelerators
by: Huang, Zongle, et al.
Published: (2026)
by: Huang, Zongle, et al.
Published: (2026)
SwiftKV: An Edge-Oriented Attention Algorithm and Multi-Head Accelerator for Fast, Efficient LLM Decoding
by: Zhang, Junming, et al.
Published: (2026)
by: Zhang, Junming, et al.
Published: (2026)
HLS-Eval: A Benchmark and Framework for Evaluating LLMs on High-Level Synthesis Design Tasks
by: Abi-Karam, Stefan, et al.
Published: (2025)
by: Abi-Karam, Stefan, et al.
Published: (2025)
Transitive Array: An Efficient GEMM Accelerator with Result Reuse
by: Guo, Cong, et al.
Published: (2025)
by: Guo, Cong, et al.
Published: (2025)
InTAR: Inter-Task Auto-Reconfigurable Accelerator Design for High Data Volume Variation in DNNs
by: He, Zifan, et al.
Published: (2025)
by: He, Zifan, et al.
Published: (2025)
Platinum: Path-Adaptable LUT-Based Accelerator Tailored for Low-Bit Weight Matrix Multiplication
by: Shan, Haoxuan, et al.
Published: (2025)
by: Shan, Haoxuan, et al.
Published: (2025)
FractalSync: Lightweight Scalable Global Synchronization of Massive Bulk Synchronous Parallel AI Accelerators
by: Isachi, Victor, et al.
Published: (2025)
by: Isachi, Victor, et al.
Published: (2025)
ISAAC: Intelligent, Scalable, Agile, and Accelerated CPU Verification via LLM-aided FPGA Parallelism
by: Sun, Jialin, et al.
Published: (2025)
by: Sun, Jialin, et al.
Published: (2025)
D-Legion: A Scalable Many-Core Architecture for Accelerating Matrix Multiplication in Quantized LLMs
by: Abdelmaksoud, Ahmed J., et al.
Published: (2026)
by: Abdelmaksoud, Ahmed J., et al.
Published: (2026)
Oobleck: Low-Compromise Design for Fault Tolerant Accelerators
by: Wilks, Guy, et al.
Published: (2025)
by: Wilks, Guy, et al.
Published: (2025)
GSIM: Accelerating RTL Simulation for Large-Scale Designs
by: Chen, Lu, et al.
Published: (2025)
by: Chen, Lu, et al.
Published: (2025)
STI-SNN: A 0.14 GOPS/W/PE Single-Timestep Inference FPGA-based SNN Accelerator with Algorithm and Hardware Co-Design
by: Wang, Kainan, et al.
Published: (2025)
by: Wang, Kainan, et al.
Published: (2025)
LaZagna: An Open-Source Framework for Flexible 3D FPGA Architectural Exploration
by: Youssef, Ismael, et al.
Published: (2025)
by: Youssef, Ismael, et al.
Published: (2025)
DiffuSE: Cross-Layer Design Space Exploration of DNN Accelerator via Diffusion-Driven Optimization
by: Ren, Yi, et al.
Published: (2025)
by: Ren, Yi, et al.
Published: (2025)
In-place Switch: Reprogramming based SLC Cache Design for Hybrid 3D SSDs
by: Yang, Xufeng, et al.
Published: (2024)
by: Yang, Xufeng, et al.
Published: (2024)
Accelerating PageRank Algorithmic Tasks with a new Programmable Hardware Architecture
by: Chowdhury, Md Rownak Hossain, et al.
Published: (2024)
by: Chowdhury, Md Rownak Hossain, et al.
Published: (2024)
Towards Closing the Performance Gap for Cryptographic Kernels Between CPUs and Specialized Hardware
by: Zhang, Naifeng, et al.
Published: (2025)
by: Zhang, Naifeng, et al.
Published: (2025)
HCiM: ADC-Less Hybrid Analog-Digital Compute in Memory Accelerator for Deep Learning Workloads
by: Negi, Shubham, et al.
Published: (2024)
by: Negi, Shubham, et al.
Published: (2024)
Similar Items
-
ForgeBench: A Machine Learning Benchmark Suite and Auto-Generation Framework for Next-Generation HLS Tools
by: Wanna, Andy, et al.
Published: (2025) -
RealProbe: An Automated and Lightweight Performance Profiler for In-FPGA Execution of High-Level Synthesis Designs
by: Kim, Jiho, et al.
Published: (2025) -
Hardware-Software Co-Design for Accelerating Transformer Inference Leveraging Compute-in-Memory
by: Kim, Dong Eun, et al.
Published: (2025) -
TAPA-CS: Enabling Scalable Accelerator Design on Distributed HBM-FPGAs
by: Prakriya, Neha, et al.
Published: (2023) -
Stream-HLS: Towards Automatic Dataflow Acceleration
by: Basalama, Suhail, et al.
Published: (2025)