:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	He, Zifan, Truong, Anderson, Cao, Yingqi, Cong, Jason
Format:	Preprint
Published:	2025
Subjects:	Hardware Architecture Machine Learning
Online Access:	https://arxiv.org/abs/2502.08807
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

FlexLLM: Composable HLS Library for Flexible Hybrid LLM Accelerator Design
by: Zhang, Jiahao, et al.
Published: (2026)

Hardware/Software Co-Design of RISC-V Extensions for Accelerating Sparse DNNs on FPGAs
by: Sabih, Muhammad, et al.
Published: (2025)

PoTAcc: A Pipeline for End-to-End Acceleration of Power-of-Two Quantized DNNs
by: Saha, Rappy, et al.
Published: (2026)

AutoHLS: Learning to Accelerate Design Space Exploration for HLS Designs
by: Ahmed, Md Rubel, et al.
Published: (2024)

Effective and Memory-Efficient Alternatives to ECC for Reliable Large-Scale DNNs
by: Ahmadilivani, Mohammad Hasan, et al.
Published: (2026)

An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT
by: Shao, Haikuo, et al.
Published: (2024)

HW-SW Optimization of DNNs for Privacy-preserving People Counting on Low-resolution Infrared Arrays
by: Risso, Matteo, et al.
Published: (2024)

Iceberg: Enhancing HLS Modeling with Synthetic Data
by: Ding, Zijian, et al.
Published: (2025)

Efficient Task Transfer for HLS DSE
by: Ding, Zijian, et al.
Published: (2024)

Learning to Compare Hardware Designs for High-Level Synthesis
by: Bai, Yunsheng, et al.
Published: (2024)

Cross-Modality Program Representation Learning for Electronic Design Automation with High-Level Synthesis
by: Qin, Zongyue, et al.
Published: (2024)

Designing Efficient LLM Accelerators for Edge Devices
by: Haris, Jude, et al.
Published: (2024)

GCoD: Graph Convolutional Network Acceleration via Dedicated Algorithm and Accelerator Co-Design
by: You, Haoran, et al.
Published: (2021)

When Small Variations Become Big Failures: Reliability Challenges in Compute-in-Memory Neural Accelerators
by: Qin, Yifan, et al.
Published: (2026)

A Tiny Supervised ODL Core with Auto Data Pruning for Human Activity Recognition
by: Matsutani, Hiroki, et al.
Published: (2024)

DART: Input-Difficulty-AwaRe Adaptive Threshold for Early-Exit DNNs
by: Patne, Parth, et al.
Published: (2026)

RACE-IT: A Reconfigurable Analog Computing Engine for In-Memory Transformer Acceleration
by: Zhao, Lei, et al.
Published: (2023)

FedChip: Federated LLM for Artificial Intelligence Accelerator Chip Design
by: Nazzal, Mahmoud, et al.
Published: (2025)

EVA: Accelerating LLM Decoding via an Efficient Vector Quantization Architecture
by: Duan, Bowen, et al.
Published: (2026)

RCNet: $ΔΣ$ IADCs as Recurrent AutoEncoders
by: Verdant, Arnaud, et al.
Published: (2025)

FLAASH: Flexible Accelerator Architecture for Sparse High-Order Tensor Contraction
by: Kulp, Gabriel, et al.
Published: (2024)

Stream: Design Space Exploration of Layer-Fused DNNs on Heterogeneous Dataflow Accelerators
by: Symons, Arne, et al.
Published: (2022)

An Efficient Data Reuse with Tile-Based Adaptive Stationary for Transformer Accelerators
by: Li, Tseng-Jen, et al.
Published: (2025)

FORTALESA: Fault-Tolerant Reconfigurable Systolic Array for DNN Inference
by: Cherezova, Natalia, et al.
Published: (2025)

Hardware Software Optimizations for Fast Model Recovery on Reconfigurable Architectures
by: Xu, Bin, et al.
Published: (2025)

DAISM: Digital Approximate In-SRAM Multiplier-based Accelerator for DNN Training and Inference
by: Sonnino, Lorenzo, et al.
Published: (2023)

MetaML-Pro: Cross-Stage Design Flow Automation for Efficient Deep Learning Acceleration
by: Que, Zhiqiang, et al.
Published: (2025)

Deep Inverse Design for High-Level Synthesis
by: Chang, Ping, et al.
Published: (2024)

PowerGenie: Analytically-Guided Evolutionary Discovery of Superior Reconfigurable Power Converters
by: Gao, Jian, et al.
Published: (2026)

GNNBuilder: An Automated Framework for Generic Graph Neural Network Accelerator Generation, Simulation, and Optimization
by: Abi-Karam, Stefan, et al.
Published: (2023)

GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models
by: Fu, Yonggan, et al.
Published: (2023)

DP-HLS: A High-Level Synthesis Framework for Accelerating Dynamic Programming Algorithms in Bioinformatics
by: Cao, Yingqi, et al.
Published: (2024)

Hierarchical Mixture of Experts: Generalizable Learning for High-Level Synthesis
by: Li, Weikai, et al.
Published: (2024)

AutoFlows++: Hierarchical Message Flow Mining for System on Chip Designs
by: Nadimi, Bardia, et al.
Published: (2026)

HaShiFlex: A High-Throughput Hardened Shifter DNN Accelerator with Fine-Tuning Flexibility
by: Herbst, Jonathan, et al.
Published: (2025)

Neural Network Acceleration on MPSoC board: Integrating SLAC's SNL, Rogue Software and Auto-SNL
by: Rahali, Hamza Ezzaoui, et al.
Published: (2025)

Torch2Chip: An End-to-end Customizable Deep Neural Network Compression and Deployment Toolkit for Prototype Hardware Accelerator Design
by: Meng, Jian, et al.
Published: (2024)

Hardware-Aware Data and Instruction Mapping for AI Tasks: Balancing Parallelism, I/O and Memory Tradeoffs
by: Chowdhury, Md Rownak Hossain, et al.
Published: (2025)

Dato: A Task-Based Programming Model for Dataflow Accelerators
by: Fang, Shihan, et al.
Published: (2025)

Reconfigurable Computing Challenge: Real-Time Graph Neural Networks for Online Event Selection in Big Science
by: Neu, Marc, et al.
Published: (2026)