Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Rahimifar, Mohammad Mehdi, Rahali, Hamza Ezzaoui, Therrien, Audrey C.
Format:	Preprint
Published:	2024
Subjects:	Machine Learning Artificial Intelligence Hardware Architecture
Online Access:	https://arxiv.org/abs/2408.05314
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866914910082957312
author	Rahimifar, Mohammad Mehdi Rahali, Hamza Ezzaoui Therrien, Audrey C.
author_facet	Rahimifar, Mohammad Mehdi Rahali, Hamza Ezzaoui Therrien, Audrey C.
contents	Implementing Machine Learning (ML) models on Field-Programmable Gate Arrays (FPGAs) is becoming increasingly popular across various domains as a low-latency and low-power solution that helps manage large data rates generated by continuously improving detectors. However, developing ML models for FPGAs is time-consuming, as optimization requires synthesis to evaluate FPGA area and latency, making the process slow and repetitive. This paper introduces a novel method to predict the resource utilization and inference latency of Neural Networks (NNs) before their synthesis and implementation on FPGA. We leverage HLS4ML, a tool-flow that helps translate NNs into high-level synthesis (HLS) code, to synthesize a diverse dataset of NN architectures and train resource utilization and inference latency predictors. While HLS4ML requires full synthesis to obtain resource and latency insights, our method uses trained regression models for immediate pre-synthesis predictions. The prediction models estimate the usage of Block RAM (BRAM), Digital Signal Processors (DSP), Flip-Flops (FF), and Look-Up Tables (LUT), as well as the inference clock cycles. The predictors were evaluated on both synthetic and existing benchmark architectures and demonstrated high accuracy with R2 scores ranging between 0.8 and 0.98 on the validation set and sMAPE values between 10% and 30%. Overall, our approach provides valuable preliminary insights, enabling users to quickly assess the feasibility and efficiency of NNs on FPGAs, accelerating the development and deployment processes. The open-source repository can be found at https://github.com/IMPETUS-UdeS/rule4ml, while the datasets are publicly available at https://borealisdata.ca/dataverse/rule4ml.
format	Preprint
id	arxiv_https___arxiv_org_abs_2408_05314
institution	arXiv
publishDate	2024
record_format	arxiv
spellingShingle	rule4ml: An Open-Source Tool for Resource Utilization and Latency Estimation for ML Models on FPGA Rahimifar, Mohammad Mehdi Rahali, Hamza Ezzaoui Therrien, Audrey C. Machine Learning Artificial Intelligence Hardware Architecture Implementing Machine Learning (ML) models on Field-Programmable Gate Arrays (FPGAs) is becoming increasingly popular across various domains as a low-latency and low-power solution that helps manage large data rates generated by continuously improving detectors. However, developing ML models for FPGAs is time-consuming, as optimization requires synthesis to evaluate FPGA area and latency, making the process slow and repetitive. This paper introduces a novel method to predict the resource utilization and inference latency of Neural Networks (NNs) before their synthesis and implementation on FPGA. We leverage HLS4ML, a tool-flow that helps translate NNs into high-level synthesis (HLS) code, to synthesize a diverse dataset of NN architectures and train resource utilization and inference latency predictors. While HLS4ML requires full synthesis to obtain resource and latency insights, our method uses trained regression models for immediate pre-synthesis predictions. The prediction models estimate the usage of Block RAM (BRAM), Digital Signal Processors (DSP), Flip-Flops (FF), and Look-Up Tables (LUT), as well as the inference clock cycles. The predictors were evaluated on both synthetic and existing benchmark architectures and demonstrated high accuracy with R2 scores ranging between 0.8 and 0.98 on the validation set and sMAPE values between 10% and 30%. Overall, our approach provides valuable preliminary insights, enabling users to quickly assess the feasibility and efficiency of NNs on FPGAs, accelerating the development and deployment processes. The open-source repository can be found at https://github.com/IMPETUS-UdeS/rule4ml, while the datasets are publicly available at https://borealisdata.ca/dataverse/rule4ml.
title	rule4ml: An Open-Source Tool for Resource Utilization and Latency Estimation for ML Models on FPGA
topic	Machine Learning Artificial Intelligence Hardware Architecture
url	https://arxiv.org/abs/2408.05314

Similar Items