Saved in:
Bibliographic Details
Main Authors: Lu, Yao, Liu, Shang, Zhou, Hangan, Fang, Wenji, Zhang, Qijun, Xie, Zhiyao
Format: Preprint
Published: 2026
Subjects:
Online Access:https://arxiv.org/abs/2601.01765
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866908747604951040
author Lu, Yao
Liu, Shang
Zhou, Hangan
Fang, Wenji
Zhang, Qijun
Xie, Zhiyao
author_facet Lu, Yao
Liu, Shang
Zhou, Hangan
Fang, Wenji
Zhang, Qijun
Xie, Zhiyao
contents The rapid progress of artificial intelligence increasingly relies on efficient integrated circuit (IC) design. Recent studies have explored the use of large language models (LLMs) for generating Register Transfer Level (RTL) code, but existing benchmarks mainly evaluate syntactic correctness rather than optimization quality in terms of power, performance, and area (PPA). This work introduces RTL-OPT, a benchmark for assessing the capability of LLMs in RTL optimization. RTL-OPT contains 36 handcrafted digital designs that cover diverse implementation categories including combinational logic, pipelined datapaths, finite state machines, and memory interfaces. Each task provides a pair of RTL codes, a suboptimal version and a human-optimized reference that reflects industry-proven optimization patterns not captured by conventional synthesis tools. Furthermore, RTL-OPT integrates an automated evaluation framework to verify functional correctness and quantify PPA improvements, enabling standardized and meaningful assessment of generative models for hardware design optimization.
format Preprint
id arxiv_https___arxiv_org_abs_2601_01765
institution arXiv
publishDate 2026
record_format arxiv
spellingShingle A New Benchmark for the Appropriate Evaluation of RTL Code Optimization
Lu, Yao
Liu, Shang
Zhou, Hangan
Fang, Wenji
Zhang, Qijun
Xie, Zhiyao
Artificial Intelligence
Software Engineering
The rapid progress of artificial intelligence increasingly relies on efficient integrated circuit (IC) design. Recent studies have explored the use of large language models (LLMs) for generating Register Transfer Level (RTL) code, but existing benchmarks mainly evaluate syntactic correctness rather than optimization quality in terms of power, performance, and area (PPA). This work introduces RTL-OPT, a benchmark for assessing the capability of LLMs in RTL optimization. RTL-OPT contains 36 handcrafted digital designs that cover diverse implementation categories including combinational logic, pipelined datapaths, finite state machines, and memory interfaces. Each task provides a pair of RTL codes, a suboptimal version and a human-optimized reference that reflects industry-proven optimization patterns not captured by conventional synthesis tools. Furthermore, RTL-OPT integrates an automated evaluation framework to verify functional correctness and quantify PPA improvements, enabling standardized and meaningful assessment of generative models for hardware design optimization.
title A New Benchmark for the Appropriate Evaluation of RTL Code Optimization
topic Artificial Intelligence
Software Engineering
url https://arxiv.org/abs/2601.01765