Staff View: :: Library Catalog

Saved in:

Bibliographic Details
Main Authors:	Bondhugula, Uday, Baviskar, Akshay, Katel, Navdeep, Patel, Vimal, JS, Anoop, Dutta, Arnab
Format:	Preprint
Published:	2026
Subjects:	Programming Languages Machine Learning
Online Access:	https://arxiv.org/abs/2603.06731
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1866912959074140160
author	Bondhugula, Uday Baviskar, Akshay Katel, Navdeep Patel, Vimal JS, Anoop Dutta, Arnab
author_facet	Bondhugula, Uday Baviskar, Akshay Katel, Navdeep Patel, Vimal JS, Anoop Dutta, Arnab
contents	We present the design and implementation of PolyBlocks, a modular and reusable MLIR-based compiler infrastructure for AI programming frameworks and AI chips. PolyBlocks is based on pass pipelines that compose transformations on loop nests and SSA, primarily relying on lightweight affine access analysis; the transformations are stitched together in specialized ways to realize high-performance code automatically by the use of analytical cost models and heuristics. The optimizations in these passes include multi-level tiling, fusion, on-chip scratchpad usage, mapping matmuls and convolutions to matrix units, fusing the attention layer, and several other transformations for parallelism and locality. They have been developed in a way that makes it easy to build PolyBlocks-based compilers to target new chips, reusing much of the infrastructure. PolyBlocks' design and architecture enable fully automatic code generation from high-level frameworks to low-level target-specific intrinsics. Experimental results from evaluating PolyBlocks-powered just-in-time compilation for PyTorch and JAX targeting NVIDIA GPUs show that it is able to match or outperform Torch Inductor and XLA in several cases, although the latter rely on a combination of vendor libraries and code generation. For individual operators like matmuls and convolutions, PolyBlocks-generated code is competitive with the best vendor-tuned libraries or hand-written kernels.
format	Preprint
id	arxiv_https___arxiv_org_abs_2603_06731
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	PolyBlocks: A Compiler Infrastructure for AI Chips and Programming Frameworks Bondhugula, Uday Baviskar, Akshay Katel, Navdeep Patel, Vimal JS, Anoop Dutta, Arnab Programming Languages Machine Learning We present the design and implementation of PolyBlocks, a modular and reusable MLIR-based compiler infrastructure for AI programming frameworks and AI chips. PolyBlocks is based on pass pipelines that compose transformations on loop nests and SSA, primarily relying on lightweight affine access analysis; the transformations are stitched together in specialized ways to realize high-performance code automatically by the use of analytical cost models and heuristics. The optimizations in these passes include multi-level tiling, fusion, on-chip scratchpad usage, mapping matmuls and convolutions to matrix units, fusing the attention layer, and several other transformations for parallelism and locality. They have been developed in a way that makes it easy to build PolyBlocks-based compilers to target new chips, reusing much of the infrastructure. PolyBlocks' design and architecture enable fully automatic code generation from high-level frameworks to low-level target-specific intrinsics. Experimental results from evaluating PolyBlocks-powered just-in-time compilation for PyTorch and JAX targeting NVIDIA GPUs show that it is able to match or outperform Torch Inductor and XLA in several cases, although the latter rely on a combination of vendor libraries and code generation. For individual operators like matmuls and convolutions, PolyBlocks-generated code is competitive with the best vendor-tuned libraries or hand-written kernels.
title	PolyBlocks: A Compiler Infrastructure for AI Chips and Programming Frameworks
topic	Programming Languages Machine Learning
url	https://arxiv.org/abs/2603.06731

Similar Items