Saved in:
Bibliographic Details
Main Authors: Lai, Ruihang, Shao, Junru, Feng, Siyuan, Lyubomirsky, Steven S., Hou, Bohan, Lin, Wuwei, Ye, Zihao, Jin, Hongyi, Jin, Yuchen, Liu, Jiawei, Jin, Lesheng, Cai, Yaxing, Jiang, Ziheng, Wu, Yong, Park, Sunghyun, Srivastava, Prakalp, Roesch, Jared G., Mowry, Todd C., Chen, Tianqi
Format: Preprint
Published: 2023
Subjects:
Online Access:https://arxiv.org/abs/2311.02103
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1866912222736809984
author Lai, Ruihang
Shao, Junru
Feng, Siyuan
Lyubomirsky, Steven S.
Hou, Bohan
Lin, Wuwei
Ye, Zihao
Jin, Hongyi
Jin, Yuchen
Liu, Jiawei
Jin, Lesheng
Cai, Yaxing
Jiang, Ziheng
Wu, Yong
Park, Sunghyun
Srivastava, Prakalp
Roesch, Jared G.
Mowry, Todd C.
Chen, Tianqi
author_facet Lai, Ruihang
Shao, Junru
Feng, Siyuan
Lyubomirsky, Steven S.
Hou, Bohan
Lin, Wuwei
Ye, Zihao
Jin, Hongyi
Jin, Yuchen
Liu, Jiawei
Jin, Lesheng
Cai, Yaxing
Jiang, Ziheng
Wu, Yong
Park, Sunghyun
Srivastava, Prakalp
Roesch, Jared G.
Mowry, Todd C.
Chen, Tianqi
contents Dynamic shape computations have become critical in modern machine learning workloads, especially in emerging large language models. The success of these models has driven the demand for their universal deployment across a diverse set of backend environments. In this paper, we present Relax, a compiler abstraction for optimizing end-to-end dynamic machine learning workloads. Relax introduces a cross-level abstraction that encapsulates computational graphs, loop-level tensor programs, and external library calls in a single representation. Relax also introduces first-class symbolic shape annotations to track dynamic shape computations globally across the program, enabling dynamic shape-aware cross-level optimizations. We build an end-to-end compilation framework using the proposed approach to optimize dynamic shape models. Experimental results on LLMs show that Relax delivers performance competitive with state-of-the-art systems across various GPUs and enables deployment of emerging models to a broader set of emerging environments, including mobile phones, embedded devices, and web browsers.
format Preprint
id arxiv_https___arxiv_org_abs_2311_02103
institution arXiv
publishDate 2023
record_format arxiv
spellingShingle Relax: Composable Abstractions for End-to-End Dynamic Machine Learning
Lai, Ruihang
Shao, Junru
Feng, Siyuan
Lyubomirsky, Steven S.
Hou, Bohan
Lin, Wuwei
Ye, Zihao
Jin, Hongyi
Jin, Yuchen
Liu, Jiawei
Jin, Lesheng
Cai, Yaxing
Jiang, Ziheng
Wu, Yong
Park, Sunghyun
Srivastava, Prakalp
Roesch, Jared G.
Mowry, Todd C.
Chen, Tianqi
Machine Learning
Artificial Intelligence
Programming Languages
Dynamic shape computations have become critical in modern machine learning workloads, especially in emerging large language models. The success of these models has driven the demand for their universal deployment across a diverse set of backend environments. In this paper, we present Relax, a compiler abstraction for optimizing end-to-end dynamic machine learning workloads. Relax introduces a cross-level abstraction that encapsulates computational graphs, loop-level tensor programs, and external library calls in a single representation. Relax also introduces first-class symbolic shape annotations to track dynamic shape computations globally across the program, enabling dynamic shape-aware cross-level optimizations. We build an end-to-end compilation framework using the proposed approach to optimize dynamic shape models. Experimental results on LLMs show that Relax delivers performance competitive with state-of-the-art systems across various GPUs and enables deployment of emerging models to a broader set of emerging environments, including mobile phones, embedded devices, and web browsers.
title Relax: Composable Abstractions for End-to-End Dynamic Machine Learning
topic Machine Learning
Artificial Intelligence
Programming Languages
url https://arxiv.org/abs/2311.02103