Saved in:
| Main Authors: | , , , , , , , , , , , , , , , , , , |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2311.02103 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1866912222736809984 |
|---|---|
| author | Lai, Ruihang Shao, Junru Feng, Siyuan Lyubomirsky, Steven S. Hou, Bohan Lin, Wuwei Ye, Zihao Jin, Hongyi Jin, Yuchen Liu, Jiawei Jin, Lesheng Cai, Yaxing Jiang, Ziheng Wu, Yong Park, Sunghyun Srivastava, Prakalp Roesch, Jared G. Mowry, Todd C. Chen, Tianqi |
| author_facet | Lai, Ruihang Shao, Junru Feng, Siyuan Lyubomirsky, Steven S. Hou, Bohan Lin, Wuwei Ye, Zihao Jin, Hongyi Jin, Yuchen Liu, Jiawei Jin, Lesheng Cai, Yaxing Jiang, Ziheng Wu, Yong Park, Sunghyun Srivastava, Prakalp Roesch, Jared G. Mowry, Todd C. Chen, Tianqi |
| contents | Dynamic shape computations have become critical in modern machine learning workloads, especially in emerging large language models. The success of these models has driven the demand for their universal deployment across a diverse set of backend environments. In this paper, we present Relax, a compiler abstraction for optimizing end-to-end dynamic machine learning workloads. Relax introduces a cross-level abstraction that encapsulates computational graphs, loop-level tensor programs, and external library calls in a single representation. Relax also introduces first-class symbolic shape annotations to track dynamic shape computations globally across the program, enabling dynamic shape-aware cross-level optimizations. We build an end-to-end compilation framework using the proposed approach to optimize dynamic shape models. Experimental results on LLMs show that Relax delivers performance competitive with state-of-the-art systems across various GPUs and enables deployment of emerging models to a broader set of emerging environments, including mobile phones, embedded devices, and web browsers. |
| format | Preprint |
| id |
arxiv_https___arxiv_org_abs_2311_02103 |
| institution | arXiv |
| publishDate | 2023 |
| record_format | arxiv |
| spellingShingle | Relax: Composable Abstractions for End-to-End Dynamic Machine Learning Lai, Ruihang Shao, Junru Feng, Siyuan Lyubomirsky, Steven S. Hou, Bohan Lin, Wuwei Ye, Zihao Jin, Hongyi Jin, Yuchen Liu, Jiawei Jin, Lesheng Cai, Yaxing Jiang, Ziheng Wu, Yong Park, Sunghyun Srivastava, Prakalp Roesch, Jared G. Mowry, Todd C. Chen, Tianqi Machine Learning Artificial Intelligence Programming Languages Dynamic shape computations have become critical in modern machine learning workloads, especially in emerging large language models. The success of these models has driven the demand for their universal deployment across a diverse set of backend environments. In this paper, we present Relax, a compiler abstraction for optimizing end-to-end dynamic machine learning workloads. Relax introduces a cross-level abstraction that encapsulates computational graphs, loop-level tensor programs, and external library calls in a single representation. Relax also introduces first-class symbolic shape annotations to track dynamic shape computations globally across the program, enabling dynamic shape-aware cross-level optimizations. We build an end-to-end compilation framework using the proposed approach to optimize dynamic shape models. Experimental results on LLMs show that Relax delivers performance competitive with state-of-the-art systems across various GPUs and enables deployment of emerging models to a broader set of emerging environments, including mobile phones, embedded devices, and web browsers. |
| title | Relax: Composable Abstractions for End-to-End Dynamic Machine Learning |
| topic | Machine Learning Artificial Intelligence Programming Languages |
| url | https://arxiv.org/abs/2311.02103 |