Machine learning model deployment for training and execution has been an important topic for industry and academic research in the last decade. Much of the attention has been focused on developing specific toolchains to support acceleration hardware. In this paper, we present IREE, a unified compiler and runtime stack with the explicit goal to scale down machine learning programs to the smallest footprints for mobile and edge devices, while maintaining the ability to scale up to larger deployment targets. IREE adopts a compiler-based approach and optimizes for heterogeneous hardware accelerators through the use of the MLIR compiler infrastructure which provides the means to quickly design and implement multi-level compiler intermediate representations (IR). Specifically, this paper is focused on TinyIREE, which is a set of deployment options in IREE that accommodate the embedded systems and bare-metal platforms, while also demonstrating IREE’s intuitive workflow for different ISA extensions and ABIs through LLVM.
TinyIREE: An ML Execution Environment for Embedded Systems from Compilation to Deployment
TinyIREE: An ML Execution Environment for Embedded Systems from Compilation to Deployment.