
Research Article
A Deep Learning Compiler for Vector Processor
@INPROCEEDINGS{10.1007/978-3-030-67720-6_46, author={Pingping Pan and Jun Wu and Songyuan Zhao and Haoqi Ren and Zhifeng Zhang}, title={A Deep Learning Compiler for Vector Processor}, proceedings={Communications and Networking. 15th EAI International Conference, ChinaCom 2020, Shanghai, China, November 20-21, 2020, Proceedings}, proceedings_a={CHINACOM}, year={2021}, month={2}, keywords={Deep learning compiler Target optimization Code generation Vector processor}, doi={10.1007/978-3-030-67720-6_46} }
- Pingping Pan
Jun Wu
Songyuan Zhao
Haoqi Ren
Zhifeng Zhang
Year: 2021
A Deep Learning Compiler for Vector Processor
CHINACOM
Springer
DOI: 10.1007/978-3-030-67720-6_46
Abstract
The technical route of machine learning compiler generally refers to the application of automatic or semi-automatic code generation in the optimization process instead of hand-optimization. This paper presents a deep learning compiler (DLCS) for target vector processor based on LLVM framework, which lowers deep learning (DL) models to an intermediate representation (IR) of two levels. The high-level IR realizes target-independent optimizations including kernel fusion, data replacement and data simplification, while the low-level IR allows the compiler to perform target-dependent optimizations, such as Eight-Slots VLIW and special intrinsic function. The proposed compiler customizes the architecture description of target vector processor to achieve a high-quality automatic code generation. We evaluate the performance comparison between DLCS and hand-optimization when deploying ResNet-18 model and MobileNet model to the target vector processor. Experimental results show that DLCS offers Multi-slot parallel performance for target vector processor and achieves speedups ranging from 1.5× to 3.0× over existing frameworks backed by hand-optimized libraries.