About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Communications and Networking. 15th EAI International Conference, ChinaCom 2020, Shanghai, China, November 20-21, 2020, Proceedings

Research Article

A Deep Learning Compiler for Vector Processor

Download(Requires a free EAI acccount)
2 downloads
Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-030-67720-6_46,
        author={Pingping Pan and Jun Wu and Songyuan Zhao and Haoqi Ren and Zhifeng Zhang},
        title={A Deep Learning Compiler for Vector Processor},
        proceedings={Communications and Networking. 15th EAI International Conference, ChinaCom 2020, Shanghai, China, November 20-21, 2020,  Proceedings},
        proceedings_a={CHINACOM},
        year={2021},
        month={2},
        keywords={Deep learning compiler Target optimization Code generation Vector processor},
        doi={10.1007/978-3-030-67720-6_46}
    }
    
  • Pingping Pan
    Jun Wu
    Songyuan Zhao
    Haoqi Ren
    Zhifeng Zhang
    Year: 2021
    A Deep Learning Compiler for Vector Processor
    CHINACOM
    Springer
    DOI: 10.1007/978-3-030-67720-6_46
Pingping Pan1, Jun Wu2,*, Songyuan Zhao1, Haoqi Ren1, Zhifeng Zhang1
  • 1: Department of Computer Science
  • 2: School of Computer Science
*Contact email: wujun@fudan.edu.cn

Abstract

The technical route of machine learning compiler generally refers to the application of automatic or semi-automatic code generation in the optimization process instead of hand-optimization. This paper presents a deep learning compiler (DLCS) for target vector processor based on LLVM framework, which lowers deep learning (DL) models to an intermediate representation (IR) of two levels. The high-level IR realizes target-independent optimizations including kernel fusion, data replacement and data simplification, while the low-level IR allows the compiler to perform target-dependent optimizations, such as Eight-Slots VLIW and special intrinsic function. The proposed compiler customizes the architecture description of target vector processor to achieve a high-quality automatic code generation. We evaluate the performance comparison between DLCS and hand-optimization when deploying ResNet-18 model and MobileNet model to the target vector processor. Experimental results show that DLCS offers Multi-slot parallel performance for target vector processor and achieves speedups ranging from 1.5× to 3.0× over existing frameworks backed by hand-optimized libraries.

Keywords
Deep learning compiler Target optimization Code generation Vector processor
Published
2021-02-02
Appears in
SpringerLink
http://dx.doi.org/10.1007/978-3-030-67720-6_46
Copyright © 2020–2025 ICST
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL