About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Communications and Networking. 17th EAI International Conference, Chinacom 2022, Virtual Event, November 19-20, 2022, Proceedings

Research Article

Optimization of Tensor Operation in Compiler

Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-031-34790-0_16,
        author={Chenguang Qiu and Jun Wu and Haoqi Ren and Zhifeng Zhang},
        title={Optimization of Tensor Operation in Compiler},
        proceedings={Communications and Networking. 17th EAI International Conference, Chinacom 2022, Virtual Event, November 19-20, 2022, Proceedings},
        proceedings_a={CHINACOM},
        year={2023},
        month={6},
        keywords={MLIR Deep learning Compiler Vector processor},
        doi={10.1007/978-3-031-34790-0_16}
    }
    
  • Chenguang Qiu
    Jun Wu
    Haoqi Ren
    Zhifeng Zhang
    Year: 2023
    Optimization of Tensor Operation in Compiler
    CHINACOM
    Springer
    DOI: 10.1007/978-3-031-34790-0_16
Chenguang Qiu1,*, Jun Wu2, Haoqi Ren1, Zhifeng Zhang1
  • 1: Department of Computer Science
  • 2: School of Computer Science
*Contact email: chenguangqcg@163.com

Abstract

This paper proposes an AI compiler architecture, which can compile the trained model and deploy it on DSP chip. The biggest difficulty in deploying the reasoning model on DSP is the multiplication between tensors. Tensor multiplication is the main operation and the most time-consuming operation in the process of model reasoning. Therefore, the operation efficiency of tensor multiplication directly restricts the performance of reasoning. However, there is no matrix computing unit in DSP chip, instead of vector computing unit. We define a new dialect in MLIR(Multi-Level Intermediate Representation) to efficiently compile AI models, especially GEMM and conv operations. The dialect is based on the basic features of mhlo, so this new dialect can make full use of the existing optimized pass of mhlo. Moreover, we have added some functions to support architecture related optimization, mainly the lower algorithm of operation, such as GEMM and conv. we finally map dialect to LLVM dialect and convert it into LLVM IR(immediate representation). The advantage of converting to LLVM IR is that more detailed instruction scheduling can be carried out at the backend of the compiler. We compare the efficiency of a speech model in the code generated by the traditional compiler clang and the code generated by our compiler. The experimental results show that this conversion method has greatly improved the efficiency.

Keywords
MLIR Deep learning Compiler Vector processor
Published
2023-06-10
Appears in
SpringerLink
http://dx.doi.org/10.1007/978-3-031-34790-0_16
Copyright © 2022–2025 ICST
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL