
Research Article
A DNN Inference Acceleration Algorithm in Heterogeneous Edge Computing: Joint Task Allocation and Model Partition
@INPROCEEDINGS{10.1007/978-3-030-67537-0_15, author={Lei Shi and Zhigang Xu and Yi Shi and Yuqi Fan and Xu Ding and Yabo Sun}, title={A DNN Inference Acceleration Algorithm in Heterogeneous Edge Computing: Joint Task Allocation and Model Partition}, proceedings={Collaborative Computing: Networking, Applications and Worksharing. 16th EAI International Conference, CollaborateCom 2020, Shanghai, China, October 16--18, 2020, Proceedings, Part I}, proceedings_a={COLLABORATECOM}, year={2021}, month={1}, keywords={Task allocation Model partition Edge computing Edge intelligence}, doi={10.1007/978-3-030-67537-0_15} }
- Lei Shi
Zhigang Xu
Yi Shi
Yuqi Fan
Xu Ding
Yabo Sun
Year: 2021
A DNN Inference Acceleration Algorithm in Heterogeneous Edge Computing: Joint Task Allocation and Model Partition
COLLABORATECOM
Springer
DOI: 10.1007/978-3-030-67537-0_15
Abstract
Edge intelligence, as a new computing paradigm, aims to allocate Artificial Intelligence (AI)-based tasks partly on the edge to execute for reducing latency, consuming energy and improving privacy. As the most important technique of AI, Deep Neural Networks (DNN) has been widely used in various fields. And for those DNN based tasks, a new computing scheme named DNN model partition can further reduce the execution time. This computing scheme partitions the DNN task into two parts, one will be executed on the end devices and the other will be executed on edge servers. However, in a complex edge computing system, it is difficult to coordinate DNN model partition and task allocation. In this work, we study this problem in the heterogeneous edge computing system. We first establish the mathematical model of adaptive DNN model partition and task offloading. The mathematical model contains a large number of binary variables, and the solution space will be too large to be solved directly in a multi-task scenario. Then we use dynamic programming and greedy strategy to reduce the solution space under the premise of a good solution, and propose our offline algorithm named GSPI. Then considering the actual situation, we subsequently proposed the online algorithm. Through our experiments and simulations, we proved that our proposed GSPI algorithm can reduce the system time cost by at least 32% and the online algorithm can reduce the system time cost by at least 24%.