Quality, Reliability, Security and Robustness in Heterogeneous Systems. 15th EAI International Conference, QShine 2019, Shenzhen, China, November 22–23, 2019, Proceedings

Research Article

Accelerating Face Detection Algorithm on the FPGA Using SDAccel

  • @INPROCEEDINGS{10.1007/978-3-030-38819-5_10,
        author={Jie Wang and Wei Leng},
        title={Accelerating Face Detection Algorithm on the FPGA Using SDAccel},
        proceedings={Quality, Reliability, Security and Robustness in Heterogeneous Systems. 15th EAI International Conference, QShine 2019, Shenzhen, China, November 22--23, 2019, Proceedings},
        proceedings_a={QSHINE},
        year={2020},
        month={1},
        keywords={FPGA Heterogeneous Face detection Architecture High-level synthesis SDAccel},
        doi={10.1007/978-3-030-38819-5_10}
    }
    
  • Jie Wang
    Wei Leng
    Year: 2020
    Accelerating Face Detection Algorithm on the FPGA Using SDAccel
    QSHINE
    Springer
    DOI: 10.1007/978-3-030-38819-5_10
Jie Wang1,*, Wei Leng1
  • 1: Dalian University of Technology
*Contact email: wangjie1003@163.com

Abstract

In recent years, with the rapid growth of big data and computation, high-performance computing and heterogeneous computing have been widely concerned. In object detection algorithms, people tend to pay less attention to training time, but more attention to algorithm running time, energy efficiency ratio and processing delay. FPGA can achieve data parallel operation, low power, low latency and reprogramming, providing powerful computing power and enough flexibility. In this paper, SDAccel tool of Xilinx is used to implement a heterogeneous computing platform for face detection based on CPU+FPGA, in which FPGA is used as a coprocessor to accelerate face detection algorithm. A high-level synthesis (HLS) approach allows developers to focus more on the architecture of the design and lowers the development threshold for software developers. The implementation of Viola Jones face detection algorithm on FPGA is taken as an example to demonstrate the development process of SDAccel, and explore the potential parallelism of the algorithm, as well as how to optimize the hardware circuit with high-level language. Our final design is 70 times faster than a single-threaded CPU.