About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Communications and Networking. 17th EAI International Conference, Chinacom 2022, Virtual Event, November 19-20, 2022, Proceedings

Research Article

A Reconfigurable Convolutional Neural Networks Accelerator Based on FPGA

Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-031-34790-0_20,
        author={Yalin Tang and Haoqi Ren and Zhifeng Zhang},
        title={A Reconfigurable Convolutional Neural Networks Accelerator Based on FPGA},
        proceedings={Communications and Networking. 17th EAI International Conference, Chinacom 2022, Virtual Event, November 19-20, 2022, Proceedings},
        proceedings_a={CHINACOM},
        year={2023},
        month={6},
        keywords={convolutional neural network depthwise convolution quantization hardware accelerator EfficientNet},
        doi={10.1007/978-3-031-34790-0_20}
    }
    
  • Yalin Tang
    Haoqi Ren
    Zhifeng Zhang
    Year: 2023
    A Reconfigurable Convolutional Neural Networks Accelerator Based on FPGA
    CHINACOM
    Springer
    DOI: 10.1007/978-3-031-34790-0_20
Yalin Tang,*, Haoqi Ren1, Zhifeng Zhang1
  • 1: School of Electronics and Information Engineering
*Contact email: 2030815@tongji.edu.cn

Abstract

With the development of lightweight convolutional neural networks (CNNs), these newly proposed networks are more powerful than previous conventional models [4,5] and can be well applied in Internet-of-Things (IoT) and edge computing. However, they perform inefficiently on conventional hardware accelerators because of the irregular connectivity in the structure. Though there are some accelerators based on unified engine (UE) architecture or separated engine (SE) architecture which can perform well for both standard convolution and depthwise convolution, these versatile structures are still not efficient for lightweight CNNs such as EfficientNet-lite. In this paper, we propose a reconfigurable engine (RE) architecture to improve the efficiency, which is used in communications such as IoT and edge computing. In addition, we adopt integer quantization method to reduce computational complexity and memory access. Also, the block-based calculation scheme is used to further reduce the off-chip memory access and the unique computational mode is used to improve the utilization of the processing elements. The proposed architecture can be implemented on Xilinx ZC706 with a 100 MHz system clock for EfficientNet-lite0. Our accelerator achieved 196 FPS and 72.9% top-1 accuracy on ImageNet classification, which is 27% and 18% speedup compared to CPU and GPU of Pixel 4 respectively.

Keywords
convolutional neural network depthwise convolution quantization hardware accelerator EfficientNet
Published
2023-06-10
Appears in
SpringerLink
http://dx.doi.org/10.1007/978-3-031-34790-0_20
Copyright © 2022–2025 ICST
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL