About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Broadband Communications, Networks, and Systems. 13th EAI International Conference, BROADNETS 2022, Virtual Event, March 12-13, 2023 Proceedings

Research Article

An Empirical Study on Model Pruning and Quantization

Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.1007/978-3-031-40467-2_7,
        author={Yuzhe Tian and Tom H. Luan and Xi Zheng},
        title={An Empirical Study on Model Pruning and Quantization},
        proceedings={Broadband Communications, Networks, and Systems. 13th EAI International Conference, BROADNETS 2022, Virtual Event, March 12-13, 2023 Proceedings},
        proceedings_a={BROADNETS},
        year={2023},
        month={7},
        keywords={Model compression Deep neural network Edge computing},
        doi={10.1007/978-3-031-40467-2_7}
    }
    
  • Yuzhe Tian
    Tom H. Luan
    Xi Zheng
    Year: 2023
    An Empirical Study on Model Pruning and Quantization
    BROADNETS
    Springer
    DOI: 10.1007/978-3-031-40467-2_7
Yuzhe Tian1, Tom H. Luan2, Xi Zheng1,*
  • 1: School of Computing, Macquarie University, Macquarie Park
  • 2: School of Cyber Engineering, Xidian University, Xi’an
*Contact email: james.zheng@mq.edu.au

Abstract

In machine learning, model compression is vital for resource-constrained Internet of Things (IoT) devices, such as unmanned aerial vehicles (UAVs) and smart phones. Currently there are some state-of-the-art (SOTA) compression methods, but little study is conducted to evaluate these techniques across different models and datasets. In this paper, we present an in-depth study on two SOTA model compression methods, pruning and quantization. We apply these methods on AlexNet, ResNet18, VGG16BN and VGG19BN, with three well known datasets,Fashion-MNIST,CIFAR-10, andUCI-HAR. Through our study, we draw the conclusion that, applying pruning and retraining could keep the performance (average less than(0.5\%)degrade) while reducing the model size (at(10\times )compression rate) on spatial domain datasets (e.g.pictures); the performance on temporal domain datasets (e.g.motion sensors data) degrades more (average about(5.0\%)degrade); the performance of quantization is related with the pruning rate and the network architecture. We also compare different clustering methods and reveal the impact on model accuracy and quantization ratio. Finally, we provide some interesting directions for future research.

Keywords
Model compression Deep neural network Edge computing
Published
2023-07-30
Appears in
SpringerLink
http://dx.doi.org/10.1007/978-3-031-40467-2_7
Copyright © 2023–2025 ICST
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL