About | Contact Us | Register | Login
ProceedingsSeriesJournalsSearchEAI
Proceedings of the 2nd International Conference on Machine Learning and Automation, CONF-MLA 2024, November 21, 2024, Adana, Turkey

Research Article

Optimizing Human Pose Estimation Using a Simplified UNet Architecture: An Experimental Analysis on Depth and Width Parameters

Download90 downloads
Cite
BibTeX Plain Text
  • @INPROCEEDINGS{10.4108/eai.21-11-2024.2354631,
        author={Shenghao  Ren},
        title={Optimizing Human Pose Estimation Using a Simplified UNet Architecture: An Experimental Analysis on Depth and Width Parameters},
        proceedings={Proceedings of the 2nd International Conference on Machine Learning and Automation, CONF-MLA 2024, November 21, 2024, Adana, Turkey},
        publisher={EAI},
        proceedings_a={CONF-MLA},
        year={2025},
        month={3},
        keywords={human pose estimation human keypoint detection network structure adjustment unet lsp dataset},
        doi={10.4108/eai.21-11-2024.2354631}
    }
    
  • Shenghao Ren
    Year: 2025
    Optimizing Human Pose Estimation Using a Simplified UNet Architecture: An Experimental Analysis on Depth and Width Parameters
    CONF-MLA
    EAI
    DOI: 10.4108/eai.21-11-2024.2354631
Shenghao Ren1,*
  • 1: Tongji University, Shanghai, China
*Contact email: 2252452@tongji.edu.cn

Abstract

Human pose estimation (HPE) is a significant problem in the field of computer vision, with wide applications in action recognition, intelligent surveillance, and other areas. With the development of deep learning, the accuracy of pose estimation has significantly improved. However, high-precision pose estimation models typically have complex network structures and high computational costs, making them difficult to apply in resource-constrained or real-time scenarios. To address this issue, this paper proposes a simple convolutional neural network named SimpleUNet based on UNet, utilizing a dataset of 2,000 athlete images and their annotated images with 14 visualized joints to perform human keypoint detection tasks. In SimpleUNet, we designed two adjustable parameters to control the depth and width of the network structure: the number of convolutional modules in the encoder and decoder, which defines the depth, and the number of channels in the network, which defines the width. We adjusted the depth from 10 to 100 in steps of 10 and the width from 1 to 9 in steps of 1, conducting a total of 90 experiments. We recorded the best model as well as information on loss, accuracy, and mIoU to analyze the relationship between the complexity of the model network and its performance in human keypoint detection. We ultimately found that moderate depth and width provide the best pose estimation performance, while excessively large or small depth and width each have their drawbacks.

Keywords
human pose estimation human keypoint detection network structure adjustment unet lsp dataset
Published
2025-03-11
Publisher
EAI
http://dx.doi.org/10.4108/eai.21-11-2024.2354631
Copyright © 2024–2025 EAI
EBSCOProQuestDBLPDOAJPortico
EAI Logo

About EAI

  • Who We Are
  • Leadership
  • Research Areas
  • Partners
  • Media Center

Community

  • Membership
  • Conference
  • Recognition
  • Sponsor Us

Publish with EAI

  • Publishing
  • Journals
  • Proceedings
  • Books
  • EUDL