Dimensionality Reduction for Handwritten Digit Recognition

Ankita  Das; Tuhin Kundu; Chandran Saravanan

Research Article

Dimensionality Reduction for Handwritten Digit Recognition

Download2998 downloads

Cite: BibTeX Plain Text

@ARTICLE{10.4108/eai.12-2-2019.156590,
    author={Ankita  Das and Tuhin Kundu and Chandran Saravanan},
    title={Dimensionality Reduction for Handwritten Digit Recognition},
    journal={EAI Endorsed Transactions on Cloud Systems},
    volume={4},
    number={13},
    publisher={EAI},
    journal_a={CS},
    year={2018},
    month={12},
    keywords={Dimensionality Reduction, Feature Descriptors, HOG, Gabor, PCA, LDA, Isomap, SVM, Classification},
    doi={10.4108/eai.12-2-2019.156590}
}

Ankita Das
Tuhin Kundu
Chandran Saravanan
Year: 2018
Dimensionality Reduction for Handwritten Digit Recognition
CS
EAI
DOI: 10.4108/eai.12-2-2019.156590

Ankita Das¹, Tuhin Kundu¹^,*, Chandran Saravanan²

1: Computer Science and Engineering, Jalpaiguri Government Engineering College, Jalpaiguri, India
2: Computer Science and Engineering, National Institute of Technology, Durgapur, India

*Contact email: tuhinkundu@outlook.com

Abstract

Human perception of dimensions is usually limited to two or three degrees. Any further increase in the number of dimensions usually leads to the diﬃculty in visual imagination for any person. Hence, machine learning researchers often commonly have to overcome the curse of dimensionality in high dimensional feature sets with dimensionality reduction techniques. In this proposed model, two handwritten digit datasets are used: CVL Single Digit and MNIST, and two popular feature descriptors, Histogram of Oriented Gradients (HOG) and Gabor ﬁlters, are used to generate the feature sets. Investigations are carried out on linear and nonlinear transformations of the feature sets using multiple dimensionality reduction techniques such as Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and Isomap. The lower dimension vectors obtained, are then used to classify the numeric digits using Support Vector Machine (SVM). A conclusion arrived is that using HOG as the feature descriptor and PCA as the dimensionality reduction technique resulted in the experimental model achieving the highest accuracy of 99.29% on the MNIST dataset with the time eﬃciency comparable to that of a convolutional neural network (CNN). Further, it is concluded that even though the LDA model with HOG as the feature descriptor achieved a lesser accuracy of 98.34%, but it was able to capture maximum information in just 9 components in its lower dimensional subspace with 75% reduction in time eﬃciency of that of the PCA-HOG model and the CNN model.

Keywords: Dimensionality Reduction, Feature Descriptors, HOG, Gabor, PCA, LDA, Isomap, SVM, Classification

Received: 2018-11-03
Accepted: 2018-11-15
Published: 2018-12-07
Publisher: EAI

: http://dx.doi.org/10.4108/eai.12-2-2019.156590

Copyright © 2018 Ankita Das et al., licensed to EAI. This is an open access article distributed under the terms of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/), which permits unlimited use, distribution and reproduction in any medium so long as the original work is properly cited.