phat 22(5): e1

Research Article

Augmentation of Predictive Competence of Non-Small Cell Lung Cancer Datasets through Feature Pre-Processing Techniques

Download166 downloads
  • @ARTICLE{10.4108/eetpht.v8i5.3169,
        author={M. Sumalatha and Latha Parthiban},
        title={Augmentation of Predictive Competence of Non-Small Cell Lung Cancer Datasets through Feature Pre-Processing Techniques},
        journal={EAI Endorsed Transactions on Pervasive Health and Technology},
        volume={8},
        number={5},
        publisher={EAI},
        journal_a={PHAT},
        year={2022},
        month={11},
        keywords={Non-small Cell Lung Cancer, Competency of Prediction, Relevancy Analysis, Regression Analysis, Cluster Analysis, Feature Pre-Processing (FPP Model), Competency Analytics},
        doi={10.4108/eetpht.v8i5.3169}
    }
    
  • M. Sumalatha
    Latha Parthiban
    Year: 2022
    Augmentation of Predictive Competence of Non-Small Cell Lung Cancer Datasets through Feature Pre-Processing Techniques
    PHAT
    EAI
    DOI: 10.4108/eetpht.v8i5.3169
M. Sumalatha1,*, Latha Parthiban2
  • 1: Periyar University
  • 2: Pondicherry University
*Contact email: latha7sumaphd@gmail.com

Abstract

The major Objective of the Study is to augment the predictive analytics of Non-Small Cell Lung Cancer (NSCLC) datasets with Feature Pre-Processing (FPP) technique in three stages viz. Remove base errors with common analytics on emptiness or non-numerical or missing values in the dataset, remove repeated features through regression analysis and eliminate irrelevant features through clustering methods. The FPP Model is validated using classifiers like simple and complex Tree, Linear and Gaussian SVM, Weighted KNN and Boosted Trees in terms of accuracy, sensitivity, specificity, kappa, positive and negative likelihood. The result showed that the NSCLC dataset formed after FPP outperformed the raw NSCLC dataset in all performance levels and showed good augmentation in predictive analytics of NSCLC datasets. The research proved that preprocessing is essential for better prediction of complex medical datasets.