sis 19(20): e2

Research Article

A Hybrid Approach for Breast Cancer Classification and Diagnosis

Download5467 downloads
  • @ARTICLE{10.4108/eai.19-12-2018.156086,
        author={Bibhuprasad Sahu and Sachi Nandan Mohanty and Saroj Kumar Rout},
        title={A Hybrid Approach for Breast Cancer Classification and Diagnosis},
        journal={EAI Endorsed Transactions on Scalable Information Systems},
        keywords={Breast cancer diagnosis, feature selection, PCA, ANN},
  • Bibhuprasad Sahu
    Sachi Nandan Mohanty
    Saroj Kumar Rout
    Year: 2019
    A Hybrid Approach for Breast Cancer Classification and Diagnosis
    DOI: 10.4108/eai.19-12-2018.156086
Bibhuprasad Sahu1,*, Sachi Nandan Mohanty2, Saroj Kumar Rout2
  • 1: Research Scholar, North Orissa University, Baripada, Odisha
  • 2: Gandhi Institute for Technology, Bhubaneswar, Odisha
*Contact email:


Feature selection in breast cancer disease important and risky task for further analysis. Breast cancer is the second leading reason for death among the women. Cancer starts from breast and spread to other part of the body. People are unable to identify their disease before it become dangerous. It can be cured if the disease identified at early stage. Accurate classification of benign tumours can avoid patients undergoing unnecessary treatments. Data Analytics and machine learning methods provides framework for prognostic studies by errorless classification of data instances into relevant based on the cancer severity. In this study we have purposed a prediction model by combining artificial intelligent based learning technique with multivariate statistical method. For automation of the diagnosis process data mining plays an significant role. The data sets available in different repositories are noisy in nature. This study suggests a hybrid feature selection method to be used with PCA (Principal Component Analysis) and Artificial Neural Network (ANN). Preprocessing of data and extracting the most relevant features done by PCA. The proposed algorithm is tested by applying it on Wisconsin Breast Cancer Dataset from UCI Repository of Machine Learning Databases. In classification phase 10 fold cross validation was used. The suggested algorithm was measured against different classifier algorithms on the same database. The evaluation results of the algorithm proposed have achieved better accuracy with sensitivity and F measure comparison with others and by enhancing this concept we can provide a future scope to produce sophisticated learning models for diagnosis.