An Empirical Comparison of Machine Learning Techniques for Software Defect Prediction

Ruchika Malhotra; Rajeev Raje

8th International Conference on Bio-inspired Information and Communications Technologies (formerly BIONETICS)

Research Article

An Empirical Comparison of Machine Learning Techniques for Software Defect Prediction

Cite: BibTeX Plain Text

@INPROCEEDINGS{10.4108/icst.bict.2014.257871,
    author={Ruchika Malhotra and Rajeev Raje},
    title={An Empirical Comparison of Machine Learning Techniques for Software Defect Prediction},
    proceedings={8th International Conference on Bio-inspired Information and Communications Technologies (formerly BIONETICS)},
    publisher={ICST},
    proceedings_a={BICT},
    year={2015},
    month={2},
    keywords={defect prediction object-oriented metrics machine learning empirical validation},
    doi={10.4108/icst.bict.2014.257871}
}

Ruchika Malhotra
Rajeev Raje
Year: 2015
An Empirical Comparison of Machine Learning Techniques for Software Defect Prediction
BICT
ACM
DOI: 10.4108/icst.bict.2014.257871

Ruchika Malhotra¹^,*, Rajeev Raje¹

1: Indiana University Purdue University

*Contact email: ruchmalh@cs.iupui.edu

Abstract

Software systems are exposed to various types of defects. The timely identification of defective classes is essential in early phases of software development to reduce the cost of testing the software. Software metrics can be used in conjunction with defect data to develop models for predicting defective classes. There have been various machine learning techniques proposed in the literature for analyzing complex relationships and extracting useful information from problems in less time. However, more studies comparing these techniques are needed to provide evidence so that confidence is established on the performance of one technique over the other. In this paper we address four issues (i) comparison of the machine learning techniques over unpopular used data sets (ii) use of inappropriate performance measures for measuring the performance of defect prediction models (iii) less use of statistical tests and (iv) validation of models from the same data set from which they are trained. To resolve these issues, in this paper, we compare 18 machine learning techniques for investigating the effect of Object-Oriented metrics on defective classes. The results are validated on six releases of the ‘MMS’ application package of recent widely used mobile operating system – Android. The overall results of the study indicate the predictive capability of the machine learning techniques and an endorsement of one particular ML technique to predict defects.

Keywords: defect prediction object-oriented metrics machine learning empirical validation

Published: 2015-02-02
Publisher: ICST
Appears in: ACM Digital Library

: http://dx.doi.org/10.4108/icst.bict.2014.257871

An Empirical Comparison of Machine Learning Techniques for Software Defect Prediction

Abstract

About EAI

Community

Publish with EAI