Research Article
An Empirical Study of Predictive Modeling Techniques of Software Quality
@INPROCEEDINGS{10.1007/978-3-642-32615-8_29, author={Taghi Khoshgoftaar and Kehan Gao and Amri Napolitano}, title={An Empirical Study of Predictive Modeling Techniques of Software Quality}, proceedings={Bio-Inspired Models of Network, Information, and Computing Systems. 5th International ICST Conference, BIONETICS 2010, Boston, USA, December 1-3, 2010, Revised Selected Papers}, proceedings_a={BIONETICS}, year={2012}, month={10}, keywords={filter-based feature ranking techniques software defect prediction software metrics software quality}, doi={10.1007/978-3-642-32615-8_29} }
- Taghi Khoshgoftaar
Kehan Gao
Amri Napolitano
Year: 2012
An Empirical Study of Predictive Modeling Techniques of Software Quality
BIONETICS
Springer
DOI: 10.1007/978-3-642-32615-8_29
Abstract
The primary goal of software quality engineering is to apply various techniques and processes to produce a high quality software product. One strategy is applying data mining techniques to software metrics and defect data collected during the software development process to identify the potential low-quality program modules. In this paper, we investigate the use of feature selection in the context of software quality estimation (also referred to as software defect prediction), where a classification model is used to predict program modules (instances) as fault-prone or not-fault-prone. Seven filter-based feature ranking techniques are examined. Among them, six are commonly used, and the other one, named (SNR), is rarely employed. The objective of the paper is to compare these seven techniques for various software data sets and assess their effectiveness for software quality modeling. A case study is performed on 16 software data sets and classification models are built with five different learners. Our experimental results are summarized based on statistical tests for significance. The main conclusion is that the SNR technique performs better than or similar to the best performer of the six commonly used techniques.