Software defect prediction framework based on hybrid metaheuristic optimization methods

Wahono, Romi Satria (2015) Software defect prediction framework based on hybrid metaheuristic optimization methods. Doctoral thesis, Universiti Teknikal Malaysia Melaka.

[img] Text (24 Pages)
Software Defect Prediction Framework Based On Hybrid Metaheuristic Optimization Methods.pdf - Submitted Version

Download (257kB)
[img] Text (Full Text)
Software defect prediction framework based on hybrid metaheuristic optimization methods.pdf - Submitted Version
Restricted to Registered users only

Download (5MB)

Abstract

A software defect is an error, failure, or fault in a software that produces an incorrect or unexpected result. Software defects are expensive in quality and cost. The accurate prediction of defect‐prone software modules certainly assist testing effort, reduce costs and improve the quality of software. The classification algorithm is a popular machine learning approach for software defect prediction. Unfortunately, software defect prediction remains a largely unsolved problem. As the first problem, the comparison and benchmarking results of the defect prediction using machine learning classifiers indicate that, the poor accuracy level is dominant and no particular classifiers perform best for all the datasets. There are two main problems that affect classification performance in software defect prediction: noisy attributes and imbalanced class distribution of datasets, and difficulty of selecting optimal parameters of the classifiers. In this study, a software defect prediction framework that combines metaheuristic optimization methods for feature selection and parameter optimization, with meta learning methods for solving imbalanced class problem on datasets, which aims to improve the accuracy of classification models has been proposed. The proposed framework and models that are are considered to be the specific research contributions of this thesis are: 1) a comparison framework of classification models for software defect prediction known as CF-SDP, 2) a hybrid genetic algorithm based feature selection and bagging technique for software defect prediction known as GAFS+B, 3) a hybrid particle swarm optimization based feature selection and bagging technique for software defect prediction known as PSOFS+B, and 4) a hybrid genetic algorithm based neural network parameter optimization and bagging technique for software defect prediction, known as NN-GAPO+B. For the purpose of this study, ten classification algorithms have been selected. The selection aims at achieving a balance between established classification algorithms used in software defect prediction. The proposed framework and methods are evaluated using the state-of-the-art datasets from the NASA metric data repository. The results indicated that the proposed methods (GAFS+B, PSOFS+B and NN-GAPO+B) makes an impressive improvement in the performance of software defect prediction. GAFS+B and PSOFS+B significantly affected on the performance of the class imbalance suffered classifiers, such as C4.5 and CART. GAFS+B and PSOFS+B also outperformed the existing software defect prediction frameworks in most datasets. Based on the conducted experiments, logistic regression performs best in most of the NASA MDP datasets, without or with feature selection method. The proposed methods also generated the selected relevant features in software defect prediction. The top ten most relevant features in software defect prediction include branch count metrics, decision density, halstead level metric of a module, number of operands contained in a module, maintenance severity, number of blank LOC, halstead volume, number of unique operands contained in a module, total number of LOC and design density.

Item Type: Thesis (Doctoral)
Uncontrolled Keywords: Software failures, Computer software, Quality control, Software failures, Prevention, Software Defect Prediction, Hybrid Metaheuristic Optimization Methods
Subjects: Q Science > Q Science (General)
Q Science > QA Mathematics
Divisions: Library > Tesis > FTMK
Depositing User: Mohd Hannif Jamaludin
Date Deposited: 05 Aug 2016 02:15
Last Modified: 02 Jun 2022 10:39
URI: http://eprints.utem.edu.my/id/eprint/16874
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item