Building an Ensemble for Software Defect Prediction Based on Diversity Selection

Associated organisational units

Electronic data

ESEM2016_paper_157
Rights statement: © ACM, 2016. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ESEM '16 Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement http://dx.doi.org/10.1145/2961111.2962610
Accepted author manuscript, 297 KB, PDF document
Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

Text available via DOI:

https://doi.org/10.1145/2961111.2962610
Final published version

Keywords

diversity, ensembles of learning machines, Software defect prediction, software faults, stacking

View graph of relations

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Published

Jean Petrić
David Bowes
Tracy Hall
Bruce Christianson
Nathan Baddoo

More...

Publication date	8/09/2016
Host publication	ESEM '16 Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement
Place of Publication	New York
Publisher	Association for Computing Machinery, Inc
Number of pages	10
ISBN (electronic)	9781450344272
<mark>Original language</mark>	English
Event	10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2016 - Ciudad Real, Spain Duration: 8/09/2016 → 9/09/2016

Conference

Conference	10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2016
Country/Territory	Spain
City	Ciudad Real
Period	8/09/16 → 9/09/16

Conference

Conference	10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, ESEM 2016
Country/Territory	Spain
City	Ciudad Real
Period	8/09/16 → 9/09/16

Abstract

Background: Ensemble techniques have gained attention in various scientific fields. Defect prediction researchers have investigated many state-of-the-art ensemble models and concluded that in many cases these outperform standard single classifier techniques. Almost all previous work using ensemble techniques in defect prediction rely on the majority voting scheme for combining prediction outputs, and on the implicit diversity among single classifiers. Aim: Investigate whether defect prediction can be improved using an explicit diversity technique with stacking ensemble, given the fact that different classifiers identify different sets of defects. Method: We used classifiers from four different families and the weighted accuracy diversity (WAD) technique to exploit diversity amongst classifiers. To combine individual predictions, we used the stacking ensemble technique. We used state-of-the-art knowledge in software defect prediction to build our ensemble models, and tested their prediction abilities against 8 publicly available data sets. Conclusion: The results show performance improvement using stacking ensembles compared to other defect prediction models. Diversity amongst classifiers used for building ensembles is essential to achieving these performance improvements.

Bibliographic note

© ACM, 2016. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in ESEM '16 Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement http://dx.doi.org/10.1145/2961111.2962610

Research

Associated organisational units

Electronic data

Links

Text available via DOI:

Keywords