Home > Research > Publications & Outputs > Code cleaning for software defect prediction


Text available via DOI:

View graph of relations

Code cleaning for software defect prediction: A cautionary tale

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Publication date29/08/2018
Host publication2018 44th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)
Number of pages5
ISBN (Electronic)9781538673836
<mark>Original language</mark>English


In this paper, we describe our experience of developing a new technique to improve defect prediction (code cleaning) which performed very encouragingly on the first two systems on which we evaluated it (both systems had their origins in one company). Code cleaning also worked well on an additional open source system (Eclipse). But our code cleaning technique then performed disappointingly on all 69 subsequent open source systems on which we evaluated it. Without our round two
evaluations on these 69 open source systems we would have published misleading prediction results. We discuss the need for performance evaluations to be performed on carefully selected samples of systems if reliable conclusions are to be drawn.