Home > Research > Publications & Outputs > Reproducibility and replicability of software d...

Electronic data

  • ist2018FINAL

    Rights statement: This is the author’s version of a work that was accepted for publication in Information and Software Technology. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Information and Software Technology, 99, 2018 DOI: 10.1016/j.infsof.2018.02.003

    Accepted author manuscript, 416 KB, PDF document

    Available under license: CC BY-NC-ND

Links

Text available via DOI:

View graph of relations

Reproducibility and replicability of software defect prediction studies

Research output: Contribution to Journal/MagazineReview articlepeer-review

Published

Standard

Reproducibility and replicability of software defect prediction studies. / Mahmood, Zaheed; Bowes, David; Hall, Tracy et al.
In: Information and Software Technology, Vol. 99, 01.07.2018, p. 148-163.

Research output: Contribution to Journal/MagazineReview articlepeer-review

Harvard

APA

Vancouver

Mahmood Z, Bowes D, Hall T, Lane PCR, Petrić J. Reproducibility and replicability of software defect prediction studies. Information and Software Technology. 2018 Jul 1;99:148-163. Epub 2018 Feb 10. doi: 10.1016/j.infsof.2018.02.003

Author

Mahmood, Zaheed ; Bowes, David ; Hall, Tracy et al. / Reproducibility and replicability of software defect prediction studies. In: Information and Software Technology. 2018 ; Vol. 99. pp. 148-163.

Bibtex

@article{55f480851248425598e15f5c40a066b4,
title = "Reproducibility and replicability of software defect prediction studies",
abstract = "Context: Replications are an important part of scientific disciplines. Replications test the credibility of original studies and can separate true results from those that are unreliable. Objective: In this paper we investigate the replication of defect prediction studies and identify the characteristics of replicated studies. We further assess how defect prediction replications are performed and the consistency of replication findings. Method: Our analysis is based on tracking the replication of 208 defect prediction studies identified by a highly cited Systematic Literature Review (SLR) [1]. We identify how often each of these 208 studies has been replicated and determine the type of replication carried out. We identify quality, citation counts, publication venue, impact factor, and data availability from all 208 SLR defect prediction papers to see if any of these factors are associated with the frequency with which they are replicated. Results: Only 13 (6%) of the 208 studies are replicated. Replication seems related to original papers appearing in the Transactions of Software Engineering (TSE) journal. The number of citations an original paper had was also an indicator of replications. In addition, studies conducted using closed source data seems to have more replications than those based on open source data. Where a paper has been replicated, 11 (38%) out of 29 studies revealed different results to the original study. Conclusion: Very few defect prediction studies are replicated. The lack of replication means that it remains unclear how reliable defect prediction is. We provide practical steps for improving the state of replication.",
keywords = "Replication, Reproducibility, Software defect prediction",
author = "Zaheed Mahmood and David Bowes and Tracy Hall and Lane, {Peter C.R.} and Jean Petri{\'c}",
note = "This is the author{\textquoteright}s version of a work that was accepted for publication in Information and Software Technology. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Information and Software Technology, 99, 2018 DOI: 10.1016/j.infsof.2018.02.003",
year = "2018",
month = jul,
day = "1",
doi = "10.1016/j.infsof.2018.02.003",
language = "English",
volume = "99",
pages = "148--163",
journal = "Information and Software Technology",
issn = "0950-5849",
publisher = "Elsevier",

}

RIS

TY - JOUR

T1 - Reproducibility and replicability of software defect prediction studies

AU - Mahmood, Zaheed

AU - Bowes, David

AU - Hall, Tracy

AU - Lane, Peter C.R.

AU - Petrić, Jean

N1 - This is the author’s version of a work that was accepted for publication in Information and Software Technology. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Information and Software Technology, 99, 2018 DOI: 10.1016/j.infsof.2018.02.003

PY - 2018/7/1

Y1 - 2018/7/1

N2 - Context: Replications are an important part of scientific disciplines. Replications test the credibility of original studies and can separate true results from those that are unreliable. Objective: In this paper we investigate the replication of defect prediction studies and identify the characteristics of replicated studies. We further assess how defect prediction replications are performed and the consistency of replication findings. Method: Our analysis is based on tracking the replication of 208 defect prediction studies identified by a highly cited Systematic Literature Review (SLR) [1]. We identify how often each of these 208 studies has been replicated and determine the type of replication carried out. We identify quality, citation counts, publication venue, impact factor, and data availability from all 208 SLR defect prediction papers to see if any of these factors are associated with the frequency with which they are replicated. Results: Only 13 (6%) of the 208 studies are replicated. Replication seems related to original papers appearing in the Transactions of Software Engineering (TSE) journal. The number of citations an original paper had was also an indicator of replications. In addition, studies conducted using closed source data seems to have more replications than those based on open source data. Where a paper has been replicated, 11 (38%) out of 29 studies revealed different results to the original study. Conclusion: Very few defect prediction studies are replicated. The lack of replication means that it remains unclear how reliable defect prediction is. We provide practical steps for improving the state of replication.

AB - Context: Replications are an important part of scientific disciplines. Replications test the credibility of original studies and can separate true results from those that are unreliable. Objective: In this paper we investigate the replication of defect prediction studies and identify the characteristics of replicated studies. We further assess how defect prediction replications are performed and the consistency of replication findings. Method: Our analysis is based on tracking the replication of 208 defect prediction studies identified by a highly cited Systematic Literature Review (SLR) [1]. We identify how often each of these 208 studies has been replicated and determine the type of replication carried out. We identify quality, citation counts, publication venue, impact factor, and data availability from all 208 SLR defect prediction papers to see if any of these factors are associated with the frequency with which they are replicated. Results: Only 13 (6%) of the 208 studies are replicated. Replication seems related to original papers appearing in the Transactions of Software Engineering (TSE) journal. The number of citations an original paper had was also an indicator of replications. In addition, studies conducted using closed source data seems to have more replications than those based on open source data. Where a paper has been replicated, 11 (38%) out of 29 studies revealed different results to the original study. Conclusion: Very few defect prediction studies are replicated. The lack of replication means that it remains unclear how reliable defect prediction is. We provide practical steps for improving the state of replication.

KW - Replication

KW - Reproducibility

KW - Software defect prediction

U2 - 10.1016/j.infsof.2018.02.003

DO - 10.1016/j.infsof.2018.02.003

M3 - Review article

AN - SCOPUS:85043273528

VL - 99

SP - 148

EP - 163

JO - Information and Software Technology

JF - Information and Software Technology

SN - 0950-5849

ER -