Home > Research > Publications & Outputs > Which Software Faults Are Tests Not Detecting?

Electronic data

  • Full paper

    Rights statement: © ACM, 2020. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in PROCEEDINGS of EASE 2020: Evaluation and Assessment in Software Engineering http://doi.acm.org/10.1145/3383219.3383236

    Accepted author manuscript, 174 KB, PDF document

    Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

Links

Text available via DOI:

View graph of relations

Which Software Faults Are Tests Not Detecting?

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published
Publication date15/04/2020
Host publicationPROCEEDINGS of EASE 2020: Evaluation and Assessment in Software Engineering
Place of PublicationNew York
PublisherACM
Pages160-169
Number of pages10
ISBN (print)9781450377317
<mark>Original language</mark>English

Abstract

Context: Software testing plays an important role in assuring the reliability of systems. Assessing the efficacy of testing remains challenging with few established test effectiveness metrics. Those metrics that have been used (e.g. coverage and mutation analysis) have been criticised for insufficiently differentiating between the faults detected by tests. Objective: We investigate how effective tests are at detecting different types of faults and whether some types of fault evade tests more than others. Our aim is to suggest to developers specific ways in which their tests need to be improved to increase fault detection. Method: We investigate seven fault types and analyse how often each goes undetected in 10 open source systems. We statistically look for any relationship between the test set and faults. Results: Our results suggest that the fault detection rates of unit tests are relatively low, typically finding only about a half of all faults. In addition, conditional boundary and method call removals are less well detected by tests than other fault types. Conclusions: We conclude that the testing of these open source systems needs to be improved across the board. In addition, despite boundary cases being long known to attract faults, tests covering boundaries need particular improvement. Overall, we recommend that developers do not rely only on code coverage and mutation score to measure the effectiveness of their tests.

Bibliographic note

© ACM, 2020. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in PROCEEDINGS of EASE 2020: Evaluation and Assessment in Software Engineering http://doi.acm.org/10.1145/3383219.3383236