Rights statement: © ACM, 2020. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in PROCEEDINGS of EASE 2020: Evaluation and Assessment in Software Engineering http://doi.acm.org/10.1145/3383219.3383236
Accepted author manuscript, 174 KB, PDF document
Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License
Final published version
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
}
TY - GEN
T1 - Which Software Faults Are Tests Not Detecting?
AU - Petric, J.
AU - Hall, T.
AU - Bowes, David
N1 - © ACM, 2020. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in PROCEEDINGS of EASE 2020: Evaluation and Assessment in Software Engineering http://doi.acm.org/10.1145/3383219.3383236
PY - 2020/4/15
Y1 - 2020/4/15
N2 - Context: Software testing plays an important role in assuring the reliability of systems. Assessing the efficacy of testing remains challenging with few established test effectiveness metrics. Those metrics that have been used (e.g. coverage and mutation analysis) have been criticised for insufficiently differentiating between the faults detected by tests. Objective: We investigate how effective tests are at detecting different types of faults and whether some types of fault evade tests more than others. Our aim is to suggest to developers specific ways in which their tests need to be improved to increase fault detection. Method: We investigate seven fault types and analyse how often each goes undetected in 10 open source systems. We statistically look for any relationship between the test set and faults. Results: Our results suggest that the fault detection rates of unit tests are relatively low, typically finding only about a half of all faults. In addition, conditional boundary and method call removals are less well detected by tests than other fault types. Conclusions: We conclude that the testing of these open source systems needs to be improved across the board. In addition, despite boundary cases being long known to attract faults, tests covering boundaries need particular improvement. Overall, we recommend that developers do not rely only on code coverage and mutation score to measure the effectiveness of their tests.
AB - Context: Software testing plays an important role in assuring the reliability of systems. Assessing the efficacy of testing remains challenging with few established test effectiveness metrics. Those metrics that have been used (e.g. coverage and mutation analysis) have been criticised for insufficiently differentiating between the faults detected by tests. Objective: We investigate how effective tests are at detecting different types of faults and whether some types of fault evade tests more than others. Our aim is to suggest to developers specific ways in which their tests need to be improved to increase fault detection. Method: We investigate seven fault types and analyse how often each goes undetected in 10 open source systems. We statistically look for any relationship between the test set and faults. Results: Our results suggest that the fault detection rates of unit tests are relatively low, typically finding only about a half of all faults. In addition, conditional boundary and method call removals are less well detected by tests than other fault types. Conclusions: We conclude that the testing of these open source systems needs to be improved across the board. In addition, despite boundary cases being long known to attract faults, tests covering boundaries need particular improvement. Overall, we recommend that developers do not rely only on code coverage and mutation score to measure the effectiveness of their tests.
KW - software testing
KW - test effectiveness
KW - unit tests
KW - Fault detection
KW - Open source software
KW - Open systems
KW - Software reliability
KW - Code coverage
KW - Fault detection rate
KW - Fault types
KW - Mutation analysis
KW - Mutation score
KW - Open source system
KW - Software fault
KW - Test effectiveness
KW - Software testing
U2 - 10.1145/3383219.3383236
DO - 10.1145/3383219.3383236
M3 - Conference contribution/Paper
SN - 9781450377317
SP - 160
EP - 169
BT - PROCEEDINGS of EASE 2020: Evaluation and Assessment in Software Engineering
PB - ACM
CY - New York
ER -