Rights statement: © © 2018 Copyright held by the owner/author(s). Publication rights licensed to Association for Computing Machinery. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in PROMISE'18 Proceedings of the 14th International Conference on Predictive Models and Data Analytics in Software Engineeringhttp://dx.doi.org/10.1145/3273934.3273939
Accepted author manuscript, 699 KB, PDF document
Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License
Final published version
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
}
TY - GEN
T1 - How Effectively Is Defective Code Actually Tested?
T2 - An Analysis of JUnit Tests in Seven Open Source Systems
AU - Petric, Jean
AU - Hall, Tracy
AU - Bowes, David
N1 - © © 2018 Copyright held by the owner/author(s). Publication rights licensed to Association for Computing Machinery. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in PROMISE'18 Proceedings of the 14th International Conference on Predictive Models and Data Analytics in Software Engineeringhttp://dx.doi.org/10.1145/3273934.3273939
PY - 2018/10/10
Y1 - 2018/10/10
N2 - Background: Newspaper headlines still regularly report latent software defects. Such defects have often evaded testing for many years. It remains difficult to identify how well a system has been tested. It also remains difficult to assess how successful at finding defects particular tests are. Coverage and mutation testing are frequently used to asses test effectiveness. We look more deeply at the performance of commonly used JUnit testing by assessing how much JUnit testing was done and how effective that testing was at detecting defects in seven open source systems.Aim: We aim to identify whether defective code has been effectively tested by JUnit tests as non-defective code. We also aim to identify the characteristics of JUnit tests that are related to identifying defects.Methodology: We first extract the defects from seven open source projects using the SZZ algorithm. We match those defects with JUnit tests to identify the proportion of defects that were covered by JUnit tests. We also do the same for non-defective code. We then use Principal Component Analysis and machine learning to investigate the characteristics of JUnit tests that were successful in identifying defects.Results: Our findings suggest that most of the open source systems we investigated are under-tested. On average over 66% of defective methods were not linked to any JUnit tests. We show that the number of methods touched by a JUnit test is strongly related to that test uncovering a defect.Conclusion: More JUnit tests need to be produced for the seven open source systems that we investigate. JUnit tests need to be relatively sophisticated, in particular they should touch more than just one method during the test.
AB - Background: Newspaper headlines still regularly report latent software defects. Such defects have often evaded testing for many years. It remains difficult to identify how well a system has been tested. It also remains difficult to assess how successful at finding defects particular tests are. Coverage and mutation testing are frequently used to asses test effectiveness. We look more deeply at the performance of commonly used JUnit testing by assessing how much JUnit testing was done and how effective that testing was at detecting defects in seven open source systems.Aim: We aim to identify whether defective code has been effectively tested by JUnit tests as non-defective code. We also aim to identify the characteristics of JUnit tests that are related to identifying defects.Methodology: We first extract the defects from seven open source projects using the SZZ algorithm. We match those defects with JUnit tests to identify the proportion of defects that were covered by JUnit tests. We also do the same for non-defective code. We then use Principal Component Analysis and machine learning to investigate the characteristics of JUnit tests that were successful in identifying defects.Results: Our findings suggest that most of the open source systems we investigated are under-tested. On average over 66% of defective methods were not linked to any JUnit tests. We show that the number of methods touched by a JUnit test is strongly related to that test uncovering a defect.Conclusion: More JUnit tests need to be produced for the seven open source systems that we investigate. JUnit tests need to be relatively sophisticated, in particular they should touch more than just one method during the test.
KW - JUnit tests, Software testing, test effectiveness
U2 - 10.1145/3273934.3273939
DO - 10.1145/3273934.3273939
M3 - Conference contribution/Paper
SN - 9781450365932
T3 - PROMISE'18
SP - 42
EP - 51
BT - Proceedings of the 14th International Conference on Predictive Models and Data Analytics in Software Engineering
PB - ACM
CY - New York, NY, USA
ER -