Home > Research > Publications & Outputs > How Effectively Is Defective Code Actually Tested?

Electronic data

  • testing_effectiveness_paper

    Rights statement: © © 2018 Copyright held by the owner/author(s). Publication rights licensed to Association for Computing Machinery. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in PROMISE'18 Proceedings of the 14th International Conference on Predictive Models and Data Analytics in Software Engineeringhttp://dx.doi.org/10.1145/3273934.3273939

    Accepted author manuscript, 699 KB, PDF document

    Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

Links

Text available via DOI:

View graph of relations

How Effectively Is Defective Code Actually Tested?: An Analysis of JUnit Tests in Seven Open Source Systems

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published

Standard

How Effectively Is Defective Code Actually Tested? An Analysis of JUnit Tests in Seven Open Source Systems. / Petric, Jean; Hall, Tracy; Bowes, David.
Proceedings of the 14th International Conference on Predictive Models and Data Analytics in Software Engineering. New York, NY, USA: ACM, 2018. p. 42-51 (PROMISE'18).

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

Petric, J, Hall, T & Bowes, D 2018, How Effectively Is Defective Code Actually Tested? An Analysis of JUnit Tests in Seven Open Source Systems. in Proceedings of the 14th International Conference on Predictive Models and Data Analytics in Software Engineering. PROMISE'18, ACM, New York, NY, USA, pp. 42-51. https://doi.org/10.1145/3273934.3273939

APA

Petric, J., Hall, T., & Bowes, D. (2018). How Effectively Is Defective Code Actually Tested? An Analysis of JUnit Tests in Seven Open Source Systems. In Proceedings of the 14th International Conference on Predictive Models and Data Analytics in Software Engineering (pp. 42-51). (PROMISE'18). ACM. https://doi.org/10.1145/3273934.3273939

Vancouver

Petric J, Hall T, Bowes D. How Effectively Is Defective Code Actually Tested? An Analysis of JUnit Tests in Seven Open Source Systems. In Proceedings of the 14th International Conference on Predictive Models and Data Analytics in Software Engineering. New York, NY, USA: ACM. 2018. p. 42-51. (PROMISE'18). doi: 10.1145/3273934.3273939

Author

Petric, Jean ; Hall, Tracy ; Bowes, David. / How Effectively Is Defective Code Actually Tested? An Analysis of JUnit Tests in Seven Open Source Systems. Proceedings of the 14th International Conference on Predictive Models and Data Analytics in Software Engineering. New York, NY, USA : ACM, 2018. pp. 42-51 (PROMISE'18).

Bibtex

@inproceedings{c861969b7865464b8974b5d06e64a1b5,
title = "How Effectively Is Defective Code Actually Tested?: An Analysis of JUnit Tests in Seven Open Source Systems",
abstract = "Background: Newspaper headlines still regularly report latent software defects. Such defects have often evaded testing for many years. It remains difficult to identify how well a system has been tested. It also remains difficult to assess how successful at finding defects particular tests are. Coverage and mutation testing are frequently used to asses test effectiveness. We look more deeply at the performance of commonly used JUnit testing by assessing how much JUnit testing was done and how effective that testing was at detecting defects in seven open source systems.Aim: We aim to identify whether defective code has been effectively tested by JUnit tests as non-defective code. We also aim to identify the characteristics of JUnit tests that are related to identifying defects.Methodology: We first extract the defects from seven open source projects using the SZZ algorithm. We match those defects with JUnit tests to identify the proportion of defects that were covered by JUnit tests. We also do the same for non-defective code. We then use Principal Component Analysis and machine learning to investigate the characteristics of JUnit tests that were successful in identifying defects.Results: Our findings suggest that most of the open source systems we investigated are under-tested. On average over 66% of defective methods were not linked to any JUnit tests. We show that the number of methods touched by a JUnit test is strongly related to that test uncovering a defect.Conclusion: More JUnit tests need to be produced for the seven open source systems that we investigate. JUnit tests need to be relatively sophisticated, in particular they should touch more than just one method during the test.",
keywords = "JUnit tests, Software testing, test effectiveness",
author = "Jean Petric and Tracy Hall and David Bowes",
note = "{\textcopyright} {\textcopyright} 2018 Copyright held by the owner/author(s). Publication rights licensed to Association for Computing Machinery. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in PROMISE'18 Proceedings of the 14th International Conference on Predictive Models and Data Analytics in Software Engineeringhttp://dx.doi.org/10.1145/3273934.3273939",
year = "2018",
month = oct,
day = "10",
doi = "10.1145/3273934.3273939",
language = "English",
isbn = "9781450365932",
series = "PROMISE'18",
publisher = "ACM",
pages = "42--51",
booktitle = "Proceedings of the 14th International Conference on Predictive Models and Data Analytics in Software Engineering",

}

RIS

TY - GEN

T1 - How Effectively Is Defective Code Actually Tested?

T2 - An Analysis of JUnit Tests in Seven Open Source Systems

AU - Petric, Jean

AU - Hall, Tracy

AU - Bowes, David

N1 - © © 2018 Copyright held by the owner/author(s). Publication rights licensed to Association for Computing Machinery. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in PROMISE'18 Proceedings of the 14th International Conference on Predictive Models and Data Analytics in Software Engineeringhttp://dx.doi.org/10.1145/3273934.3273939

PY - 2018/10/10

Y1 - 2018/10/10

N2 - Background: Newspaper headlines still regularly report latent software defects. Such defects have often evaded testing for many years. It remains difficult to identify how well a system has been tested. It also remains difficult to assess how successful at finding defects particular tests are. Coverage and mutation testing are frequently used to asses test effectiveness. We look more deeply at the performance of commonly used JUnit testing by assessing how much JUnit testing was done and how effective that testing was at detecting defects in seven open source systems.Aim: We aim to identify whether defective code has been effectively tested by JUnit tests as non-defective code. We also aim to identify the characteristics of JUnit tests that are related to identifying defects.Methodology: We first extract the defects from seven open source projects using the SZZ algorithm. We match those defects with JUnit tests to identify the proportion of defects that were covered by JUnit tests. We also do the same for non-defective code. We then use Principal Component Analysis and machine learning to investigate the characteristics of JUnit tests that were successful in identifying defects.Results: Our findings suggest that most of the open source systems we investigated are under-tested. On average over 66% of defective methods were not linked to any JUnit tests. We show that the number of methods touched by a JUnit test is strongly related to that test uncovering a defect.Conclusion: More JUnit tests need to be produced for the seven open source systems that we investigate. JUnit tests need to be relatively sophisticated, in particular they should touch more than just one method during the test.

AB - Background: Newspaper headlines still regularly report latent software defects. Such defects have often evaded testing for many years. It remains difficult to identify how well a system has been tested. It also remains difficult to assess how successful at finding defects particular tests are. Coverage and mutation testing are frequently used to asses test effectiveness. We look more deeply at the performance of commonly used JUnit testing by assessing how much JUnit testing was done and how effective that testing was at detecting defects in seven open source systems.Aim: We aim to identify whether defective code has been effectively tested by JUnit tests as non-defective code. We also aim to identify the characteristics of JUnit tests that are related to identifying defects.Methodology: We first extract the defects from seven open source projects using the SZZ algorithm. We match those defects with JUnit tests to identify the proportion of defects that were covered by JUnit tests. We also do the same for non-defective code. We then use Principal Component Analysis and machine learning to investigate the characteristics of JUnit tests that were successful in identifying defects.Results: Our findings suggest that most of the open source systems we investigated are under-tested. On average over 66% of defective methods were not linked to any JUnit tests. We show that the number of methods touched by a JUnit test is strongly related to that test uncovering a defect.Conclusion: More JUnit tests need to be produced for the seven open source systems that we investigate. JUnit tests need to be relatively sophisticated, in particular they should touch more than just one method during the test.

KW - JUnit tests, Software testing, test effectiveness

U2 - 10.1145/3273934.3273939

DO - 10.1145/3273934.3273939

M3 - Conference contribution/Paper

SN - 9781450365932

T3 - PROMISE'18

SP - 42

EP - 51

BT - Proceedings of the 14th International Conference on Predictive Models and Data Analytics in Software Engineering

PB - ACM

CY - New York, NY, USA

ER -