Coverage-Guided Testing for Recurrent Neural Networks

Computing and Communications

Text available via DOI:

https://doi.org/10.1109/tr.2021.3080664
Final published version

Keywords

Electrical and Electronic Engineering, Safety, Risk, Reliability and Quality

View graph of relations

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

Standard

Coverage-Guided Testing for Recurrent Neural Networks. / Huang, Wei; Sun, Youcheng; Zhao, Xingyu et al.
In: IEEE Transactions on Reliability, Vol. 71, No. 3, 3, 30.09.2022, p. 1191-1206.

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Harvard

Huang, W, Sun, Y, Zhao, X, Sharp, J, Ruan, W, Meng, J & Huang, X 2022, 'Coverage-Guided Testing for Recurrent Neural Networks', IEEE Transactions on Reliability, vol. 71, no. 3, 3, pp. 1191-1206. https://doi.org/10.1109/tr.2021.3080664

APA

Huang, W., Sun, Y., Zhao, X., Sharp, J., Ruan, W., Meng, J., & Huang, X. (2022). Coverage-Guided Testing for Recurrent Neural Networks. IEEE Transactions on Reliability, 71(3), 1191-1206. Article 3. https://doi.org/10.1109/tr.2021.3080664

Vancouver

Huang W, Sun Y, Zhao X, Sharp J, Ruan W, Meng J et al. Coverage-Guided Testing for Recurrent Neural Networks. IEEE Transactions on Reliability. 2022 Sept 30;71(3):1191-1206. 3. Epub 2021 Jun 10. doi: 10.1109/tr.2021.3080664

Author

Huang, Wei ; Sun, Youcheng ; Zhao, Xingyu et al. / Coverage-Guided Testing for Recurrent Neural Networks. In: IEEE Transactions on Reliability. 2022 ; Vol. 71, No. 3. pp. 1191-1206.

Bibtex

@article{8efec12a8a1d4691a7bded70cd580080,

title = "Coverage-Guided Testing for Recurrent Neural Networks",

abstract = "Recurrent neural networks (RNNs) have been applied to a broad range of applications, including natural language processing, drug discovery, and video recognition. Their vulnerability to input perturbation is also known. Aligning with a view from software defect detection, this article aims to develop a coverage-guided testing approach to systematically exploit the internal behavior of RNNs, with the expectation that such testing can detect defects with high possibility. Technically, the long short-term memory network (LSTM), a major class of RNNs, is thoroughly studied. A family of three test metrics are designed to quantify not only the values but also the temporal relations (including both stepwise and bounded-length) exhibited when LSTM processing inputs. A genetic algorithm is applied to efficiently generate test cases. The test metrics and test case generation algorithm are implemented into a tool testRNN , which is then evaluated on a set of LSTM benchmarks. Experiments confirm that testRNN has advantages over the state-of-the-art tool DeepStellar and attack-based defect detection methods, owing to its working with finer temporal semantics and the consideration of the naturalness of input perturbation. Furthermore, testRNN enables meaningful information to be collected and exhibited for users to understand the testing results, which is an important step toward interpretable neural network testing.",

keywords = "Electrical and Electronic Engineering, Safety, Risk, Reliability and Quality",

author = "Wei Huang and Youcheng Sun and Xingyu Zhao and James Sharp and Wenjie Ruan and Jie Meng and Xiaowei Huang",

year = "2022",

month = sep,

day = "30",

doi = "10.1109/tr.2021.3080664",

language = "English",

volume = "71",

pages = "1191--1206",

journal = "IEEE Transactions on Reliability",

issn = "0018-9529",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "3",

}

RIS

TY - JOUR

T1 - Coverage-Guided Testing for Recurrent Neural Networks

AU - Huang, Wei

AU - Sun, Youcheng

AU - Zhao, Xingyu

AU - Sharp, James

AU - Ruan, Wenjie

AU - Meng, Jie

AU - Huang, Xiaowei

PY - 2022/9/30

Y1 - 2022/9/30

N2 - Recurrent neural networks (RNNs) have been applied to a broad range of applications, including natural language processing, drug discovery, and video recognition. Their vulnerability to input perturbation is also known. Aligning with a view from software defect detection, this article aims to develop a coverage-guided testing approach to systematically exploit the internal behavior of RNNs, with the expectation that such testing can detect defects with high possibility. Technically, the long short-term memory network (LSTM), a major class of RNNs, is thoroughly studied. A family of three test metrics are designed to quantify not only the values but also the temporal relations (including both stepwise and bounded-length) exhibited when LSTM processing inputs. A genetic algorithm is applied to efficiently generate test cases. The test metrics and test case generation algorithm are implemented into a tool testRNN , which is then evaluated on a set of LSTM benchmarks. Experiments confirm that testRNN has advantages over the state-of-the-art tool DeepStellar and attack-based defect detection methods, owing to its working with finer temporal semantics and the consideration of the naturalness of input perturbation. Furthermore, testRNN enables meaningful information to be collected and exhibited for users to understand the testing results, which is an important step toward interpretable neural network testing.

AB - Recurrent neural networks (RNNs) have been applied to a broad range of applications, including natural language processing, drug discovery, and video recognition. Their vulnerability to input perturbation is also known. Aligning with a view from software defect detection, this article aims to develop a coverage-guided testing approach to systematically exploit the internal behavior of RNNs, with the expectation that such testing can detect defects with high possibility. Technically, the long short-term memory network (LSTM), a major class of RNNs, is thoroughly studied. A family of three test metrics are designed to quantify not only the values but also the temporal relations (including both stepwise and bounded-length) exhibited when LSTM processing inputs. A genetic algorithm is applied to efficiently generate test cases. The test metrics and test case generation algorithm are implemented into a tool testRNN , which is then evaluated on a set of LSTM benchmarks. Experiments confirm that testRNN has advantages over the state-of-the-art tool DeepStellar and attack-based defect detection methods, owing to its working with finer temporal semantics and the consideration of the naturalness of input perturbation. Furthermore, testRNN enables meaningful information to be collected and exhibited for users to understand the testing results, which is an important step toward interpretable neural network testing.

KW - Electrical and Electronic Engineering

KW - Safety, Risk, Reliability and Quality

U2 - 10.1109/tr.2021.3080664

DO - 10.1109/tr.2021.3080664

M3 - Journal article

VL - 71

SP - 1191

EP - 1206

JO - IEEE Transactions on Reliability

JF - IEEE Transactions on Reliability

SN - 0018-9529

IS - 3

M1 - 3

ER -

Research

Links

Text available via DOI:

Keywords