Using domain-specific corpora for improved handling of ambiguity in requirements

Associated organisational units

Text available via DOI:

https://doi.org/10.1109/ICSE43902.2021.00133
Final published version

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Published

Standard

Using domain-specific corpora for improved handling of ambiguity in requirements. / Ezzini, Saad; Abualhaija, Sallam; Arora, Chetan et al.
2021 IEEE/ACM 43rd International Conference on Software Engineering, ICSE 2021. IEEE, 2021. p. 1485-1497 (Proceedings - International Conference on Software Engineering).

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Harvard

Ezzini, S, Abualhaija, S, Arora, C, Sabetzadeh, M & Briand, LC 2021, Using domain-specific corpora for improved handling of ambiguity in requirements. in 2021 IEEE/ACM 43rd International Conference on Software Engineering, ICSE 2021. Proceedings - International Conference on Software Engineering, IEEE, pp. 1485-1497. https://doi.org/10.1109/ICSE43902.2021.00133

APA

Ezzini, S., Abualhaija, S., Arora, C., Sabetzadeh, M., & Briand, L. C. (2021). Using domain-specific corpora for improved handling of ambiguity in requirements. In 2021 IEEE/ACM 43rd International Conference on Software Engineering, ICSE 2021 (pp. 1485-1497). (Proceedings - International Conference on Software Engineering). IEEE. https://doi.org/10.1109/ICSE43902.2021.00133

Vancouver

Ezzini S, Abualhaija S, Arora C, Sabetzadeh M, Briand LC. Using domain-specific corpora for improved handling of ambiguity in requirements. In 2021 IEEE/ACM 43rd International Conference on Software Engineering, ICSE 2021. IEEE. 2021. p. 1485-1497. (Proceedings - International Conference on Software Engineering). doi: 10.1109/ICSE43902.2021.00133

Author

Ezzini, Saad ; Abualhaija, Sallam ; Arora, Chetan et al. / Using domain-specific corpora for improved handling of ambiguity in requirements. 2021 IEEE/ACM 43rd International Conference on Software Engineering, ICSE 2021. IEEE, 2021. pp. 1485-1497 (Proceedings - International Conference on Software Engineering).

Bibtex

@inproceedings{3355a533a0174cd89d75c289b6f9b8c9,

title = "Using domain-specific corpora for improved handling of ambiguity in requirements",

abstract = "Ambiguity in natural-language requirements is a pervasive issue that has been studied by the requirements engineering community for more than two decades. A fully manual approach for addressing ambiguity in requirements is tedious and time-consuming, and may further overlook unacknowledged ambiguity - the situation where different stakeholders perceive a requirement as unambiguous but, in reality, interpret the requirement differently. In this paper, we propose an automated approach that uses natural language processing for handling ambiguity in requirements. Our approach is based on the automatic generation of a domain-specific corpus from Wikipedia. Integrating domain knowledge, as we show in our evaluation, leads to a significant positive improvement in the accuracy of ambiguity detection and interpretation. We scope our work to coordination ambiguity (CA) and prepositional-phrase attachment ambiguity (PAA) because of the prevalence of these types of ambiguity in natural-language requirements [1]. We evaluate our approach on 20 industrial requirements documents. These documents collectively contain more than 5000 requirements from seven distinct application domains. Over this dataset, our approach detects CA and PAA with an average precision of 80% and an average recall of 89% (90% for cases of unacknowledged ambiguity). The automatic interpretations that our approach yields have an average accuracy of 85%. Compared to baselines that use generic corpora, our approach, which uses domain-specific corpora, has 33% better accuracy in ambiguity detection and 16% better accuracy in interpretation.",

author = "Saad Ezzini and Sallam Abualhaija and Chetan Arora and Mehrdad Sabetzadeh and Briand, {Lionel C}",

year = "2021",

month = may,

day = "7",

doi = "10.1109/ICSE43902.2021.00133",

language = "English",

isbn = "9781665402965",

series = "Proceedings - International Conference on Software Engineering",

publisher = "IEEE",

pages = "1485--1497",

booktitle = "2021 IEEE/ACM 43rd International Conference on Software Engineering, ICSE 2021",

}

RIS

TY - GEN

T1 - Using domain-specific corpora for improved handling of ambiguity in requirements

AU - Ezzini, Saad

AU - Abualhaija, Sallam

AU - Arora, Chetan

AU - Sabetzadeh, Mehrdad

AU - Briand, Lionel C

PY - 2021/5/7

Y1 - 2021/5/7

N2 - Ambiguity in natural-language requirements is a pervasive issue that has been studied by the requirements engineering community for more than two decades. A fully manual approach for addressing ambiguity in requirements is tedious and time-consuming, and may further overlook unacknowledged ambiguity - the situation where different stakeholders perceive a requirement as unambiguous but, in reality, interpret the requirement differently. In this paper, we propose an automated approach that uses natural language processing for handling ambiguity in requirements. Our approach is based on the automatic generation of a domain-specific corpus from Wikipedia. Integrating domain knowledge, as we show in our evaluation, leads to a significant positive improvement in the accuracy of ambiguity detection and interpretation. We scope our work to coordination ambiguity (CA) and prepositional-phrase attachment ambiguity (PAA) because of the prevalence of these types of ambiguity in natural-language requirements [1]. We evaluate our approach on 20 industrial requirements documents. These documents collectively contain more than 5000 requirements from seven distinct application domains. Over this dataset, our approach detects CA and PAA with an average precision of 80% and an average recall of 89% (90% for cases of unacknowledged ambiguity). The automatic interpretations that our approach yields have an average accuracy of 85%. Compared to baselines that use generic corpora, our approach, which uses domain-specific corpora, has 33% better accuracy in ambiguity detection and 16% better accuracy in interpretation.

AB - Ambiguity in natural-language requirements is a pervasive issue that has been studied by the requirements engineering community for more than two decades. A fully manual approach for addressing ambiguity in requirements is tedious and time-consuming, and may further overlook unacknowledged ambiguity - the situation where different stakeholders perceive a requirement as unambiguous but, in reality, interpret the requirement differently. In this paper, we propose an automated approach that uses natural language processing for handling ambiguity in requirements. Our approach is based on the automatic generation of a domain-specific corpus from Wikipedia. Integrating domain knowledge, as we show in our evaluation, leads to a significant positive improvement in the accuracy of ambiguity detection and interpretation. We scope our work to coordination ambiguity (CA) and prepositional-phrase attachment ambiguity (PAA) because of the prevalence of these types of ambiguity in natural-language requirements [1]. We evaluate our approach on 20 industrial requirements documents. These documents collectively contain more than 5000 requirements from seven distinct application domains. Over this dataset, our approach detects CA and PAA with an average precision of 80% and an average recall of 89% (90% for cases of unacknowledged ambiguity). The automatic interpretations that our approach yields have an average accuracy of 85%. Compared to baselines that use generic corpora, our approach, which uses domain-specific corpora, has 33% better accuracy in ambiguity detection and 16% better accuracy in interpretation.

U2 - 10.1109/ICSE43902.2021.00133

DO - 10.1109/ICSE43902.2021.00133

M3 - Conference contribution/Paper

SN - 9781665402965

T3 - Proceedings - International Conference on Software Engineering

SP - 1485

EP - 1497

BT - 2021 IEEE/ACM 43rd International Conference on Software Engineering, ICSE 2021

PB - IEEE

ER -

Research

Associated organisational units

Links

Text available via DOI: