WikiDoMiner: wikipedia domain-specific miner

Computing and Communications

Associated organisational units

Text available via DOI:

https://doi.org/10.1145/3540250.3558916
Final published version
Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

Keywords

Domain-specific Corpus Generation, Natural Language Processing, Natural-language Requirements, Requirements Engineering, Wikipedia

View graph of relations

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Published

Saad Ezzini
Sallam Abualhaija
Mehrdad Sabetzadeh

More...

Publication date	9/11/2022
Host publication	ESEC/FSE 2022 - Proceedings of the 30th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering
Editors	Abhik Roychoudhury, Cristian Cadar, Miryung Kim
Publisher	Association for Computing Machinery (ACM)
Pages	1706-1710
Number of pages	5
ISBN (electronic)	9781450394130
<mark>Original language</mark>	English

Publication series

Name	ESEC/FSE 2022 - Proceedings of the 30th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Abstract

We introduce WikiDoMiner - a tool for automatically generating domain-specific corpora by crawling Wikipedia. WikiDoMiner helps requirements engineers create an external knowledge resource that is specific to the underlying domain of a given requirements specification (RS). Being able to build such a resource is important since domain-specific datasets are scarce. WikiDoMiner generates a corpus by first extracting a set of domain-specific keywords from a given RS, and then querying Wikipedia for these keywords. The output of WikiDoMiner is a set of Wikipedia articles relevant to the domain of the input RS. Mining Wikipedia for domain-specific knowledge can be beneficial for multiple requirements engineering tasks, e.g., ambiguity handling, requirements classification, and question answering. WikiDoMiner is publicly available on Zenodo under an open-source license (https: //doi.org/10.5281/zenodo.6672682)

Research

Associated organisational units

Links

Text available via DOI:

Keywords