Home > Research > Publications & Outputs > WikiDoMiner: wikipedia domain-specific miner

Links

Text available via DOI:

View graph of relations

WikiDoMiner: wikipedia domain-specific miner

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published
Close
Publication date9/11/2022
Host publicationESEC/FSE 2022 - Proceedings of the 30th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering
EditorsAbhik Roychoudhury, Cristian Cadar, Miryung Kim
PublisherAssociation for Computing Machinery (ACM)
Pages1706-1710
Number of pages5
ISBN (electronic)9781450394130
<mark>Original language</mark>English

Publication series

NameESEC/FSE 2022 - Proceedings of the 30th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Abstract

We introduce WikiDoMiner - a tool for automatically generating domain-specific corpora by crawling Wikipedia. WikiDoMiner helps requirements engineers create an external knowledge resource that is specific to the underlying domain of a given requirements specification (RS). Being able to build such a resource is important since domain-specific datasets are scarce. WikiDoMiner generates a corpus by first extracting a set of domain-specific keywords from a given RS, and then querying Wikipedia for these keywords. The output of WikiDoMiner is a set of Wikipedia articles relevant to the domain of the input RS. Mining Wikipedia for domain-specific knowledge can be beneficial for multiple requirements engineering tasks, e.g., ambiguity handling, requirements classification, and question answering. WikiDoMiner is publicly available on Zenodo under an open-source license (https: //doi.org/10.5281/zenodo.6672682)