Accepted author manuscript, 111 KB, PDF document
Research output: Contribution to conference - Without ISBN/ISSN › Conference paper › peer-review
Research output: Contribution to conference - Without ISBN/ISSN › Conference paper › peer-review
}
TY - CONF
T1 - Towards a Welsh semantic tagger
T2 - The Corpus Linguistics Conference 2017
AU - Piao, Scott Songlin
AU - Rayson, Paul Edward
AU - Knight, Dawn
AU - Watkins, Gareth
AU - Donnelly, Kevin
PY - 2017/7/24
Y1 - 2017/7/24
N2 - Semantic annotation is an important part of corpus linguistics. A major tool for semantic tagger is the USAS developed at Lancaster University, which was originally designed for English but has been extended to cover many more languages. In the CorCenCC Project (http://sites.cardiff.ac.uk/corcencc), we are extending the USAS to automatically annotate Welsh language data with the USAS semantic tagset. In this paper, we report on the development of Welsh semantic lexicons for the semantic tagger, in which we have already built a Welsh semantic lexicon containing 143,290 entries that has achieved a lexical coverage of 72.42% in an initial evaluation. An initial version of the Welsh semantic tagger has already been developed based on the lexical resource.
AB - Semantic annotation is an important part of corpus linguistics. A major tool for semantic tagger is the USAS developed at Lancaster University, which was originally designed for English but has been extended to cover many more languages. In the CorCenCC Project (http://sites.cardiff.ac.uk/corcencc), we are extending the USAS to automatically annotate Welsh language data with the USAS semantic tagset. In this paper, we report on the development of Welsh semantic lexicons for the semantic tagger, in which we have already built a Welsh semantic lexicon containing 143,290 entries that has achieved a lexical coverage of 72.42% in an initial evaluation. An initial version of the Welsh semantic tagger has already been developed based on the lexical resource.
KW - Semantic Tagger
KW - Welsh Semantic Lexicon
KW - Corpus Linguistics
KW - Natural Language Processing
KW - semantic annotation
M3 - Conference paper
Y2 - 24 July 2017 through 28 July 2017
ER -