Home > Research > Publications & Outputs > Towards a Welsh semantic tagger

Electronic data

  • cl2017-welsh-semtagger

    Accepted author manuscript, 111 KB, PDF document

View graph of relations

Towards a Welsh semantic tagger: creating lexicons for a resource poor language

Research output: Contribution to conference - Without ISBN/ISSN Conference paperpeer-review

Published

Standard

Towards a Welsh semantic tagger: creating lexicons for a resource poor language. / Piao, Scott Songlin; Rayson, Paul Edward; Knight, Dawn et al.
2017. Paper presented at The Corpus Linguistics Conference 2017, Birmingham, United Kingdom.

Research output: Contribution to conference - Without ISBN/ISSN Conference paperpeer-review

Harvard

Piao, SS, Rayson, PE, Knight, D, Watkins, G & Donnelly, K 2017, 'Towards a Welsh semantic tagger: creating lexicons for a resource poor language', Paper presented at The Corpus Linguistics Conference 2017, Birmingham, United Kingdom, 24/07/17 - 28/07/17.

APA

Piao, S. S., Rayson, P. E., Knight, D., Watkins, G., & Donnelly, K. (2017). Towards a Welsh semantic tagger: creating lexicons for a resource poor language. Paper presented at The Corpus Linguistics Conference 2017, Birmingham, United Kingdom.

Vancouver

Piao SS, Rayson PE, Knight D, Watkins G, Donnelly K. Towards a Welsh semantic tagger: creating lexicons for a resource poor language. 2017. Paper presented at The Corpus Linguistics Conference 2017, Birmingham, United Kingdom.

Author

Piao, Scott Songlin ; Rayson, Paul Edward ; Knight, Dawn et al. / Towards a Welsh semantic tagger : creating lexicons for a resource poor language. Paper presented at The Corpus Linguistics Conference 2017, Birmingham, United Kingdom.4 p.

Bibtex

@conference{bafc6ab18dbc4e17b5e0fea350f9f330,
title = "Towards a Welsh semantic tagger: creating lexicons for a resource poor language",
abstract = "Semantic annotation is an important part of corpus linguistics. A major tool for semantic tagger is the USAS developed at Lancaster University, which was originally designed for English but has been extended to cover many more languages. In the CorCenCC Project (http://sites.cardiff.ac.uk/corcencc), we are extending the USAS to automatically annotate Welsh language data with the USAS semantic tagset. In this paper, we report on the development of Welsh semantic lexicons for the semantic tagger, in which we have already built a Welsh semantic lexicon containing 143,290 entries that has achieved a lexical coverage of 72.42% in an initial evaluation. An initial version of the Welsh semantic tagger has already been developed based on the lexical resource.",
keywords = "Semantic Tagger, Welsh Semantic Lexicon, Corpus Linguistics, Natural Language Processing, semantic annotation",
author = "Piao, {Scott Songlin} and Rayson, {Paul Edward} and Dawn Knight and Gareth Watkins and Kevin Donnelly",
year = "2017",
month = jul,
day = "24",
language = "English",
note = "The Corpus Linguistics Conference 2017, CL2017 ; Conference date: 24-07-2017 Through 28-07-2017",
url = "http://www.birmingham.ac.uk/research/activity/corpus/events/2017/cl2017/index.aspx",

}

RIS

TY - CONF

T1 - Towards a Welsh semantic tagger

T2 - The Corpus Linguistics Conference 2017

AU - Piao, Scott Songlin

AU - Rayson, Paul Edward

AU - Knight, Dawn

AU - Watkins, Gareth

AU - Donnelly, Kevin

PY - 2017/7/24

Y1 - 2017/7/24

N2 - Semantic annotation is an important part of corpus linguistics. A major tool for semantic tagger is the USAS developed at Lancaster University, which was originally designed for English but has been extended to cover many more languages. In the CorCenCC Project (http://sites.cardiff.ac.uk/corcencc), we are extending the USAS to automatically annotate Welsh language data with the USAS semantic tagset. In this paper, we report on the development of Welsh semantic lexicons for the semantic tagger, in which we have already built a Welsh semantic lexicon containing 143,290 entries that has achieved a lexical coverage of 72.42% in an initial evaluation. An initial version of the Welsh semantic tagger has already been developed based on the lexical resource.

AB - Semantic annotation is an important part of corpus linguistics. A major tool for semantic tagger is the USAS developed at Lancaster University, which was originally designed for English but has been extended to cover many more languages. In the CorCenCC Project (http://sites.cardiff.ac.uk/corcencc), we are extending the USAS to automatically annotate Welsh language data with the USAS semantic tagset. In this paper, we report on the development of Welsh semantic lexicons for the semantic tagger, in which we have already built a Welsh semantic lexicon containing 143,290 entries that has achieved a lexical coverage of 72.42% in an initial evaluation. An initial version of the Welsh semantic tagger has already been developed based on the lexical resource.

KW - Semantic Tagger

KW - Welsh Semantic Lexicon

KW - Corpus Linguistics

KW - Natural Language Processing

KW - semantic annotation

M3 - Conference paper

Y2 - 24 July 2017 through 28 July 2017

ER -