Home > Research > Publications & Outputs > Towards A Welsh Semantic Annotation System

Electronic data

  • lrec2018-cysemtagger

    Accepted author manuscript, 126 KB, PDF document

    Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

  • welsh-sem-tagger-lrec2018-proc

    Rights statement: The LREC 2018 Proceedings are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License

    Final published version, 138 KB, PDF document

    Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

Links

View graph of relations

Towards A Welsh Semantic Annotation System

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published

Standard

Towards A Welsh Semantic Annotation System. / Piao, Scott Songlin; Rayson, Paul Edward; Knight, Dawn et al.
LREC 2018, Eleventh International Conference on Language Resources and Evaluation. ed. / Nicoletta Calzolari; Khalid Choukri; Christopher Cieri; Thierry Declerck; Sara Goggi; Koiti Hasida; Hitoshi Isahara; Bente Maegaard; Joseph Mariani; Helene Mazo; Asuncion Moreno; Jan Odijk; Stelios Piperidis; Takenobu Tokunaga. European Language Resources Association (ELRA), 2018. p. 980-985.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

Piao, SS, Rayson, PE, Knight, D & Watkins, G 2018, Towards A Welsh Semantic Annotation System. in N Calzolari, K Choukri, C Cieri, T Declerck, S Goggi, K Hasida, H Isahara, B Maegaard, J Mariani, H Mazo, A Moreno, J Odijk, S Piperidis & T Tokunaga (eds), LREC 2018, Eleventh International Conference on Language Resources and Evaluation. European Language Resources Association (ELRA), pp. 980-985, The 11th Edition of the Language Resources and Evaluation Conference, Miyazaki, Japan, 7/05/18. <http://www.lrec-conf.org/proceedings/lrec2018/summaries/458.html>

APA

Piao, S. S., Rayson, P. E., Knight, D., & Watkins, G. (2018). Towards A Welsh Semantic Annotation System. In N. Calzolari, K. Choukri, C. Cieri, T. Declerck, S. Goggi, K. Hasida, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, J. Odijk, S. Piperidis, & T. Tokunaga (Eds.), LREC 2018, Eleventh International Conference on Language Resources and Evaluation (pp. 980-985). European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2018/summaries/458.html

Vancouver

Piao SS, Rayson PE, Knight D, Watkins G. Towards A Welsh Semantic Annotation System. In Calzolari N, Choukri K, Cieri C, Declerck T, Goggi S, Hasida K, Isahara H, Maegaard B, Mariani J, Mazo H, Moreno A, Odijk J, Piperidis S, Tokunaga T, editors, LREC 2018, Eleventh International Conference on Language Resources and Evaluation. European Language Resources Association (ELRA). 2018. p. 980-985

Author

Piao, Scott Songlin ; Rayson, Paul Edward ; Knight, Dawn et al. / Towards A Welsh Semantic Annotation System. LREC 2018, Eleventh International Conference on Language Resources and Evaluation. editor / Nicoletta Calzolari ; Khalid Choukri ; Christopher Cieri ; Thierry Declerck ; Sara Goggi ; Koiti Hasida ; Hitoshi Isahara ; Bente Maegaard ; Joseph Mariani ; Helene Mazo ; Asuncion Moreno ; Jan Odijk ; Stelios Piperidis ; Takenobu Tokunaga. European Language Resources Association (ELRA), 2018. pp. 980-985

Bibtex

@inproceedings{a0b2b63d791543fd81bb1b852e1712b4,
title = "Towards A Welsh Semantic Annotation System",
abstract = "Automatic semantic annotation of natural language data is an important task in Natural Language Processing, and a variety of semantic taggers have been developed for this task, particularly for English. However, for many languages, particularly for low-resource languages, such tools are yet to be developed. In this paper, we report on the development of an automatic Welsh semantic annotation tool (named CySemTagger) in the CorCenCC Project, which will facilitate semantic-level analysis of Welsh language data on a large scale. Based on Lancaster{\textquoteright}s USAS semantic tagger framework, this tool tags words in Welsh texts with semantic tags from a semantic classification scheme, and is designed to be compatible with multiple Welsh POS taggers and POS tagsets by mapping different tagsets into a core shared POS tagset that is used internally by CySemTagger. Our initial evaluation shows that the tagger can cover up to 91.78% of words in Welsh text. This tagger is under continuous development, and will provide a critical tool for Welsh language corpusand information processing at semantic level.",
keywords = "Welsh semantic tagger, corpus annotation tool, Welsh language, CorCenCC",
author = "Piao, {Scott Songlin} and Rayson, {Paul Edward} and Dawn Knight and Gareth Watkins",
year = "2018",
month = may,
day = "9",
language = "English",
isbn = "9791095546009",
pages = "980--985",
editor = "Nicoletta Calzolari and Khalid Choukri and Christopher Cieri and Thierry Declerck and Sara Goggi and Koiti Hasida and Hitoshi Isahara and Bente Maegaard and Joseph Mariani and Helene Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis and Takenobu Tokunaga",
booktitle = "LREC 2018, Eleventh International Conference on Language Resources and Evaluation",
publisher = "European Language Resources Association (ELRA)",
note = "The 11th Edition of the Language Resources and Evaluation Conference, LREC2018 ; Conference date: 07-05-2018 Through 12-05-2018",
url = "http://lrec2018.lrec-conf.org/en/",

}

RIS

TY - GEN

T1 - Towards A Welsh Semantic Annotation System

AU - Piao, Scott Songlin

AU - Rayson, Paul Edward

AU - Knight, Dawn

AU - Watkins, Gareth

PY - 2018/5/9

Y1 - 2018/5/9

N2 - Automatic semantic annotation of natural language data is an important task in Natural Language Processing, and a variety of semantic taggers have been developed for this task, particularly for English. However, for many languages, particularly for low-resource languages, such tools are yet to be developed. In this paper, we report on the development of an automatic Welsh semantic annotation tool (named CySemTagger) in the CorCenCC Project, which will facilitate semantic-level analysis of Welsh language data on a large scale. Based on Lancaster’s USAS semantic tagger framework, this tool tags words in Welsh texts with semantic tags from a semantic classification scheme, and is designed to be compatible with multiple Welsh POS taggers and POS tagsets by mapping different tagsets into a core shared POS tagset that is used internally by CySemTagger. Our initial evaluation shows that the tagger can cover up to 91.78% of words in Welsh text. This tagger is under continuous development, and will provide a critical tool for Welsh language corpusand information processing at semantic level.

AB - Automatic semantic annotation of natural language data is an important task in Natural Language Processing, and a variety of semantic taggers have been developed for this task, particularly for English. However, for many languages, particularly for low-resource languages, such tools are yet to be developed. In this paper, we report on the development of an automatic Welsh semantic annotation tool (named CySemTagger) in the CorCenCC Project, which will facilitate semantic-level analysis of Welsh language data on a large scale. Based on Lancaster’s USAS semantic tagger framework, this tool tags words in Welsh texts with semantic tags from a semantic classification scheme, and is designed to be compatible with multiple Welsh POS taggers and POS tagsets by mapping different tagsets into a core shared POS tagset that is used internally by CySemTagger. Our initial evaluation shows that the tagger can cover up to 91.78% of words in Welsh text. This tagger is under continuous development, and will provide a critical tool for Welsh language corpusand information processing at semantic level.

KW - Welsh semantic tagger

KW - corpus annotation tool

KW - Welsh language

KW - CorCenCC

M3 - Conference contribution/Paper

SN - 9791095546009

SP - 980

EP - 985

BT - LREC 2018, Eleventh International Conference on Language Resources and Evaluation

A2 - Calzolari, Nicoletta

A2 - Choukri, Khalid

A2 - Cieri, Christopher

A2 - Declerck, Thierry

A2 - Goggi, Sara

A2 - Hasida, Koiti

A2 - Isahara, Hitoshi

A2 - Maegaard, Bente

A2 - Mariani, Joseph

A2 - Mazo, Helene

A2 - Moreno, Asuncion

A2 - Odijk, Jan

A2 - Piperidis, Stelios

A2 - Tokunaga, Takenobu

PB - European Language Resources Association (ELRA)

T2 - The 11th Edition of the Language Resources and Evaluation Conference

Y2 - 7 May 2018 through 12 May 2018

ER -