Leveraging Pre-Trained Embeddings for Welsh Taggers

Associated organisational units

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Published

Ignatius Ezeani
Scott Piao
Steven Neale
Paul Rayson
Dawn Knight

Publication date	2/08/2019
Host publication	Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019): Held at the 57th Annual Meeting of the Association for Computational Linguistics
Place of Publication	Florence, Italy
Publisher	Association for Computational Linguistics
Pages	270-280
Number of pages	11
<mark>Original language</mark>	English

Abstract

While the application of word embedding models to downstream Natural Language Processing (NLP) tasks has been shown to be successful, the benefits for low-resource languages is somewhat limited due to lack of adequate data for training the models. However, NLP research efforts for low-resource languages have focused on constantly seeking ways to harness pre-trained models to improve the performance of NLP systems built to process these languages without the need to re-invent the wheel. One such language is Welsh and therefore, in this paper, we present the results of our experiments on learning a simple multi-task neural network model for part-of-speech and semantic tagging for Welsh using a pre-trained embedding model from FastText. Our model’s performance was compared with those of the existing rule-based stand-alone taggers for part-of-speech and semantic taggers. Despite its simplicity and capacity to perform both tasks simultaneously, our tagger compared very well with the existing taggers.

Research

Associated organisational units

Links

Leveraging Pre-Trained Embeddings for Welsh Taggers

Abstract

Quick Links

Connect With Us

Faculties & Depts

Contact Us