Final published version
Licence: CC BY: Creative Commons Attribution 4.0 International License
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
}
TY - GEN
T1 - Leveraging Pre-Trained Embeddings for Welsh Taggers
AU - Ezeani, Ignatius
AU - Piao, Scott
AU - Neale, Steven
AU - Rayson, Paul
AU - Knight, Dawn
PY - 2019/8/2
Y1 - 2019/8/2
N2 - While the application of word embedding models to downstream Natural Language Processing (NLP) tasks has been shown to be successful, the benefits for low-resource languages is somewhat limited due to lack of adequate data for training the models. However, NLP research efforts for low-resource languages have focused on constantly seeking ways to harness pre-trained models to improve the performance of NLP systems built to process these languages without the need to re-invent the wheel. One such language is Welsh and therefore, in this paper, we present the results of our experiments on learning a simple multi-task neural network model for part-of-speech and semantic tagging for Welsh using a pre-trained embedding model from FastText. Our model’s performance was compared with those of the existing rule-based stand-alone taggers for part-of-speech and semantic taggers. Despite its simplicity and capacity to perform both tasks simultaneously, our tagger compared very well with the existing taggers.
AB - While the application of word embedding models to downstream Natural Language Processing (NLP) tasks has been shown to be successful, the benefits for low-resource languages is somewhat limited due to lack of adequate data for training the models. However, NLP research efforts for low-resource languages have focused on constantly seeking ways to harness pre-trained models to improve the performance of NLP systems built to process these languages without the need to re-invent the wheel. One such language is Welsh and therefore, in this paper, we present the results of our experiments on learning a simple multi-task neural network model for part-of-speech and semantic tagging for Welsh using a pre-trained embedding model from FastText. Our model’s performance was compared with those of the existing rule-based stand-alone taggers for part-of-speech and semantic taggers. Despite its simplicity and capacity to perform both tasks simultaneously, our tagger compared very well with the existing taggers.
M3 - Conference contribution/Paper
SP - 270
EP - 280
BT - Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)
PB - Association for Computational Linguistics
CY - Florence, Italy
ER -