Home > Research > Publications & Outputs > Multi-task projected embedding for igbo

Links

Text available via DOI:

View graph of relations

Multi-task projected embedding for igbo

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published
Close
Publication date8/09/2018
Host publicationText, Speech, and Dialogue - 21st International Conference, TSD 2018, Proceedings
EditorsPetr Sojka, Aleš Horák, Ivan Kopecek, Karel Pala
Place of PublicationCham
PublisherSpringer-Verlag
Pages285-294
Number of pages10
ISBN (Electronic)9783030007942
ISBN (Print)9783030007935
<mark>Original language</mark>English
Event21st International Conference on Text, Speech, and Dialogue, TSD 2018 - Brno, Czech Republic
Duration: 11/09/201814/09/2018

Conference

Conference21st International Conference on Text, Speech, and Dialogue, TSD 2018
Country/TerritoryCzech Republic
CityBrno
Period11/09/1814/09/18

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11107 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference21st International Conference on Text, Speech, and Dialogue, TSD 2018
Country/TerritoryCzech Republic
CityBrno
Period11/09/1814/09/18

Abstract

NLP research on low resource African languages is often impeded by the unavailability of basic resources: tools, techniques, annotated corpora, and datasets. Besides the lack of funding for the manual development of these resources, building from scratch will amount to the reinvention of the wheel. Therefore, adapting existing techniques and models from well-resourced languages is often an attractive option. One of the most generally applied NLP models is word embeddings. Embedding models often require large amounts of data to train which are not available for most African languages. In this work, we adopt an alignment based projection method to transfer trained English embeddings to the Igbo language. Various English embedding models were projected and evaluated on the odd-word, analogy and word-similarity tasks intrinsically, and also on the diacritic restoration task. Our results show that the projected embeddings performed very well across these tasks.