Home > Research > Publications & Outputs > Multi-task projected embedding for igbo

Links

Text available via DOI:

View graph of relations

Multi-task projected embedding for igbo

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published

Standard

Multi-task projected embedding for igbo. / Ezeani, Ignatius; Hepple, Mark; Onyenwe, Ikechukwu et al.
Text, Speech, and Dialogue - 21st International Conference, TSD 2018, Proceedings. ed. / Petr Sojka; Aleš Horák; Ivan Kopecek; Karel Pala. Cham: Springer-Verlag, 2018. p. 285-294 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11107 LNAI).

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

Ezeani, I, Hepple, M, Onyenwe, I & Enemuo, C 2018, Multi-task projected embedding for igbo. in P Sojka, A Horák, I Kopecek & K Pala (eds), Text, Speech, and Dialogue - 21st International Conference, TSD 2018, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11107 LNAI, Springer-Verlag, Cham, pp. 285-294, 21st International Conference on Text, Speech, and Dialogue, TSD 2018, Brno, Czech Republic, 11/09/18. https://doi.org/10.1007/978-3-030-00794-2_31

APA

Ezeani, I., Hepple, M., Onyenwe, I., & Enemuo, C. (2018). Multi-task projected embedding for igbo. In P. Sojka, A. Horák, I. Kopecek, & K. Pala (Eds.), Text, Speech, and Dialogue - 21st International Conference, TSD 2018, Proceedings (pp. 285-294). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11107 LNAI). Springer-Verlag. https://doi.org/10.1007/978-3-030-00794-2_31

Vancouver

Ezeani I, Hepple M, Onyenwe I, Enemuo C. Multi-task projected embedding for igbo. In Sojka P, Horák A, Kopecek I, Pala K, editors, Text, Speech, and Dialogue - 21st International Conference, TSD 2018, Proceedings. Cham: Springer-Verlag. 2018. p. 285-294. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-00794-2_31

Author

Ezeani, Ignatius ; Hepple, Mark ; Onyenwe, Ikechukwu et al. / Multi-task projected embedding for igbo. Text, Speech, and Dialogue - 21st International Conference, TSD 2018, Proceedings. editor / Petr Sojka ; Aleš Horák ; Ivan Kopecek ; Karel Pala. Cham : Springer-Verlag, 2018. pp. 285-294 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

Bibtex

@inproceedings{109deae5c40f41e8ae477ecb921c7984,
title = "Multi-task projected embedding for igbo",
abstract = "NLP research on low resource African languages is often impeded by the unavailability of basic resources: tools, techniques, annotated corpora, and datasets. Besides the lack of funding for the manual development of these resources, building from scratch will amount to the reinvention of the wheel. Therefore, adapting existing techniques and models from well-resourced languages is often an attractive option. One of the most generally applied NLP models is word embeddings. Embedding models often require large amounts of data to train which are not available for most African languages. In this work, we adopt an alignment based projection method to transfer trained English embeddings to the Igbo language. Various English embedding models were projected and evaluated on the odd-word, analogy and word-similarity tasks intrinsically, and also on the diacritic restoration task. Our results show that the projected embeddings performed very well across these tasks.",
keywords = "Diacritics, Embedding models, Igbo, Low-resource, Transfer learning",
author = "Ignatius Ezeani and Mark Hepple and Ikechukwu Onyenwe and Chioma Enemuo",
year = "2018",
month = sep,
day = "8",
doi = "10.1007/978-3-030-00794-2_31",
language = "English",
isbn = "9783030007935",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer-Verlag",
pages = "285--294",
editor = "Petr Sojka and Ale{\v s} Hor{\'a}k and Ivan Kopecek and Karel Pala",
booktitle = "Text, Speech, and Dialogue - 21st International Conference, TSD 2018, Proceedings",
note = "21st International Conference on Text, Speech, and Dialogue, TSD 2018 ; Conference date: 11-09-2018 Through 14-09-2018",

}

RIS

TY - GEN

T1 - Multi-task projected embedding for igbo

AU - Ezeani, Ignatius

AU - Hepple, Mark

AU - Onyenwe, Ikechukwu

AU - Enemuo, Chioma

PY - 2018/9/8

Y1 - 2018/9/8

N2 - NLP research on low resource African languages is often impeded by the unavailability of basic resources: tools, techniques, annotated corpora, and datasets. Besides the lack of funding for the manual development of these resources, building from scratch will amount to the reinvention of the wheel. Therefore, adapting existing techniques and models from well-resourced languages is often an attractive option. One of the most generally applied NLP models is word embeddings. Embedding models often require large amounts of data to train which are not available for most African languages. In this work, we adopt an alignment based projection method to transfer trained English embeddings to the Igbo language. Various English embedding models were projected and evaluated on the odd-word, analogy and word-similarity tasks intrinsically, and also on the diacritic restoration task. Our results show that the projected embeddings performed very well across these tasks.

AB - NLP research on low resource African languages is often impeded by the unavailability of basic resources: tools, techniques, annotated corpora, and datasets. Besides the lack of funding for the manual development of these resources, building from scratch will amount to the reinvention of the wheel. Therefore, adapting existing techniques and models from well-resourced languages is often an attractive option. One of the most generally applied NLP models is word embeddings. Embedding models often require large amounts of data to train which are not available for most African languages. In this work, we adopt an alignment based projection method to transfer trained English embeddings to the Igbo language. Various English embedding models were projected and evaluated on the odd-word, analogy and word-similarity tasks intrinsically, and also on the diacritic restoration task. Our results show that the projected embeddings performed very well across these tasks.

KW - Diacritics

KW - Embedding models

KW - Igbo

KW - Low-resource

KW - Transfer learning

U2 - 10.1007/978-3-030-00794-2_31

DO - 10.1007/978-3-030-00794-2_31

M3 - Conference contribution/Paper

AN - SCOPUS:85053906387

SN - 9783030007935

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 285

EP - 294

BT - Text, Speech, and Dialogue - 21st International Conference, TSD 2018, Proceedings

A2 - Sojka, Petr

A2 - Horák, Aleš

A2 - Kopecek, Ivan

A2 - Pala, Karel

PB - Springer-Verlag

CY - Cham

T2 - 21st International Conference on Text, Speech, and Dialogue, TSD 2018

Y2 - 11 September 2018 through 14 September 2018

ER -