Home > Research > Publications & Outputs > Igbo-English Machine Translation

Electronic data

  • 2004.00648

    Accepted author manuscript, 65.1 KB, PDF document

    Available under license: CC0

Links

View graph of relations

Igbo-English Machine Translation: An Evaluation Benchmark

Research output: Contribution to conference - Without ISBN/ISSN Conference paperpeer-review

Published
Close
Publication date1/04/2020
Number of pages4
<mark>Original language</mark>English
EventEighth International Conference on Learning Representations: ICLR 2020 - Virtual
Duration: 26/04/202030/04/2020
Conference number: 8th
http://ilcr.cc

Conference

ConferenceEighth International Conference on Learning Representations
Period26/04/2030/04/20
Internet address

Abstract

Although researchers and practitioners are pushing the boundaries and enhancing the capacities of NLP tools and methods, works on African languages are lagging. A lot of focus on well resourced languages such as English, Japanese, German, French, Russian, Mandarin Chinese etc. Over 97% of the world's 7000 languages, including African languages, are low resourced for NLP i.e. they have little or no data, tools, and techniques for NLP research. For instance, only 5 out of 2965, 0.19% authors of full text papers in the ACL Anthology extracted from the 5 major conferences in 2018 ACL, NAACL, EMNLP, COLING and CoNLL, are affiliated to African institutions. In this work, we discuss our effort toward building a standard machine translation benchmark dataset for Igbo, one of the 3 major Nigerian languages. Igbo is spoken by more than 50 million people globally with over 50% of the speakers are in southeastern Nigeria. Igbo is low resourced although there have been some efforts toward developing IgboNLP such as part of speech tagging and diacritic restoration