Final published version
Licence: CC BY
Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - MasakhaNER
T2 - Named Entity Recognition for African Languages
AU - Masakhane
AU - Adelani, David Ifeoluwa
AU - Abbott, Jade
AU - Neubig, Graham
AU - D’souza, Daniel
AU - Kreutzer, Julia
AU - Lignos, Constantine
AU - Palen-Michel, Chester
AU - Buzaaba, Happy
AU - Rijhwani, Shruti
AU - Ruder, Sebastian
AU - Mayhew, Stephen
AU - Azime, Israel Abebe
AU - Muhammad, Shamsuddeen H.
AU - Emezue, Chris Chinenye
AU - Nakatumba-Nabende, Joyce
AU - Ogayo, Perez
AU - Anuoluwapo, Aremu
AU - Gitau, Catherine
AU - Mbaye, Derguene
AU - Alabi, Jesujoba
AU - Yimam, Seid Muhie
AU - Gwadabe, Tajuddeen Rabiu
AU - Ezeani, Ignatius
AU - Niyongabo, Rubungo Andre
AU - Mukiibi, Jonathan
AU - Otiende, Verrah
AU - Orife, Iroro
AU - David, Davis
AU - Ngom, Samba
AU - Adewumi, Tosin
AU - Rayson, Paul
AU - Adeyemi, Mofetoluwa
AU - Muriuki, Gerald
AU - Anebi, Emmanuel
AU - Chukwuneke, Chiamaka
AU - Odu, Nkiruka
AU - Wairagala, Eric Peter
AU - Oyerinde, Samuel
AU - Siro, Clemencia
AU - Bateesa, Tobius Saul
AU - Oloyede, Temilola
AU - Wambui, Yvonne
AU - Akinode, Victor
AU - Nabagereka, Deborah
AU - Katusiime, Maurice
AU - Awokoya, Ayodele
AU - MBOUP, Mouhamadane
AU - Gebreyohannes, Dibora
AU - Tilaye, Henok
AU - Nwaike, Kelechi
PY - 2021/10/1
Y1 - 2021/10/1
N2 - We take a step towards addressing the under- representation of the African continent in NLP research by bringing together different stakeholders to create the first large, publicly available, high-quality dataset for named entity recognition (NER) in ten African languages. We detail the characteristics of these languages to help researchers and practitioners better understand the challenges they pose for NER tasks. We analyze our datasets and conduct an extensive empirical evaluation of state- of-the-art methods across both supervised and transfer learning settings. Finally, we release the data, code, and models to inspire future research on African NLP.1
AB - We take a step towards addressing the under- representation of the African continent in NLP research by bringing together different stakeholders to create the first large, publicly available, high-quality dataset for named entity recognition (NER) in ten African languages. We detail the characteristics of these languages to help researchers and practitioners better understand the challenges they pose for NER tasks. We analyze our datasets and conduct an extensive empirical evaluation of state- of-the-art methods across both supervised and transfer learning settings. Finally, we release the data, code, and models to inspire future research on African NLP.1
U2 - 10.1162/tacl_a_00416
DO - 10.1162/tacl_a_00416
M3 - Journal article
VL - 9
SP - 1116
EP - 1131
JO - Transactions of the Association for Computational Linguistics
JF - Transactions of the Association for Computational Linguistics
SN - 2307-387X
ER -