Home > Research > Publications & Outputs > AfriQA

Electronic data

  • 2305.06897v1

    Other version, 408 KB, PDF document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License

Links

Keywords

View graph of relations

AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages

Research output: Working paperPreprint

Published

Standard

AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages. / Ogundepo, Odunayo; Gwadabe, Tajuddeen R.; Rivera, Clara E. et al.
Arxiv, 2023.

Research output: Working paperPreprint

Harvard

Ogundepo, O, Gwadabe, TR, Rivera, CE, Clark, JH, Ruder, S, Adelani, DI, Ezeani, I, Chukwuneke, C, Dossou, BFP, Abdou, ADIOP, Sikasote, C, Hacheme, G, Buzaaba, H, Mabuya, R, Osei, S, Emezue, C, Kahira, AN, Muhammad, SH, Oladipo, A, Owodunni, AT, Tonja, AL, Shode, I, Asai, A, Ajayi, TO, Siro, C, Arthur, S, Adeyemi, M, Ahia, O, Anuoluwapo, A, Awosan, O, Opoku, B, Ayodele, A, Otiende, V, Mwase, C, Sinkala, B, Rubungo, AN, Ajisafe, DA, Onwuegbuzia, EF, Mbow, H, Niyomutabazi, E, Mukonde, E, Lawan, FI, Ahmad, IS, Alabi, JO, Namukombo, M, Chinedu, M, Phiri, M, Putini, N, Mngoma, N, Amuok, PA, Iro, RN & Adhiambo, S 2023 'AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages' Arxiv. <https://arxiv.org/abs/2305.06897v1>

APA

Ogundepo, O., Gwadabe, T. R., Rivera, C. E., Clark, J. H., Ruder, S., Adelani, D. I., Ezeani, I., Chukwuneke, C., Dossou, B. F. P., Abdou, A. DIOP., Sikasote, C., Hacheme, G., Buzaaba, H., Mabuya, R., Osei, S., Emezue, C., Kahira, A. N., Muhammad, S. H., Oladipo, A., ... Adhiambo, S. (2023). AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages. Arxiv. https://arxiv.org/abs/2305.06897v1

Vancouver

Ogundepo O, Gwadabe TR, Rivera CE, Clark JH, Ruder S, Adelani DI et al. AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages. Arxiv. 2023 May 11.

Author

Ogundepo, Odunayo ; Gwadabe, Tajuddeen R. ; Rivera, Clara E. et al. / AfriQA : Cross-lingual Open-Retrieval Question Answering for African Languages. Arxiv, 2023.

Bibtex

@techreport{1d0bc8bde1ca48e2931ee2244843dc1b,
title = "AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages",
abstract = "African languages have far less in-language content available digitally, making it challenging for question answering systems to satisfy the information needs of users. Cross-lingual open-retrieval question answering (XOR QA) systems -- those that retrieve answer content from other languages while serving people in their native language -- offer a means of filling this gap. To this end, we create AfriQA, the first cross-lingual QA dataset with a focus on African languages. AfriQA includes 12,000+ XOR QA examples across 10 African languages. While previous datasets have focused primarily on languages where cross-lingual QA augments coverage from the target language, AfriQA focuses on languages where cross-lingual answer content is the only high-coverage source of answer content. Because of this, we argue that African languages are one of the most important and realistic use cases for XOR QA. Our experiments demonstrate the poor performance of automatic translation and multilingual retrieval methods. Overall, AfriQA proves challenging for state-of-the-art QA models. We hope that the dataset enables the development of more equitable QA technology.",
keywords = "cs.CL, cs.AI, cs.IR",
author = "Odunayo Ogundepo and Gwadabe, {Tajuddeen R.} and Rivera, {Clara E.} and Clark, {Jonathan H.} and Sebastian Ruder and Adelani, {David Ifeoluwa} and Ignatius Ezeani and Chiamaka Chukwuneke and Dossou, {Bonaventure F. P.} and Abdou, {Aziz DIOP} and Claytone Sikasote and Gilles Hacheme and Happy Buzaaba and Rooweither Mabuya and Salomey Osei and Chris Emezue and Kahira, {Albert Njoroge} and Muhammad, {Shamsuddeen H.} and Akintunde Oladipo and Owodunni, {Abraham Toluwase} and Tonja, {Atnafu Lambebo} and Iyanuoluwa Shode and Akari Asai and Ajayi, {Tunde Oluwaseyi} and Clemencia Siro and Steven Arthur and Mofetoluwa Adeyemi and Orevaoghene Ahia and Aremu Anuoluwapo and Oyinkansola Awosan and Bernard Opoku and Awokoya Ayodele and Verrah Otiende and Christine Mwase and Boyd Sinkala and Rubungo, {Andre Niyongabo} and Ajisafe, {Daniel A.} and Onwuegbuzia, {Emeka Felix} and Habib Mbow and Emile Niyomutabazi and Eunice Mukonde and Lawan, {Falalu Ibrahim} and Ahmad, {Ibrahim Said} and Alabi, {Jesujoba O.} and Martin Namukombo and Mbonu Chinedu and Mofya Phiri and Neo Putini and Ndumiso Mngoma and Amuok, {Priscilla A.} and Iro, {Ruqayya Nasir} and Sonia Adhiambo",
year = "2023",
month = may,
day = "11",
language = "English",
publisher = "Arxiv",
type = "WorkingPaper",
institution = "Arxiv",

}

RIS

TY - UNPB

T1 - AfriQA

T2 - Cross-lingual Open-Retrieval Question Answering for African Languages

AU - Ogundepo, Odunayo

AU - Gwadabe, Tajuddeen R.

AU - Rivera, Clara E.

AU - Clark, Jonathan H.

AU - Ruder, Sebastian

AU - Adelani, David Ifeoluwa

AU - Ezeani, Ignatius

AU - Chukwuneke, Chiamaka

AU - Dossou, Bonaventure F. P.

AU - Abdou, Aziz DIOP

AU - Sikasote, Claytone

AU - Hacheme, Gilles

AU - Buzaaba, Happy

AU - Mabuya, Rooweither

AU - Osei, Salomey

AU - Emezue, Chris

AU - Kahira, Albert Njoroge

AU - Muhammad, Shamsuddeen H.

AU - Oladipo, Akintunde

AU - Owodunni, Abraham Toluwase

AU - Tonja, Atnafu Lambebo

AU - Shode, Iyanuoluwa

AU - Asai, Akari

AU - Ajayi, Tunde Oluwaseyi

AU - Siro, Clemencia

AU - Arthur, Steven

AU - Adeyemi, Mofetoluwa

AU - Ahia, Orevaoghene

AU - Anuoluwapo, Aremu

AU - Awosan, Oyinkansola

AU - Opoku, Bernard

AU - Ayodele, Awokoya

AU - Otiende, Verrah

AU - Mwase, Christine

AU - Sinkala, Boyd

AU - Rubungo, Andre Niyongabo

AU - Ajisafe, Daniel A.

AU - Onwuegbuzia, Emeka Felix

AU - Mbow, Habib

AU - Niyomutabazi, Emile

AU - Mukonde, Eunice

AU - Lawan, Falalu Ibrahim

AU - Ahmad, Ibrahim Said

AU - Alabi, Jesujoba O.

AU - Namukombo, Martin

AU - Chinedu, Mbonu

AU - Phiri, Mofya

AU - Putini, Neo

AU - Mngoma, Ndumiso

AU - Amuok, Priscilla A.

AU - Iro, Ruqayya Nasir

AU - Adhiambo, Sonia

PY - 2023/5/11

Y1 - 2023/5/11

N2 - African languages have far less in-language content available digitally, making it challenging for question answering systems to satisfy the information needs of users. Cross-lingual open-retrieval question answering (XOR QA) systems -- those that retrieve answer content from other languages while serving people in their native language -- offer a means of filling this gap. To this end, we create AfriQA, the first cross-lingual QA dataset with a focus on African languages. AfriQA includes 12,000+ XOR QA examples across 10 African languages. While previous datasets have focused primarily on languages where cross-lingual QA augments coverage from the target language, AfriQA focuses on languages where cross-lingual answer content is the only high-coverage source of answer content. Because of this, we argue that African languages are one of the most important and realistic use cases for XOR QA. Our experiments demonstrate the poor performance of automatic translation and multilingual retrieval methods. Overall, AfriQA proves challenging for state-of-the-art QA models. We hope that the dataset enables the development of more equitable QA technology.

AB - African languages have far less in-language content available digitally, making it challenging for question answering systems to satisfy the information needs of users. Cross-lingual open-retrieval question answering (XOR QA) systems -- those that retrieve answer content from other languages while serving people in their native language -- offer a means of filling this gap. To this end, we create AfriQA, the first cross-lingual QA dataset with a focus on African languages. AfriQA includes 12,000+ XOR QA examples across 10 African languages. While previous datasets have focused primarily on languages where cross-lingual QA augments coverage from the target language, AfriQA focuses on languages where cross-lingual answer content is the only high-coverage source of answer content. Because of this, we argue that African languages are one of the most important and realistic use cases for XOR QA. Our experiments demonstrate the poor performance of automatic translation and multilingual retrieval methods. Overall, AfriQA proves challenging for state-of-the-art QA models. We hope that the dataset enables the development of more equitable QA technology.

KW - cs.CL

KW - cs.AI

KW - cs.IR

M3 - Preprint

BT - AfriQA

PB - Arxiv

ER -