AfriMTE and AfriCOMET: Enhancing COMET to Embrace Under-resourced African Languages

Wang, Jiayi, Adelani, David Ifeoluwa, Agrawal, Sweta, Masiak, Marek, Rei, Ricardo, Briakou, Eleftheria, Carpuat, Marine, He, Xuanli, Bourhim, Sofia, Bukula, Andiswa, Mohamed, Muhidin, Olatoye, Temitayo, Adewumi, Tosin, Mokayed, Hamam, Mwase, Christine, Kimotho, Wangui, Yuehgoh, Foutse, Aremu, Anuoluwapo, Ojo, Jessica, Muhammad, Shamsuddeen Hassan, Osei, Salomey, Omotayo, Abdul-Hakeem, Chukwuneke, Chiamaka, Ogayo, Perez, Hourrane, Oumaima, Anigri, Salma El, Ndolela, Lolwethu, Mangwana, Thabiso, Mohamed, Shafie Abdi, Hassan, Ayinde, Awoyomi, Oluwabusayo Olufunke, Alkhaled, Lama, Al-Azzawi, Sana, Etori, Naome A., Ochieng, Millicent, Siro, Clemencia, Njoroge, Samuel, Muchiri, Eric, Kimotho, Wangari, Momo, Lyse Naomi Wamba, Abolade, Daud, Ajao, Simbiat, Shode, Iyanuoluwa, Macharm, Ricky, Iro, Ruqayya Nasir, Abdullahi, Saheed S., Moore, Stephen E., Opoku, Bernard, Akinjobi, Zainab, Afolabi, Abeeb, Obiefuna, Nnaemeka, Ogbu, Onyekachi Raphael, Brian, Sam, Otiende, Verrah Akinyi, Mbonu, Chinedu Emmanuel, Sari, Sakayo Toadoum, Lu, Yao and Stenetorp, Pontus (2023). AfriMTE and AfriCOMET: Enhancing COMET to Embrace Under-resourced African Languages. Other. arXiv.org.

Abstract

Despite the recent progress on scaling multilingual machine translation (MT) to several under-resourced African languages, accurately measuring this progress remains challenging, since evaluation is often performed on n-gram matching metrics such as BLEU, which typically show a weaker correlation with human judgments. Learned metrics such as COMET have higher correlation; however, the lack of evaluation data with human ratings for under-resourced languages, complexity of annotation guidelines like Multidimensional Quality Metrics (MQM), and limited language coverage of multilingual encoders have hampered their applicability to African languages. In this paper, we address these challenges by creating high-quality human evaluation data with simplified MQM guidelines for error detection and direct assessment (DA) scoring for 13 typologically diverse African languages. Furthermore, we develop AfriCOMET: COMET evaluation metrics for African languages by leveraging DA data from well-resourced languages and an African-centric multilingual encoder (AfroXLM-R) to create the state-of-the-art MT evaluation metrics for African languages with respect to Spearman-rank correlation with human judgments (0.441).

Publication DOI:	https://doi.org/10.48550/arXiv.2311.09828
Divisions:	College of Business and Social Sciences > Aston Business School > Operations & Information Management Aston University (General)
Additional Information:	Paper submitted to NAACL 2024. This is an open access publication distributed under the terms of the Creative Commons Attribution License CC BY [https://creativecommons.org/licenses/by/4.0/], which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Last Modified:	20 Feb 2026 09:37
Date Deposited:	21 May 2024 16:04
Full Text Link:
Related URLs:	https://arxiv.o ... /abs/2311.09828 (Publisher URL)
PURE Output Type:	Preprint
Published Date:	2023-11-16
Authors:	Wang, Jiayi Adelani, David Ifeoluwa Agrawal, Sweta Masiak, Marek Rei, Ricardo Briakou, Eleftheria Carpuat, Marine He, Xuanli Bourhim, Sofia Bukula, Andiswa Mohamed, Muhidin Olatoye, Temitayo Adewumi, Tosin Mokayed, Hamam Mwase, Christine Kimotho, Wangui Yuehgoh, Foutse Aremu, Anuoluwapo Ojo, Jessica Muhammad, Shamsuddeen Hassan Osei, Salomey Omotayo, Abdul-Hakeem Chukwuneke, Chiamaka Ogayo, Perez Hourrane, Oumaima Anigri, Salma El Ndolela, Lolwethu Mangwana, Thabiso Mohamed, Shafie Abdi Hassan, Ayinde Awoyomi, Oluwabusayo Olufunke Alkhaled, Lama Al-Azzawi, Sana Etori, Naome A. Ochieng, Millicent Siro, Clemencia Njoroge, Samuel Muchiri, Eric Kimotho, Wangari Momo, Lyse Naomi Wamba Abolade, Daud Ajao, Simbiat Shode, Iyanuoluwa Macharm, Ricky Iro, Ruqayya Nasir Abdullahi, Saheed S. Moore, Stephen E. Opoku, Bernard Akinjobi, Zainab Afolabi, Abeeb Obiefuna, Nnaemeka Ogbu, Onyekachi Raphael Brian, Sam Otiende, Verrah Akinyi Mbonu, Chinedu Emmanuel Sari, Sakayo Toadoum Lu, Yao Stenetorp, Pontus

Download

Version: Draft Version

License: Creative Commons Attribution

| Preview

Export / Share Citation

Explore Further

Statistics

Additional statistics for this record

Record administration