Cracking the Language Barrier for a Multilingual Africa

Ai in Africa

This webinar series will be hosted by the International Research Centre in Artificial Intelligence (IRCAI) and supported by UNESCO and Knowledge 4 All Foundation, to present the Fellowship to develop datasets and strengthen capacities and innovation potential for Low Resource African Languages project that is composed of research in natural language processing, open dataset creation and publishing, and the development of an interface between policy and technology sphere. The project delivered three main components from research in natural language processing, dataset creation, and policy creation:

1. Fellowship for African AI researchers focused on African languages, based on previously IDRC and Knowledge 4 All Foundation funded work on language datasets. This work contributes to a roadmap for better integration of African languages on digital platforms in aid of lowering the barrier for African participation in the digital economy,

2. Improvement of the representation of AI research carried out on African languages by creating resources for a variety of NLP tasks and in a variety of African languages that will enable good, data-driven results in AI research,

3. Attract an African community of native speakers as contributors of language resources and language technology tools to adopt and support Masakhane NLP, a platform for sharing, maintaining and making use of language resources and tools; establishing widely agreed benchmarks for NLP tasks and stimulating competition between methods and systems,

4. Be used as a model case to inform African evidence-based policymaking concerning Artificial Intelligence and will be included in UNESCO’s AI Decision maker’s Essential to inform policymakers.



Date: TBA

Place: Online via Zoom link

The work within the Fellowship for Low Resource African Languages[13] is currently supported by international donors via a community-led movement towards locally developed and owned datasets that will unlock the power of AI to deliver new social sector solutions and increase the presence of African countries on the international data map. But this work has only just begun. The technical work will always be slower than anyone, including researchers, technologists and entrepreneurs would want, but once a meaningful set of languages are equipped and digitised, real-life applications will start to be produced and used to solve a number of user-driven and consumer-driven problems. However, the question is how does this work in natural language processing influence policy making? The webinar will detail challenges and opportunities of data collection, archiving, available government support, opportunities for startups and finally the role that NLP can play in achieving SDGs.


Kathleen Siminyu, Regional Coordinator, AI4D Africa
Davor Orlic, COO at IRCAI

Vukosi Marivate: University of Pretoria and Deep Learning Indaba


Date: TBA

Place: Online via Zoom link

Language is essential to how we learn, speak, teach, engage, and innovate in our communities. It captures our culture and cultivates it for generations to come, but that is not possible without timely and accurate datasets and technology for natural language processing digitally connecting this wealth of humanity to other continents and cultures around and most importantly beyond Africa. A number of donors have supported the Language Fellowship project and a number of its side and sub-projects, whose proactive AI teams with 29 researchers have created openly accessible text and speech datasets that will fuel natural language processing (NLP) technologies in 9 languages, across 22 countries, with 150 million speakers engaging dozens of institutions across Africa and beyond. What is the roadmap from here onwards? Masakhane, a Network of Excellence in African Languages is shaping up and in the making, consisting of a number of research centres and researchers from African countries, is dedicated to building the technological foundations of a multilingual Africann information society. In this webinar we introduce Masakhane, and synthesise the results of the Fellowship and provide a clear vision for the next steps Member States should take in order to make the best of the momentum between the AI research communities, policy makers and investors, in order to harvest the benefits offered by Language Technology.


Kathleen Siminyu, Regional Coordinator, AI4D Africa
Davor Orlic, COO at IRCAI

Intro by Marielza (Dir/UNESCO)

Project Introduction by
Kathleen Siminyu, Regional Coordinator, AI4D Africa
Davor Orlic, COO at IRCAI

Moderator: UNESCO
David Adelani, Fellow working on Yoruba language
Amelia Taylor, Fellow working on Chichewa language
Vukosi Marivate, University of Pretoria and Deep Learning Indaba
Jade Abbott, Retro Rabbit and Masakhane
Prof. John Shawe-Taylor, Executive Director, IRCAI



Date: Friday, May 28, 2021

Place: Online via Zoom link

Participants: Nairobi, London, Cape Town, Saarbrücken, Kampala

This kick-off is part of a webinar series, hosted by IRCAI and partners and builds on the results coming from the Fellowship. These are the four new projects which are supported by the Lacuna Fund, in collaboration with Canada’s International Development Research Centre, the German development agency GIZ, The Rockefeller Foundation and

This webinar will look at the structure and delivery plan of the new projects that intersect with the Fellowship to develop datasets and strengthen capacities and innovation potential for Low Resource African Languages projects.

  1. Masakhane MT: Decolonizing Scientific Writing for Africa
  2. Masakhane NER: Named Entity Recognition & Parts of Speech datasets for AfricanLanguages
  3. Building NLP Text and Speech Datasets for Low Resourced Languages in East Africa
  4. Nigerian Sentiment Lexicon

The kick-off meeting will cover the following:

  • Formal presentation by the project leads
  • Formal presentation by the project officer
  • Review of project overview and different roles and potential outcomes
  • Definition of different structures and where possible identification of individual responsibilities
  • Presentation of available software and networks that can be exploited
  • Identification of areas where development is needed and which partners will be responsible for what
  • Scheduling of the next set of meetings, including a technical workshop later in the year (perhaps w/c July), PMC meetings, and next project meetings


13:00 – 14:00 (CEST)
Launch opening
Prof. John Shawe-Taylor, Director at IRCAI and K4A, UNESCO Chair in AI at University College London
Background and history
Kathleen Siminyu, Regional Coordinator, AI4D Africa
Davor Orlic, COO at IRCAI and K4A
All projects overview and Masakhane legal entity
Jade Abbott, Retro Rabbit and Masakhane
Discussion with Lacuna Fund Secretariat
Jennifer Pratt Miles and Seth Blum, Meridian Institute

Coffee Break

14:05 – 16:00 (CEST)Presentation of management of all 4 projects
Jade Abbott, Retro Rabbit and Masakhane
Peter Nabende, Makerere University
Shamsuddeen Hassan, Bayero University
Andrew Katumba, Makerere University

Structure: Rollout, Milestones, List of deliverables, Technologies to be developed, Data collection, Legal framework, Risks, Discussion

Coffee Break

16:05 – 17:00 (CEST)Presentations of business and outreach
Presentation on data influencing policies: Hacking and Prototyping, Products and Business, Pilots and Architecture, Dissemination and Policy, Invitation to the research communitiy, Wrapping up and Closure


Date: Friday, June 28, 2021

Place: Online via Zoom link

The languages from this Fellowship have not received adequate attention or resources in the rapidly evolving landscape of language technologies and applications. Therefore this is the beginning of a significantly long journey of creating training and evaluation datasets for underserved African languages, which will have significant downstream impacts on education, financial inclusion, healthcare, agriculture, communication, and disaster response in Sub-Saharan Africa. These recipients have produced training datasets in Eastern, Western, and Southern Africa that will support a range of needs for low resource languages, including machine translation, speech recognition, named entity recognition and part of speech tagging, sentiment analysis, and multi-modal datasets. Here we will present the work done on each language with a general overview of the main challenges, key insights and learnings from the work of the various Fellows, including additional work by the African data science community on top of datasets to create AI/ML applications such as the newly built Text-to-Speech (TTS) platform for Wolof language.


13:00 – 13:15 (CEST)
Bhanu Neupane, Communication and Information Sector, UNESCO
Jonas Gramse, GIZ
Kathleen Siminyu, Regional Coordinator, AI4D Africa
Davor Orlic, COO at IRCAI

13:15 – 13:30 (CEST)
Keynote speaker
Thierno Diop: Masakhane Text-to-Speech (TTS) platform

Coffee Break

13:30 – 14:15 (CEST)
Fellowship speakers
Language Dataset Fellowships of all 9 languages
Kevin Degila: Ewe language and Fongbe language
David Adelani: Yoruba language
Amelia Taylor: Chichewa language
Thierno Diop: Wolof language
Davis David: Kiswahili language
Chayma Fourati:Tunisian Arabizi language
Swahili language
Lawrence Adu-Gyamfi: Twi language
Joyce Nakatumba-Nabende:Luganda language

14:15 – 14:30 (CEST)
Closing Remarks
Bhanu Neupane, Communication and Information Sector, UNESCO


International Research Centre
on Artificial Intelligence (IRCAI)
under the auspices of UNESCO 

Jožef Stefan Institute
Jamova cesta 39
SI-1000 Ljubljana


The designations employed and the presentation of material throughout this website do not imply the expression of any opinion whatsoever on the part of UNESCO concerning the legal status of any country, territory, city or area of its authorities, or concerning the delimitation of its frontiers or boundaries.

Design by Ana Fabjan