skip to content

Department of Computer Science and Technology

It's time to define some research priorities for our collaboration with Ethiopian colleagues.

Helen and I had an enjoyable and worthwhile afternoon with Tesfa Tegegne as he passed through Addis Ababa on the way to a community networking workshop in Tanzania (also involving project partner Nic Bidwell).

Over lunch, we talked about two main themes in the Ethiopia leg of the project: the potential for statistics curriculum and methods to be enhanced by more accessible probabilistic programming tools; and the need for natural language processing resources in the Amharic language to support an AI4D agenda in Ethiopia.

Tesfa’s own AI research at the ICT4D Centre in Bahir Dar is currently focused on the development of data resources for training Amharic speech recognition systems. As he told us, if Amharic-speaking people could access IT services via mobile devices using Amharic dialog, this could extend many information benefits to a large proportion of the Ethiopian population that are currently excluded by low literacy (about 50% of the adult population are not literate, and about 30% of the population are primarily Amharic speaking - we aren’t sure of the intersection).

A significant question here might be whether free dialog systems are the most appropriate mechanism for serving needs of excluded populations. An existing alternative is interactive voice response systems, which are already widely deployed for commercial applications in developed countries. What benefits are anticipated for free dialog beyond those of IVR? Might partially structured dialogs be appropriate, and do they require research questions and methods different from those of corpus-driven language modelling?

With regard to statistics curriculum and methods, this was the first opportunity we had to discuss with Tesfa the development of ideas in the PPIG paper on Usability of Probabilistic Programming Languages, and conversations with the Cambridge Mathematics Project team. There are potentially two “leapfrog” opportunities here for countries like Ethiopia. The first is to establish mathematical reasoning skills more appropriate to a data science generation, perhaps assisted by interactive learning systems and simulations. The second is to develop a experiential understanding of computation as probabilistic rather than deterministic, where concepts of sampling and generalisation might become equally as important as those currently emphasised in computing curricula such as iteration and functional abstraction.

Probabilistic languages might themselves be very effective research tools for AI research where quantities of training data are insufficient to apply deep learning methods, and where those responsible for collecting and labelling data might have closer relations and obligations to those building the models.

A more challenging speculation is whether there could be a “gentle slope” of usable functionality, leading toward such research tools, but starting with educational modelling tools that could be accessible to senior high school students, perhaps using mobile phones in the style of Microsoft’s TouchDevelop
https://www.microsoft.com/en-us/research/project/touchdevelop/