A Survey of Localisation in African Languages, and its Prospects: A Background Document
1. Aside from being the maternal language for a large population in northern Africa, Arabic is also a major world language with significant speakership outside the continent, so some localisation issues implicate large markets and can draw on significant and diverse resources.
2. This observation is frequently made. Herbert (1992:1) is among the recent sources.
3. The term "European language of wider communication" (ELWC
) was introduced by Eyamba G. Bokamba (1995). "Europhone" is a more recent coinage, sometimes used to refer to European languages and speakers of them in Africa. "Language of wider communication" (LWC
or sometimes LOWC) is an established term that refers to any language used vehicularly, generally in contexts where it is a second or additional language. Many African languages including Arabic serve as LWCs
or as local linguae francae
of course dominate in web content and software worldwide.
4. According to one estimate, up to 90% of the people in some countries do not speak the official languages (Mackey 1989:5, quoted in Robinson 1996:5).
is the International Organisation for Standards. In effect, this standards organisation and the Unicode
Consortium, begun as an industry association, coordinated their efforts in the mid-nineties to have a single coding system. It is sometimes called the "Universal Character Set
" (UCS) but is commonly referred to simply as Unicode
. This paper will follow the latter practice.
6. This is a subject that cannot be treated in depth here but merits a brief elaboration. Minimisation of the value of all aspects of indigenous cultures in Africa was a fundamental feature of European and North American interaction with Africa for centuries during which the slave trade and colonisation were rationalised. But while such attitudes are no longer acceptable today, and indeed there is a greater appreciation of African cultures elsewhere in the world today, African languages have had little value attributed to them outside the limited circles of linguistic specialists. As late as the 1970s, a major introductory text on Africa gave little attention to African languages other than to suggest their future was in doubt (Bohannan and Curtin 1971; this statement was modified in later editions – see Bohannan and Curtin 1995). Chaudenson (2004) notes that the subject of language has been almost entirely absent from the discourse on development in Africa. And Brock-Utne (2005) calls attention to the negative attitude of foreign donors towards multilingualism in Africa, who see it as a "hindrance" to development.
7. In education and literacy training in Africa, one strategy has been to use instruction in first languages primarily as a "bridge" to learning in the official language (this is sometimes called a "subtractive bilingual" approach). Localisation of ICT in this report is not conceived with such a limited end in mind, although it is certainly true that people who learn computer use in a more familiar language would be able to acquire computer skills in an additional language more readily.
8. The focus here is mainly on the written languages, but it is important to acknowledge the importance of audio and non-text images – whether alone or in combination with text – in localisation and multilingual computing. These include some applications that will be discussed later.
9. We will mention keyboards briefly in this section. A more in-depth treatment, including discussion of speech recognition and speech-to-text is a topic below (section 7.6).
10. Even more broadly, on a "meta" level, one might also include development of tools to facilitate the process of localisation. This is different than the internationalisation of the technology. Such tools are discussed below, Section 5.4.
11. Which is not to suggest that ELWCs
in Africa have no connection, but that it is different and for obvious reasons less deep.
12. Some examples include models proposed by Duncan (1959), Rambo (1983), and Campbell and Olson (1991).
13. One might note that the South African, Jan Smuts, articulated the concept of "holism" in 1926.
15. In French: "aménagement linguistique intégré."
17. Nsengiyumva and Stork, "Rwanda" in Gillwald (2005).
18. The marketing for the Konyin keyboard for instance includes the phrase “Does not change how you type!”No cryptic codes to remember! No training required!” - an explicit recognition of the importance of this "sociocultural" factor.
, which "joiners" might in some cases simply call "languages" but which in other cases may approximate language clusters, is a category that arose in the process of reconciling different parts of the ISO-639
standard for codes representing languages.
20. This process involves in effect a blurring of dialect differences due to factors like marriage, movement of people, and broadcast media.
21. The phenomenon of speakers not mastering the language fully and in the extreme no speakers or group of speakers mastering the full range of the language.
22. Among recent sources that survey language change in contemporary Africa is one by Batibo (2005).
23. H. Russell Bernard (1996) mentions such diversity of opinions in a discussion of whether linguists should work to preserve indigenous languages.
24. The author encountered the opinion that there is "no huge demand" in Ghana
for Ghanaian language interfaces or software from at least two sources. The expectation that there must be large scale demand manifest before providing interfaces or beginning localisation work for various languages fails to understand the issue of latent demand.
25. The author has encountered this attitude among some development professionals.
27. There have been references for instance to Igbo
– a language of Nigeria
spoken by at least 18 million people – as being "endangered" based on perceptions of how the language is and is not being used and passed on (see Daily Champion
2004, Lotanna 2005). This obviously stretches the definition of endangered too far, but it also reflects popular interest and concern among many Igbo
28. A third area might be suggested in the context of localization for "digital divide
" or ICT4D
projects, and that is language in development more broadly. This has been treated only to a limited degree in the literature, for instance by Robinson (1996), Prah (2000), Simala (2002), and Ongarora (2002). For this study, however, language in development will be considered as part of the broader issue of language policy.
29. Many African countries do not have a legislated official language (Gadelli 1999). This fact is borne out by a country by country research of language policy (see the site L'aménagement linguistique dans le monde http://www.tlfq.ulaval.ca/axl/afrique/afracc.htm
, which was one of the references used in compiling the country profiles in Appendix 3 [12.3] of this document). This is not particular to Africa, as numerous countries elsewhere (such as the United States) have not found it necessary to legislate any official language.
30. The website of the African Academy of Languages (ACALAN
) has a recapitulation of how many of the declarations and plans of action issued by conferences and meetings in Africa have not been acted on. See http://www.acalan.org
31. See for example John E. Philips (2000) on the history of Hausa
orthographies. In terms of developing alphabets for multiple languages in a country, particular note should be made of the process in Cameroon
where an effort to develop an alphabet has apparently met with some success (see Tadadjeu and Sadembouo 1984; Tadadjeu 1993).
32. Roger Blench (personal correspondence, 2006) notes, for instance, that much of what Kay Williamson (1984) compiled on orthographies for several Nigerian
languages may not be in current use.
34. For instance Naira currency notes in Nigeria include the amount of the note in Hausa
, written in Ajami
. It is the only indigenous language represented on the currency.
35. There are a number of experts who Fallou Ngom (personal correspondence, 2006).
36. See Appendix 2
(section 12.2) for more information on major scripts. Concerning the unsuccessful proposals, there have been for instance at least three writing systems proposed over the years for Hausa but not widely used, and in 2005 there was a retired professor in Senegal
(Agence de Presse Sénégalaise
2005) and a merchant in the Gambia
(Secka 2005) who each announced they had created new scripts for African languages.
37. It is worth noting that there have been numerous conferences and meetings over the years to discuss aspects of use of African languages in education. Two of the earliest were in 1964 in Abidjan and Ibadan (Sow 1977). Two of the most recent include one on bilingual education in Windhoek, Namibia
in August 2005 and one on languages and education in Africa scheduled for Oslo, Norway in June 2006. A partial list is available at http://www.bisharat.net/Documents
38. There are actually two terms used for this. One, "multiliteracy," is also and perhaps more frequently used to describe literacy in multiple media. The other, "pluriliteracy," has been used in some European literature in the more strict sense of the ability to read more than one language. The latter term is used here.
40. Access is also an important issue where disabilities are involved, but this report will not address that dimension of access in Africa.
42. It is, of course, remembered that skilled users may also have an interest in or preference for localized interfaces.
43. Other, non-technical, factors that impinge on levels of ICT
usage in Africa include literacy (mentioned above, section 4.4) and income level.
46. The Leland Initiative
"Africa Global Information Infrastructure Project" formally began in 1995 with a target of extending "full internet connectivity" to 20 or more African countries. See http://www.usaid.gov/leland/
. The IIA was founded in 1996. The two coordinated their efforts to extend connectivity to the maximum number of counties possible (Okpaku 2003).
47. The Balancing Act
(2004-2005) reports on the internet in Africa discuss these cables as does the IDRC
(2005) Acacia Atlas.
48. See http://l10n.openoffice.org/languages.html
. In general, open source software and operating systems have been localised to a greater degree than proprietary software (see "Open Source's Local Heroes." The Economist 4 Dec. 2003).
54. American Standard Code for Information Interchange.
55. American National Standards Institute. ANSI
is a bit of a misnomer as the institute never formally adopted drafts of this standard. Nevertheless they were used as "Windows ANSI" and the term is commonly used.
59. One exception is the Senegalese non-governmental organization ARED, headed by Sonja Fagerberg-Diallo. An early example of the kind of use possible with Macintosh computers of that era was a a learning manual for the Pular of Guinea in the extended-Latin orthography that Dr. Fagerberg-Diallo produced in 1986.
61. Information from Yacob (personal correspondence 2006).
62. Unicode Transformation Format. There are also other UTFs
, such as UTF-16 and UTF-32 (the number indicates the number of bits). Some background is given at http://en.wikipedia.org/wiki/UTF-8
63. Réseau international francophone d'aménagement linguistique (International Francophone Network for Language Management). See http://www.rifal.org
64. The author is indebted to Mark Davis, Doug Ewell, and Steve Summit for their clarifications on this matter on the Unicode list (September 2006).
65. A recent example was the sample glyph used for the upper case Y with hook (used for the ejective y sound in Fula
), in which the side on which the hook is shown was changed to reflect local usage in West Africa. A discussion of this aspect of this character can be read at http://scripts.sil.org/HooktopYVariants
. This was apparently an inheritance from the divergence years before between what the current practice was in Africa (as reflected in the African Reference Alphabet) and the glyph form retained in ISO documents (per ISO-6438). See http://en.wikipedia.org/wiki/%C6%B3
67. This outline benefitted from information from Cunningham (personal communication, 2006) and Hoskins (2003).
68. An example of different assignment of keys is the set of differences between the QWERTY
keyboards. The placement of the A, Z, Q, and W keys, among others, differ between the two layouts. Similarly, one can, in a keyboard driver, reassign keys without changing what is printed on them in a customised keyboard layout.
72. Each of the profiles in the Major Languages section of this report (Appendix I) includes information on ISO-639
codes for that language.
75. Its director, Dr. Nii Quaynor of Ghana, also served from 2000-3 as At-Large Director of ICANN
76. An organisational meeting was held in Dakar on 7 September 2005 to launch this effort. Mouhamet Diop of the Senegalese company Next SA organised the meeting.
77. This involves testing of two main alternative ways of handling non-ASCII
characters and scripts (Crawford 2006).
80. One of the hopes of this study is that continuing to gather progressively more specific information on the country level will facilitate more detailed cross-comparisons of technical possibilities and linguistic needs.
81. Several Yahoogroups with significant Hausa
content are one example, and a Senegalese forum in which there is Pulaar
content is another.
82. There was even a "web-page by e-mail" service hosted for several years by Kabissa.org, in recognition of the fact that many people in Africa could not access the web but did have limited e-mail access.
83. It would be interesting to know more about the experience of these services. Unfortunately inquiries have yielded no replies.
84. A simple survey of websites by language done in 2000 by Vilaweb, the website of a Barcelona newspaper (Pastore 2000), listed no African languages among the 31. A follow-up to the Vilaweb survey which ranked the top 48 languages on the web found Afrikaans
42nd after languages such as Basque and Slovenian, and Swahili
last following, among others, Frisian and Faeroese (Mas 2003).
85. A more recent survey by UNESCO
(2005) on linguistic diversity on the internet recapitulates the information summarised in this section for Africa.
87. ALI stands for "Apprentissage des Langues africaines par l'Internet" (learning African languages on the internet). See http://www.kabissa.org/archives/a12n-forum/msg00187.html
. This is not to be confused with another online program for second language learners of Akan
called "ALI Akan" based in Switzerland (ALI there standing for meaning African languages on the internet).
89. There are still occasionally new ones created, even though Unicode makes them unnecessary. For example in 2006 a new 8-bit font was announced for the Ewe language
90. Williamson (1984:66) mentions some typewriter keyboards for Nigerian
languages along with strategies for typing with English keyboards. In the 1980s, the IBM
company developed some typeballs with what we now call extended Latin characters for its Selectric typewriter. Mann and Dalby (1987) proposed a lower-case only keyboard for typewriters and computers based on the Niamey African Reference Alphabet, but this never caught on; see http://scripts.sil.org/cms/scripts/page.php?site_id=nrsi&item_id=IntlNiameyKybd
(there is actually one keyboard layout that is based on the Mann-Dalby Niamey keyboard, but it includes upper case characters as well).
92. This was the case for instance in Mali
where the 8-bit
fonts Bambara Arial and Bambara Times were developed by a project facilitated by the French agency ACCT
during the late 1990s.
95. In one large cybercafé in Bamako in 2000, for instance, the author encountered French, English, and German language keyboards.
101. Nokia has localised "menu text and predictive input" for at least one phone model in Afrikaans, Arabic, and Swahili, and "menu text only" in Zulu, Xhosa, Sesotho, Yoruba, Hausa, and Igbo. See http://www.europe.nokia.com/A4160009
103. The International Association for Machine Translation (IAMT), for instance, is composed of three regional associations, one each for the Americas, Europe, and the Asia-Pacific region, but none in Africa, a continent that by itself accounts for about a third of the world's languages.
105. Jeff Allen, personal communication, 2006.
110. In the framework of the PLETES model this would refer to two points in the localization dynamic.
111. Helen Ladd, professor of public policy and economics at Duke University, proposed a similar question regarding the South African government: "...part of the broader language policy they need to grapple with is should all eleven [official] languages remain as viable languages?" (Aziz 2004). In other words, she was not talking even about languages in danger of extinction, for which such questions cannot be escaped, but rather the official and widely used languages of the country. This report is proposing similar questions by other African countries.
113. It is our understanding from previous experience that many people working on localization of Arabic often use English or French as working languages. Nevertheless, the possibility of working in Arabic is considered.
< 12.5 Localisation Resources | Survey Document