Pushing Back the Origin of Bantu Lexicography: The Vocabularium

In this article, the oldest Bantu dictionary hitherto known is explored, that is the Vocabularium Latinum, Hispanicum, e Congense, handed down to us through a manuscript from 1652 by the Flemish Capuchin Joris van Gheel, missionary in the Kongo (present-day north-western Angola and the southern part of the Lower Congo Province of the DRC). The manuscript was heavily reworked by the Belgian Jesuits Joseph van Wing and Constant Penders, and published in 1928. Both works are currently being digitized, linked and added to an interlingual and multimedia database that revolves around Kikongo and the early history of the Kongo kingdom. In Sections 1 and 2 the origins of Bantu lexicography in general and of Kikongo metalexicography in particular are revisited. Sections 3 and 4 are devoted to a study of Van Gheel's manuscript and an analysis of Van Wing and Penders' rework. In Sections 5 and 6 translation equivalence and lexicographical structure in both dictionaries are scrutinized and compared. In Section 7, finally, all the material is brought together.


The origins of Bantu lexicography
In 1964 Benson wrote a remarkable article titled "A Century of Bantu Lexicography".Reading through the recent literature on Bantu lexicography, it seems as if scholars agree that the field, now half a century later, is indeed just 150 years old.In support of his argument Benson starts by retracing the lexicographical efforts of "a pioneer in the field such as Krapf" (p.65), whose Swahili-English dictionary was published posthumously in 1882, whereas his first manuscript, "a vocabulary which became quite an extensive work" (p.65), was written in 1844.Also for East Africa, Benson feels that " [a]fter Swahili the major Bantu language meriting consideration is Luganda" (p.73), for which he starts his account with Le Veux's Luganda-French vocabulary of 1917.For Central Africa, Benson mentions Madan's Lala/Lamba/Wisa-English dictionary of 1913, a Bemba-English dictionary by the White Fathers of 1947, Torrend's English-Bantu-Botatwe dictionary of 1931, Hannan's Shona-English dictionary of 1959, and Scott's encyclopaedic Nyanja-English dictionary which was prepared in about 1870.For Southern Africa, Benson discusses Mabille's Southern Sotho-English dictionary of 1878, Brown's Tswana-English dictionary of the end of the 19th century, Doke and Vilakazi's Zulu-English dictionary of 1948, and McLaren's Xhosa-English dictionary of 1936.For West Central Africa, finally, Benson lists Bentley's Kikongo-English dictionary and grammar of 1887, Van Wing and Penders' Kikongo-French-Flemish dictionary of 1928, and Whitehead's Bobangi-English dictionary and grammar of 1899.Benson (1964) does not refer to Doke's excellent overview of the "Early Bantu Literature" (1935), published three decades earlier.Doke stresses the invaluable contribution of "[t]he Angola Fathers [who] were the first to give us any monograph in or concerning a Bantu language" (p.87), singling out Brusciotto as the greatest, being "the discoverer of the Bantu noun class and con-cord system, and the first recorder of Bantu verbal derivations" (p.102).Hence the subtitle of Doke's (1935) article: "The Age of Brusciotto".The first four works which Doke discusses all stem from the first half of the 17th century.In 1624 the Portuguese Jesuit Cardoso translates the catechism "Dovtrina Christãa", which is published in Portuguese with interlinear translations into Kikongo, making it the very first text in a Bantu language.Two decades later, in 1643, another catechism, Pacconio and De Couto's "Gentio de Angola" is published, written in Kimbundu with a Portuguese version on the opposite pages.Next comes Brusciotto himself, who is credited with a quadrilingual Kikongo dictionary manuscript as well as a translation of the "Dovtrina Christãa" into Latin and Italian, both in the year 1650.Unfortunately, the quadrilingual Kikongo dictionary is not now extant (Doke 1935: 96), which leads some scholars to doubt whether it was actually compiled (e.g.Van Wing and Penders 1928: xxvii).Conversely, copies of Brusciotto's grammar of Kikongo, published in 1659, are extant and have "earned for him lasting reputation in Bantu language study" (Doke 1935: 97).A manuscript from the same period that has also survived to this date is Van Gheel's (1652) trilingual Latin-Spanish-Kikongo dictionary.What interests us most here are Brusciotto's 'lost' quadrilingual dictionary manuscript of 1650, and Van Gheel's still-existing trilingual dictionary manuscript of 1652.Given Van Gheel's manuscript survives to this day, it is possible and even necessary to move the origin of the field back to 1652 or, writing in 2012, to state that the field of Bantu lexicography is (at least) 360 years old.

Metalexicographical studies on Kikongo
In a way, it is not surprising that the first dictionary of a Bantu language is one for Kikongo (H16), the Kongo kingdom being one of the first Bantu-speaking regions where the Portuguese landed.With a dictionary history of 360 years, one would therefore expect Kikongo lexicography to be a popular and oftdiscussed topic in Bantu metalexicographic circles.Yet nothing is further from the truth.In twenty-one years of Lexikos, for instance, not a single dictionary aspect of Kikongo lexicography has been discussed.The closest one has come to the Kongo kingdom and its languages and dialects, is via Gabon.Three years ago, Ndinga-Koumba-Binza and Roux (2009) as well as Mavoungou (2009a), each devoted an entire contribution to Civili (H12).Civili, also known as Fiote, belongs to the wider Kongo language cluster -that is, Guthrie's group H10and is spoken along the coast in Congo-Brazzaville as well as in adjacent coastal areas in Gabon and Angola's Cabinda, and is associated with the historical Loango kingdom.Moving further afield, "sister languages" of Civili (Mavoungou 2006: 141), namely Yipunu (B43) and Yilumbu (B44), have also been covered to some extent in Lexikos (Mavoungou 2002(Mavoungou , 2006(Mavoungou , 2009)).Similarly, in twenty-four years of the International Journal of Lexicography (IJL), Kikongo is only mentioned once in passing, in a dictionary review of French in Congo (Rey-Debove 1992: 160), and once in a definition for Kituba (Tsakona 2007: 120).The lingua franca Kituba (H10b) itself, also known as Munukutuba, Monokutuba or Kikongo ya Leta, a pidgin/creole based on Kikongo as lexifier, would be a good candidate to fill the lack of metalexicographical studies on Kikongo, but both Lexikos and IJL are silent about this language as well, except for a passing mention in De Schryver (2003: 18).While lexicographers may not have concerned themselves with metalexicographical studies on Kikongo, dictionary compilers have been quite busy, as the lists of Kikongo reference works in for instance Doke (1945: 17-22) and Hendrix (1982: 45, 96-99, 186-187, 238, 244, 254, 262, 271) attest to.

The Capuchin missions in the Kongo and their linguistic works
In the year 1645, the first Capuchins arrived at the port of Mpinda, in Soyo, located in present-day north-western Angola, just south of the Congo River.
Their purpose was to spread the Christian faith among the Kongolese population.The missionaries of this first caravan settled in Soyo and Mbanza Kongo (San Salvador), but did not engage in learning the indigenous language, since most of the Africans in these two urban centres already had sufficient knowledge of Portuguese (Hildebrand 1940: 259).Three years later, following the arrival of a second caravan of Capuchin missionaries, they realized the importance of acquiring the native language in order for them to pursue their evangelistic aspirations in the hinterland as well (Hildebrand 1940: 259;Nsondé 1995: 57).This second caravan included such illustrious missionaries as Antonio de Teruel and Girolamo da Montesarchio (Hildebrand 1940: 261), who engaged in the compilation of sermons, vocabulary lists and grammars in Kikongo.Alas, very few of these works have survived.
A later Capuchin caravan to the Kongo included our subject, the Fleming Joris van Gheel.The missionaries had set sail in 1648, but only reached the port of Mpinda in June 1651.After his arrival, Van Gheel was sent into the district of Matari ( Van Wing and Penders 1928: xxiii). 1 His stay in Kongo was rather short, since he died on the 17th of December 1652, as a result of having been beaten by villagers for disrupting a ritual and destroying their ritual objects (Nsondé 1995: 127;Thornton 2011).It is during this short period that Van Gheel managed to pen a manuscript which includes, in addition to a number of spiritual and worldly texts appended to the front and back, the trilingual Vocabularium Latinum, Hispanicum, e Congense, the oldest surviving source of the Capuchin description of Kikongo.

The question of authorship
It is generally accepted that Joris van Gheel physically wrote the dictionary, although the manuscript does not include any sign of authorship.This assumption is based on the fact that the handwriting clearly corresponds to other texts which are known to have been written by Van Gheel ( Van Wing and Penders 1928: xxii-xxiii;Thornton 2011).The question of authorship, on the other hand, has been debated ever since the manuscript was discovered.D 'Alençon (1914: 42) claims that Van Gheel cannot possibly be the author of the dictionary, considering that his stay was too short to acquire sufficient knowledge of the language.D'Alençon suggests that Van Gheel copied the dictionary merely for his own use.Van Wing and Penders (1928: xxvi-xxvii) refute this argument and point out that no potential original antedating 1652, from which Van Gheel could have copied, has been found.They consider d'Alençon's argument to be a confirmation of Van Gheel's linguistic capacities and of the extreme, though not insuperable, difficulties of the enterprise.Further on, Van Wing and Penders (1928: xxix) seem to nuance their argument, however, and claim that it might also be possible that Van Gheel actually used a vocabulary list of Antonio de Teruel, the Capuchin missionary who was part of the second caravan.Hildebrand (1940: 263-264), author of a book-length biography of Joris van Gheel, suggests that the Flemish Capuchin copied his dictionary from a vocabulary list previously compiled by the Capuchin prefect Buenaventura d'Alessano, as well as others including Antonio de Teruel and José de Pernambuco. 2 Hildebrand (1940: 259-265) is also the first to mention the considerable influence exerted by Manuel Roboredo on the linguistic enterprises of the Capuchins.Roboredo was a Kongolese priest, child of a Portuguese nobleman and a Kongolese mother who belonged to the royal lineage of King García II of Kongo (Hildebrand 1940: 260).According to Hildebrand (1940: 261-265), it is Roboredo who taught the Capuchins the language, and it is also he who directed most of the compilation of their linguistic works.In fact, Hildebrand is very clear with respect to the authorship of the dictionary in question, as he states: 3   Le grand mérite de la rédaction revient à Roboredo, en un certain sens, le dictionnaire est son oeuvre.La rédaction a été faite à la demande des Pères; ceux-ci peuvent revendiquer une partie du mérite de la belle entreprise.Le vocabulaire semble le travail collectif des nouveaux missionnaires, surtout d'Antoine de Teruel et de Joseph de Pernambouc, sous la direction de Roboredo … Telle a été la genèse du remarquable vocabulaire latin-espagnol-congolais, que nous connaissons par la copie du P. Georges.(Hildebrand 1940: 264, underlining ours) Doke (1935), who had had access to an earlier study of Hildebrand (1934), is of the same opinion: There can be no doubt, however, that he [Van Gheel] copied a manuscript known to be in existence at the Mission Station of San Salvador before his arrival.Joris was only a beginner, having been under two years in the country at the time of his death.Though the dictionary is probably not the work of a single person, it is practically certain that in the main it is to be ascribed to Roboredo, a Spaniard whose name is the only one mentioned in the original text.(Doke 1935: 97, underlining ours) Contemporary scholars support (parts of) this argument, and especially focus on the merits of Manuel Roboredo.Nsondé (1995: 60), for instance, does not neglect the remarkable linguistic capacities of Joris van Gheel -who mastered Latin, Spanish and English before his arrival in the Kongo, in addition to his mother tongue Flemish -but he attributes the majority of the linguistic works of the Capuchins to Roboredo. 4In this respect, he also mentions the gratitude expressed by Buenaventura d'Alessano, the prefect of the Kongo mission, who openly recognized the merits of Roboredo (Nsondé 1995: 58-59).This view is shared by Thornton (2011), who considers Van Gheel to have copied from a vocabulary list compiled by the Spanish Capuchins José de Pernambuco and Francisco de Veas, with the aid of Roboredo and under the direction of Bonaventura da Sardegna (or da Nuoro).Similar arguments can be found in Bonvini (1996: 140) and Gray (1998), who consider Bonaventura da Sardegna and Manuel Roboredo to be the compilers of the dictionary.Bontinck (1980: 530), on the other hand, singles out José de Pernambuco as the writer of the first vocabulary lists, from which other Capuchins must have copied, such as Antonio de Teruel, Girolamo da Montesarchio and Joris van Gheel.The prefect, Buenaventura d'Alessano, is also often cited in the context of the compilation process, but this may be due to the fact that he reported the event to Rome (Nsondé 1995: 58-59;Thornton 2011).
In Section 3.4, we discuss linguistic evidence indicating that the main dialect represented in the manuscript is the direct ancestor of the Kisikongo variety currently spoken at Mbanza Kongo, the former capital of the Kongo Kingdom, and not the Kisolongo variety spoken along the coast.Given that Roboredo was close to the royal court at Mbanza Kongo, this evidence also supports the hypothesis of his strong contribution to the compilation of the Vocabularium.

The compilation strategy
In Addendum 1, pages 41-42 from the Vocabularium Latinum, Hispanicum, e Congense are shown.As may be seen, in this manuscript a lemma sign in Latin is typically followed by, first its translation into Spanish (although at times this slot remains empty), and second one or more translation equivalents in Kikongo.The interspersed metalanguage, which is used to indicate parts of speech and to clarify grammatical points, is presented in (abbreviated) Latin.That missionaries use Latin should not surprise, but the presence of Spanish in Kongo, rather than Portuguese, may surprise.The reason seems to simply boil down to the availability of existing reference works at the Mission Station.Both Hildebrand (1940: 264) and Bontinck (1976: 155-156) suggested that the source text must have been one of the re-editions of De Nebrija's (1492) Latin-Spanish Dictionarium.In a follow-up study, Bontinck (1980: 531-533) settles for the reedition of 1581, published in Antequera.On the one hand Bontinck sees some macro-as well as microstructural correlations between De Nebrija's 1581 reedition and the 1652 manuscript, and on the other he uses the place of publication to go as far as to pinpoint the very missionary -unsurprisingly from Antequera -who must have brought a copy down to the Kongo.That the Dictionarium was used as a base sounds rather plausible, but the evidence for a particular edition is less convincing.More or less any of the numerous works of De Nebrija (Wilkinson 2010: 30-38) that had been published by the mid-17th century could have been a candidate, and indeed, Nsondé (1995: 232) refers to the re-edition of 1570.That edition was published in Antwerp, so one could as well argue that it is Joris van Gheel who brought a copy of the Dictionarium to the Kongo. 5 In Addendum 2 the start of the section "C before O" in the 1570 edition of the Dictionarium is shown.A comparison with Addendum 1 reveals some similarities, but especially many differences.Pinpointing the exact edition, however, goes beyond the scope of the present article.Yet, what is interesting to note is the strategy itself.Just as the first monograph in a Bantu language was actually a translation (cf.Section 1), so is the first reference work in a Bantu language.The use of an existing dictionary as a kind of template, to be filled in with the local language, seems to have been a common strategy of the time.An example from Mexico is the 16th century Vocabulario trilingüe, a trilingual Spanish-Latin-Nahuatl dictionary, incidentally also based on one of De Nebrija's dictionaries, the Vocabulario de romance en latin of 1516 (cf.Clayton 2003).

3.4
The language/dialect described The question of authorship is extremely relevant when it comes to determining the exact variety of Kikongo that is being described in the manuscript, since Kikongo itself does not refer to one single language, but to a large dialect continuum manifesting a family resemblance structure.Neighbouring dialects are mutually intelligible, but dialects at the extreme ends of the chain are not.If Van Gheel copied from another vocabulary list, the variety described in his dictionary does not necessarily represent the varieties of the areas in which he was preaching.Van Wing and Penders, however, make the following, rather contradictory, statement: 6   De door hem [Joris van Gheel] opgeteekende taal is die van de streek waar hij werkzaam was; het dialekt van Sogno, wellicht het meest door zijn voorgangers gebruikt, heeft echter de overhand.Deze taal overigens heeft ook P. de Teruel moeten leeren te Mbata, te Nkusu en te Mpemba.( Van Wing and Penders 1928: xxx-xxxi) While this statement could well be read as an argument favouring the hypothe-sis that Van Gheel copied from earlier Capuchin work, Van Wing and Penders do not entertain this option and they continue to consider Van Gheel to be the real author of the dictionary.According to John Thornton (personal communication, January 2012), De Cadornega (1680) mentions that there were three dialects of Kikongo and gives their approximate limits.It is not clear to what extent these dialects correspond to the three major Kikongo varieties spoken in northern Angola today: (i) Kisolongo along the coast; (ii) Kisikongo, also known as Kisansala, spoken in the wide vicinity of Mbanza Kongo; and (iii) Kizombo spoken further east.
Van Wing and Penders are not the only ones who believe that the dialect of Soyo (Fl.and Fr.Sogno, Prt.Sonho), of which Kisolongo would be the closest descendant, dominates in Van Gheel's manuscript. 7Bontinck (1976: 156) actually uses the assumed predominance of this dialect as an argument in favour of José de Pernambuco, who stayed in Soyo, to be the compiler of the first vocabulary list.John Thornton (personal communication, January 2012), however, does not believe that it is the coastal dialect of Soyo that is being described, but rather the dialect from Mbanza Kongo (San Salvador), spoken 300 km inland. 8 In De Kind (2012), a comparative phonological and morphological study between the 17th century Kikongo described in the manuscript and more recent Kisolongo and Kisikongo varieties is carried out.On purely phonological grounds it is not possible to determine which Kikongo variety is described in the manuscript, since only minor differences have been observed in this regard.However, some remarkable differences have been observed regarding the morphology of the Kikongo varieties concerned.The 17th century variety and the Kisikongo variety share innovations regarding prefix loss or reduction which are not shared by the Kisolongo variety.The clearest examples are the prefixes of classes 5 and 10.The former shifted to eboth in the 17th century variety and in 19th century Kisikongo, and subsequently disappeared in present-day Kisikongo, but is maintained as diin Kisolongo.The prefix of class 10 is realized as ziin Kisolongo, but is lost in the 17th century variety and in Kisikongo.The sound changes which the augment or pre-prefix underwent also constitute a shared innovation between the 17th century variety and Kisikongo, both having the e-o-o type, while Kisolongo exhibits the e-e-o type.Both types evolved from the ancestral e-a-o type.In sum, based on shared morphological innovations, we can conclude that the variety described in the manuscript is a predecessor of Kisikongo, and not Kisolongo.

3.5
The orthography used This question of authorship is also relevant to determine on which language the orthography of the manuscript is based.It can, at present, not be answered with complete certainty, but it seems to be both Portuguese and Spanish based.
Portuguese was the language spoken by Kongolese priests, such as Roboredo, who, as we saw, played a pivotal role in the compilation process.At the same time, many of the Capuchin missionaries came from Spain, although several were also Italian.Especially interesting are José de Pernambuco and Francisco de Veas, who participated in the compilation process and who were both Spanish (Thornton 2011).Moreover, the director of the compilation, Bonaventura da Sardegna, was of Italian origin, but studied in Spain (Gray 1998).

Le plus ancien dictionnaire bantu/Het oudste Bantu-woordenboek (Van Wing and Penders 1928)
So far, we have neatly kept Van Wing and Penders' Kikongo → French/Flemish dictionary of 1928 (mentioned in Section 1), and Van Gheel's Latin/Spanish → Kikongo manuscript of 1652 apart, even though there is a connection.According to Benson (1964: 77), Van Gheel's (1652) manuscript "was edited and reproduced" by Van Wing and Penders (1928).Merely looking at the direction (into Kikongo in 1652, vs. out of Kikongo in 1928) and languages involved (with Latin and Spanish as source languages in 1652, vs. French and Flemish as target languages in 1928), it should be clear that this cannot be a 'reproduction' by any stretch of the imagination.Compare Addendum 3, which shows a random page taken from Van Wing and Penders' dictionary, with the manuscript pages seen in Addendum 1.In this respect we concur with Doke, who rightly said about Van Wing and Penders' effort: Unfortunately the present Editors have not published the manuscript in the form in which it was written, viz.Latin-Spanish-Kongo, but have taken out the 7000 odd Kongo words alphabetically, and then added French and Dutch equivalents.
Since the publishing of such a work to-day is not of everyday practical worth, but of great value to students, such a method of handling the manuscript is the opposite of scientific.(Doke 1935: 96) The Vocabularium Congense, in its 1928 incarnation -which Van Wing and Penders titled (in French/Flemish) Le plus ancien dictionnaire bantu/Het oudste Bantu-woordenboek, or thus The Oldest Bantu Dictionary -remains the more accessible of the two versions, however, so it is important to submit it to an analysis, in order to judge its scientific value.

The modern Kikongo orthography: base letters
Over and above the changes to the direction and languages involved, an even more obtrusive intervention concerns the adjustment of the Kikongo words to the 'modern' Kikongo orthography.In doing so, several phonemes of the original were obscured and merged in the modern variants.For instance, the grapheme <v> in Van Wing and Penders might refer to <bh> or <u> in the original.
It is extremely doubtful that these two graphemes represented the same phonemes, let alone the same sounds.
In (1) we list the principal changes of Van Wing and Penders (1928: xxxiiixxxiv) with regard to the orthography, and we discuss some of the problems that result from these changes.
u, ü = v or w, according to the modern orthography  y = y or i, according to the modern orthography  z = z or s, according to the modern orthography Some of the changes might be considered useful as they clarify the original orthography which was influenced by Portuguese or Spanish and approximate the IPA conventions.The change from <cu> to <kw> in front of vowels should not be considered harmful, nor should the change from <c> to <k>, since <c> always seems to represent the voiceless velar plosive /k/.In modern-day Spanish, the grapheme <c> might refer to the voiceless dental fricative /θ/, when followed by <e> or <i>.The manuscript, however, seems to use the grapheme <z> to represent this voiceless dental fricative, as seen in the Spanish hazer 'do, act' in (2). 9 (2) ago.is.hazer.cubhanga: p. npā (ago 'to do, to act') gúiri.
The changes from <gu> to <g> before <i> or <e> and to <gw> before <a> do not imply phonological changes and merely clarify the Portuguese or Spanish orthography.When reading the manuscript, one must thus be conscious of the fact that <gu> before <i> or <e> represents the voiced velar plosive /g/, while <gu> before <a> (or <o>) represents this voiced velar plosive /g/ followed by the voiced labialized velar approximant /w/. 10  The change from <qu> to <ku> is problematic, since <qu> only represents /k/ when followed by <i> or <e>.When followed by <a>, <o> or <u>, it represents /kw/, that is the voiceless velar plosive followed by the voiced labialized velar approximant.However, in practice Van  Other orthographical changes do have an impact on phonetic and/or phonological distinctions.Such is the case with <b> and <bh> becoming <b> or <v>.
In most cases <b> remains <b> and <bh> is replaced by <v>, but unfortunately in some cases <bh> is also replaced by <b>.See (5).It seems unlikely that both <bh> and <u> in the original represent the voiced labiodental fricative /v/.<bh> never existed as a grapheme in Portuguese or Spanish and its phonetic value cannot be pinpointed with certainty.It is possible that the indication of an aspiration of /b/ was intended, but in the Bantu languages, it is voiceless rather than voiced plosives that are normally aspirated. 12It is more likely that it represents the voiced bilabial fricative /β/, as is also suggested by Thornton (2011), who mentions the existence of the bilabial fricative in some dialects.It is also attested in Kizombo as a reflex of *p, after a nasal prefix of class 1, for instance in /ɱβaŋgi/ 'creator' (Fernando 2008: 32).However, *p is reflected as /v/ in an intervocalic position, for instance in -vanga 'do, make'.Possibly, the dialect in the dictionary did not yet make a distinction between these two sound changes and *p was always reflected as /β/ before a non-close vowel.It seems, nonetheless, problematic to regard <bh> as /β/ with respect to some Spanish words included in the dictionary, in which the <u> grapheme represents the bilabial fricative /β/, as in example (7), heruir [eɾβiɾ].
As such, two graphemes (<bh> and <u>) would be used to represent the same sound /β/.This can be explained if we assume that the Spanish words were merely copied from the Latin-Spanish dictionary, and that the Kikongo words were added in with a slightly different orthography, namely the already established Kikongo orthography of the time, which must rather have been based on Portuguese.Thus, <u> might represent /β/ in Spanish, while <bh> might represent /β/ in Kikongo.
The <u> grapheme, on the other hand, seems to represent several phonetic values.It might represent the voiced labial velar approximant /w/, as it merges with the /w/ sound of several prefixes.It is, thus, used as semivowel.But from a diachronic perspective, the evolution from /w/ in the 17th century to /v/ in the beginning of the 20th century (i.e. the sound reflected in the Kikongo variant to which Van Wing and Penders have adjusted their orthography) seems unlikely.Since /w/ is a 'weaker' sound than /v/, it would be more logical the other way around, a phenomenon called 'lenition' (Crowley and Bowern 2010: 39).It is, therefore, likely that <u> in the manuscript represents both /w/ and /v/.This is corroborated by the fact that no <v> graphemes can be found in the dictionary, which are all included under <u>.Example (8) illustrates different uses of the <u> grapheme in the Kikongo word eúúa, in which the first <ú> might refer to the labiodental fricative /v/ (or perhaps the bilabial fricative /β/, or even something in-between), while the second <ú> probably refers to the semivowel /w/. 13(8) noúem.eúúa (noúem 'nine') The change of <ç> to <s> or <z>, and of <z> to <z> or <s>, is also likely to cause phonetic changes, but this needs to be studied in further detail.
To summarize this section one can thus say that the orthographic changes executed by Van Wing and Penders, on the level of the basic letters, include changes that clarify, but unfortunately also changes that obscure the phonetic and/or phonological values of the graphemes used.

4.2
The modern Kikongo orthography: diacritic marks Another remarkable orthography change executed by Van Wing and Penders is their omission of diacritic marks.The precise function of the diacritics in the original is difficult to retrace.One would expect them to represent tone, but this is unlikely for two reasons.First, acute accents, currently associated with high tone in Bantu linguistics, also occur on the Latin and Spanish words, which are definitely not tonal.See ( 7) and ( 8) above for Latin examples, and (9) for a Spanish example.
Second, both acute and grave accents occur on the Kikongo data, which would imply a three-tone system, since an unmarked syllable would then be interpreted as mid-tone. 14This is not found in the contemporary Kikongo varieties or in other Bantu languages (Lumwamu 1973: 25).See ( 10 An analysis of the diacritics on the Latin and Spanish words in the manuscript does not reveal much either.What is significant is that the diacritics in these two European languages only occur on <u>, and exclusively as acute accents, while they occur on more vowels in Kikongo, and also include grave accents and other diacritics.Neither in Latin nor in Spanish do they seem to indicate stress, as they occur on vocalic, consonantal and semi-vocalic uses of <u>.Moreover, this is not consistently done.In (11), for example, nouus 'new' is written without any accents, while noúitas 'novelty' is written with an acute accent.
iaúbha : (noúitas 'novelty') Also, both púrgo and purgo occur, as seen in ( 12), in which the form with the acute accent represents the transitive form of the verb, 'to purify', as indicated in the margin of the manuscript, while the unmarked form represents the reflexive form, 'to apologize'.Unfortunately, no other instances of such a dif-ferentiating function have so far been found.
(púrgo 'to purify') (12b) purgo.as.desculparse.cúicússula (purgo 'to apologize') múqúicúma.p. icúsúiri & Remarkably, even an unpronounced <u> is occasionally given an acute accent, both in Spanish and in Kikongo, as is illustrated in example ( 13).The <qú> grapheme in the Spanish qúemar, qúe and qúema is pronounced as the voiceless velar plosive /k/, as it is in the Kikongo múbhiqúi.This conveys the impression that the diacritics have not been used in a systematic way.

(contorqueo 'to throw')
To summarize this section one can thus say that the functions of the diacritic marks in Van Gheel's (1652) manuscript, omitted by Van Wing and Penders (1928), are extremely hard to retrace.At this stage we have to conclude that no apparent system was used for the placement of accents and other marks, but further research may, hopefully, invite us to revise this view.The option of vowel length could also be studied further in this regard.It might also be the case that several diacritic systems are intermingled, one belonging to an as-yet undiscovered original, and others belonging to the copies such as the one made by Van Gheel.

Translation equivalence in Van Gheel (1652) and Van Wing and Penders (1928)
The difficulties of translating an existing dictionary into another language are well known, especially when having to bridge languages with very different grammatical structures.Several issues are dealt with by Clayton (2003: 101-108), when she discusses the addition of Nahuatl to a 16th century Spanish/Latin template.Earlier, Doke (1935: 87), referring to the Bantu languages in the age of Brusciotto, spoke of "the Latin approach to a treatment of Bantu when grammatical elements are dealt with".As any bilingual (or trilingual, quadrilingual, etc.) lexicographer will be able to confirm wholeheartedly, perfect interlingual correspondence is a chimera.With reference to Zulu, De Schryver and Wilkes (2008) coined the term 'complexicography', and offered some modern (corpus-driven) solutions.Summarising the state of the art, Adamska-Sałaciak recently recognized three potentially interconnected reasons underlying the complexity of interlingual lexicography: The lexicons of natural languages are not isomorphic.Reasons for the anisomorphism can be sought on three interrelated planes: language structure, extralinguistic reality, and conceptualisation.Simply put, the relevant differences may reside in the language, the world, the mind, or any combination of these.
(Adamska-Sałaciak 2011: 1) No doubt, our Capuchins were faced with exactly these problems when adding in Kikongo to their Latin/Spanish template.It is, therefore, instructive to look at some of the solutions found to combat anisomorphism in the Vocabularium Latinum, Hispanicum, e Congense of 1652, and to look at the technique that was used when taking out the Kikongo in compiling The Oldest Bantu Dictionary of 1928.

Meaning extensions
A neat solution for imported (here European) concepts is to resort to extending existing meanings, in combination with the general morphological rules for word formation in a language (here Kikongo), as seen in ( 18) and ( 19).
(cathecumenus 'catechumen') mulungua .músonguela.pl.a&.In ( 18) mulungua (sic, rather mulongua) and músonguela are offered as translation equivalents for cathecumenus 'catechumen', both nouns having been put in class 1 (mu-), and derived from the verb roots -longua 'to learn, to be taught' and -songuela 'to advise' respectively.According to the OED a catechumen is "[a] new convert under instruction before baptism", and as in the original Greek (i.e.κατηχούμενος 'one being instructed (in the rudiments of religion)'), the Capuchins derived the two Kikongo versions from verb roots equivalent in meaning to the Greek ones.In ( 19) the second and third translation equivalents for discipulus 'disciple' are derived from the same two verb roots as in ( 18), while the first option muana a mucanda literally means 'child of the book', or thus 'student', and by extension 'disciple'.Lexicologically the Capuchins clearly did a rather good job, terminologically they unfortunately introduced an ambiguous term (with mulongua being both a 'catechumen' and a 'disciple'), and lexicographically they have been sloppy: the Spanish equivalent is present in (18) but missing in ( 19), the plural of the first Kikongo equivalent in ( 18) is missing but present elsewhere in ( 18) and ( 19), and the structural marker preceding plurals is "pl." in (18) but "p." in (19). 17

Paraphrases
When the Capuchins did not manage to create a single-word term for a novel concept, they simply combined words paraphrasing the concept, creating a multi-word term, as in ( 20), where two connectives (lua and ia) are used.

Loanwords
Unsurprisingly, there are also cases where the Capuchins simply took both the foreign concept and the word itself, with or without phonological adaptation.In ( 21) and ( 22) the loanword was taken from Portuguese, while in ( 23) it was taken from Latin. 18

Misnamings
Not only did the Europeans bring elements of their culture to the Kongo area, it is clear that the Kongolese culture consisted of elements unfamiliar to the Europeans as well.This bias might be less visible to the European scholar, as indigenous terms are used to denote foreign concepts.Their meaning is not just extended; their original meaning (at least in Van Gheel's manuscript, as well as in Van Wing and Penders' reversing out) is denied and abandoned for the foreign concept.This becomes especially clear when comparing these terms to other Bantu languages or to the Proto-Bantu reconstructions.For instance, while the Capuchins were familiar with wild animals such as lions, leopards and elephants, they were apparently not familiar with hyenas and jackals.Examples ( 24) through ( 26) show that the translations of lion, leopard and elephant correspond to the respective Proto-Bantu reconstructions, while examples ( 27) and ( 28) show that there is a mismatch for hyenas and jackals, as these are offered as equivalents for wolves and foxes respectively.
(24) leo.onis.ncossi.p. id.Here we have reached a crucial point, and are entering the domain of forensic dictionary analysis (cf.Coleman and Ogilvie 2009).That existing terms may be (re)used to name similar animal species across continents is well known.For instance, the Dutch who settled in the Cape named a certain species of fish they found in the sea snoek, drawing an analogy with the fresh water snoek they knew from home.The two are however different species, prompting the latest Afrikaans-Dutch dictionary to point out: "In Afr.verwys 'snoek' na 'n bepaalde soort seevis, nie 'n varswaterroofvis soos in Ned.nie" (ANNA). 19In the case of snoek, it was one people who used (initially) one language (Old Dutch), to name a new species.Not having a name for the new species, they used a term they already had for a similar fish.This is different from our interlingual Kikongo dictionary.The European-born Capuchins surely had had first-hand experience with wolves and foxes in Europe, and so must have realized that the hyenas and jackals in Africa were different species.Could they then, as suggested at the start of this section, really have taken Kikongo terms in use for other species, to now name animals from Europe?This sounds improbable.More plausible is the situation whereby a native of the Kongo is presented with a description of wolves and foxes, which are unknown to him, to then, based on that evidence, offer terms from his native Kikongo as translation equivalents.If anything, then, the errors noted in ( 27) and ( 28) are pointing in the direction of a dictionary compiler whose native language and view of the world are African.In other words, the case in favour of Roboredo as the main compiler of the first Capuchin manuscripts is getting stronger. 20 A second crucial point concerns the words-and-things method.This method is founded on the basic idea that a community's culture is reflected in its language.It is therefore used to reconstruct the history of a particular region on the basis of vocabulary reconstructed from the languages spoken there (Bostoen 2007: 175).Looking back at examples such as ( 27) and ( 28), it should thus be clear that extreme caution must be exhibited in blindly citing 'evidence' from it.Bontinck (1976Bontinck ( , 1980)), too, pointed this out, and criticized Vansina (1974) for using Van Wing and Penders (1928) very loosely, for instance with respect to his deductions on the presence of certain craftsmen in the Kongolese society, such as "slave traders, wine merchants, butchers, fishmongers, booksellers, shopkeepers, grocers for spices, clothes sellers, perfume dealers, and pharmacists" (Bontinck 1976: 155, Vansina 1974: 149, Van Wing and Penders 1928: 85).Clearly, the same holds for conclusions regarding the Kongolese wildlife, as illustrated above.One cannot conclude that the Kongolese wildlife included wolves and foxes (cf.Kingdon 1997). 21

Retranslations
On top of the anisomorphisms already discussed, Van Wing and Penders added yet another layer of translation inequivalence.In their own words (quoting the French version as it conveys it better than the Flemish): 22 En faisant la traduction française et flamande des mots congolais nous avions à tenir compte du sens du mot congolais, tel qu'il nous est connu en congolais moderne et en même temps du sens des mots correspondants en latin et en espagnol donnés par notre auteur.Il arrive parfois que l'auteur rend inexactement en congolais certains mots latins.De la sorte il sera arrivé quelquefois, que nous avons donné une traduction française et flamande qui ne rend pas exactement le sens du mot congolais.( Van Wing and Penders 1928: xvi) In other words, on top of reversing out the entire dictionary of Van Gheel, Van Wing and Penders also insisted on adding the modern (i.e.end 19th-beginning 20th century) Kikongo meanings, and being unhappy with some of the Latin to Kikongo translations, they sometimes additionally translated directly from Latin into French/Flemish, regardless of the Kikongo!At all times, and despite the fact that there is no fixed slot for Latin in their dictionary, one thus actually has to 'imagine' there is an underlying layer of Latin 'driving' the entire enterprise.Van Wing and Penders do not give examples of their claim, but a candidate imbedding several levels is shown in ( 29).
With regard to the reversal proper, the Latin lemma sign draco and the Kikongo translation equivalent nboma traded places, becoming the Kikongo lemma sign Mboma and the French/Flemish translation equivalents dragon/draak.A better (zoological) knowledge of Kikongo resulted in the fronting of python/reuzenslang as translation equivalent; while a retranslation from the Latin (with dracō 'snake; dragon') further added espèce de serpent/soort slang 'type of snake'.Important here, is that there is no entry for 'python' in the manuscript, nor, of course, for 'type of snake', so Van Wing and Penders' two additional translation equivalents are not the result of reversing out Van Gheel's manuscript.

Lexicographical structure in Van Gheel (1652) and Van Wing and Penders (1928)
Van Gheel's dictionary being a manuscript, no typographical variation is pre-sent.All the information is written in a single file with, for dictionary articles longer than one line, some slight indentation, as seen in the images from the dictionary reproduced above.The only non-typographical structural marker used is the full stop, which delimits both the languages (Latin vs. Spanish vs. Kikongo, whence the full stop is typically attached to the last word of the respective language), and separates synonyms (in Spanish and Kikongo, whence the full stop is typically surrounded by white space).Full stops are also used with abbreviations, and end dictionary articles (though this overlaps with the end of the Kikongo slot).In contrast, and for all its faults, Van Wing and Penders' published dictionary is a rather advanced product for early 20th century Bantu lexicography.Theirs uses typography (bold vs. Roman vs. italics) to separate the three languages, and also uses many more non-typographical structural markers (commas, semi-colons, colons, full stops, long dashes (in lieu of the more usual tildes), ellipsis, the symbol "./.", as well as round and square brackets).Recurrent orthographic markers that structure the text include "N.B.", "v.g.", and "dans : in :".The latter is especially interesting, as it signals lemma signs which only take on a translatable meaning when combined with other words.Examples are shown in ( 30) and ( 31).
('which flows, runny') This beautifully solves a lexicographic problem in a user-friendly way, by sidestepping the question of lemma-sign status.A variation is shown in (32), where the lemma sign is either Safiru 'sapphire' or etari ria Safiru 'stone of sapphire'.
(exemplification of Vanga riaka, vangulula, refaire ; herdoen. the grammar) In ( 33) a general grammatical rule is explained: "To indicate repetition, add the word riaka, or change the final vowel a to ulula". 23The very same grammatical point and exemplification could of course have been added throughout the dictionary, at many a verb with the potential for repetition (in English re-…).It is not clear why Van Wing and Penders decided to include it with this verb only; apart, perhaps, from the fact that it may be the first verb with this feature in the alphabetically-ordered lemma list -but then, no one reads a dictionary from A to Z.It also does not seem to be copied from Van Gheel's manuscript, as no such note can be found under calefacio 'to heat' or liquefacio 'to melt', neither can it be found under ago 'to do, to act' (see ( 2) above) or facio 'to do, to make' (the equivalents of Van Wing and Penders' Kikongo vanga/bhanga).
Other grammatical points in Van Wing and Penders do find their origin in Van Gheel's manuscript, as may be seen from a comparison of (34a) with (34b) in terms of the clarification "always requires to be specified further".34) is an excellent illustration of how an entire article from Van Gheel's manuscript was reversed out by Van Wing and Penders.All the information seen in ( 34b) is derived from (34a), but the reverse is not fully true, as one also needs (34c) in addition to (34b) to complete the information that came from (34a).What is also missing from Van Wing and Penders throughout is the part of speech of each lemma; although one could argue that this is implicit in their treatment (nouns being followed by an indication of how to form their plural, verbs by their first person praeteritum, etc.).Here one dictionary article in the manuscript straightforwardly gave rise to two in the dictionary; one thus deals with divergence.
Examples of convergence also abound, whereby different slots from a series of dictionary articles from the manuscript were combined into one by , where the compilers even included a Latin slot, between square brackets, indicating where the information came from in Van Gheel's manuscript, namely liberare (cf.( 10 If one looks at all the other translation equivalents in for instance (10) it should be clear that Van Wing and Penders often had to make use of both divergence and convergence simultaneously, in a first phase taking out each Kikongo word from Van Gheel's manuscript and translating that into French and Flemish via Latin (divergence), and in a second phase collapsing the material that belongs to single dictionary articles (convergence).Over and above, they added their own material (nonvergence).The result of this approach to compiling a dictionary is that Van Wing and Penders' publication not only looks more dictionary-like but also contains more data.Indeed, what sets Van Wing and Penders most apart visually is their often long lists of combinations; a short version of which is shown in (36).
(playing cards) While the lemma and its translation equivalent, as well as the first and last combination, have been taken from Van Gheel's manuscript, all the combinations in-between have been added by Van Wing and Penders.No wonder Van Gheel's manuscript of 243 pages grew to 361 printed pages in Van Wing and Penders.

7.
Bringing everything together: the KongoKing Database (2012) Reading through Van Gheel's (1652) Latin/Spanish → Kikongo manuscript, there can be no doubt about its intended target user: It is an active, encoding dictionary meant to help the missionary produce Kikongo.The main compiler was very likely Roboredo, a Capuchin born in the Kongo.In the front matter to their Kikongo → French/Flemish dictionary, the Belgian Jesuits, Van Wing and Penders (1928: xxxii), are also clear about their goal: It is meant to be a scientific work for both Bantuists and missionaries, hence why they chose Kikongo as a source language, and French and Flemish (the two official languages of Belgium, the colonial power at the time) as target languages.About their effort, the towering Bantuist Malcolm Doke had been scathing, see Section 4 above.In the absence of any other edition of Van Gheel's manuscript, however, it has been the only entry point to it for over 80 years now, and as we saw, it has indeed been (mis)used during that period.With roots in both the 17th century and the turn-of-the-19th-20th century, it is also a valuable dictionary in its own right.Today, in 2012, there is a renewed interest in getting easy access to this early Kikongo data as a result of the launch of the KongoKing research project.The interdisciplinary KongoKing team wishes to shed new light on the origins, rise and development of the early Kongo kingdom, by combining and coordinating pioneering archaeological fieldwork in Angola and Congo with novel historical linguistics research.To that end, a digital transcription of Van Gheel's manuscript as well as the digitization of Van Wing and Penders' dictionary has become a necessity.Keeping the need for a long-due critical edition of Van Gheel's manuscript in mind, and the digital reality of the 21st century, we opted for using the dictionary production system TLex (aka TshwaneLex, cf.De Schryver 2011).With the aim to allow for cross-searches and with future multimedia extensions in mind, we also opted to work in a single database.TLex has a feature (called linked-view mode) that can automatically connect distinct dictionaries that are stored in a single database, and a common language to enable this is the ideal route.Given both Van Gheel's manuscript and Van Wing and Penders' dictionary have only Kikongo in common, and given Kikongo is the main language of interest to the project, one would be tempted to opt for it as the linking language.However, given the varying Kikongo orthographies of the two reference works, it seems better to abstract to a stable language or formalism.In this respect, we are in luck in that we actually have such a language: it is Latin.Recall that we pointed out in Section 5.5 above that also Van Wing and Penders used Latin as an underlying layer during their compilation.In practical terms, by adding a (hidden) Latin slot to the data of Van Wing and Penders, it is possible to automatically coordinate both dictionaries in an electronic environment, and to visually see the divergences, convergences and nonvergences described in Section 6 above.In metalexicographical terms this amounts to a variation of the hub-and-spoke model (Martin 2004), whereby a hub-language is used to create a series of bilingual dictionaries between it and several spoke-languages, which then allows for a combination of the spokes amongst one another, invisibly through the hub.Latin is our hublanguage, but only partly hidden: hidden in Van Wing and Penders, but visible (as the source language) in Van Gheel.
The digitization of Van Wing and Penders has already been completed.Their publication was scanned and OCRed, and then parsed for importation into TLex.In one of the views (TLex allows for any number of dictionary 'views' of the database data) the printed dictionary is mimicked, typography, punctuation and all, though underlying that, extra slots have been provided for Latin (the linking language), as well as for various aspects needed in the KongoKing project such as fields for the addition of the Proto-Bantu forms, various semantic label sets, cross-references to material about corresponding archaeological finds, cross-references to corresponding academic papers, etc.
The digital transcription of Van Gheel's manuscript is ongoing.A major difficulty here is the poor readability of the original, as well as the rather haphazard use of a flat lexicographic structure.This necessitates occasional changes to the DTD (or document type definition, i.e. the dictionary grammar).The positive aspect, though, is that a rigid structure is being imposed onto the manuscript data in the process, with every part of the data ending up in its proper dictionary slot.In addition to the transcribed but now structured material, images of the original entries also accompany each dictionary article.A notes field was also added, used to point out uncertainties, errors, etc. as in a traditional (i.e.paper) critical edition.A screenshot of the two dictionaries in linked-view mode in TLex may be seen in Addendum 4.
Having first moved the field of Bantu lexicography back with two centuries, it is now exciting to witness the recreation and digitization of the very first extant Bantu dictionary.As a work in progress, it will be made available on the KongoKing website, at which point the oldest Bantu dictionary and its 19th-20th century rework will not only be searchable in five languages, but also searchable using any combination or restriction of lexicographic metalanguage (such as word classes or semantic fields), and it will moreover function as a stepping stone towards new, multimedia data that aims to uncover the Kongo history of what came before the compilation of this first Kikongo dictionary.This fitting digital lexicographic capstone, then, is only the beginning of writing Kongo's early history.20.Interestingly, in Van Wing and Penders' (1928) reversed-out version, the wolves and foxes are still featured, even though the earlier Bentley (1887), to which they had access, got it right talking about hyenas and jackals only.21.Nor bears and tigers for that matter, given both ursus and tigris have also both (wrongly!)been given the translation equivalent ngo, as in (25).Bears and tigers also feature in Van Wing and Penders, but