The Lexicographic Treatment of Ideophones in Zulu

Abstract: The ideophone, a word class not unique to but highly characteristic of the Bantu languages, presents particular challenges in both monolingual and bilingual lexicography. Not only is this part of speech without a counterpart in most other languages, the meaning of ideophones is highly elusive. In this research article these challenges are studied by means of an analysis of the treatment of ideophones in a corpus-driven Zulu–English school dictionary project. Keywords: LEXICOGRAPHY, DICTIONARY, BILINGUAL, CORPUS, FREQUENCY, BANTU, ZULU (ISIZULU), ENGLISH, IDEOPHONE, SEMANTIC IMPORT, PARAPHRASE, PART-OF-SPEECH MISMATCH Samenvatting: De lexicografische behandeling van ideofonen in Zoeloe. De ideofoon, een woordklasse die niet uniek maar wel heel karakteristiek is voor de Bantoetalen, is een echte uitdaging in zowel de monolinguale als bilinguale lexicografie. Niet enkel heeft deze woordklasse geen equivalent in de meeste andere talen, de betekenis van ideofonen is heel moeilijk vast te leggen. In dit onderzoeksartikel worden deze uitdagingen onderzocht aan de hand van een analyse van de behandeling van ideofonen in een corpus-gedreven Zoeloe–Engels schoolwoorden-boekproject. Sleutelwoorden: LEXICOGRAFIE, WOORDENBOEK, TWEETALIG, CORPUS, FRE-QUENTIE, BANTOE, ZOELOE, ENGELS, IDEOFOON, SEMANTISCHE LADING, PARAFRASE, VLOEKENDE WOORDKLASSEN


1.
The coinage and meaning of the term 'ideophone' Although the term 'ideophone' has not been entered as a lemma sign in the second edition of the Oxford English Dictionary (OED Online 2009), it may be found within the article of the combining form 'ideo-', as shown in ( 1): (1) ideocombining form of Gr. ί δ έ α IDEA, as in […] ideophone (-f ɘ ʊ n) [Gr.ϕ ω ν ή voice, sound], (a) term used by A. J. Ellis (in contradistinction to ideograph) for a sound or group of sounds denoting an idea, i.e. a spoken word; (b) a term used principally in Bantu linguistics to refer to particular classes of onomatopoeic and sound-symbolic words found in these languages; so ideopho˙netics, the subject of 'ideophones'; hence ideo˙phonic a.; ideophonous (-ɒ f ɘ n ɘ s) a., relating to spoken words as sounds denoting ideas; […] The OED's citation evidence starts with: "1881 A.J. ELLIS Synops.Lect.Lond.Dialectical Soc. 2 Nov., Mimetics, ideographics, and *ideophonetics.Fixed ideograph, variable *ideophone, and their connection."Unfortunately, the reference to the Ellis source is a dead cross-reference.However, according to the OED lexicographer Jesse Sheidlower: "The OED is citing a printed card announcing two of the London Dialectical Society's November meetings, mailed by Ellis to James Murray (they were friends), and subsequently deposited by Murray in the OED archives" (Language Hat 2008), and he points out that: "We'll be clarifying our bibliography to show that this is not a published item" (The Ideophone 2008).First recorded in 1881, then, the meaning assigned to the term 'ideophone' under (1, sense a), was redefined as (2) in the Century Dictionary Supplement of 1909 (Century Dictionary Online 2009): (2) ideophone (ī -d ē ' ō -f ō n), n. [Gr.ί δ έ α , idea, + ϕ ω ν ή , sound.]In phonetics, the auditory symbol of a word or phrase that is perceived as a whole and thus constitutes a single idea.Ideophones are distinguished as sensory or motor, according as the sound or group of sounds corresponding to the word or phrase is heard or spoken.See *ideogram, 2. First used by A.J. Ellis.Scripture, Exper.Phonetics, p. 132.Since Doke's (1935) publication of Bantu Linguistic Terminology, however, the term has been considerably expanded, as seen in the oft-quoted opening section of Doke's definition: 1   (3) IDEOPHONE (Idéophone) [Ideophon].A vivid representation of an idea in sound.
A word, often onomatopoeic, which describes a predicate, qualificative or adverb in respect to manner, colour, sound, smell, action, state or intensity.The ideophone is in Bantu a special part of speech, resembling to a certain extent in function the adverb, together with which it is classified as a descriptive.
In addition to the expansion in meaning compared to the earlier definitions, Doke's more important addition to the concept of ideophones is that he underscores the special status accorded to ideophones in Bantu, namely that they are a word class distinct from the other parts of speech (POSs).No wonder, then, that the OED's definition (1, sense b) singles out Bantu linguistics, even though ideophones (or at least aspects of them) are found in many (if not all) of the world's languages.
In reviewing the Bantu literature, Doke noted that various authors had suggested many a term for the ideophone.He lists: radical, descriptive adverb, descriptive complement, indeclinable verbal particle, intensive interjection, interjectional adverb, onomatopoeic vocable, onomatopoeic adverb, onomatopoeia, onomatopoeic substantive, mimic noun, indeclinable adjective, etc. (Doke 1935: 119).Ironically, Weakley (1973: 2) correctly concludes: Of all the above terms onomatopoeia is probably one of the least suitable.It would not even be appropriate I should imagine, to say the ideophone is usually onomatopoeic.As Fortune points out: "The fact among others, that ideophones can be used to indicate complete silence makes the term onomatopoeic an unrepresentable term for ideophones as a whole." In another review of the Bantu literature, Samarin (1971), summarized in Weakley (1973: 7) So far, the focus has been on semantic (and pragmatic) aspects of ideophones -a logical by-product of looking at dictionary definitions.Even more striking, however, are the unique phonological, morphological and syntactic aspects of ideophones -all of which have received considerable attention in the scientific literature.For a global perspective on these aspects, see the collection edited by Voeltz and Kilian-Hatz (2001), and especially the 'Bibliography of ideophone research' therein.The most recent and continuously updated source on ideophones is doubtless the blog The Ideophone by Mark Dingemanse, who includes the function of ideophones in his research.His most recent working definition (June 2009) for the ideophone is: "marked words that vividly evoke sensations and perceptions" (Dingemanse 2009).

Ideophones in (Zulu) metalexicography
With now well over a century of linguistic research into the world's ideophones, one can fairly say that this class of words starts to be rather well described linguistically.Lexicographic aspects, however, have unfortunately largely been ignored.To the best of our knowledge, only two serious attempts have been made, both in Lexikos, by Childs (1993: 21-23) and Khumalo (2002: 270-271).
Childs starts his discussion of ideophones in Kisi as follows: "Ideophones pose enormous problems to the lexicographer because of their monumental variation and semantic indeterminacy."He then goes on to pose some important questions, including: Is it the lexicographer's task to faithfully record the fact that ideophones exploit prosodic resources?What does one do when the ideophone seems to have no independent meaning?Given the close relationship between ideophones and gestures, how must gestures be represented?Unfortunately, Childs does not give any answers to these questions.Furthermore, while phonological and pragmatic aspects are indeed important, lexico-graphy is primarily concerned with meaning, so of these questions an answer to the second one is most needed.
In that sense, Khumalo's discussion of ideophones is more revealing.Describing the compilation of a monolingual (Zimbabwean) Ndebele dictionary, he shows how three distinct defining formats were developed for ideophones, formats patterned on the COBUILD style of writing definitions (cf.Hanks 1987).
With specific reference to bilingual lexicography, one only finds trivial statements, with no solutions whatsoever.For instance, all that is said by Gauton (2008: 112) on ideophones is: "Languages differ in their parts of speech.For example, a language such as Zulu distinguishes the word category 'ideophone' which does not exist in a language such as English."Or by Jadezweni (1998: 323): "The difficulty of providing English equivalents for Xhosa ideophones can be a nightmare to learners […] It requires a lot of imagination to be able to come up with explanations of ideophones." Of course, ideophones have been entered into dictionaries, and in dictionaries for the Bantu languages, they have typically been termed 'ideophone' in the slots for the word classes too.What is missing is a proper metalexicographic analysis, especially one that bridges a Bantu language (where ideophones are a distinct part of speech) with a non-Bantu language.
This is exactly what is done in the present contribution.As a case study, the ideophones in a Zulu to English school dictionary are looked into.As such, this study forms part of a series of studies in which each of the various Zulu word classes is analysed from a lexicographic point of view.Earlier instalments in the series looked into the possessive pronouns (De Schryver and Wilkes 2008), the adjectives (De Schryver 2008), and the quantitative pronouns (De Schryver 2008a).
While the number of ideophones in each (Bantu) language varies, it is generally assumed that there are 'many', or that they at least "represent a sizeable proportion of a language's lexicon" (Childs 1994: 179).That this is indeed the case is not disputed, but current dictionaries -for Bantu languages generally, and for Zulu particularly -may well over-represent ideophones to the detriment of words in other word classes.There is likely a sociolinguistic underpinning for this, as Weakley (1973: 9) observes: "It seems that the proper use of ideophones can be correlated with a mastery of the language concerned." 2 In earlier times, lexicographers who worked without access to large electronic corpora may thus very well have overdone it by including in their dictionaries as many ideophones as they could, just as they tended to stock their dictionaries with (often rare) idioms and proverbs.Conversely, it may also be the case that ideophones are used more often in spoken than in written language, and thus that the modern corpus-driven approach to dictionary making will actually under-represent the ideophones, given corpora are mostly built up from written sources.Fivaz (1963) proceeded to count all ideophones in the largest dictionary available for Zulu, viz.Doke and Vilakazi's (1953) Zulu-English Dictionary, and arrived at 2 600 ideophones.With about 30 000 lemmas in that dictionary, 3 this corresponds to a massive 8.67%.Based on the occurrence frequencies in our 8.5-million-word written Zulu corpus, however, we conclude that ideophones in (written) Zulu are not particularly frequent: Only five make it into the top 1 500 lemmas, with exactly 100 in a dictionary covering the 5 000 most frequent lemmas.In our project, and expressed in per cent, the ideophones thus make up 2.00% of the lemmas (100 out of 5 000, compared to 2 600 out of 30 000 or 8.67% in Doke and Vilakazi).
With the present study, then, the Zulu word classes considered from a lexicographic point of view so far are shown in Figure 1, with the possessive pronouns representing 1.98%, the adjectives 2.52%, the quantitative pronouns 0.66%, and the ideophones 2.00%.Although another 'small' category in dictionary terms, the metalexicographic description of ideophones is highly relevant.For one, in our Zulu to English dictionary project, their compilation took an average three times longer than the compilation of entries in any other word class.It is no exaggeration, then, that in Bantu lexicography ideophones are a lexicographer's worst nightmare.This is so, not because of their peculiar linguistic properties -be these phonological, morphological or syntactic (see for example, with specific reference to Zulu: Fivaz (1963), Voeltz (1971), Von Staden (1974, 1977), Taljaard and Bosch (1993: 162), Childs (1996), Poulos and Msimang (1998: Chapter 8), or Msimang and Poulos (2001)) -but because of their semantic import that is hard to pinpoint, describe and represent lexicographically.
Although our Zulu-English dictionary project is a bidirectional one, the focus will be on the Zulu to English side, as that is the side where the Zulu ideophones are lemmatized.As will be clear from the discussion below, attempting to 'reverse out' ideophones, and thus attempting to force ideophones into the reverse side of the dictionary as lemmas on the English to Zulu side, is a futile attempt.This does not mean that there are no Zulu ideophones to be found in the English to Zulu side of the dictionary.When translating into Zulu, mother-tongue speakers often feel the need to introduce them, as may be seen from entries (4) and ( 5), where highlights have been added for ease of reference: (4) In the definition for retailer under (4), the meaning of the English adverb 'directly' has been rendered with the Zulu ideophone ngqo 'of straightness, of directness', and in the definition of endangered under (5), the English phrasal verb 'cease to exist' has been rendered in Zulu by combining the verb -phela 'come to an end; get finished; run out' with the ideophone nya 'of complete absence'.
Actually, given that our Zulu corpus also contains sources translated from other languages into Zulu, several attested ideophonic uses found their way into the corpus, and from there into our dictionary, through languages other than Zulu.To give one example: When translating the sentence 'Dawn was breaking and they could just make out several villages dotted about on the open hillslopes', taken from Jenny Seed's (1968) The Voice of the Great Elephant, the translator, N.S.Ntuli (1988), introduced an ideophone, as shown in ( 6 This bidirectionality is non-trivial, as it clearly indicates that even though there is no word class 'ideophone' in English, in order to produce idiomatic Zulu, certain English concepts and expressions 'require' the use of ideophones when they are translated into Zulu.

The semantic import of (Zulu) ideophones
The last statement in Section 2 automatically leads to two areas in need of further investigation in bilingual lexicography.On the one hand one needs to be able to 'map' the meaning of an ideophone in one language onto a relevant concept or expression in another language, and vice versa.On the other hand one needs to be able to 'map' the part of speech ideophone in one language onto a relevant part of speech in another, and vice versa.Although the part of speech ideophone is rarely problematic in monolingual (Bantu) lexicography (various linguistic tests can and have been designed to pinpoint this word class), delineating the meaning remains an arduous task.In this regard, Childs (1993: 22) rightly points out: Determining the meaning of ideophones can prove incredibly frustrating to the lexicographer since the meaning of an ideophone requires a context for interpretation much more than other words.In addition, ideophones require for their understanding an intensive knowledge of the language, a knowledge often inaccessible to an outsider (Samarin 1967).

Stacked paraphrases (rather than translation equivalents)
A concern that recurs in the literature on ideophones is thus that "it is extremely difficult to characterize in a simple way, the meanings of ideophones" (Weakley 1973: 8).After having analysed and described the 100 most frequent Zulu ideophones lexicographically, the golden rule that emerges for dictionary makers is that meanings assigned to ideophones should not be too broad neither too specific: The art is to establish the right level of generalization as far as the semantic import is concerned, with the examples functioning (true to their core function) as possible instances only.Selected ideophones to illustrate this follow in ( 7) to ( 10): (7) do ideophone ► (of nothing) ♦ Wampaya naso sonke isibhedlela kodwa do ukumthola.
• He looked all over the hospital but he didn't find him at all.♦ Ngaquba kwa-Oom Joe, umsebenzi do.(10) ngqi ideophone 1 ► (of tightness, of security, of holding firm) ♦ Wase ezivalela yena endlini ethi ngqi.The task of the lexicographer, when presented with page-fulls of instances of the use of a particular ideophone, is to try to deduce the meaning from the col-http://lexikos.journals.ac.za lective evidence.In context, meanings may be distilled on a generic level, roughly led by the highlights in ( 7) to ( 10).The problem with ideophones is that it is sheer impossible to demarcate where the meaning of the context starts and stops, and by consequence, to clearly pinpoint the true semantic import of the ideophone itself, that is, the ideophone in isolation.(The highlights in ( 7) to ( 10) are purposely generous in this regard.)Nonetheless, with enough evidence, one or more generic meanings do appear, and in monolingual dictionary making, these then form the basis for the write-up of the definitions for each sense.To the bilingual dictionary compiler, however, no foolproof translation equivalents are available (except when mapping cognate languages onto one another, or when dealing with languages that each have ideophones as word classes).Rather, each sense of an ideophone in translation dictionaries is merely a paraphrase of the semantic import, with the examples not random but hand-picked instances to substantiate the range of possible uses.
As a result, and as seen in ( 7) to (10), as well as ( 6), instead of translation equivalents, paraphrases are provided (in italics, and between brackets), using the convention to start those paraphrases with 'of', typically followed by a verb in the -ing form or an abstract noun.Paraphrases may be stacked, as in (10, sense 1), in order to best cover the meaning seen in the majority of the corpus lines.The example sentences in ( 6) to ( 10) are not merely illustrative material (there are no translation equivalents to illustrate anyway); rather, these examples truly support the meanings.

Part-of-speech mismatch
Compared to monolingual dictionary making, the absence of a corresponding word class in the non-Bantu language, forces the bilingual dictionary maker to come up with various strategies to 'translate/transpose' the examples.When the ideophones are onomatopoeic-like, English sound words can be inserted or recourse can be taken to English exclamations.Instances of the former may be seen in (11, sense 3) and ( 12), instances of the latter in ( 13) and (14, sense 1): (11) nsi ideophone 1 ► (of tightness, of security) ♦ Yathatha imbeleko, yabopha umntwana yamuthi nsi emhlane.(12) pho 2 ideophone ► (of dripping, of crying heavily) ♦ Ngalesi sikhathi zase zehla izinyembezi ku-Alice zithi pho pho pho, engazi kodwa ukuthi ukhalelani.• At this time the tears came down from Alice's face -drip drip drip -while she did not know why she was crying.
( In most cases, however, the Zulu ideophones need to be translated with English verbs, as in (8) above, or with English phrasal verbs, as in (9, sense 1) or (10, sense 2) above.
At times, Zulu ideophones may also conveniently be translated with noun phrases, as in (9, sense 3) above, or (15) below: (15) qu ideophone ► (of brief period of time) ♦ Ekupheleni konyaka lowo wake wathi qu ekhaya ebuyele uKhisimusi.• At the end of that year he went home for a short while, returning for Christmas.
Finally, there are instances where the ideophone simply disappears, as it is rendered by a paraphrase in idiomatic English, as seen in ( 20) and ( 21): (20) juqu ideophone ► (of cutting off, of snapping) ♦ UMahommed wayithi juqu ngo- mmese intambo.• Mahommed cut the string with a knife.♦ Uqwashe kuze kuse juqu! • You must be alert until dawn!(21) shu ideophone ► (of going right into) ♦ Aphenye izingubo, azithi shu kuzona ikha- nda livele kancane.• He turned over the blankets and covered himself, with only his head sticking out a little bit.▪ [NEGATIVE +] … shu 1 ► (of no meaning, of no use) ♦ Udumo aluthi shu, singayibamba i-Chiefs.• Fame doesn't mean anything, we can stop Chiefs.♦ Ungalindeli amaphilisi ama-antibiotics owathola kudokotela noma ekliniki, awathi shu egciwaneni lomkhuhlane.• You must not wait for the antibiotic pills you get from the doctor or the clinic; they are of no use when it comes to the flu bug. 2 ► (of silence) ♦ Wabadonsa ngendlebe ukuthi bangayithi shu kumuntu leyo ndaba.• He warned them not to say a word to anyone.

Ideophones as intensifiers of meanings
Ideophones also tend to 'stress' an aspect that was already mentioned, meaning that their function at that point in the sentence is to give more weight to the action expressed by the verb, the pronoun, etc.In ( 22), for instance, the ideophone qhwaba 'of being alone' intensifies the exclusive quantitative pronoun of class 1 yedwa 'alone; on her/his own', here prefixed by the relative concord of class 1: oyedwa '(only) one; (only) a single' (cf.De Schryver 2008a: 99).Rather than just '(only) one' on the one hand, and 'of being alone' on the other, put together the meaning in English becomes 'one and only'.Given that the prime target of the Zulu-English school dictionary consists of junior users, such examples are, where possible, avoided for exemplification purposes -but not skipped altogether, and in any case always accompanied by more straightforward examples.

Ideophones in combinations and fixed expressions
While some ideophones could be said to stand on their own, corpus evidence clearly shows that one out of ten typically combines with specific words in addition, as seen in ( 17) and ( 19) above, or in (23) below: (23) phaqa ideophone 1 ► (of reality, of truth) ♦ Mina ngingumZulu phaqa futhi ngiya- ziqhenya ngalokho.Although it is still possible to uncover chains of meaning -from the so-called basic meaning of the ideophone in isolation, to the meaning of the combination -the combined meaning is not transparent enough to leave the combination untreated in a corpus-driven lexicographic description.
For another one out of ten ideophones, the only occurrence is in so-called fixed expressions, as seen in ( 24) and ( 25 Unlike ideophones in isolation, which can only be provided with one or more paraphrases, examples ( 17), ( 19), ( 23), ( 24) and ( 25) show that combinations and fixed expressions that include ideophones may successfully be translated with (phrasal) verbs, nouns, adjectives, etc. in English.For these, then, proper translation equivalents may be provided (hence the different typography).

Ideophones in constructions
Corpus evidence further enables one to record typical patterns or constructions, as seen in ( 21) above, or (26) below: (26) The corpus evidence indicates that for most patterns ideophones are embedded in a negative construction (as is also the case in ( 21) and ( 26)).

Corpus vs. non-corpus-driven dictionary compilation of (Zulu) ideophones
Starting with the publication of Looking Up (Sinclair 1987), a considerable amount of scholarly lexicographic literature has been devoted to corpus-driven dictionary making.That the results as well as the corpus-driven dictionaries themselves are (very) different compared to those from the pre-corpus era is evident on all levels -be these levels macrostructural, mediostructural or microstructural, and even with regard to the treatment of the extra (front, middle and back) matter.An example of how differently the extra matter may be approached in corpus-driven Bantu lexicography can be found in De Schryver and Taljard ( 2007).
The corpus-driven lexicographic treatment of the Zulu ideophones will now be compared with the treatment of ideophones in the two most widely used pre-corpus era dictionaries for Zulu, viz.Doke and Vilakazi's (1953) Zulu-English Dictionary, and Dent and Nyembezi's (1995) Scholar's Zulu Dictionary.

4.1
The macrostructure Given that Doke and Vilakazi's dictionary contains as many as six times more lemmas (30 000 lemmas vs. 5 000 in our project), and Dent and Nyembezi's dictionary (with 13 500 lemmas 4 ) nearly three times more, one could assume that each of the 100 ideophones in our project is also covered in the existing two dictionaries.This should especially be the case given that those 100 are the hundred most-frequently used ones.This assumption, however, is not corroborated, as may be deduced from the data in Table 1.Table 1 brings together the 100 ideophones lemmatized in our project, together with their corpus frequencies (in 8.5 million words) and frequency bands (i.e.star-ratings in the dictionary), and contrasts this with the information in the dictionaries by Doke and Vilakazi (D&V), and Dent and Nyembezi (D&N).Whenever at least one of the senses mentioned by D&V, respectively D&N, is also attested in the corpus, that ideophone has been marked with a tick ( ).Both D&V and D&N only cover four fifths of the top hundred Zulu ideophones.In D&V a total of 8 are missing (-), and for a further 12 the meanings provided are not seen in the corpus (≠).Note that, since the publication of D&V, the spelling has been adapted -the spelling as found in D&V is therefore also shown next to the ticks.In D&N a total of 11 ideophones are missing http://lexikos.journals.ac.za (-), a further 8 have been assigned meanings unattested in the corpus (≠), and 2 more are errors (?).Regarding these errors: The ideophone ha 'of extreme action' has been lumped with the homonymous interjection ha 'ha!; gosh!', but the ideophonic meanings given are unattested; and the ideophone qhu 'of exploding/bursting sound; of burning, of being dry' has been misspelled qho (while nonetheless being lemmatized between -qhova and -qhuba).
With regard to a dictionary's macrostructure, one thus notes that even though a dictionary like D&V contains as many as 2 600 ideophones, one fifth of the most frequently occurring hundred ideophones are either missing or have been entered with an unattested meaning.For two of these, however, an incomplete form (chithi rather than chithi saka 'of scattering all over') or the unreduplicated form (qala rather than qalaqala 'of looking left and right') have been lemmatized.The ideophone chithi saka was lemmatized correctly in D&N.One other ideophone missing in D&V has been entered correctly in D&N: fahlafahla 'of speaking briefly'.Conversely, five ideophones entered in D&V are missing from D&N: shu 'of going right into', do 'of nothing', gelekeqe 'of completeness; of sudden action', mpu 'of looking around in search of something', and nci 'of happiness, of surprise'.
Although it may not immediately be apparent from the statistics presented in Table 1, the semantic characterization found in D&N ( ) is on the whole better than that found in D&V ( ).Indirectly, this may be deduced from the fact that four ideophones with unattested meanings in D&V (≠) have attested meanings in D&N ( ).Of course, one should not rule out the possibility of meaning shifts and changes.Even though the corpus used spans sources from the 1930s to the present (cf.De Schryver 2008: 69), it is entirely plausible to assume that Doke's data collection included ideophones that were common before the 1930s: After all, Doke's D.Litt.dissertation, The Phonetics of the Zulu Language, was published in 1926.A shift in meaning may for example be seen for the ideophone khumu: The absence from both dictionaries of common ideophones, such as qakala, used in the fixed expression -gqizi qakala 'not care at all', or phecelezi 'of saying differently', is regrettable.While the latter may be found in two monolingual Zulu dictionaries, Nyembezi's (1992) Isichazimazwi sanamuhla nangomuso, as well as Mbatha's (2006) Isichazamazwi sesiZulu, other ideophones are again absent there.A corpus-driven approach to the macrostructure, then, truly helps the lexicographer in decisions on what to include and what to omit.

The mediostructure
With regard to a dictionary's mediostructure, corpus (frequency) information enables one to connect related aspects dispersed throughout the dictionary text.
With regard to ideophones, cross-references may be employed to refer the dictionary user from lesser frequent variants to more frequent ones (e.g.

The microstructure
Less trivial than the macrostructural differences discussed in Section 4.1 are the microstructural ones.In this regard, the ticks in Table 1 do not imply a perfect correspondence between the meaning(s) uncovered in our corpus-driven study, and the meaning(s) recorded in the existing dictionaries.Two examples follow to illustrate this; first the treatment of qho as found in D&V in ( 27), in D&N in (28), and according to our analysis in ( 29): (27) qho (8-9) ideo. [> qhoza ; qhoqhoza ; u∫uqhoqhoqho ; uqhoqhoqho.]kathi lathi: "qho, qho, qho." • He put the smoking horn in his mouth and pulled and pulled, until it sounded "puff, puff, puff" inside.
The analysis seen in ( 29) summarizes the corpus evidence, with the so-called 'senses' ordered in order of corpus frequencies.In our analysis, a total of four general meanings were uncovered.Each meaning is typically supported by one real example from the corpus, sometimes more when the evidence indicates a wide semantic range (as for the second sense in ( 29)).Looking back from ( 29) to ( 27), one notices that D&V's first two senses are actually two instances of a more generic meaning, 'of repetitive sound', and should thus actually have been lumped.That sense is not the most frequent one.The more frequent meanings, 'of realness, of trueness' and 'of directness', have however been missed by D&V, with only the third meaning overlapping in both studies.When one now considers D&N in (28), one notices that the first two meanings seen in the corpus are again absent, with the next two variations in (non-generic) formulation.As a second example, see the treatment of ngqa as found in D&V in (30), in D&N in (31), and according to our analysis in (32): (30) ngqa (3-8) ideo. [> umngqaka ; ngqá∫alazi.] 1. of looking straight in the face; of seeing for the first time.Lomuntu ngimuthi ngqa ukum∫ona (I see this person for the first time).2. of brightness, brilliance.Indlu ekhanyiswa ngogesi ithi ngqa (A room lit by electricity is brilliant).3. of finishing off at a stroke.Wayithi ngqa ingilazana yotshwala (He drunk off at a single draught the glass of beer).
For the ideophone ngqa corpus evidence overwhelmingly points to one specific fixed expression, as seen in the screenshot reproduced in Figure 2.With a frequency of 564 and a rank of 1 272, the ideophone ngqa belongs to the top 1 500 lemmas in our project, hence the star rating (*) in (32).When this highly frequent ideophone is preceded by the verb -qala 'begin; start; commence', the meaning of -qala ngqa is 'do for the very first time'.This, then, is also the evidence summarized in the dictionary article shown in (32).D&N in (31) got the treatment almost right: They still focus on 'seeing' for the first time rather than the more generic 'doing' for the first time.D&V in (30), however, only vaguely approach the evidence with their first sense.The other senses offered by D&V are very rare: Only about 30 (out of 564!) corpus lines have a meaning that is different from the one seen in (32).Without wanting to discredit the great contribution to Zulu lexicography by D&V (a dictionary praised for its detailed linguistic description), as well as D&N (a dictionary praised for its inclusion of numerous relevant combinations and fixed expressions), it should be clear that an electronic Zulu corpus enables the dictionary compiler to take especially the semantic aspects of the Zulu lexicon to the next level.Large amounts of real evidence enable a far more precise delineation of the semantic import; not only of single items (cf.Sections 3.1 and 3.2), but also of items frequently collocating with other items (cf.Sections 3.3 and 3.4), or patterns colligating (cf.Section 3.5).

Discussion
In his linguistic study of the Zulu ideophone, Von Staden (1977: 195) summarized the semantic aspects as follows: Semantically, ideophones have a binary function.On the one hand, ideophones tend to be more explicit than corresponding non-ideophonic forms; an implication of this feature is that they also intensify meanings.On the other hand, ideophones also differentiate more precisely within a specific semantic field.Many ideophones have a great number of semantic variants, whilst quite a number of synonyms and homonyms are also found.Certain voice quality features and ges-http://lexikos.journals.ac.za tures play an important supporting role especially in respect of the semantic aspect of differentiation.
Although it is tempting to accept Von Staden's dual view of ideophones, the second aspect -i.e. that ideophones would also differentiate more precisely within a specific semantic field -is not corroborated by large amounts of corpus data for Zulu.Rather, ideophones acquire a specific meaning only when combined with other parts of speech, especially verbs.Or, as Childs (1993: 22) puts it: "Semantically, ideophones can do as little as simply underscore the meaning of the verb with which it has a close collocational association."The first part of the dual view is attested by corpus data, and is apparent from the lexicographic data presented throughout this article.That there are many semantic variants, synonyms and homonyms is mostly confirmed.One particularly 'popular' ideophonic sense is 'of extreme action'.In the text so far, this meaning has been encountered for the ideophones mpo (9), nsi (11), nse (17), and ha (Section 4.1).It is also attested for the ideophones bhe, hluthu, mbo, ngci, qingqo, wu, and zwi.(Note that these eleven ideophones are not (necessarily) freely interchangeable for this meaning: There is a need for real examples that show typical environments.) Our lexicographic carving up of the homonymy-polysemy (dis)continuum is rather uncommon, in that we decided to group all so-called 'senses' of a particular orthographic form under that form, even when one would be able to recognize different homonyms based on phonetic, morphological or even semantic grounds. 5This decision was taken with the target user group in mind (for whom a single listing is easier to process), and was also inspired by the fact that the assigned meanings are generic, which excludes polysemy by definition.
Lastly, although the supporting role played by voice quality and gestures was not studied (these are aspects that are absent from the now common but limited text-only corpora), given the 'vivid' aspect of ideophones, they are perfectly acceptable as correct.In a paper dictionary such aspects could be covered in the extra matter, while ideophones are of course prime candidates for audio and especially video illustrations in an electronic dictionary (cf. De Schryver 2003: 165-167).
In conclusion, then, although seen as a lexicographer's worst nightmare, a careful corpus-driven study of ideophones enables the dictionary compiler to present dictionary articles for ideophones that 'look' similar to the articles for lemmas in other word classes.In reality, however, the traditional (monolingual) definitions and (bilingual) translation equivalents are actually carefully crafted, generic paraphrases.These paraphrases are supported by hand-picked authentic examples, examples meant to substantiate the possible range of each set of paraphrases.Where relevant, the ideophonic import may be and has to be stemmed by means of the inclusion of common combinations, fixed expressions, and patterns colligating, at which point generic meanings morph into highly precise vivid language.Specifically for bilingual lexicography with a Bantu language as the source language, the presumed problem of the nonexistence of a word class 'ideophone' in the target language becomes irrelevant from the moment paraphrases rather than translation equivalents are employed.

1.
Although oft-quoted, hardly any scholars seem to go back to the original source, rather copying mistakes in typography and spelling from one another, and even missing out on crucial words (like 'sound').Typographical errors may be found in the OED citation, missing and misspelled words in, for example, Childs (1994: 180) and Allan (2001: 139).

2.
For more on sociolinguistic aspects of (Zulu) ideophones, see the highly revealing study by Childs (1996).

3.
To determine the total number of lemmas in Doke and Vilakazi's (1953) Zulu-English Dictionary, every fiftieth page was sampled starting with page 50, and the page average (32.67 lemmas per page) was multiplied with the total number of pages (918 pages), resulting in an extrapolated 29 988 lemmas overall.

4.
To determine the total number of lemmas in Dent and Nyembezi's (1995) Scholar's Zulu Dictionary, the first page and every subsequent fifteenth page was sampled, and the page average (62.13 lemmas per page) was multiplied with the total number of pages (218.5 pages), resulting in an extrapolated 13 576 lemmas overall.

5.
For example, the two so-called 'senses' in (19) -'of straightness, of directness' and 'of knocking' -are clearly derived from different verbal stems.

Figure 1 :
Figure 1: Zulu POS categories studied from a lexicographic point of view gqwa ideophone ► (of being sparse, of being dotted) ♦ Kwase kusa, isibonakala imizi ithe gqwa gqwa laphaya emaqeleni.• Dawn was breaking and they could just make out several villages dotted about on the open hillslopes.
• And then they rolled over laughing: hi, hi, hi, hi, hi, hi, hi, hi.

Table 1 :
Top 100 Zulu ideophones from a corpus and dictionary perspective qhwa (freq.63) > qwa (freq.233), used in -mhlophe qwa 'bright-white; snow-white; very white'); or to cross-refer closely related forms, both in terms of orthography and meaning (e.g.ngqo 'of straightness, of directness; of knocking' vs. nqo 'of knocking; of being right on top; of precision'); or to indicate how one ideophone is derived from another, often through reduplication (e.g.qathatha 'of falling down on a certain spot; of arriving at an exact time' < qatha 'of falling down, of dropping; of arriving; of being solid'); or finally to show how one ideophone may form part of another (e.g.saka 'of intensity, of emphasis' vs. chithi saka 'of scattering all over').