Management and Internal Standardization of Chemistry Terminology: A Northern Sotho Case Study

: One of the many implications of the process of language democratization which started post-1994 in South Africa is the empowerment of the previously marginalized South African Bantu languages to become languages of higher functions, i


Contextualization
One of the many implications of the process of language democratization which started post-1994 in South Africa is the empowerment of the previously marginalized South African Bantu languages to become languages of higher functions, i.e. languages of learning and teaching, and also of scientific discourse.This in turn implies the development, consolidation and especially standardization of terminology for each of these languages, and the compilation of LSP dictionaries.Rising to this challenge, the Suid-Afrikaanse Akademie vir Wetenskap en Kuns (SAAWK) initiated a project that had as one of its aims the compilation of a Quadrilingual Explanatory Dictionary of Chemistry.A detailed account of issues pertaining to the planning of this dictionary is given in Taljard and Gauton (2000) and will not be repeated here; suffice it to mention briefly that the intended target users of this dictionary are senior secondary school learners and undergraduate chemistry students.Subject field experts, in cooperation with a team of terminographers, compiled an English lemma list of 500 high frequency chemistry terms to be included in the envisaged dictionary.Terminological definitions were compiled in English for each of these terms, followed by the provision of Afrikaans term equivalents and the translation of the definitions into Afrikaans.The SAAWK then approached the Department of African Languages at the University of Pretoria to assist with the translation of the terms and their definitions into Northern Sotho (as representative of the Sotho languages) and Zulu (as representative of the Nguni languages).This project was subsequently incorporated into the M.A. (course work) programme of the Department of African Languages.Each student participating in the project received 50 terms and their definitions in English (and Afrikaans), which constituted the source text that had to be translated into the two languages mentioned above.Two teams consisting of ten participants for each language were therefore envisaged to participate in the project.Participants had a dual role to fulfil, i.e. that of both translator and terminologist.The nature of the project, particularly the physical circumstances and terminological context within the participants found themselves, posed a number of terminological challenges which had to be addressed in order to produce a terminologically sound final product.The aim of this article is to briefly describe the model of terminology management that was utilized for this project, and secondly, to report on the terminological processing of one of the ten source texts prior to translation, focussing on the internal standardization of terms in the absence of readily available, standardized chemistry terminology.The second point specifically investigates the involvement of target users in preliminary standardization procedures.

Terminology management model
For this particular project, participants were required to translate a technical text which constitutes a random extract from the domain of chemistry.Since the students participating in this project are by no means full-time translators and/or terminologists, the translated text represents an instance of once-off text production, and therefore calls for what Wright and Wright (1997: 147) call ad hoc terminology management.The default model for terminology management advocated in most theoretical treatises is that of systematic terminology management, but as Wright and Wright (op. cit.) point out, this model does not make provision for the limitations that exist in the conventional translation workplace.Consequently, translation oriented terminology management is set apart from other terminological activities.Terminologists working within a systematic model have access to subject field experts and usually have the time to collect material, select terminology and organize it according to logical concept systems.In contrast, ad hoc terminology management calls on the translator-terminologist to create and manage his/her own terminology resources.Furthermore, whereas systematic terminology management is subject-field driven, ad hoc terminology management is text-driven.Translators/terminologists are often confronted with source texts which constitute random extracts from a specific domain, and their lack of expertise in the subject field makes it difficult, if not impossible, to reconstruct the logical concept structure of the domain in question.Wright and Wright (1997: 148) list several disadvantages experienced by translators/terminologists, and these are particularly applicable to the participants in the chemistry dictionary project.In the first instance, they are not subject field experts, and as a result have no knowledge of the concept system within which they are working.Secondly, they have no or limited access to subject field experts and lastly, available research materials on chemistry in especially the two target languages, i.e.Northern Sotho and Zulu are almost non-existent.In order to decide on an appropriate approach to even small-scale terminology management, the conditions that prevail in the individual working environment have to be considered.Every student who participated in the chemistry dictionary project worked alone, without any interaction with other members of the translation team.The majority of participants have only a very basic level of computer literacy with limited or no access to internet facilities.Translation is done without the benefit of sophisticated electronic terminology management systems or translation memory systems.Despite these constraints, it is only reasonable to expect a minimum level of documentation to support the translation work at hand.For this particular project, and taking the non-ideal situation of the translators/terminologists into consideration, it was decided that a bilingual glossary of source terms with their equivalents in the target languages was the minimum requirement, even though Wright and Wright (1997: 151) indicate that such a list does not really meet the minimum requirement for terminological documentation.They do concede however, that terminologist-translators need to determine for themselves what the basic minimum terminological entry must look like in their specific working environment.During the initial planning of the project it was decided that the project leader would take responsibility for collecting and consolidating glossaries submitted by all participants and making a final bilingual (English-Northern Sotho for the Northern Sotho speaking participants and English-Zulu for the Zulu speaking participants) term list available to all.As pointed out by Tufiş (2004), it is common knowledge that terminological consistency over a large collection of thematic documents is hard to attain, and having such an internally standardized list would ensure terminological consistency not only in the work of each individual participant, but also within the project as a whole.It is furthermore our intention to submit the respective glossaries to the National Language Body for Northern Sotho and the National Language Body for Zulu, sub-structures of PanSALB which are responsible for standardization and consequent dissemination of terminology.In cases where multiple term equivalents exist, all equivalents will be retained, but terms which have shown themselves during the course of the investigation to be preferred will be listed first.It would then be the task of the standardization body to make a choice from amongst competing equivalents.In this way, the project can make a positive contribution to terminology development in Northern Sotho and Zulu respectively.The Northern Sotho glossary appears as Addendum A.
It should be clear from the foregoing discussion that some form of rudimentary terminology management is essential in any project dealing with technical translation, even if it is on a small scale and on an ad hoc basis.
The discussion below focuses first on the procedural steps which are followed in order to compile a bilingual (English-Northern-Sotho) glossary of chemistry terms.Secondly, the issue of internal standardization is addressed, with particular reference to the potential role that target users can play in this regard.The data generated by one of the participants (who is also the co-author of this article) form the basis for the discussion.Sections of this article are furthermore based on her unpublished mini-dissertation, of which full particulars appear in the bibliography.

Terminological processing of the source text
Translating technical texts into a lesser resourced language such as Northern Sotho requires proper and sometimes innovative terminological processing of the source text prior to the actual translation.

Terminology extraction
The first step in the translation of the source text is the (semi-automatic and manual) extraction of terminology from the source text.To this end, an electronic special purpose corpus was compiled, consisting of the 50 terms and their definitions.This is necessary in order to semi-automatically extract all terms from the source text, making use of WordSmith Tools' KeyWord function.Simply put, the KeyWord function compares the frequency with which an item occurs in the special purpose corpus with its frequency in a larger, general reference corpus, and isolates KeyWords with a significantly higher or lower frequency of occurrence.(For a detailed description of the procedure for the identification of KeyWords, i.e. potential term candidates, see Taljard et al. 2007: 160).Not all KeyWords thrown up by the KeyWord search are however necessarily terms, and manual perusal of the list is necessary to eliminate nonterms from the term candidate list.
For the purpose of this study, the existing English definitions of the 50 chemistry terms which is the source text that is to be translated into Northern Sotho, automatically constitute the special purpose corpus, whereas the University of Pretoria English Internet Corpus (PEIC) is used as the general or reference corpus.The special purpose corpus consists of 1 225 tokens and 158 types, whereas the reference corpus has approximately 12,5 million tokens and 118 193 types.Running the KeyWord search on our special corpus resulted in 81 term candidates being thrown up, of which 72 turned out to be terms.This procedure can be carried out not only for single word terms, but also for multiword terms, and an additional 18 two word terms were extracted semi-automatically.However, semi-automatic term extraction does not succeed in extracting all terms from a source text.According to Taljard and De Schryver (2002), semi-automatic term extraction accounts for approximately 60% of terms in a running text.Therefore, computational extraction needs to be complemented by manual term excerption.Term conscious reading of the source text resulted in a further 40 single and 19 two word terms being identified, thus giving a total of 149 terms isolated from the source text.Compare Table 1 in this regard:

Sourcing of term equivalents
The next step is the sourcing of translation equivalents for all terms occurring in the source text.This would result in the bilingual term list or glossary, to be made available to all participants in the project.Availability of such a list will ensure terminological consistency in the project at large, making sure that all participating translators make use of the same translation equivalents for terms occurring in the source texts.As will be pointed out below, this ideal is not always easily attainable.
The first preference when sourcing term equivalents for the 149 source terms should be perusal of standardized sources.These would normally include dictionaries, preferably LSP dictionaries dealing with the subject field at hand, and official term lists.Due to the lack of LSP dictionaries for Northern Sotho, participants in the project had no option but to consult existing LGP dictionaries, and the only available official terminology list, the Terminology and Orthography of 1988.Consultation of these sources produced equivalents for only 57 of the 149 source terms.This means that only 38.2% of source terms can be covered by terms from standardized Northern Sotho sources.Trawling through the available standardized sources revealed a further problem -for 36 of the 57 source terms, multiple equivalents were found; for some terms as many as four equivalents were found.As can be seen from Table 2 below, the multiplicity of TEs is to be found on various levels: in some cases variation is on the lexical level (cf.TEs for 'decomposition', 'particle' and 'separation'), in other cases on the orthographical level, i.e. different spellings of the same term (cf.nekethifi vs. neketifi) and in yet other cases, the variation concerns the term formation strategy (indigenous term vs. transliteration, cf.sedilana vs esiti).It therefore seems that even so-called standardized sources suffer from proliferation of terms, a symptom of inadequate implementation of standardization procedures.
Seeing that only little more than a third of the source terms could be covered by consulting standardized sources, the translator/terminologist was compelled to also consult non-standardized sources.For the purpose of this project, non-standardized sources consisted mainly of informal term lists com-piled by individuals working in the field of chemistry, who are also speakers of Northern Sotho.With regard to the use of non-standardized sources, it needs to be acknowledged that these sources need to be treated with the necessary circumspection.However, in this particular instance, these sources provided wellformed and appropriate TEs for many of the source terms, revealing an exceptional engagement with both linguistic and conceptual issues.By utilizing these sources, equivalents for a further 22 source terms could be provided, leaving 70 source terms without term equivalents.For these source terms, equivalents had to be coined by making use of the appropriate term formation strategies for Northern Sotho.Figure 1 summarizes the different sources from which term equivalents were harvested.Ideally, coining of terms should be done in collaboration with subject field experts who are also speakers of Northern Sotho, but from a practical point of view this is not always possible.Term translation strategies that are available to the terminologist include the following:

Sources of TEs
-Semantic transfer, specifically semantic specialization, a process during which a word from the language for general purpose (LGP) attains the status of a language for special purposes (LSP) term by acquiring an additional, more technical meaning.After having completed the coining of equivalents for those source terms for which no equivalents could be found, the first deliverable of the initial terminological processing of the source text was available, i.e. a bilingual term list, containing all source terms isolated from the source text, followed by their Northern Sotho equivalents.However, the problem of multiple term equivalents for 43 of the 149 source terms still persisted.Ideally, such a list of source terms and their multiple equivalents should be submitted to an official standardization body for formal standardization, a process during which a preferred term from amongst multiple translation equivalents is identified, again in consultation with subject field experts.In practice this is rarely feasible, due to the time pressure under which translators normally operate.Furthermore, the standardization process of terminology in SA seems rather flawed, one of the main problems being the dissemination and general accessibility of standardized terms.As a result, translators use their own discretion in deciding on appropriate term equivalents for source terms.This practice does however not solve the issue of the multiplicity of term equivalents, and may even contribute to the unnecessary proliferation of terms.

Internal standardization of the bilingual term list
For this particular project, it was decided to use the preferences of the target users of the terminology as a guideline for internal standardization.Consequently, a small case study was conducted in three secondary schools in the Limpopo province where chemistry forms part of the curriculum.The aim of this case study was to determine the feasibility of involving target users in the standardization process, even if it is only a preliminary and internal standardization.A questionnaire (Addendum B) consisting of four sections was administered to 30 grade 12 learners and three science educators.It was assumed that, being in the final year of schooling, these learners would already have internalized the basic chemistry concepts and that it would be appropriate to administer the questionnaire to them.All of them have indicated that their mother tongue is Northern Sotho.The first section of the questionnaire concerns the attitude of learners towards the use of Northern Sotho as a language of instruction of especially chemistry.The second section concentrates on establishing the users' preference in the case of multiple TEs, using conceptual appropriateness as guiding principle.In the third section, users' preference with regard to the use of transliterations versus indigenous terms is investigated, and the fourth section examines users' preferences pertaining to the phonological adaptation and resultant spelling of transliterations.The results collected from the questionnaires were used to internally standardize the bilingual term list to be used in the translation of the chemistry texts, thus ensuring terminological consistency in the translated text.The list would then also serve as the minimum level of terminological documentation as required by Wright and Wright (1997) for this particular project.Insight into target users' preferences can also have a wider impact, in that it may provide some guidelines for future terminology development.Space constraints do not allow a detailed analysis of the results obtained from all four sections of the questionnaire; the results of section 1 will therefore be dealt with very briefly.The two anchor questions put to learners in this section of the questionnaire were the following: (1) Is it easy to learn chemistry in English?
(2) Do you think that teaching subjects such as chemistry in the mother tongue will have a positive influence on the matric pass rate?
Learners' responses to these questions present an interesting contradiction: 80% of learners indicated that it was easy studying chemistry in English, which is not their mother tongue.On the other hand, they do seem to sense that learning a difficult subject in a language other than their mother tongue may have a negative impact on their successful mastering of the subject: all 30 responded that they believe that teaching subjects such as chemistry in the mother tongue will have a positive influence on the matric pass rate.
In the second section of the questionnaire, learners were presented with 11 source terms for which multiple TEs had been harvested in order to identify the preferred term.These 11 source terms were selected in such a way that the TEs for any particular source term represented the same term translation strategy.Respondents were provided with all TEs for a specific source term, and asked to select the one they prefer.They were also provided with the definition of each term to ensure that they select the term which is conceptually the closest match to the source term.
The terms which were preferred by the majority of respondents were then regarded as being internally standardized for the purpose of the project.As can be seen from Table 3 below, in some cases preferences were very clear -80% of respondents for example preferred the term tlemagano as equivalent for 'bond', 70% preferred mafolofolo as equivalent for 'energy'.In other cases, preferences were not so clear-cut.Preferences for the equivalent of the source term 'dispersion' were as follows: tšitlano (43%), phatlalatšo (30%), and phatlalalo (24%).Nevertheless, it was possible to identify a preferred term for all the source terms.Compare Table 3 for an analysis of the results of this section of the questionnaire: The purpose of the third section of the questionnaire was to establish whether the target users have a specific preference for, or resistance against the use of transliterations to form term equivalents.The use of transliterations is a much debated issue amongst academics, but as far as we could ascertain, no investigation has thus far been made into the preferences of target users.In this section of the questionnaire, respondents were presented with 12 source terms, each of which has two term equivalents, one being a transliteration, the other a so-called indigenous term.Results for this section indicated that 51% of the preferred equivalents were transliterations, the rest (49%) being indigenous terms.The results obtained from the educators present an interesting contrast to those of the learners: an analysis of the educators' preferences indicated that only 28% of their preferred equivalents are transliterations, the rest being indigenous terms.This could be ascribed to the fact that educators feel that they have to promote the use of 'pure' language, thus discouraging the use of transliterations.Compare Table 4 for an analysis of learners' preferences.Although we acknowledge the fact that the preferences of target users, who can at most be regarded as lay people to perhaps semi-experts, cannot be the final criterion in the selection of a standardized term from multiple equivalents, it surely needs to be taken into consideration that target users seem to have no serious objection to the use of transliterations.Furthermore, since they are probably already familiar with the concepts denoted by these terms, the nontransparency of the transliterated TEs, which is often used as an argument against the use of these forms, is no longer a stumbling block to the conceptual understanding of these terms.
One of the problems with regard to the use of transliterations in Northern Sotho is the phonological adaptation and resultant spelling of these items.The preferred syllable structure in Northern Sotho is a CV-structure, which implies that whenever a word is borrowed from English or Afrikaans, its phonological structure and consequently its spelling needs to be adapted to conform to this requirement.However, this rule is not applied consistently, resulting in multiple equivalents which differ on orthographical level.In the last section of the questionnaire, respondents were requested to choose between one of two variants, the one displaying an adapted syllable structure and spelling corresponding to the syllable structure, the other equivalent being the non-adapted variant.A second aspect that complicates the spelling of transliterations is the indication of aspiration, specifically with regard to the three voiceless plosives [p], [t] and [k].Speakers often differ with regard to the pronunciation of these sounds, which consequently leads to differences in spelling, cf molekhule vs molekule.Four items were included in the list of terms where respondents had to choose between a version where aspiration was indicated and one where it was not.The official spelling rules of Northern Sotho provide no guideline with regard to these two issues.Analysis of target users' preferences revealed a clear bias towards those forms where the orthographical representation reflects the adapted phonological structure.With regard to the indication of aspiration, in all four examples preference was for those forms where the aspiration was reflected in the spelling, cf.khomphaonte, molekhule, phosethifi and athomo.It must however again be emphasized that these preferences should by no means be interpreted as definitive principles -they are merely indications of the preferences of a very restricted sample of Northern Sotho speakers.

Conclusion
A statement that is often heard with regard to the African languages of South Africa is that there is a lack of technical terminology in these languages.This statement is however only partially true and represents a very simplified view of a complex matter.First, the fact that TEs need to be coined for almost 50% (46.9%) of terms isolated from the source text is especially worrisome and indeed confirms the need for a concerted effort of proper terminology development.This need is further substantiated by the fact that less than 40% (38.2%) of source terms can be provided with equivalents from standardized sources.The possibility to recover equivalents for 14.7% of source terms from non-standardized sources adds another dimension to the picture -it implies that the main challenge is not so much a lack of terminology, but rather a lack of standardized termi-nology.In view of the seeming inability of official standardization bodies to properly manage terminology development, alternative measures of standardization need to be considered.One such an alternative is to use target users' inputs as a guideline for preliminary and project specific standardization.Involving target users in the development of terminology will furthermore encourage them to take ownership thereof.This would also make potential users more inclined to actually use the terminology, since they would feel themselves to have been involved in the creation thereof.However, this can never be more than an interim measure.It can never function as a substitute for a fully functional central standardization body.

Key:
Where statistical information could be retrieved from the questionnaires with regard to multiple equivalents, the preferred equivalent is listed first and marked with p , with other equivalents following in descending order.If no statistical information is available, multiple equivalents are listed in alphabetical order.
= equivalent sourced from standardized sources  = equivalent sourced from non-standardized sources  = coined equivalent  In this section a term will be given together with two possible translation equivalents (TE).One TE is a coined term, the other is a transliteration.Choose the TE that you prefer.Tick with √ next to the chosen TE.

SECTION D
In this section, a term will be given together with two possible term equivalents.The two term equivalents (TEs) are the same, but spelled differently.Choose the TE that you prefer.Tick with √ next to the chosen TE.

Figure 1 :
Figure 1: Sources of translation equivalents

-
Source term 1: atomic -Translation equivalent: seka-athomo 'atom-like, like an atom' -Source term 2: time unit -Translation equivalent: motšonako < motšo wa nako 'unit of time' -Source term 3: trivial name -Translation equivalent: leinatlwaelo < leina la tlwaelo 'name of habit' -Borrowing, which includes the use of loan words, where the term and its meaning is retained intact with no adaptation of the morphological structure of the word.In this particular case, symbols representing chemical elements and units of measurement are, according to international practice, retained as is.Examples are the following: -Source term 1: Debye (unit of measurement named after the Dutch physicist P.J.W. Debye) -Translation equivalent: Debye -Source term 2: H2O (symbol) -Translation equivalent: H2O (seka)Borrowing also includes the use of adoptives or transliterations, in which case the adopted word is completely adapted -morphologically and phonologically -to the structure of the borrowing language.
: A reaction involving the chemical separation of a given compound into two or more simple compounds or substances e.g.2H2O →2H2+O2 The removal of water from a substance e.g.CuSO4.5H2O(Hydrated copper sulphate) → CuSO4 + 5H2O (dehydrated copper sulphate + water) Mass per unit of volume e.g. the density of mercury is 13,5gThe separation of compounds or atoms, e.g. the dissociation of acetic acid in water to form H+ ions and acetate ions.Stop being combined, to remove elements from each other.Chemical change produced by two or more substances acting upon each other.

Table 1 :
Results of term extraction

Table 2 :
Multiple term equivalents for source terms

Table 3 :
Learners' preferences with regard to multiple TEs

Table 4 :
Indigenous words versus transliterations

Table 5 :
Learners' preferences with regard to spelling of transliterations