Using Semi-automated Term Extraction for IsiNdebele Health Terminology

  • Nomsebenzi Malele Department of African Languages, University of South Africa, Pretoria, South Africa (https://orcid.org/0000-0001-8384-7853)
  • Sonja Bosch Department of African Languages, University of South Africa, Pretoria, South Africa (https://orcid.org/0000-0002-9800-5971)

Abstract

IsiNdebele, also known as Southern isiNdebele, has a limited availability of language resources and specialised terminology, especially when compared to other members of the Nguni language family. This study therefore explores means of addressing the shortage of specialised terminology in isiNdebele by using semi-automatic term extraction methods. The focus is on health terminology, intended for communication with laypersons rather than between experts in the health field. Semi-automatic term extraction methods are employed, combining manual identification and extraction of data from available corpora with the use of a software tool named WordSmith Tools (WST). The study illustrates the necessity of utilising all functions of the WST, as they complement each other. Terms overlooked by one function may be captured by another. For instance, while the KeyWords function identified only a limited number of terms in this research, manual identification proved more fruitful. Interestingly, the Concord function emerged as particularly effective in identifying a greater number of terms. The use of the WST in this research highlights the viability of corpus-driven studies, even for resource-scarce languages like isiNdebele. Therefore, considering the limited resources available for isiNdebele, particularly the absence of specialised dictionaries, this collection of health terms exemplifies ideal candidates for inclusion in a general dictionary. Keywords: isiNdebele, corpus-driven term extraction, health corpora, language for specific purposes (LSP), language for general purposes (LGP), Wordsmith Tools, word list, key words, concordance, semi-automatic extraction
Published
2024-08-29
How to Cite
Malele, N., & Bosch, S. (2024). Using Semi-automated Term Extraction for IsiNdebele Health Terminology. Lexikos, 34(1), 269-287. https://doi.org/10.5788/34-1-1926
Section
Artikels/Articles