Semi-Automatic Retrieval of Definitional Information: A Northern Sotho Case Study*

Elsabé Taljard

Abstract


Abstract: Corpus-based terminology is currently gaining ground on the international front. Itis therefore important that terminologists working on the South African Bantu languages not onlytake note of this development, but that they should also follow this trend, even if they do not havethe same measure of access to highly sophisticated software. The aim of this article is therefore toestablish whether it is possible to retrieve definitional information on key concepts from untagged,running text by making use of affordable and easily accessible software such as WordSmith Tools. Inorder to answer this question, a case study is done in Northern Sotho, using textual material onlinguistics as basis for a special field corpus. Syntactic and lexical patterns serving as textual markersof definitional information are identified and the success rate of the computational retrieval ofdefinitional information is analysed and evaluated. Attention is also paid to the retrieval of specificallyconceptual information, which turned out to be a fortunate by-product of semi-automaticretrieval of definitional information. Finally, it is illustrated how definitional information retrievedcan be utilised in the writing of a formal terminological definition.

Keywords: TERMINOLOGY, SOUTH AFRICAN BANTU LANGUAGES, DEFINITIONALINFORMATION, SEMI-AUTOMATIC INFORMATION RETRIEVAL, TERMINOLOGICAL DEFINITIONS,CONCEPTUAL RELATIONSHIPS, LEXICAL PATTERNS, SYNTACTIC PATTERNS,TEXTUAL MARKERS, KEYWORD-IN-CONTEXT (KWIC), WORDSMITH TOOLS

Opsomming: Semi-outomatiese herwinning van definisie-inligting: 'n Noord-Sothogevallestudie. Korpus-gebaseerde terminologie is tans besig om veld te wen op dieinternasionale front. Dit is daarom belangrik dat terminoloë wat binne die Suid-Afrikaanse Bantoetalewerk, nie net sal kennis neem van hierdie ontwikkeling nie, maar dat hulle ook hierdie neigingsal volg, selfs al het hulle nie dieselfde mate van toegang tot gesofistikeerde rekenaarprogrammatuurnie. Die doel van hierdie artikel is daarom om vas te stel of dit moontlik is om definisie-inligtingoor sleutelkonsepte uit ongemerkte, lopende teks te herwin deur bekostigbare en toeganklikesagteware soos WordSmith Tools te gebruik. Ten einde hierdie vraag te beantwoord, is 'n gevallestudiein Noord-Sotho gedoen, met gebruikmaking van teksmateriaal oor die linguistiek as basisvir 'n gespesialiseerde korpus. Sintaktiese en leksikale patrone wat as tekstuele merkers van definisie-inligting dien, word geïdentifiseer en die suksesratio van rekenaarmatige herwinning vandefinisie-inligting word ontleed en beoordeel. Aandag word ook gegee aan die herwinning vanspesifiek konseptuele inligting, wat 'n onverwagse byproduk van die semi-outomatiese herwinningvan definisie-inligting is. Ten slotte word geïllustreer hoe definisie-inligting aangewend kan wordby die skryf van 'n formele terminologiese definisie.

Sleutelwoorde: TERMINOLOGIE, SUID-AFRIKAANSE BANTOETALE, DEFINISIE-INLIGTING,SEMI-OUTOMATIESE INLIGTINGSHERWINNING, TERMINOLOGIESE DEFINISIES,KONSEPTUELE VERHOUDINGE, LEKSIKALE PATRONE, SINTAKTIESE PATRONE, TEKSTUELEMERKERS, KEYWORD-IN-CONTEXT (KWIC), WORDSMITH TOOLS


Keywords


TERMINOLOGY; SOUTH AFRICAN BANTU LANGUAGES; DEFINITIONAL INFORMATION; SEMI-AUTOMATIC INFORMATION RETRIEVAL; TERMINOLOGICAL DEFINITIONS; CONCEPTUAL RELATIONSHIPS; LEXICAL PATTERNS; SYNTACTIC PATTERNS; TEXTUAL MARKERS; KEYWORD-IN-CONTEXT (KWIC); WORDSMITH TOOLS

Full Text:

PDF


DOI: https://doi.org/10.5788/14-0-689

Refbacks

  • There are currently no refbacks.



ISSN 2224-0039 (online); ISSN 1684-4904 (print)

Creative Commons License CC BY 4.0


Powered by OJS and hosted by Stellenbosch University Library and Information Service since 2011.


Disclaimer:

This journal is hosted by the SU LIS on request of the journal owner/editor. The SU LIS takes no responsibility for the content published within this journal, and disclaim all liability arising out of the use of or inability to use the information contained herein. We assume no responsibility, and shall not be liable for any breaches of agreement with other publishers/hosts.

SUNJournals Help