Semi-Automatic Detection of New Words in Modern Georgian
Abstract
The study of neologisms in the Georgian language has gained significance due to the rapid socio-political changes in the country after the collapse of the Soviet Union and the country regaining independence. Technological advancements of the 21st century have also played a role. These developments have led to the introduction of numerous new terms and concepts into the language. However, there has been no established methodology for identifying neologisms in modern Georgian. To address this issue, a methodology was worked out at Ilia State University based on the study of existing methods applied to other languages. A corpus of the Georgian language was developed from textual materials retrieved from online platforms such as online newspapers and magazines, online media websites, websites of non-governmental organisations, and governmental agencies. Two lemmatisation tools were then applied to it to identify potential neologisms. This paper presents the methodology for the semi-automatic detection of new words in modern Georgian. Keywords: neologism, Georgian language corpus, lemmatiser, out-of-vocabulary lexis, neologism detection methodologyCopyright of all material published in Lexikos will be vested in the Board of Directors of the Woordeboek van die Afrikaanse Taal. Authors are free, however, to use their material elsewhere provided that Lexikos (AFRILEX Series) is acknowledged as the original publication source.
Creative Commons License CC BY 4.0