Back to Search View Original Cite This Article

Abstract

<jats:p>This article provides a comprehensive review of contemporary research in the field of natural language processing (NLP) and speech technologies for Central Asian Turkic languages, including Kazakh, Kyrgyz, and Uzbek. Although a number of theoretical and applied studies have been published in recent years, these languages continue to be classified as low-resource. This situation is primarily caused by the limited availability of annotated text corpora, insufficient speech data, the parallel use of Cyrillic and Latin scripts, and the absence of unified annotation and evaluation standards. The article systematically examines current approaches to morphological segmentation, named entity recognition, sentiment analysis, and automatic speech recognition. Agglutinative morphology and vowel harmony are discussed as key typological features of Turkic languages that strongly influence computational processing strategies. The effectiveness of both rule-based and neural morphological analyzers is highlighted. The paper also describes the adaptation of computational models originally developed for Turkish, English, and Russian through subword modeling, character-level embeddings, and multilingual transformer architectures. In addition, cross- lingual transfer learning is evaluated as a promising approach to mitigating data scarcity. The study identifies corpus fragmentation, inconsistent annotation schemes, and the lack of standardized speech resources as major challenges. The author argues for the development of open-access datasets, the introduction of shared evaluation tasks, and the strengthening of institutional collaboration between linguists and computational language technology specialists. The findings of the study are of both theoretical and practical importance for the development of sustainable and effective language technologies for low-resource languages.</jats:p>

Show More

Keywords

speech languages language computational article

Related Articles