Back to Search View Original Cite This Article

Abstract

<jats:p>The rapid development of artificial intelligence and natural language processing technologies has opened new opportunities for automating document workflows in educational institutions. This study presents a comprehensive comparative analysis of five prominent generative language models - GPT-4, GPT-3.5-Turbo, T5-Large, BERT (fine-tuned), and BLOOM-7B - evaluated on their capacity to generate high-quality corporate letter templates in educational systems. Experiments were conducted on a corpus of 200 authentic institutional letters from Uzbek higher education institutions spanning five letter types. Model performance is assessed using BLEU, ROUGE-L, and F1 metrics alongside a structured human evaluation framework covering fluency, formality, and structural accuracy. Results demonstrate that instruction-tuned large language models significantly outperform encoder-based and smaller generative models, with GPT-4 achieving a BLEU score of 42.3 and a human approval rate of 87%. The study further investigates the impact of prompt engineering strategies, showing that structured few-shot prompts improve GPT-4 performance to a BLEU score of 44.8. Findings provide actionable guidelines for educational institutions considering the deployment of generative AI for administrative document automation.</jats:p>

Show More

Keywords

language educational institutions generative models

Related Articles