КОМПЬЮТЕРНОЕ ЗРЕНИЕ И РАСПОЗНАВАНИЕ РЕЧИ КАК ИНСТРУМЕНТ ПОВЫШЕНИЯ ПОЖАРНОЙ БЕЗОПАСНОСТИ

Authors: К.В. Чумаков, А.О. Жданова, П.А. Стрижак

Publication: Pozharnaia bezopasnost`

Published: Mar 27, 2026

Source: Crossref

Back to Search View Original Cite This Article

Abstract

<jats:p>Представлены основные элементы разработанной системы для проведения противопожарного инструктажа с обеспечением возможности контроля усвоения материала слушателями с применением инструментов искусственного интеллекта. Применены компьютерное зрение и распознавание речи. Показаны примеры использования с пояснением ключевых особенностей и преимуществ. Определены основные отличия по сравнению с традиционными условиями проведения противопожарного инструктажа. Обосновано, что применение разработанной системы позволит повысить объективность контроля и создать инструмент для снижения количества нарушений правил пожарной безопасности и инцидентов, вызванных человеческим фактором, через совершенствование превентивного обучения.</jats:p> <jats:p>This paper presents the development of a software system designed for the automated analysis of the effectiveness of fire safety briefings using artificial intelligence technologies – computer vision and natural language processing. The scientific novelty of this development lies in the creation of an integrated methodology combining cross-modal analysis of verbal and non-verbal training components. The YOLOv8 convolutional neural network, capable of real-time detection of people, postures, and gestures, was used for video analysis. Participants’ gaze direction was assessed using eye-tracking algorithms. The Whisper model, which transcribes the audio track, was used to process speech content. Subsequent NLP analysis included topic modeling, keyword extraction, and emotional assessment of speech. Based on this data, the system calculates new metrics such as engagement index, information density coefficient, and interactivity level. The analysis of 137 video recordings revealed significant variability in key parameters. The average briefing duration was 23.4±7.8 minutes, with statistically significant differences between disciplines. A computer vision algorithm demonstrated high accuracy (0.89) in assessing participant engagement. A visual attention index (0.72) was found to positively correlate with subsequent safety compliance. A speech content analysis of 487,000 words revealed that only 34% of the content consisted of precise terminology, while 41% consisted of vague wording, which reduced comprehension effectiveness. Topic modeling identified five semantic clusters, of which the “Clear Algorithms for Action in Emergency Situations” cluster demonstrated the greatest predictive value for safety. Validation of the results by comparison with expert annotation confirmed high consistency, with the system identifying 12% of discrepancies related to previously unaccounted for nonverbal parameters. These results support the proposed hypothesis and the potential of the automated approach. The implementation of the system enables a transition from subjective assessments to objective quantitative analysis of training quality.</jats:p>

Keywords

analysis system safety which speech

КОМПЬЮТЕРНОЕ ЗРЕНИЕ И РАСПОЗНАВАНИЕ РЕЧИ КАК ИНСТРУМЕНТ ПОВЫШЕНИЯ ПОЖАРНОЙ БЕЗОПАСНОСТИ

Abstract

Keywords

Related Articles

Распознавание эмоциональных состояний с использованием ансамблевых методов мультимодального сентимент-анализа, Emotion recognition using ensemble methods for multimodal sentiment analysis

COMPUTER SIMULATION OF MICROSTRUCTURE EVOLUTION OF 08KH18N10T STEEL PRODUCED BY EQUAL-CHANNEL ANGULAR PRESSING

RECOGNITION OF EMOTIONAL STATES IN RUSSIAN SPEECH USING MFCC FUNCTIONS AND THE BLSTM MODEL FOR THE DUSHA DATASET

Features of the Search for the Meaning of Life and Emotional Response in Military Personnel Who Have Lost Their Sight in Combat Conditions