Abstract
<jats:p>Tuberculosis (TB) remains a major infectious disease in Indonesia, while the identification of patient severity levels in healthcare facilities is often time-consuming due to manual assessment of medical records. At Puskesmas Bonang 1, TB cases increased from 41 in 2023 to 57 in 2024, yet no data-driven analytical system is available to support rapid and objective risk evaluation. This study utilizes 2,546 TB patient medical records from 2023–2024 and applies preprocessing, normalization, encoding, clustering using K-Means, and the development of both baseline and hybrid models. The evaluation results indicate that the Hybrid K-Means + Random Forest model with hyperparameter tuning outperforms the standalone Random Forest model. The baseline Random Forest achieved an accuracy of 81.72% with an F1-Score of 80.98%, while the Hybrid + Tuning model obtained an accuracy of 82.51% and an F1-Score of 81.34%. This improvement demonstrates that cluster-based features extracted using K-Means successfully enhance data representation and improve the predictive performance of Tuberculosis severity risk classification.</jats:p>