Neonatal Sepsis Detection Using Decision Tree Ensemble Methods: Random Forest and XGBoost

Detta är en Kandidat-uppsats från KTH/Skolan för elektroteknik och datavetenskap (EECS)

Sammanfattning: Neonatal sepsis is a potentially fatal medical conditiondue to an infection and is attributed to about 200 000annual deaths globally. With healthcare systems that are facingconstant challenges, there exists a potential for introducingmachine learning models as a diagnostic tool that can beautomatized within existing workflows and would not entail morework for healthcare personnel. The Herlenius Research Teamat Karolinska Institutet has collected neonatal sepsis data thathas been used for the development of many machine learningmodels across several papers. However, none have tried to studydecision tree ensemble methods. In this paper, random forestand XGBoost models are developed and evaluated in order toassess their feasibility for clinical practice. The data contained24 features of vital parameters that are easily collected througha patient monitoring system. The validation and evaluationprocedure needed special consideration due to the data beinggrouped based on patient level and being imbalanced. Theproposed methods developed in this paper have the potentialto be generalized to other similar applications. Finally, usingthe measure receiver-operating-characteristic area-under-curve(ROC AUC), both models achieved around ROC AUC= 0.84.Such results suggest that the random forest and XGBoost modelsare potentially feasible for clinical practice. Another gainedinsight was that both models seemed to perform better withsimpler models, suggesting that future work could create a moreexplainable model.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)