medRxiv. 2020 Dec 5:2020.12.02.20235879. doi: 10.1101/2020.12.02.20235879. Preprint.
OBJECTIVE: Retrospective study of COVID-19 positive patients treated at NYU Langone Health (NYULH) to identify clinical markers predictive of disease severity to assist in clinical decision triage and provide additional biological insights into disease progression.
MATERIALS AND METHODS: Clinical activity of 3740 de-identified patients at NYULH between January and August 2020. Models were trained on clinical data during different parts of their hospital stay to predict three clinical outcomes: deceased, ventilated, or admitted to ICU.
RESULTS: XGBoost model trained on clinical data from the final 24 hours excelled at predicting mortality (AUC=0.92, specificity=86% and sensitivity=85%). Respiration rate was the most important feature, followed by SpO2 and age 75+. Performance of this model to predict the deceased outcome extended 5 days prior with AUC=0.81, specificity=70%, sensitivity=75%. When only using clinical data from the first 24 hours, AUCs of 0.79, 0.80, and 0.77 were obtained for deceased, ventilated, or ICU admitted, respectively. Although respiration rate and SpO2 levels offered the highest feature importance, other canonical markers including diabetic history, age and temperature offered minimal gain. When lab values were incorporated, prediction of mortality benefited the most from blood urea nitrogen (BUN) and lactate dehydrogenase (LDH). Features predictive of morbidity included LDH, calcium, glucose, and C-reactive protein (CRP).
CONCLUSION: Together this work summarizes efforts to systematically examine the importance of a wide range of features across different endpoint outcomes and at different hospitalization time points.