medRxiv. 2021 Jan 15:2021.01.13.21249540. doi: 10.1101/2021.01.13.21249540. Preprint.
Patients with influenza and SARS-CoV2/Coronavirus disease 2019 (COVID-19) infections have different clinical course and outcomes. We developed and validated a supervised machine learning pipeline to distinguish the two viral infections using the available vital signs and demographic dataset from the first hospital/emergency room encounters of 3,883 patients who had confirmed diagnoses of influenza A/B, COVID-19 or negative laboratory test results. The models were able to achieve an area under the receiver operating characteristic curve (ROC AUC) of at least 97% using our multiclass classifier. The predictive models were externally validated on 15,697 encounters in 3,125 patients available on TrinetX database that contains patient-level data from different healthcare organizations. The influenza vs. COVID-19-positive model had an AUC of 98%, and 92% on the internal and external test sets, respectively. Our study illustrates the potentials of machine-learning models for accurately distinguishing the two viral infections. The code is made available at https://github.com/ynaveena/COVID-19-vs-Influenza and may be have utility as a frontline diagnostic tool to aid healthcare workers in triaging patients once the two viral infections start cocirculating in the communities.