Machine learning discovery of distinguishing laboratory features for severity classification of COVID-19 patients

Yang Xiao, Li Yan, Mingyang Zhang, Kent E. Pinkerton, Haosen Cao, Ying Xiao, Wei Li, Shuai Li, Yancheng Wang, Shusheng Li, Zhiguo Cao, Gary Wing Kin Wong, Hui Xu, Hai Tao Zhang

Research output: Contribution to journalArticlepeer-review


The exponential spread of COVID-19 worldwide is evident, with devastating outbreaks primarily in the United States, Spain, Italy, the United Kingdom, France, Germany, Turkey and Russia. As of 1 May 2020, a total of 3,308,386 confirmed cases have been reported worldwide, with an accumulative mortality of 233,093. Due to the complexity and uncertainty of the pathology of COVID-19, it is not easy for front-line doctors to categorise severity levels of clinical COVID-19 that are general and severe/critical cases, with consistency. The more than 300 laboratory features, coupled with underlying disease, all combine to complicate proper and rapid patient diagnosis. However, such screening is necessary for early triage, diagnosis, assignment of appropriate level of care facility, and institution of timely intervention. A machine learning analysis was carried out with confirmed COVID-19 patient data from 10 January to 18 February 2020, who were admitted to Tongji Hospital, in Wuhan, China. A softmax neural network-based machine learning model was established to categorise patient severity levels. According to the analysis of 2662 cases using clinical and laboratory data, the present model can be used to reveal the top 30 of more than 300 laboratory features, yielding 86.30% blind test accuracy, 0.8195 F1-score, and 100% consistency using a two-way patient classification of severe/critical to general. For severe/critical cases, F1-score is 0.9081 (i.e. recall is 0.9050, and precision is 0.9113). This model for classification can be accomplished at a mini-second-level computational cost (in contrast to minute-level manual). Based on available COVID-19 patient diagnosis and therapy, an artificial intelligence model paradigm can help doctors quickly classify patients with a high degree of accuracy and 100% consistency to significantly improve diagnostic and classification efficiency. The discovered top 30 laboratory features can be used for greater differentiation to serve as an essential supplement to current guidelines, thus creating a more comprehensive assessment of COVID-19 cases during the early stages of infection. Such early differentiation will help the assignment of the appropriate level of care for individual patients.

Original languageEnglish (US)
Pages (from-to)31-43
Number of pages13
JournalIET Cyber-systems and Robotics
Issue number1
StatePublished - Mar 2021

ASJC Scopus subject areas

  • Artificial Intelligence
  • Human-Computer Interaction
  • Computational Theory and Mathematics
  • Computer Networks and Communications
  • Information Systems
  • Hardware and Architecture


Dive into the research topics of 'Machine learning discovery of distinguishing laboratory features for severity classification of COVID-19 patients'. Together they form a unique fingerprint.

Cite this