A Novel Scoring System for Prediction of Disease Severity in COVID-19

Background: A novel enveloped RNA beta coronavirus, Corona Virus Disease 2019 (COVID-19) caused severe and even fetal pneumonia in China and other countries from December 2019. Early detection of severe patients with COVID-19 is of great significance to shorten the disease course and reduce mortality. Methods: We assembled a retrospective cohort of 80 patients (including 56 mild and 24 severe) with COVID-19 infection treated at Beijing You'an Hospital. We used univariable and multivariable logistic regression analyses to select the risk factors of severe and even fetal pneumonia and build scoring system for prediction, which was validated later on in a group of 22 COVID-19 patients. Results: Age, white blood cell count, neutrophil, glomerular filtration rate, and myoglobin were selected by multivariate analysis as candidates of scoring system for prediction of disease severity in COVID-19. The scoring system was applied to calculate the predictive value and found that the percentage of ICU admission (20%, 6/30) and ventilation (16.7%, 5/30) in patients with high risk was much higher than those (2%, 1/50; 2%, 1/50) in patients with low risk (p = 0.009; p = 0.026). The AUC of scoring system was 0.906, sensitivity of prediction is 70.8%, and the specificity is 89.3%. According to scoring system, the probability of patients in high risk group developing severe disease was 20.24 times than that in low risk group. Conclusions: The possibility of severity in COVID-19 infection predicted by scoring system could help patients to receiving different therapy strategies at a very early stage. Topic: COVID-19, severe and fetal pneumonia, logistic regression, scoring system, prediction.


INTRODUCTION
A cluster of cases of acute respiratory illness with unknown etiology was reported in Wuhan City, Hubei Province of China from December 2019 . The pathogen was identified as a novel enveloped RNA beta coronavirus by the Chinese Center for Disease Control and Prevention (CDC) (Wu et al., 2020), and was designated as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (Zhu et al., 2020). The World Health Organization (WHO) declared the novel coronavirus disease, COVID-19; a public health emergency of international concern, and by 11 March 2020, the COVID-19 outbreak was declared a global pandemic. According to Coronavirus disease 2019 (COVID-19) situation report from WHO,totally 191,127 cases of patients were laboratory confirmed and amongst them 7,807 patients died by 18th March 2020 World Health Organization, 2020).
Infection in the majority of people is mild, with common clinical characteristics including fever, cough, and sputum. Some infected patients also reported gastrointestinal symptoms including vomiting and diarrhea (Perlman and Netland, 2009;Fehr and Perlman, 2015). Dyspnea and/or hypoxemia occurred after 1 week, with 50% of severe patients quickly progressing to acute respiratory distress syndrome, septic shock, refractory metabolic acidosis, coagulation disorders, and multiorgan failure, even life-threatening (China National Health Commission, 2020). However, there is still no clear critical predictive factors and models to prognosticate the severity of the disease. This article intends to conduct a group study of 80 patients with COVID-19 infection in a tertiary teaching hospital specializing on infectious diseases to screen for critical factors related to the disease and establish a predictive model for disease severity. Early detection of severe patients with COVID-19 is of great significance to shorten the disease course and reduce mortality.

Study Population
Patients were recruited from Beijing You'an Hospital, Capital Medical University, Beijing. A discovery cohort (80 cases) was setup between January 2020 and February 2020 and a validation group (22 cases) was setup from March to April of 2020. All participants were hospitalized patients with laboratory-confirmed COVID-19. Their clinical data was collected from Electronic Medical Record System (EMRS), Laboratory Information System (LIS) and Picture Archiving and Communication System (PACS). The study was approved by the Institutional Review Board of Beijing You'an Hospital.

Clinical Definitions
COVID-19 was diagnosed according to the diagnosis and treatment of coronavirus disease 2019 (COVID-19) recommended by the National Health Commission of China (China National Health Commission, 2020). The laboratoryconfirmed patient was defined as a positive result on high throughput sequencing or real-time reverse-transcriptasepolymerase-chain-reaction (RT-PCR) assay of nasal and pharyngeal swab specimens. The degree of severity was divided as mild infection and severe infection. Severe infection was defined as COVID-19 confirmed patients with one of conditions: respiratory distress with RR>30/min; Blood oxygen saturation<93%; arterial oxygen partial pressure (PaO 2 )/Fraction of inspired O 2 (FiO 2 ) <300 mmHg; respiratory failure with mechanical ventilation; shock; or other organ failures need intensive care in ICU. Initial stage of COVID-19 infection was defined as patients during their first week of infection only with the common clinical characteristics, such as fever, cough, sputum, vomit, and diarrhea.

Treatment Procedure and End-Point of Observation
All of patients received standard therapy according to the "Diagnosis and Treatment of Coronavirus Disease 2019" guidelines recommended by the National Health Commission of China (China National Health Commission, 2020). The observed end-point was defined as recovery or death in 28 days in hospital.

Statistical Analysis
Statistical analysis of the categorical data was performed using the Chi-square test. Fisher's exact test was used since the Chi-square approximation might not hold for the relatively small sample size. Student's t-test was used to compare continuous values between mild and severe infection groups in which case data were normally distributed (evaluated with Kolmogorov-Smirnov test), and non-parametric t-test (Mann-Whitney test) was used when data were not normally distributed. The univariate and multivariate logistic regression analysis of variables potentially associated with severity of COVID-19 infection. The optimal cutoff values were calculated in accordance with the receiver operating characteristic curves and Youden's index. The prediction value of scoring system was determined by the area under the curve (AUC). Statistical test differences were considered significant if the P-values were <0.05. Analyses were performed with SPSS software v 25.5 (IBM, NY, USA).

Clinical and Laboratory Characteristics of Discovery Cohort
Eighty hospitalized patients with laboratory-confirmed COVID-19 were recruited in the study in total, and all candidates were divided into those with "mild" and "severe" disease according to the clinical definitions from the National Health Commission of China. Mild disease (n = 56) was defined as those with fever, respiratory symptoms and pneumonia from imaging. Patients with severe disease (n = 24) were those with the symptoms described above, but deteriorated and developed respiratory distress or respiratory failure. Blood oxygen saturation in the patients (24/24) in the severe group was below 93%, and none of 56 patients in mild group was below 93%. The ratio of arterial oxygen partial pressure (PaO 2 ) to Fraction of inspired O 2 (FiO 2 ) was 223.5 ± 45.77 mmHg in severe group, much lower than that in mild group (466.7 ± 135.6 mmHg, p < 0.001). Seven patients in severe group received intensive care in ICU, 6 patients mechanically ventilated, and among them three severely infected patients died. Demographic data are shown in Table 1.

Clinical Indicators Associated With the Severity of COVID-19 Infection
Demographic and clinical data between mild and severe group were compared. Firstly, age was found strongly associated with the severity of diseases (45.34 ± 15.25 in mild vs. 64.75 ± 14.76 in severe group, p = 1.0E-06). Secondly, respiratory disease (p = 0.0067), cardiac disease (p = 0.0186), hypertension (p = 0.0011), and more than two comorbidities (p = 0.0024) were identified as the factors associated with the severity. Several biomarkers from the 1st laboratory detection were also identified as the potential factors related with the severity of the disease, including white blood cell count (4.15 ± 1.37 in mild vs. 6.08 ± 2.02 in severe group, p = 1.

Scoring System for Prediction of Disease Severity in COVID-19
The factors associated with severity of COVID-19 in Table 1 were analyzed by univariate and multivariate logistic regression analysis. Age, pre-existing conditions (cardiac disease, hypertension, and more than two comorbidities), and 1st Laboratory detection (WBC, NEU, LYM%, NEU%, NLR, FIB, CRP, TBIL, ALB, GRF, CK-MB, Myoglobin, and Troponin) were identified as the predictors of the severity of disease by univariate analysis. Amongst them, age, WBC, NEU, GFR, and Myoglobin were selected by multivariate analysis as candidates of scoring system for prediction of disease severity in COVID-19 ( Table 2). Each variable selected by multivariate analysis was assigned diverse scores according to their hazard ratio (HR). Patients with age above 59 years old were assigned a score of 1; and the level of WBC above 6.09, the value of neutrophil above 2.89 were given score of 2; GFR below 103.75 and myoglobin above 43 were assigned score 1. Finally, a scoring system was designed, which ranged from 0 to 7 by calculating each patient's score. Individuals with scores of 0-4 were defined to be at low risk of severity, and 5-7 at high risk (Table 3).

Predictive Value and Validation of Scoring
System to the Severity of COVID-19 The scoring system was brought into the cohort to calculate the predictive value and found that the percentage of ICU admission (20%, 6/30) and ventilation (16.7%, 5/30) in patients with high risk was much higher than those (2%, 1/50; 2%, 1/50) in patients with low risk (p = 0.009; p = 0.026). The scoring system was then used to evaluate the accuracy of prediction in severity and found that the AUC is 0.906 (Figure 1A), sensitivity of prediction is 70.8%, and the specificity is 89.3%. The probability of patients in high risk group developing severe disease was 20.24 times than that in low risk group (p = 1.0E-06, Table 4). In addition, another 22 patients with COVID-19 were recruited from March to April of 2020 in the validation cohort. Amongst them, 18 patients were diagnosed as "mild" disease and 4 patients with "severe" disease. The variables from scoring system, including age, WBC, NEU, GFR, and Myoglobin were collected and the patients were divided into two groups (high risk vs. low risk) according  to the scoring system. The accuracy of prediction in severity was evaluated and found that the AUC is 0.958, sensitivity of prediction is 100%, and the specificity is 88.9% ( Figure 1B).

DISCUSSION
COVID 19 is a novel disease which has spread throughout the world and resulted in over seven thousand deaths worldwide in a few months. Most patients had mild symptoms with only 6.1% of patients progressing to severe disease requiring admission to ICU or the use of mechanical ventilation (Guan et al., 2020). There is an urgent need to find a simple and precise tool to predict the development of severity in COVID-19 infection at the early stage of disease (Wynants et al., 2020).
In the current study, we calculated a novel scoring system which could help predict the severity of COVID-19 infection from patient characteristics and clinical parameters collected on the first day of presentation to hospital. Although several factors, for example, age and NLR Wang et al., 2020;Yang et al., 2020;Zhou et al., 2020) have previously been reported to be associated with the incidence of severe illness, we are the first to use scoring system to classify high and low risk of severity. We found that 63.33% of patients in the high-risk group developed severe infection, compared with only 10% of patients in low-risk group, which indicated that FIGURE 1 | Predictive value and validation of scoring system to the severity of COVID-19. The scoring system was brought into the discovery cohort (A) to calculate the predictive value and found that the accuracy of prediction in severity. AUC is 0.906, sensitivity of prediction is 70.8%, and the specificity is 89.3%. The scoring system was brought into the validation cohort (B) to calculate the predictive value and found that the accuracy of prediction in severity. AUC is 0.958, sensitivity of prediction is 100%, and the specificity is 88.9%.
the hazard ratio of severity in high-risk group was 20 times of low-risk group. This will help set up different strategies for high and low risk group, which is very important for government to manage limited medical resources, also useful for patients to quell anxiety.
The second character of this scoring system is covering patients' condition, from pre-existing conditions to presenting symptoms. We found that pre-existing conditions, including respiratory disease, cardiac disease, hypertension, and more than comorbidities are risk factors strongly associated with the severity, although all of them were substituted by white blood cell count, absolute value of neutrophil, glomerular filtration rate and myoglobin in scoring system, which just indicates the importance of pre-existing conditions to the severity of COVID-19 infection. Amongst the five factors in scoring system, age is the basic factor of severity, which has become consensus in recent studies in COVID-19  and Severe acute respiratory syndrome (SARS) (Chan et al., 2003) and Middle East respiratory syndrome (MERS) (Arabi et al., 2017). Moreover, several pre-existing conditions which are high-risk factors were reported by Gong et al. (2020), and in this study, we also found that these pre-existing conditions strongly associated with the severity for example, cardiac disease and hypertension, while they are rejected from the scoring system, because they are age-dependent factors.
In this study, white blood cell count and absolute value of neutrophil are selected to be the biomarker for predict the progress of the disease. The same as the other papers published previously, our data in the paper also found that the lymphocyte percentage descend with the disease, which indicates the direct result of viral infection (Dymond, 2018;Qin et al., 2019Qin et al., , 2020Liu Z. et al., 2020). And more interesting, we also found that the higher of white blood cell count and absolute value of neutrophil, the higher risk of severity, which give us a clue that abnormal virus-immune response cross talk in the early stage might affect the outcome of the disease (da Silva-Malta et al., 2017;Abd El-Kader and Al-Jiffri, 2018).
In addition, the biomarkers used in the scoring system are common and easily obtainable in an early stage of the disease (Havrilesky et al., 2008;Matthews et al., 2018). White blood cell count, absolute value of neutrophil, GFR and myoglobin are routine clinical detection in hospital, which could be get on the first day of hospital admission. The availability of these biomarkers indicates this scoring system could be used in an outpatient setting to classify patients in high or low risk of severity and receiving different therapy strategies.
In conclusion, our data clearly present a simple and precise scoring system to predict the possibility of severity in COVID-19 infection. Age, white blood cell count and pre-existing conditions could help calculate the score and further classify the risk of disease severity. Whilst the convenience of this scoring system is very important for current therapy during the period of pandemic of COVID-19 infection, further validation in large cohort is required.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Institutional Review Board of Beijing You'an Hospital. The patients/participants provided their written informed consent to participate in this study. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.