A Risk Prediction Model for Evaluating the Disease Progression of COVID-19 Pneumonia

Background and Objective: The epidemic of coronavirus disease 2019 (COVID-19) pneumonia caused by infection with severe acute respiratory syndrome coronavirus 2 (SARS-CoV2) has expanded from China throughout the world. This study aims to estimate the risk of disease progression of patients who have been confirmed with COVID-19. Methods: Meta-analysis was performed in existing literatures to identify risk factors associated with COVID-19 pneumonia progression. Patients with COVID-19 pneumonia were admitted to hospitals in Wuhan or Hangzhou were retrospectively enrolled. The risk prediction model and nomogram were developed from Wuhan cohort through logistic regression algorithm, and then validated in Hangzhou and Yinchuan cohorts. Results: A total of 270 patients admitted to hospital between Dec 30, 2019, and Mar 30, 2020, were retrospectively enrolled (Table 1). The development cohort (Wuhan cohort) included 87 (43%) men and 115 (57%) women, and the median age was 53 years old. Hangzhou validation cohort included 20 (48%) men and 22 (52%) women, and the median age was 59 years old. Yinchuan validation cohort included 12 (46%) men and 14 (54%) women, and the median age was 44 years old. The meta-analysis along with univariate logistic analysis in development cohort have shown that age, fever, diabetes, hypertension, CREA, BUN, CK, LDH, and neutrophil count were significantly associated with disease progression of COVID-19 pneumonia. The model and nomogram derived from development cohort show good performance in both development and validation cohorts. Conclusion: The severe COVID-19 pneumonia is associated with various types of risk factors including age, fever, comorbidities, and some laboratory examination indexes. The model integrated with these factors can help to evaluate the disease progression of COVID-19 pneumonia.


INTRODUCTION
Since December, 2019, China reported a cluster of cases of pneumonia with unknown cause in Wuhan, Hubei (1). On Jan 7, 2020, Chinese health authorities have confirmed these cases were associated with a novel coronavirus, severe acute respiratory syndrome corona virus 2 (SARS-CoV2; previously called 2019-nCoV) via next generation sequencing analysis of patients' respiratory tract samples (2,3). SARS-CoV2 is the seventh member of Coronaviridae, which has been shown to infect human cells through interacting with the angiotensinconverting enzyme 2 (ACE2) receptor on the cell surface (4). ACE2 receptor is wildly distributed in various types of human cells including type II alveolar cells, renal tubular cells, Leydig cells and so on (5). Thus, SARSCoV2 possesses a strong ability to infect humans.
Most of the original cases of coronavirus disease 2019 (COVID-19) pneumonia were reported to have been exposed to the Huanan seafood market in Wuhan (6). However, the medical and nursing staffs, patients without exposure to the market but with a history of travel to Wuhan have been found to be infected by SARS-CoV2, suggesting that humanto-human transmission is occurring (7,8). The number of diagnosed cases has been increasing rapidly: by March 27 2020, more than 500,000 cases of COVID-19 pneumonia had been reported in China and other countries worldwide (including Japan, South Korea, Spain, Italy, the UK, and the USA), and over 23,000 patients had died, equivalent to a case fatality rate of around 4% (9). Epidemic prevention is becoming increasingly severe.
Similar as respiratory diseases caused by other betacoronaviruses such as severe acute respiratory syndrome (SARS) and Middle East respiratory syndrome (MERS) (10,11). Patients with both mild and severe COVID-19 pneumonia showed fever, dry cough, and dyspnea symptoms (6). Furthermore, patients with severe COVID-19 pneumonia were more likely to progress to acute respiratory distress syndrome (ARDS), which had relatively higher case fatality rate (12). However, little studies were reported about evaluating the risk of disease progression of COVID-19 pneumonia.
In this study, we employed a meta-analysis of 6,061 cases of COVID-19 from 32 studies. The results showed that severe COVID-19 pneumonia was obviously correlated with severe complications, including ARDS, shock, acute kidney injury and acute cardiac injury. And the comorbidities significantly increased the risk of progressive COVID-19. In addition, a total of 270 COVID-19 pneumonia patients were collected from Wuhan, Hangzhou, and Yinchuan. A predictive model and nomogram were then established based on previously identified risk factors including age, fever, comorbidities, CK, LDH, CREA, BUN, and neutrophil count to predict the risk of disease progression in Wuhan cohort. A nomogram is a statistical instrument that accounts for numerous variables to predict an outcome of interest (13). The nomogram showed great performance in predicting the probability of severe COVID-19 pneumonia, which was further validated in two independent cohorts. Neutrophil (10 9 /L) 3.9 ± 3.7 5.0 ± 5.7 3.2 ± 1.5

Meta-Analyses
Thirty-two eligible studies were analyzed by performing metaanalysis. OR, RR, SMD, 95% CI as well as forest plots were calculated via stata 12.0. Specific analysis procedures including search strategy, inclusive and exclusive criteria, data extraction, quality evolution and statistics were provided in Supplementary Texts 1, 2.

Study Design and Participants
This was a retrospective study done at two centers in Wuhan and Hangzhou. Patients with confirmed COVID-19 pneumonia were admitted to Zhongnan Hospital of Wuhan University or the First People's Hospital of Xiaoshan District or The Fourth People's Hospital of Ningxia Hui Autonomous Region were retrospectively enrolled ( Table 1). Next-generation sequencing or real-time PCR of throat swab specimens were used to confirm the SARS-CoV2 infection of each patient according to a previously published protocol (14). The following primers or probe targeted to the envelope gene of SARS-CoV2 were used: Forward primer: 5 ′ -GACCCCAAAATCAGCGAAAT-3 ′ ; Reverse primer: Patients with respiratory rate over 30 per min, or SpO 2 <93%, or PaO 2 /FiO 2 <300 mmHg were considered as severe cases (15). The clinical characteristics, laboratory findings, and comorbidities of the patients were recorded at the time of admission to hospital. The study is approved by the Ethics Committee of the Zhongnan Hospital of Wuhan University, the First hospital of Xiaoshan District and The Fourth People's Hospital of Ningxia Hui Autonomous Region under the accession 2020035.

Statistical Analysis
Normality of the data was evaluated using the Kolmogorov-Smirnov test. Data with normal distributions were presented as mean ± SD, data with non-normally distribution were presented as median (IQR), and categorical variables as frequency (%). Differences between two groups were analyzed by Fisher's exact test (for categorical variables) or Mann-Whitney Utest (for continuous variables). The hazard ratios (HRs) and corresponding 95% CIs were calculated using univariate or multivariate logistic regression algorithm.

Model Development and Valudations
For the development of the nomogram, we tested the significance of potential risk factors by univariate logistic regression algorithm and further filtered by meta-analysis. As a result, eight important predictors of risk of severe COVID-19 pneumonia were to create the model via multivariate logistic regression. The models produce a linear predictor, which is represented as a patient's predicted log hazard. We tested the accuracy of the model as well as nomogram derived from the development cohort in the validation cohort by discrimination, ROC (receiver operating characteristic curve) analysis and calibration curve. The nomogram converts the risk of each variable into a points system, which can be added to produce an overall risk estimate.
The discrimination is evaluated by the c-statistic, which measures the capability of the model to predict a high risk for a patient who is considered as high risk. The closer that c-statistic is to one, the better the discrimination, and a value of 0.5 indicates that the model is not better than chance. The calibration curve of the model measures the relationship between the outcomes predicted by the model and the actual outcomes in indicated cohort. A 45 • line indicates perfect calibration, which the predicted outcome of the model perfectly matches the patient's observed outcome. Any deviation above or below the 45 • line indicates underprediction or overprediction, respectively. The ROC analysis is evaluated by AUC (area under the curve), which measures the sensitivity and specificity of the model. The closer that AUC is to one, the more sensitive and specific the model. All the statistics were performed using R project (version 3.4.4).

RESULTS
According to our search strategy, a total of 142 studies were collected from the online database. Filtered by the criteria of inclusion and exclusion, we finally retained 32 (Supplementary Figure 1) studies for further meta-analysis, and no publication bias exists (Supplementary Figure 2 18; P < 0.001) occurred frequently in patients with severe COVID-19 pneumonia compared with the ordinary cases (Figure 1). The death event [risk ratio [RR] = 30.09, 95% CI: 11.46-79.01; P < 0.001 was more common in disease progression in COVID-19 pneumonia. The detailed meta-analysis results were summarized in Supplementary Table 1.
To investigate the variables associated with severe COVID-19 pneumonia, we analyzed the relationship between patients' clinical characteristics and severe COVID-19 pneumonia risk through meta-analysis. Supplementary Table 2 10.54, P < 0.001), and pulmonary disease (OR = 4.17, 95% CI: 2.86-6.08, P < 0.001) may be closely related with the risk of patients infected with SARS-CoV2 progressed to severe COVID-19 pneumonia (Figure 2). In addition, different clinical symptoms including cough, fatigue, fever, muscular soreness, and CT performance were also correlated with severe COVID-19 pneumonia. The above symptoms may frequently occur in patients with progressive COVID-19 pneumonia. Next, we analyzed the relationship between patients' laboratory findings and severe COVID-19 pneumonia risk. The results showed that hepatic function indexes albumin, ALT, AST, renal function indexes BUN, CREA, cardiac function indexes CK, CK-MB, LDH, cTnI, MYO, coagulation function indexes D-dimer, PLT, PT, blood routine indexes Hb, lymphocyte, neutrophil, CD4 lymphocyte, CD8 lymphocyte, inflammation indexes WBC, CRP, PCT, ESR, ferritin, and electrolyte indexes Na + were obviously changed in patients with severe COVID-19 pneumonia compared with mild COVID-19 pneumonia patients. Moreover, the Supplementary Table 3 showed that blood routine Table 2). We further analyzed the relationship between potential risk factors identified by meta-analysis using the univariate logistic regression (Supplementary Figure 5).
To evaluate the probability of the patient with severe COVID-19 pneumonia, we sought to develop a model for predicting the risk of disease progression using multivariate logistic regression algorithm with the above variables in Wuhan cohort. The final model contained nine investigated variables (including age, fever, hypertension, diabetes, BUN, CREA, CK, LDH, and neutrophil count) and one interaction term. The model for predicting the risk of severe pneumonia had a c-statistic of 0.86, with good calibration curve (Figure 3). The ROC analysis in Wuhan cohort revealed that the model had high sensitivity and specificity (Figure 3). Moreover, we collected another two independent cohorts of patients from Hangzhou and Yinchuan as validation cohorts to evaluate the performance of the model. The Hangzhou validation cohort consisted of 42 patients. 16 (38%) patients were diagnosed as severe COVID-19 pneumonia, and 26 patients were diagnosed as mild COVID-19 pneumonia ( Table 1). The model for predicting the risk of severe pneumonia had a c-statistic of 0.93, with great calibration curve in validation cohort (Figure 3). The model had an AUC of 0.928, which indicated it also had good performance in validation cohort (Figure 3). The Yinchuan validation cohort consisted of 26 patients. 6 (23%) patients were diagnosed as severe COVID-19 pneumonia ( Table 1). The model for predicting the risk of severe pneumonia had a c-statistic of 0.90, with great calibration curve in validation cohort (Figure 3). The model had an AUC of 0.908, which indicated it also had good performance in validation cohort (Figure 3).
To prevent the overfitting of our model, we performed correlation analysis between the biological test parameters used in our model, the results showed that the CREA level was significantly correlated with the BUN level in COVID-19 patients (R = 0.77, P < 0.001; Supplementary Figure 6). We sought to interact these two parameters in our model, through the AUC of model increased to 0.874 in the development cohort (Supplementary Figure 6), the AUC dropped to 0.837 and 0.900 in Hangzhou and Yinchuan validation cohort, respectively (Supplementary Figure 6). Therefore, we use all the variables to generate the model without interactions.
We created the nomogram to predict the probability of severe COVID-19 pneumonia (Figure 4). The line labeled Points was used to calculate the points referred to each of the nine risk factors. The subsequent lines (lines 2-10 in Figure 4) are the risk factors involved in this model. The value for each variable is located on these lines and a vertical line is drawn up to the points line to find the points associated with each value. All of the points are summed and the total located on the Total points line. According to the total points, we could get the COVID-19 pneumonia of severe COVID-19 pneumonia for each patient.

DISCUSSION
The Coronaviridae is a family of enveloped, non-segmented, positive-stranded RNA viruses, which are widely distributed in humans and other mammals such as bats, masked palm civets, and pangolins (16,17). Six of these coronaviruses are found to infect humans previously, and four of them only mild respiratory symptoms similar to the common cold. However, MERS-CoV and SARS-CoV, two kinds of beta-coronaviruses, infected more than 1,000 and 8,000 patients with high case fatality rates (37% for MERS-CoV and 10% for SARS-CoV), respectively (18,19). The novel coronavirus −19, which is also considered as SARS-CoV2, is the seventh member of the Coronaviridae family found to infect human beings (20). Though the case fatality rate of SARS-CoV2 so far is lower than that of SARS-CoV or MERS-Cov diseases, SARS-CoV2 is contagious in humans and is the cause of the ongoing pandemic that has been designated a Public Health Emergency of International Concern by the World Health Organization (WHO) (21).
In our study, we employed a meta-analysis to identify relevant variables for promoting COVID-19 pneumonia progression. Severe COVID-19 pneumonia was obviously correlated with severe complications including ARDS (OR = 49.03), shock (OR = 45.48), acute kidney injury (OR = 24.82), acute cardiac injury (OR = 37.93), progressive COVID-19 also enhanced the death risk (RR = 30.09). Previous studies have implied the SARS-CoV2 infection may cause multi-organ injuries, our results further supported their conclusion by making evidence-based medicine research (22,23). Moreover, Age and comorbidities such as hypertension, diabetes, cancers, heart diseases, pulmonary diseases, kidney diseases as well as cerebrovascular diseases can significantly increase the risk of COVID-19 progression. Meanwhile, we also found that clinical symptoms might not contribute to the disease progression of COVID-19 pneumonia. Laboratory findings especially the renal function index (CREA, BUN), myocardial enzymonram (CK, LDH), and neutrophil counts in severe COVID-19 patients were sustained higher than that of the ordinary patients.
Age and comorbidities were important potential risk factors in progressive COVID-19 pneumonia (24). Elder patients could progress to COVID-19 pneumonia due to the hypo-immunity and comorbidities. Laboratory examinations could reflect the different statuses of organs. We observed that a range of indicators representing functions of different organs had changed in COVID-19 patients, especially in severe COVID-19 patients via meta-analysis, suggesting the existence of multi-organ dysfunctions in patients with severe COVID-19 pneumonia. Organ injuries and progressive COVID-19 pneumonia may be reciprocal causation, progressive COVID-19 caused organ injuries, and indexes of organ injuries could in turn help us to monitor the risk of COVID-19 pneumonia progression. The decision to evaluate disease progression COVID-19 patients based on solely index might be misleading. However, there are no explicit standards or risk models to assist diagnosis. Thus, we collected a total of 270 COVID-19 patients to establish a model for evaluating the risk of COVID-19 pneumonia progression.
In our development cohort, concentrations of C-reactive protein were increased in most patients, similar as observed in previous beta-coronavirus infections (10,11). In the subgroup of patients with severe COVID-19 pneumonia, concentrations of C-reactive protein (74.3 mg/L) were higher than those in mild patients. The concentration of albumin was significantly reduced in severe group patients, which might lead to albuminemia. This finding is similar to those of a previous study of patients with H1N1 infection (25). Furthermore, renal function index (CREA and BUN), myocardial enzymonram (CK and LDH), and liver function index (ALT and AST) were elevated in patients with severe COVID-19 pneumonia. Previous studies also showed that the severe COVID-19 pneumonia could cause kidney dysfunctions based on biopsy samples (26). Thus, our findings indicate that patients with severe COVID-19 pneumonia may be suffered with multiple organ dysfunctions.
Several variables have been documented to be associated with severe COVID-19 pneumonia, such as cancer, hypertension, heart diseases and so on (24). However, no study integrates these risk factors to predict the probability of severe COVID-19 pneumonia. First, we screened the predictive variables for severe COVID-19 pneumonia using the 202 patients from Wuhan cohort. Eight risk factors were found to be significantly correlated with severe COVID-19 pneumonia, and we subsequently created a model and a nomogram based on these risk factors by multivariate logistic regression algorithm. To further validate the validity of the model and nomogram, we collected another two validation cohorts of COVID-19 pneumonia patients from Hangzhou and Yinchuan, the calibration, discrimination and ROC analysis together reveal that the model we built has good performance in two independent validation cohorts (Figure 3).
Our prediction model and nomogram can help doctors to evaluate the risk of disease progression of COVID-19 pneumonia patients on admission to hospital. The nomogram summed points showed the risk of each patient, sensitively reflecting the dysfunctions of multi-organs and the risk score might be related to the outcome of the patient. Accurate risk evaluation may also assist therapeutic decision-making for different patients. According to previous report, steroid drugs may bring no benefits for patients with mild COVID-19 pneumonia, indicating that patients with low risk may not accept steroid drugs treatment (27,28). In addition, the decision to make disease managements for COVID-19 pneumonia is complex, which depends on various factors, including the patient's baseline disease burden and overall clinical picture, not solely on the risk of severe COVID-19 pneumonia. Thus, our model and nomogram can make assistance for doctors by providing an objective and quantifiable estimate of the probability progressed to severe COVID-19 pneumonia.
Our study still has several limitations. First, our nomogram does not include other important predictive variables including heart diseases, chronic obstructive pulmonary disease (COPD), and CT images because of relatively small cohort population. Second, multi-organ dysfunctions of patients with severe COVID-19 pneumonia were evaluated by laboratory examinations of indicated indexes, which needs further pathological validations by biopsy specimens. Third, follow-up data were not available for majority of patients, we could not monitor the risk change of patients during the whole illness onset. Finally, although we have validated our nomogram in an independent cohort, large-scale cohort validation and long-term follow-up are still needed to confirm our findings. Therefore, our nomogram may be improved as additional predictive variables are incorporated, and it still needs followup data to accurately monitor the disease progression of COVID-19 pneumonia.

CONCLUSION
In conclusion, our study summarized all the existing literatures (published or preprinted), which studied the factors of disease progression of COVID-19 pneumonia. Severe COVID-19 pneumonia is significantly correlated with severe complications such as ARDS, shock, acute kidney injury and acute cardiac injury, and the comorbidities significantly increased the risk of progressive COVID-19 pneumonia. At the same time, a model followed by a nomogram based on previously identified risk factors including age, fever, comorbidities, CK, LDH, CREA, BUN and neutrophil count for predicting the risk of severe COVID-19 pneumonia was established in a development cohort, and further validated in an independent cohort.

DATA AVAILABILITY STATEMENT
All datasets presented in this study are included in the article/supplementary material.

ETHICS STATEMENT
The study is approved by the Ethics Committee of the Zhongnan Hospital of Wuhan University and the First hospital of Xiaoshan District under the accession 2020035. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
GC, PL, and YC contributed equally as the co-author. QS and MG had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis. GC, MG, RZ, and QS concept and design. GC, PL, KF, YC, SW, BC, XF, and QS acquisition, analysis, or interpretation of data. GC, PL, and QS drafting of the manuscript. MX and MG critical revision of the manuscript for important intellectual content. GC and QS statistical analysis. RZ and YC administrative, technical, or material support. MX, RZ, MG, and QS supervision. MG obtain funding. All authors contributed to the article and approved the submitted version.

FUNDING
This study was supported by funding from Natural Science Foundation of Jiangsu Province (MG, No. BK20171183).