Performance of Three Mortality Prediction Scores and Evaluation of Important Determinants in Eight Pediatric Intensive Care Units in China

Background: The mortality prediction scores were widely used in pediatric intensive care units. However, their performances were unclear in Chinese patients and there were also no reports based on large sample sizes in China. This study aims to evaluate the performances of three existing severity assessment scores in predicting PICU mortality and to identify important determinants. Methods: This prospective observational cohort study was carried out in eight multidisciplinary, tertiary-care PICUs of teaching hospitals in China. All eligible patients admitted to the PICUs between Aug 1, 2016, and Jul 31, 2017, were consecutively enrolled, among whom 3,957 were included for analysis. We calculated PCIS, PRISM IV, and PELOD-2 scores based on patient data collected in the first 24 h after PICU admission. The in-hospital mortality was defined as all-cause death within 3 months after admission. The discrimination of mortality was assessed using the area under the receiver-operating characteristics curve (AUC) and calibrated using the Hosmer–Lemeshow goodness-of-fit test. Results: A total of 4,770 eligible patients were recruited (median age 18.2 months, overall mortality rate 4.7%, median length of PICU stay 6 days), and 3,957 participants were included in the analysis. The AUC (95% confidence intervals, CI) were 0.74 (0.71–0.78), 0.76 (0.73–0.80), and 0.80 (0.77–0.83) for PCIS, PRISM IV, and PELOD-2, respectively. The Hosmer–Lemeshow test gave a chi-square of 3.16 for PCIS, 2.16 for PRISM IV and 4.81 for PELOD-2 (p ≥ 0.19). Cox regression identified five predictors from the items of scores better associated with higher death risk, with a C-index of 0.83 (95%CI 0.79–0.86), including higher platelet (HR = 1.85, 95% CI 1.59–2.16), invasive ventilation (HR = 1.40, 1.26–1.55), pupillary light reflex (HR = 1.31, 95% CI 1.22–1.42) scores, lower pH (HR 0.89, 0.84–0.94), and extreme PaO2 (HR 2.60, 95% CI 1.61–4.19 for the 1st quantile vs. 4th quantile) scores. Conclusions: Performances of the three scores in predicting PICU mortality are comparable, and five predictors were identified with better prediction to PICU mortality in Chinese patients.


INTRODUCTION
Patients in pediatric intensive care units (PICU) always have a higher risk of death. The PICU mortality rate in China is two or three times that of developed countries in America and Europe (1)(2)(3)(4). It is very important to identify predictors or determinants of death in PICU. Since the establishment of PICU, critical care researchers have been constantly exploring the death risk prediction scores. At present, the most widely used scores in PICUs are PRISM III/IV (5,6), PIM3 (7) and PELOD-2 (8), but their performances and comparisons in Chinese PICU patients in large sample sizes have not been reported. The Pediatric Critical Illness Score (9) (PCIS) has been commonly used in China for the severity assessment of PICU patients. It was established in 1995 based on Chinese experts' experience and only available in Chinese version for domestic use (translated PCIS scale can be seen in Supplementary Table 1).
The performances of PRISM and PIM applied in PICU patients have been assessed earlier in Hong Kong, showing good predictive accuracy with area under the receiver operating characteristic curve (AUC) over 0.9 (10,11). However, studies from mainland Chinese patients were limited, and most of them were based on single-center samples (12)(13)(14), partly due to the unavailability of required data and facilities of the international scores in China. Compared with the reports in Americans and Europeans, these studies showed less ideal performances of these scores in predicting PICU mortality in Chinese patients; the AUCs were 0.73-0.83 for PRISM, 0.72-0.75 for PIM, 0.77 for PELOD-2, and 0.64 for PCIS. The situations are unknown in a larger number of Chinese patients. The aim of the current study is to evaluate the performances of PCIS, PRISM IV, and PELOD-2 in PICU mortality prediction based on a large multicenter cohort of Chinese patients, and to explore the possibility of identifying a smaller number of important determinants to mortality, which are more easily acquired in most PICUs of China.

Study Design and Participants
This was a multicenter prospective observational cohort study including eight PICUs of tertiary teaching hospitals with similar Abbreviations: PCIS, pediatric critical illness score; PELOD, Pediatric Logistic Organ Dysfunction-2; PICU, pediatric intensive care unit; PIM, pediatric index of mortality; PRISM, pediatric risk of mortality. organizations, staffing structures, and management protocols in China; four in Shanghai, two in Jiangsu Province, and two in Zhejiang Province. These eight hospitals are located in the prosperous Yangtze River Delta region and represent more than medium-level PICUs in China. Patients who were admitted to these PICUs between Aug 1, 2016, and Jul 31, 2017, and met the following criteria were eligible and recruited consecutively: a) age over 28 days and below 18 years on admission (the same with patients' admission criteria of PICU in China) and b) patients staying over 4 h in PICU (to minimize missing data). Readmissions to the PICU during the same hospitalization were recorded as two admissions. No interventions or procedures beyond routine clinical practice were implemented. This study was reviewed and approved by the Institutional Review Board (IRB) at the Children's Hospital of Fudan University. Guardians of all participants were informed and signed a routine consent form on admission including the future use of their data for research purposes. No consent form specific to this study was signed. The study protocol was registered at Clinicaltrial.gov NCT02961153.

Data Collection
A uniform case report form was developed for prospective data collection, including hospital facilities of PICU, demographics, clinical, laboratory and therapeutic data of patients. Demographic data included age, gender, date of birth, and payment type. Clinical data included admission diagnosis classified by the system of primary dysfunction based on reason for admission (international classification of disease, ICD-9), etiology of diseases (infections, poisonings, accidents, immunity, tumors, congenital malformations, metabolic disorders, and others), admission sources (general wards, emergency departments, outpatients, operating rooms, transferred from another hospitals), status at the time of admission [cardiopulmonary resuscitation performed 24 h before PICU admission, PICU hospitalization previously associated with this admission, 24 h after surgery but not including postanesthetic recovery patients (those patients were admitted to PICU because of the difficulty in getting beds), invasive ventilator support, vasopressors support], and underlying diseases (chronic health status in the last 3 months before PICU admission), dynamic vital signs (temperature, heart rate, respiratory rate, systolic and diastolic blood pressures) from the time of admission to 24 h after admission, and pupillary reactions and Glasgow Coma Score (included only patients with central nervous system diseases). Laboratory data included blood gases (pH, PCO 2 , bicarbonate, total CO 2 , arterial PaO 2 , lactate), chemistry tests (alanine aminotransferases, aspartate aminotransferases, total and direct bilirubin, albumin, creatinine, urea nitrogen, glucose, serum potassium, and serum sodium), and hematology tests (white blood cell count, hemoglobin, platelet count, prothrombin time or partial thromboplastin time, international normalized ratio, and fibrinogen). PaO 2 data were obtained from ventilator patients (∼24%) and other patients who had arterial blood gas. A small proportion of PaO 2 data came from venous or capillary samples (∼10% of included patients) due to arterial puncture failure. The data of PaO 2 were analyzed uniformly and some speculations were made. Therapeutic data included a fraction of inspired oxygen, whether treated with invasive ventilator support, vasopressor support (dopamine, dobutamine, epinephrine, or norepinephrine), or continuous blood purification within the first 24 h of admission. Items were measured only if the doctor thought it was appropriate. If it was not measured, it was assumed that the value of the variable was normal or identical to the previous measurement. PRISM IV, PELOD-2, and PCIS were calculated as reported.

Outcomes
The primary outcome of this study was in-hospital mortality (defined as all-cause death within 3 months after admission). Patients who still survived or transferred to general wards were defined as survival (coded as 0). Patients discharged against medical suggestions were excluded from the analysis since their outcomes were unknown, and the association analysis would be biased if they were included.
Patients were routinely transferred to the general ward when the following criteria were met: effectively control of primary disease, away from mechanical ventilation, blood purification, vasoactive drugs for more than 48 h, or stable vital signs. Transfer to the general ward was confirmed by the PICU attending doctor or senior doctors.

Quality Control
A database was developed using Microsoft ACCESS based on the uniform case report form, where data can be automatically checked. Two investigators at each site were trained to collect, check, and enter the records. The coordination center monitored the database for quality control; data manager checked with the site about queries with phone or networking applications. Each center recorded the recruitment and reported the number of those discharged against medical suggestions.

Statistical Analysis
Characteristics were summarized for all patients. Count variables were summarized in count and percentages, and numerical variables were summarized in median and interquartile range (IQR).
The three scores were calculated for patients who did not discharge, and their abilities to discriminate mortality were presented in ROC. Pair-wise comparisons were applied to test differences between AUCs of the three scores. For sensitivity analysis, Cox regression models were fitted and C-indices were calculated and compared among the three scores. Bonferroni adjustment was applied for multiple tests.
Hosmer-Lemeshow goodness-of-fit test was applied to examine the extent to which observed and predicted risks of death agree within quintiles of death risk. The statistic χ 2 was calculated as a summary indicator of calibration. A higher χ 2 means greater discrepancy between observed and predicted risks of death.
To identify significant predictors for death, a Cox regression model was constructed. Items from the three score systems were sequentially incorporated, starting from a univariate model with the highest C-index.
All statistical analyses were conducted in STATA 15.0 (Stata Statistical Software, Stata Corp, College Station, TX), and α as the threshold for statistical tests was set at 0.05.

RESULTS
A total of 4,770 eligible patients out of 4,983 admissions were recruited in the eight PICU centers during the study period (Figure 1). Two hundred and twenty six patients died in hospital and overall mortality was 4.7%. Three thousand seven hundred and thirty one patients improved and were transferred to the general ward; 813 were discharged against medical suggestions. Of the 4,770 subjects enrolled, 3,957 who were not discharged were included in the analysis.
As in Table 1, 2,023 (42.4%) patients were younger than 1 year old, and mortality rate was similar among age groups (chisquare test p = 0.82). 2,893 (60.6%) patients were boys. The median length of PICU stay was 6 days. Contributions of each PICU to the total sample ranged from 6.7 to 18.0%. 2,698 (56.6%) were diagnosed as infection at admission, among whom 116 died. Dead patients had worse risk scores (lower PCIS and higher PRISM III and PELOD-2).
Outcome by the hospital can be seen in Table 4. Mortality rates varied among hospitals, which ranged from 2.2% (hospital F) to 5.3% (hospital G). However, hospital C had by far the highest death rate at 16.7%, which was significantly higher than hospital A.

DISCUSSION
This study has compared the performances of three scores (PCIS, PRISM IV, and PELOD-2) in predicting the risk of death in Chinese PICU patients based on a large sample size. We found that performances of the three scores are comparable but less satisfying compared with reports from previous studies in Americans and Europeans (15)(16)(17)(18). Five items from the scores were identified with better mortality prediction in Chinese patients, namely, platelet, invasive ventilation, pupillary light reflex, pH, and PO 2 . This study was the first multi-center epidemiological survey on PICU death in China, including eight PICUs from the Yangtze River Delta region, which represented the highest level of PICUs in the country. The overall mortality rate was 4.7%, which was significantly lower than the high mortality in the early days of PICU (12.8%) (19); but it was still at least twice that of US/European PICUs, which had a mortality rate of about 2.5% (2,3). The causes of the higher mortality in Chinese PICUs were multifactorial. Firstly, the population in this study came from eight centers that received the most serious patients referred from local hospitals routinely. Secondly, there were some differences in the patients' characteristics of PICU patients in our study from studies in western countries. For example, severe pneumonia still ranked as the first cause of death, and one-third of the  patients dying of severe pneumonia had underlying diseases, including congenital heart disease, primary immunodeficiency disease, neuromuscular disease, and congenital genetic metabolic diseases, and treatments were more difficult for these patients. In addition, the common application of broad-spectrum antibiotics in order to achieve timely control of bacterial infections may increase the risk of pan-resistant bacteria, super bacteria, and double infections, which were important causes of mortality. Thirdly, higher mortality in the PICUs included in our study was largely attributed to the underdeveloped medical care levels and qualities in China. Compared with developed countries, there was still a large gap in medical care quality which needed great effort to improve. PRISM and PELOD scores were widely used internationally and showed excellent performance where the scores were developed, with AUCs close to 1 (5,6,8). However, their performances were less ideal in this study in a current study sample. One possible explanation could be the different characteristics of the study population. For example, more than half of the cases admitted to PICU were due to respiratory infection in our study, indicating different characteristics of patients (types of diseases), which is much higher than western countries (3,20). Additionally, the mortality risk varied among countries and hospitals, which was likely attributed to a different quality of PICU care and clinical settings (21). The AUC of this study was similar to a Pakistan study (22) (AUC was 0.78 for PRISM, 0.77 for PELOD), in which earlier versions of the scores were based on a single center and small sample size. We found an inferior performance of PCIS that was inferior to PRISM IV or PELOD-2. Compared with evidenced-based PRISM IV and PELOD-2, PCIS was established based on experts' experience and lacked scientific evidence. It was older and the predictors of the score were never updated in time with the changes in clinical monitoring indicators. To further explore better predictors of mortality among most recent patient cohort, this study scrambled all the variables of the three scores and analyzed each of them. Five predictors were significantly associated with higher death risk, only two of which were included in the PCIS score. The rest of the three items, invasive ventilation, platelet, and pupillary light reflex, were quite consistent with the characteristics of the patient cohort, where severe respiratory infections, neuromuscular disease, accident, and hematogenous tumor comprised the most common diagnoses on admission. The five identified predictors in the current study improved the performance of the scores; however, they needed to be verified with additional independent study samples. More attention was expected in updating PCIS.
This study has the following strengths in methodology. Firstly, this was a multicenter study with the largest sample size of PICU patients in China, making the findings more generalizable. Secondly, this study was prospectively designed and carried out; possible predictors were deliberately collected with quality control procedures, making the data more reliable.

LIMITATIONS
A total of 813 eligible patients were discharged against medical suggestions, whose outcome could not be observed. The reasons these patients left hospital may include poor prognosis, low quality of life, and religious belief. It poses a potential bias in this study. Among all participating units of the current study, the death rate in PICU from Hospital C was higher than others, mainly due to the fact that Hospital C is the largest transport center and receives very critical patients transferred from other hospitals. This may introduce some bias, which is a limitation of the study.

CONCLUSION
Performances of the three scores in predicting PICU mortality are comparable, but less ideal than previous reports. Five predictors from the three score items were identified with better mortality prediction in Chinese PICU patients. Our findings provide important evidence for developing or updating the mortality prediction score for Chinese PICU patients.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Institutional Review Board of Children's Hospital of Fudan University. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.

AUTHOR CONTRIBUTIONS
GL conceptualized and designed the study and reviewed and revised the manuscript. ZZ coordinated and supervised data collection, drafted the initial manuscript, and reviewed and revised the manuscript. WY and XH carried out the analyses, interpreted the results, drafted the initial manuscript, and reviewed and revised the manuscript. YW, YL, HoM, CZ, GP, YZ, XZ, WC, JL, DS, YB, ZC, BJ, HuM, XK, YChen, YCheng, and GY contributed to data collection and reviewed and revised the manuscript. All authors contributed to manuscript revision and read and approved the submitted version.