Clinical Profile, Prognostic Factors, and Outcome Prediction in Hospitalized Patients With Bloodstream Infection: Results From a 10-Year Prospective Multicenter Study

Background: Bloodstream infection (BSI) is one of the most common serious bacterial infections worldwide and also a major contributor to in-hospital mortality. Determining the predictors of mortality is crucial for prevention and improving clinical prognosis in patients with nosocomial BSI. Methods: A nationwide prospective cohort study was conducted from 2007 until 2016 in 16 teaching hospitals across China. Microbiological results, clinical information, and patient outcomes were collected to investigate the pathogenic spectrum and mortality rate in patients with BSI and identify outcome predictors using multivariate regression, prediction model, and Kaplan–Meier analysis. Results: No significant change was observed in the causative pathogen distribution during the 10-year period and the overall in-hospital mortality was 12.83% (480/3,741). An increased trend was found in the mortality of patients infected with Pseudomonas aeruginosa or Acinetobacter baumannii, while a decreased mortality rate was noted in Staphylococcus aureus-related BSI. In multivariable-adjusted models, higher mortality rate was significantly associated with older age, cancer, sepsis diagnosis, ICU admission, and prolonged hospital stay prior to BSI onset, which were also determined using machine learning-based predictive model achieved by random forest algorithm with a satisfactory performance in outcome prediction. Conclusions: Our study described the clinical and microbiological characteristics and mortality predictive factors in patients with BSI. These informative predictors would inform clinical practice to adopt effective therapeutic strategies to improve patient outcomes.


INTRODUCTION
Despite the great advances in medical diagnosis and therapy over the past decades, bloodstream infection (BSI) remains a major cause of infectious disease morbidity and mortality in both low-and middle-or high-income countries (1,2). Several studies have reported that BSI was the seventh most common cause of death and the leading cause of death caused by infections (1,2). It is estimated that at least 23 per 100,000 people die each year shortly following an episode of BSI (2). Immunocompromised, chemotherapies, intravascular catheters, and high consumption of antibiotics rendered hospitalized patients highly vulnerable to bacterial colonization, local infection, and even systemic infection (3). A previous study showed that different bacterial species had a significant impact on the prognosis of bacteremia, but the pathogenic spectrum responsible for BSI varied substantially over time and by region (4). Moreover, bloodstream infection can lead to sepsis, an extreme systemic response to infection, which is associated with increased mortality and length of hospital stay and additional medical costs (5). Previous efforts have demonstrated that rapid assessment and intervention is crucial for the prognosis of BSI patients, especially in the emergency department and ICU, because implementing timely and effective infection treatment can significantly reduce the incidence of BSI-associated deaths (6,7).
Accurate identification of predictors associated with mortality in patients with BSI is critical to informing clinical interventions and improving clinical outcomes. Although some prognostic factors have been identified as potential predictors for BSI mortality, most previous reports particularly focused on a single group of people, such as children or the aged, or with specific clinical conditions including cancer and trauma as well as causative organisms equipped with multidrug resistance (8)(9)(10)(11). In addition, multiple machine learning approaches have been developed and increasingly used in predicting unfavorable outcomes and in identifying predictors of mortality for different types of disease, with better performance than the classical multivariate regression analysis method (12). Previous studies have explored the use of the random forest model, one of the machine learning approaches, in the prediction of multidrug-resistant bacterial infection and mortality due to sepsis in the emergency department and in providing significant clinical outcome predictors based on permutation importance of different variables (13,14). Until now, however, very few studies have characterized the feasibility of machine learning technology for the purpose of predicting all-cause mortality of hospitalized patients with BSI.
In the present study, we sought to describe the trends of the incidence of key bloodstream pathogens and BSIassociated mortality over time for the period 2007-2016, which were collected by a national prospective surveillance program. Independent factors for all-cause mortality in hospitalized patients with BSI were also assessed. These results might facilitate a physician's decision-making process concerning rational treatment for high-risk individuals with bacteremia and optimize clinical resources.

Study Design
This study was an investigative and predictive analysis based on BSI patients' clinical data from the CARES study (Chinese Antimicrobial Resistance Surveillance of Nosocomial Infections), which is a nationwide, longitudinal, prospective study encompassing 16 tertiary-care teaching hospitals in 10 provinces in China between 2007 and 2016 (15)(16)(17). Each hospital has at least 1,200 beds and one infectious disease department and infection control committee with specialist doctors, nurses, and microbiological laboratory personnel. Clinicians can be informed immediately by telephone with the positive blood culture as a critical value. Our aim in this study was to analyze the pathogenic spectrum of bacteremia and further elucidate the independent predictors for all-cause mortality at 28 days among hospitalized patients with BSI. Cases were eligible for this study if they had a positive blood culture for gram-negative or gram-positive bacteria and sufficient documentation in the electronic health records to assess therapy and outcomes within 28 days of the positive blood culture. This study was approved by the Research Ethics Board at Peking University People's Hospital, which waived the need for informed consent, because of the observational nature of the study.

Clinical Data Collection
All cases considered in this study were hospitalized patients aged ≥18 years and had at least one documented isolation from positive blood culture during their hospitalization. In order to identify the clinical predictors of BSI mortality, each patient was included only once at the time of the first bacterial isolation from blood culture during the study period. All elements in demographic data, antibiotic administration records, laboratory and microbiological results, and clinical information, including potential predictors for mortality, were extracted from electronic health records (EHRs) by trained reviewers. The primary outcome variable was in-hospital mortality within the first 28 days after drawing the positive blood cultures. Additionally, patients with missing observations or treated on an outpatient basis were excluded retrospectively. All bloodstream isolates were transferred to a reference laboratory (Peking University People's Hospital) and identified by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Cases with coagulase-negative Staphylococcus isolated from a single blood culture without any clinical evidence of infection were also excluded.

Prognostic Factor Analysis
Cox multivariable regression analysis was performed to identify independent predictors for BSI 28-day mortality. We conducted univariate logistic regression analysis for each candidate variable using Pearson's chi-square test or Fisher's exact test, with a P < 0.10 being the criterion for further analysis in the backward, conditional stepwise multivariable regression model. The Hosmer-Lemeshow goodness-of-fit test was used to assess model fit. Hazard ratios, 95% confidence intervals, and associated P-values were also reported.

Statistical Analysis
For univariate analysis, normally distributed continuous variables were expressed as means ± standard deviations (SD) and compared using t-test or Mann-Whitney U test. Categorical variables and their relative frequencies were expressed as absolute numbers and compared using Pearson's chi-square test or Fisher's exact test. The multivariate regression analyses were performed to identify independent predictors using IBM SPSS software (version 24.0) for Windows. The chi-square test for trend in proportions was performed to determine significant variations in etiology and mortality during the study period. All reported P-values are two-sided and statistical significance was set as P < 0.05. In addition, all cases and potential factors were used to develop the random forest model and export strong predictors for BSI-related mortality using the R package randomForest. We also quantified the discriminative performance using the area under the ROC curves (AUC), sensitivity, specificity, positive predictive value, and negative predictive value. For variables significantly associated with mortality in both multivariate analysis and random forest model, a Kaplan-Meier curve was plotted to show the survival probabilities at 28 days. FIGURE 2 | All-cause mortality among patients with nosocomial BSI by a 2-year period (A) and 10-year trend in mortality related to different causative bacteria (B). S. aureus-related mortality rate has dropped significantly during the 10-year period (P < 0.05). Upward trends in A. baumannii-and P. aeruginosa-related mortalities were noted but with no statistical significance as assessed by the chi-square test (P > 0.05). E. coli-and K. pneumoniae-associated mortality remained stable and relatively lower (P > 0.05).

Demographic and Clinical Characteristics
For the purpose of identifying predictors of BSI-associated mortality in the present study, a total of 3,741 hospitalized patients fulfilled our inclusion criteria and were included in the final analysis (Figure 1). The overall 28-day mortality rate was 12.83% (480/3,741) during the 10-year study period and did not vary significantly among different years (Figure 2A). However, the BSI-associated mortality varied somewhat over time by  Figure 2B). Demographic and clinical characteristics of bacteremia patients as well as the results of univariate analysis of the comparison between survived and died groups are shown in Table 2. The mean age of all BSI cases was 56 years (SD = 17.33, range = 18-99) and patients were predominantly male (60.09%). The most common underlying condition was malignancy (28.82%) and the source of the BSI was primary (unknown origin) in 57.04% of the cases. Mortality varied according to comorbidities, type of catheter, and clinical therapy. The highest mortality was accompanied with sepsis symptoms (25.74%) and ICU admission (24.25%). Multiple statistically significant predictors (P < 0.05) were identified in the univariate analysis. Compared with survived patients, dead patients with BSI were more likely to be >65 years of age, their length of hospital stay prior to BSI was >14 days, sepsis was present, intermittent temperature was <35 or >40 • C, they were admitted to the ICU, and inappropriate empirical treatment was provided.
Besides, we also identified potential predictors with the highest coefficients based on permutation importance using the random forest algorithm. It is found that ICU admission [variable importance (VI), 53.89], presentation with sepsis (VI, 21.66), age >65 years (VI, 11.58), inappropriate empirical treatment (VI, 9.29), temperature <35 or >40 • C (VI, 8.92), pre-infection length of stay >14 days (VI, 8.32), malignancy (VI, 5.62), cardiovascular disease (VI, 4.42), surgery within the past 14 days (VI, 4.28), and central line-associated (VI, 3.31) were the top 10 important predictors in the random forest model. These results were consistent with what is found in the multivariate analysis.
In addition, we also evaluated the model's ability to discriminate outcome based on all clinical data collected. The sensitivity (0.81), specificity (0.74), negative predictive value (0.95), and AUC (0.856) showed high to moderate predictive performance, while the positive predictive value is only 0.32 (Figure 3).

Survival Curve Analysis
To evaluate the trends of in-hospital mortality, five predictors identified both in multivariate regression analysis and random forest predictive model were selected to construct survival curve analysis (Figure 4). Kaplan-Meier curves demonstrated that 28day survival distributions were significantly different in patients with age >65 years (P < 0.001), pre-infection length of stay >14 days (P < 0.001), ICU admission (P < 0.001), and presentation with sepsis (P < 0.001). Although BSI patients with malignancy tended to have a worse outcome, the log-rank test was not significant (P = 0.061) and the two survival curves crossed early at around 4 days. All survival curves run parallel until the first week and start to diverge, with a continuously higher death rate among patients with corresponding prognostic factor.

DISCUSSION
Globally, the incidence of bacteremia remains high and continues to contribute to increased patient morbidity and mortality, as well as medical costs (1). In this study, a total of 4,708 BSI cases were obtained from the well-studied nationwide dataset over a 10-year period, and we reported that the 28-day allcause mortality rate among hospitalized patients with BSI in China was 12.83%, which was slightly greater than that reported in a recent large multicenter study (12%) in the USA (18). Demographics, comorbidity, and clinical treatment information were investigated in our study to evaluate the predictors of mortality. The multivariate analysis showed that a total of five independent predictors for BSI mortality were identified in the dataset, which is associated with older age, malignancy, hospital length of stay, clinical symptom, and ICU admission. These predictive factors were also identified by a machine learning model and survival curve analysis.  Patients at increased risk of death after bacteremia could be identified in real time according to prognostic factors. Previous studies have reported that multiple clinical factors, including underlying medical conditions, previous antibiotics exposure, and severity of bacteremia, were independently associated with poor outcome in patients with BSI (19,20).
Patient-related factors, including older age, female sex, and recent hospitalization, were additional significant predictors of mortality (21). However, most of these studies focused on a specific subpopulation group suffering from BSI or those individuals infected with multidrug-resistant pathogen (21), while the current study extracted the predictive factors from a general patient population. The diverse population could increase the generalizability of the identified predictors. More complicated clinical manifestation could be available in the analysis, and these predictors might be more broadly applicable in clinical practice.
Previous studies proved that machine learning techniques are capable of harnessing a mass of clinical variables and the interaction between these factors and, ultimately, predicting clinical outcomes of interest with a satisfactory accuracy in real time (12). In the field of infectious disease prediction, the machine learning model has mostly been limited to the use of predicting infection with multidrug-resistant organism and sepsis in the ICU and emergency department (22)(23)(24)(25). In this study, we made an attempt to utilize the machine learning prediction model to predict mortality among hospitalized patients with BSI, and it performed satisfactorily with an AUC value of 0.856. The prediction model also exported important predictors for BSI mortality, including ICU admission, presentation with sepsis, inappropriate empirical treatment, etc. These variables could aid in the physician's judgment and provide clinicians with real-time prognostic information to assist in decision-making and reduce preventable BSI-related adverse events. Moreover, no parameter optimization was performed for this model in order to simplify the application of machine learning approaches in the healthcare settings in our study. It suggests that parameter optimization could further improve predictive performance. In addition, other machine-learningbased models, such as the support vector machine, artificial neural networks, or deep learning, may also be constructed with this dataset and compared with the random forest model in this study.
Based on the nationwide culture-confirmed BSI cohort, we found that the overall mortality of BSI patients during the 10-year period was relatively stable, but the mortality of patients with BSI due to different causative pathogen presented different changing trends. P. aeruginosa and A. baumannii, two clinically important non-fermenters, were linked to increased mortality during the study period. Conversely, S. aureus-related mortality rate showed a gradually decreasing trend. A possible explanation for the observed phenomena was the extremely limited therapeutic options for bloodstream infections due to carbapenem-resistant P. aeruginosa and A. baumannii that have spread increasingly in recent years, while vancomycin-or daptomycin-resistant S. aureus were relatively rare in China.
In conclusion, our study determined the overall mortality rate of patients with bloodstream infection during a 10-year period and identified multiple predictors associated with poorer outcomes using multivariable-adjusted analysis and random forest predictive model. These clinically important predictive factors, including abnormal body temperature, longer hospital stay, and presentation with sepsis, could aid clinicians in identifying patients at high risk of death and lead to timely medical interventions to improve patient outcomes.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Research Ethics Board at Peking University People's Hospital. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
LJ and HW conceived and designed the study. CZ, HL, RW, and QW led the data collection and analysis. LJ wrote the manuscript. HW critically reviewed and edited the manuscript. All authors have read, commented, and approved the final version of the article.

FUNDING
This study (CARES Network) was supported by a research funding from Pfizer Inc. The funder had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.