Prognostic Value of Multiple Circulating Biomarkers for 2-Year Death in Acute Heart Failure With Preserved Ejection Fraction

Background: Heart failure with preserved ejection fraction (HFpEF) is increasingly recognized as a major global public health burden and lacks effective risk stratification. We aimed to assess a multi-biomarker model in improving risk prediction in HFpEF. Methods: We analyzed 18 biomarkers from the main pathophysiological domains of HF in 380 patients hospitalized for HFpEF from a prospective cohort. The association between these biomarkers and 2-year risk of all-cause death was assessed by Cox proportional hazards model. Support vector machine (SVM), a supervised machine learning method, was used to develop a prediction model of 2-year all-cause and cardiovascular death using a combination of 18 biomarkers and clinical indicators. The improvement of this model was evaluated by c-statistics, net reclassification improvement (NRI), and integrated discrimination improvement (IDI). Results: The median age of patients was 71-years, and 50.5% were female. Multiple biomarkers independently predicted the 2-year risk of death in Cox regression model, including N-terminal pro B-type brain-type natriuretic peptide (NT-proBNP), high-sensitivity cardiac troponin T (hs-TnT), growth differentiation factor-15 (GDF-15), tumor necrosis factor-α (TNFα), endoglin, and 3 biomarkers of extracellular matrix turnover [tissue inhibitor of metalloproteinases (TIMP)-1, matrix metalloproteinase (MMP)-2, and MMP-9) (FDR < 0.05). The SVM model effectively predicted the 2-year risk of all-cause death in patients with acute HFpEF in training set (AUC 0.834, 95% CI: 0.771–0.895) and validation set (AUC 0.798, 95% CI: 0.719–0.877). The NRI and IDI indicated that the SVM model significantly improved patient classification compared to the reference model in both sets (p < 0.05). Conclusions: Multiple circulating biomarkers coupled with an appropriate machine-learning method could effectively predict the risk of long-term mortality in patients with acute HFpEF. It is a promising strategy for improving risk stratification in HFpEF.


INTRODUCTION
Heart failure (HF) is a leading cardiovascular disorder with high morbidity and mortality (1). Based on measurement of left ventricular ejection fraction (LVEF), HF is categorized into heart failure with reduced ejection fraction (HFrEF, LVEF <40%), HF with preserved ejection fraction (HFpEF, LVEF ≥50%), and HF with a mid-range ejection fraction of 40 to 50% (2,3). HFpEF accounts for nearly half of HF patients worldwide, which is increasingly recognized as a major challenge for clinical practice due to no effective management and pharmacological interventions (2)(3)(4). Therefore, accurate risk stratification is critical for tailoring treatment and long-term management strategies for individual patients.
The underlying pathophysiology is currently considered to be different between HFrEF and HFpEF (5,6). HFrEF manifests as an eccentric remodeling accompanied with chamber dilatation and often being volume-overload leading to forward failure typically as a consequence of myocardial infarction. HFpEF is a type of concentric remodeling and/or ventricular hypertrophy characterized by impaired ventricular relaxation and/or filling, resulting in increased filling pressure and usually leading to backward failure. Recent evidences suggest that the mutual effect of cardiovascular and non-cardiovascular comorbidities [e.g., obesity (7), hypertension (8), diabetes (9), coronary artery disease (10), and chronic kidney disease (11)] induces an inflammatory state, leading to myocardial structural and functional alterations in patients with HF. The guidelines of the European Society of Cardiology (ESC) (2) and the American Heart Association (AHA) (3) suggest that the incorporation of biomarkers with clinical and imaging tools can be beneficial for establishing the diagnosis and assessing disease severity in heart failure, including biomarkers of braintype natriuretic peptide (BNP), N-terminal pro-BNP (NT-proBNP), and cardiac troponin. Other diagnostic biomarkers, such as soluble suppression of tumorigenicity 2 (sST2), galectin-3, and growth differentiation factor-15 (GDF-15), could be beneficial in guiding HF therapy. However, the majority of the clinical biomarker data have been derived from studies in undifferentiated HF or HFrEF, while valuable prognostic biomarkers in patients with HFpEF are still very limited. Currently, there are emerging studies increasingly focusing on HFpEF which reported that strategies based on multi-biomarker and supervised/unsupervised machine learning models could improve risk stratification and prognostic prediction in HFpEF patients (12)(13)(14)(15); however, most of them focused on traditional biomarkers, and more accurate risk stratification strategies are still needed.
In this study, we looked at 18 biomarkers which cover the main pathophysiological domains of HF, have been reported to be associated with heart failure prognosis, and can be accurately quantified in more than 95% of samples. Also, the regents with high sensitivity for testing these biomarkers are currently available in the Chinese markets. Our objectives were to assess the prognostic value of the candidate biomarkers from HF pathophysiologic pathways for 2-year all-cause mortality in patients with acute HFpEF; and establish multi-biomarker risk prediction models based on machine learning for 2-year allcause death and cardiovascular (CV) death in patients with acute HFpEF.

Study Design and Patients
The current analysis included patients enrolled from the China Patient-centered Evaluative Assessment of Cardiac Events Prospective Heart Failure Study (China PEACE 5p-HF Study) between August 1, 2016 and July 31, 2017, with LVEF >50% according to echocardiography of the standard procedure. The design of China PEACE 5p-HF Study has been described previously (16). In brief, it is a large multi-center prospective study that consecutively recruited patients hospitalized for HF between August 2016 and May 2018 from 52 hospitals (48 tertiary and 4 secondary hospitals) across China. One of the specific aims of the prospective cohort study was to identify the predictors of adverse outcomes. Patients were eligible if they were ≥18years of age, local residents, and hospitalized with a primary diagnosis of new-onset HF or decompensation of chronic HF. Enrolled patients were interviewed during index hospitalization and followed-up at 1, 6, 12 months after discharge, and annually.
The central ethics committee at Fuwai Hospital and local internal ethics committees at study hospitals have approved the China PEACE prospective HF study. All participants provided written informed consents. The study was registered on clinicaltrials.gov (NCT 02878811).

Data Collection
Medical history, clinical characteristics on admission, and treatments (during index hospitalization and at discharge) were centrally abstracted from medical records, with a 2-level quality control approach. In-person interviews with a standardized questionnaire during index hospitalization and follow-up were conducted to collect additional patient characteristics and outcomes. Data were directly entered into laptop computers equipped with customized electronic data collection system, allowing real-time monitoring to verify the accuracy and completeness of entered data.

Clinical Variables
Coronary heart disease (CHD), myocardial infarction (MI), valvular heart disease (VHD), atrial fibrillation, hypertension, chronic obstructive pulmonary disease (COPD), and ischemic stroke during admission were defined according to the diagnosis in medical records. Diabetes mellitus was defined according to the diagnosis in medical records or positive laboratory test results (HbA1c ≥6.5%). Reduced renal function was defined as an estimated glomerular filtration rate (eGFR) <60 mL/min/1.73 m 2 . The Acute Study of Clinical Effectiveness of Nesiritide in Decompensated Heart Failure (ASCEND-HF) outcome model was used as a reference model for predicting long-term mortality risk in patients with acute decompensated HF. ASCEND-HF outcome model is a simplified prediction model, which includes 5 commonly available clinical variables (age, dyspnea, blood urea nitrogen, sodium, and systolic blood pressure), and has a relatively good prognostic value for mortality within 30 and 180 days (17).

Clinical Outcome
The outcomes of this study were all-cause death and CV death within 2-years after hospitalization. CV death included sudden cardiac death, death due to HF, and other CV deaths (cerebrovascular events, acute myocardial infarction, aortic vascular disease, peripheral arterial disease, and pulmonary heart disease). We ascertained outcome events with the approach employed in international multi-center clinical trials (18). Local site staffs sought information on pre-specified clinical events during follow-up interviews. If in-person follow-up visits were not feasible, information would be gathered through telephone interviews with patients, their relatives, or physicians. We also collected the information on death from the national cause-ofdeath database. Outcome events were centrally adjudicated by trained clinicians according to standard criteria.

Statistical Analysis
Continuous variables were summarized as median [interquartile range (IQR)] and categorical variables as frequency (percentage). Non-parametric tests (Man-whitney-U) and Chi-Square tests were used to compare patients' baseline characteristics grouped by the 2-year survival status.
We first determined the high-risk threshold for each biomarker to divide patients into high-and low-risk groups by using the maximally selected rank statistics from the "maxstat" R package (http://cran.r-project.org/web/packages/maxstat/index. html), which is an outcome-oriented method providing a value of a cutpoint that corresponds to the most significant relation with outcome. We plotted Kaplan-Meier curves to identify the differences of 2-year all-cause death in these binary biomarkers. We used three Cox proportional hazards regression models to evaluate the relationship between individual biomarkers as binary variables and the 2-year risk of all-cause death (model 1: unadjusted model; model 2: adjusting for ASCEND-HF score and history of HF; and model 3: adjusting for ASCEND-HF score, history of HF, and NT-proBNP level). The false discovery rate (FDR) < 0.05 was used to identify the significant biomarkers.
We also developed a prediction model for the 2-year risk of all-cause death with multiple biomarkers based on support vector machine (SVM) (model 6), a supervised machine learning approach. First, we randomly split the study samples into two groups, training set and validation set, in the ratio of 3:2. In the training set, with 2-year death as outcome, we trained a model with 18 biomarkers (log-NT-proBNP, hs-TNT, hs-CRP, endoglin, sTNFRI, sTNFRII, TIMP-1, TIMP-2, MMP-2, MMP-8, MMP-9, galectin-3, MCP-1, TNFα, GDF-15, lipocanlin-2, Cystatin-C, sST2), history of HF, and ASCEND-HF score, using 10-fold cross-validation, classification "C-classification, " kernel "linear, " and cost 1. We obtained each patient's probability of 2-year death based on the SVM model, which was defined as the SVM risk score. In addition, another two Cox regression models (model 4 and model 5) were developed for comparing the predictive ability with the SVM model (model 6). Model 4 was only adjusted for ASCEND-HF score and history of HF. Model 5 was adjusted for ASCEND-HF score, history of HF, and the NT-proBNP level. We compared the area under receiver operating characteristic (ROC) curves of model 6 with those of model 4 and model 5, and calculated the net reclassification improvement (NRI) and integrated discrimination improvement (IDI) by survIDINRI from R package to quantify the added predictive value of 18 biomarkers in training set and validation set, respectively.
Similarly, an SVM model for 2-year risk of CV death was developed and the value of adding 18 biomarkers to the reference model was evaluated by c-statistics, NRI, and IDI.
We conducted a sensitivity analysis by firstly dividing the study samples into training set and validation set according to the date of index admission in the ratio of 3:2, and then re-developing an SVM model and two reference models for 2-year risk of allcause death and CV death with the same method previously mentioned. We also evaluated whether the prediction models have been improved by c-statistics, NRI, and IDI in both training set and validation set.
All calculations were performed using software SAS 9.4 and R version 4.0.3 with packages "e1071" and "maxstat." Statistical significance was defined as a 2-tailed p < 0.05.

Baseline Characteristics
We included 380 patients hospitalized for HFpEF in this analysis, whose median age was 71-years (IQR 63 to 78) and 192 of whom were female (50.5%) ( Table 1). CHD (54%), VHD (28.2%), cardiomyopathy (13.2%), atrial fibrillation (56.1%), hypertension (61.1%), diabetes mellitus (34.2%), COPD (25.8%), reduced renal function (38.7%), and ischemic stroke (20%) were common comorbidities. Two-thirds of the patients had a history of HF. Most patients were in New York Heart Association (NYHA) class III/IV (87.4%) with a median (IQR) LVEF of 59% (53.4, 65.0%). During the 2-year follow-up, 102 (26.8%) patients died, among whom 84 died from CV disease. Compared with those surviving during 2-year follow-up, the dead patients were older (74-years vs. 70-years, P = 0.005), more likely to have COPD (p < 0.001), and with a higher ASCEND-HF score (p < 0.001) and a higher the SVM risk score (p < 0.001) ( Table 1). Table 2 shows the high-risk threshold for each biomarker and percentage of high-risk patients by individual markers at baseline in death, and survival groups. We carried out multiple comparisons with FDR analysis. The percentages of high-risk patients in the death group were significantly higher than those in the survival group for NT-proBNP (FDR  Table 2).

Risk Prediction Model Based on Multiple Marker Panels
We developed 3 prediction models (model 4, model 5, and model 6) for all-cause death and CV death using different marker panels in the training set and validation set, respectively (Figure 2). All markers were used as categorical variables in these models.  Figure 1).

DISCUSSION
In the present study, we assessed the prognostic value of circulating levels of multiple biomarkers for 2-year risk of allcause death and CV death in patients hospitalized for HFpEF. In Cox proportional hazards models, we found that NT-proBNP (cardiac stretch biomarkers), hs-TnT (cardiomyocyte injury biomarker), 2 inflammation-related biomarkers (TNFα and GDF-15), endoglin, an endothelial function biomarker, and 3 biomarkers of extracellular matrix turnover (TIMP-1, MMP2, and MMP9) were independently associated with 2-year risk of all-cause death. We also developed prediction models of 2-year risk of all-cause death and CV death based on 18 biomarkers, history of HF, and ASCEND-HF score by machine learning, and found that the SVM model markedly improved prediction power for 2-year risk of all-cause death in both training set and validation set. It is a potentially effective approach to improve risk prediction in HFpEF patients and provide insights into the possible pathogenesis for the progression of HFpEF.
In this study, we identified an association between the endothelial dysfunction marker endoglin and 2-year risk of allcause death, which was independent of ASCEND-HF score, history of HF, and NT-proBNP. To the best of our knowledge, our study is the first to report the independently predictive value of the biomarker for long-term risk of death in patients with HFpEF. Endoglin (also known as CD105) is a membrane coreceptor for transforming growth factor-β, which is released into the circulation in a soluble form and disrupts TGFβ1 signaling in the endothelium, thereby promoting inflammation, endothelial dysfunction, cardiac fibrosis, and vascular remodeling (19). Circulating levels of soluble endoglin were reported to elevate in patients with increased left heart filling pressures and decrease in association with reduced cardiac filling pressure after diuresis (20). Plasma endoglin has also been reported as a predictor of cardiovascular events following percutaneous coronary intervention in patients with chronic coronary artery disease (21). The elevated level of endoglin during the acute phase initially maintains cardiac output and hemodynamics in the circulation; however, it may also reflect the severity of cardiac impairment. Cardiac function deteriorates progressively when these compensatory mechanisms eventually fail over time. This may be a reason that the biomarker can predict long-term risk of death. We also identified multiple markers of extracellular matrix turnover that were independently associated with the 2-year risk of death, including TIMP-1, MMP-2, and MMP-9; especially, TIMP1 showed the strongest association with the risk of death. In a cross-sectional study of 275 hypertensive patients, HFpEF was associated with an increased matrix turnover signal (MMP2 and MMP9). Alterations in MMP9 and TIMP1 enzymes were found to be significant indicators of greater degrees of asymptomatic left ventricular diastolic dysfunction (22). Similarly, Zile et al. reported a distinguishing role of a plasma multi-biomarker panel consisting of increased MMP-2, TIMP-4, and PIIINP and decreased MMP-8 in identifying patients with HFpEF vs. LV hypertrophy (23). Our results extend the literature with showing that abnormal extracellular matrix turnover, which plays a pivotal role in structural and functional alterations, is associated with long-term risk of death of HFpEF.
In our study, GDF-15, Gal-3, and sST2 were also found to predict the 2-year risk of death in patients with HFpEF. The results are consistent with previous studies, although the associations were attenuated after adjusting for ASCEND-HF score, history of HF, and NT-proBNP. GDF-15 is a member of the transforming growth factor-β cytokine superfamily and its expression is increased upon cell injury and inflammation. Several studies reported that GDF-15 was an independent predictor for long-term death (24) and the composite outcome of death or HF re-hospitalization in patients with HFpEF (25). Galectin-3 is a marker associated with inflammation and fibrosis. Serum levels of galectin-3 have been found to be elevated in both acute and chronic HFpEF, and they have been related to 1-year and 5-year all-cause mortality (26). sST2 is a marker associated with inflammation, myocyte hypertrophy, and fibrosis. Elevated plasma levels of sST2 have been reported to be an independent predictor of mortality and disease progression in both acute and chronic HFpEF (27,28). Our findings further confirmed that these biomarkers could reflect disease progression and contribute to more accurate risk stratification of HFpEF patients, especially when used in combination.
Although several biomarkers have been reported to predict the outcomes in patients with HFpEF, the predictive value of individual biomarkers is limited. Machine learning has great potential to improve predictive power by combining the information of multiple biomarkers from the main pathophysiological domains of HF. Recently, Chirinos et al. (12) evaluated the prognostic value in a supervised machinelearning-derived model which combined 49 plasma biomarkers in 379 patients with chronic HFpEF. In this case, the authors found that the model was strongly predictive of the risk of HF-related hospital admission and markedly improved the risk prediction power when combined with the MAGGIC (Meta-Analysis Global Group in Chronic Heart Failure Risk Score) risk score. In addition, several studies applied unsupervised machine learning methods to identify phenotype-based subpopulations in patients with HFpEF based on clinical, laboratory and/or cardiac ultrasound data, and assessed the differences in characteristics, outcomes, as well as the levels of circulating biomarkers between different phenogroups. Hedman et al. (13) applied model-based clustering to 32 echocardiograms and 11 clinical and laboratory variables collected in 320 HFpEF outpatients, and found that the composite end point (all-cause mortality or HF hospitalization) and 15 out of 86 plasma proteins significantly varied among 6 phenogroups. Cohen et al. (14) identified 3 HFpEF phenogroups based on 8 clinical features, and observed important differences in 10 circulating biomarkers (corrected P < 0.05), cardiac/arterial characteristics, and prognosis (composite of cardiovascular death, heart failure hospitalization, or aborted cardiac arrest) across the clinical HFpEF phenogroups. Woolley et al. (15) performed an unsupervised cluster analysis using 363 biomarkers from 429 patients with HFpEF and identified four distinct patient subgroups. The occurrence of death or HF hospitalization during a median follow-up of 21 months had the highest rate in cluster 4 (62.8%) and the lowest in cluster 3 (25.6%). These studies provide evidence that circulating biomarkers, combined with clinical information, can help accurately identify different phenotypes in patients with HFpEF, which may reflect different pathophysiological pathways and contribute to targeted interventions for patients.
In this study, we developed a risk prediction model combining ASCEND-HF score, history of HF, and 18 circulating biomarkers based on SVM method. This model accurately predicted the 2-year risk of all-cause death in acute patients with HFpEF, suggesting that multi-biomarker models based on machine learning is a promising strategy for improving risk stratification in HFpEF. For the CV death prediction model, we found that the addition of 18 markers significantly improved the predictive value of the SVM model by ROC analysis, NRI, and IDI in the training set. However, in the validation set, NRI and IDI showed that the improvement of the model was not statistically significant. One possible reason may be due to the small sample size with fewer CV deaths in the validation set. In addition, given that heart failure can cause systemic multi-organ ischemia and dysfunction, there may also be cardiac injury in patients who died from non-cardiac causes, which may also affect the expression levels of these markers, and thus may influence the predictive power of the model. Regarding its practical application, this multi-biomarker prediction model is promising to be applied in future clinical practice. There are currently several analytical platforms that already can simultaneously quantify multiple protein biomarkers using a very small volume of plasma samples. Besides, in light of the rapid development and increasing accessibility of analytical techniques, muti-biomarker tests would be affordable for most patients.

Study Strengths and Limitations
This study has several strengths. First, our data is from a prospective HFpEF cohort with clear diagnoses, comprehensive baseline data, and 2-year follow-up information. Second, we used machine learning to develop a model combining 18 biomarkers with traditional clinical indicators, which could better predict the risk of death than the models developed by traditional methods. Our study also had some limitations. Firstly, cross-validation of the developed risk model using external samples was not performed in this study; a larger, independent cohort with HFpEF is needed to verify the results. Secondly, the patients included in this study are all Chinese, which limits the generalizability of our findings. Thirdly, the ASCEND-HF outcome model with good prognostic value for 30-day and 180-day mortality may not be the most appropriate reference model for this study which looks at a 2-year follow-up. However, the established models currently could not predict a longer-term risk of death in patients with acute HF. Finally, due to the low sensitivity and limited availability of detection reagents, we did not include some interesting biomarkers in this study.

CONCLUSIONS
Multi-biomarker models based on an appropriate machine learning method can be a powerful tool for predicting long-term risk of death in patients hospitalized for HFpEF. Our findings should be verified in future studies from other ethnics.

DATA AVAILABILITY STATEMENT
The data analyzed in this study is subject to the following licenses/restrictions: The China PEACE 5p-HF Study is a national program, and as the government policy stipulates, it is not permissible for the researchers to make the raw data publicly available at this time.
Requests to access these datasets should be directed to jing.li@fwoxford.org.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Central Ethics Committee at Fuwai Hospital and Local Internal Ethics Committees at Study Hospitals. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
JinL, YG, and JiaL designed the study. XB and HD designed the biostatistical methods and analyzed the data. YG drafted the manuscript. Other authors revised the manuscript for important intellectual content. All the authors participated in interpretation of the data and approved the final version of the manuscript.