Predicting mortality in critically ill patients with hypertension using machine learning and deep learning models

Zhang, Ziyang; Ye, Jiancheng

doi:10.3389/fcvm.2025.1568907

ORIGINAL RESEARCH article

Front. Cardiovasc. Med., 08 August 2025

Sec. Hypertension

Volume 12 - 2025 | https://doi.org/10.3389/fcvm.2025.1568907

Predicting mortality in critically ill patients with hypertension using machine learning and deep learning models

Ziyang Zhang¹

Jiancheng Ye^2*

¹Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL, United States
²Weill Cornell Medicine, Cornell University, New York, NY, United States

Background: Accurate prediction of mortality in critically ill patients with hypertension admitted to the Intensive Care Unit (ICU) is essential for guiding clinical decision-making and improving patient outcomes. Traditional prognostic tools often fall short in capturing the complex interactions between clinical variables in this high-risk population. Recent advances in machine learning (ML) and deep learning (DL) offer the potential for developing more sophisticated and accurate predictive models.

Objective: This study aims to evaluate the performance of various ML and DL models in predicting mortality among critically ill patients with hypertension, with a particular focus on identifying key clinical predictors and assessing the comparative effectiveness of these models.

Methods: We conducted a retrospective analysis of 30,096 critically ill patients with hypertension admitted to the ICU. Various ML models, including logistic regression, decision trees, and support vector machines, were compared with advanced DL models, including 1D convolutional neural networks (CNNs) and long short-term memory (LSTM) networks. Model performance was evaluated using area under the receiver operating characteristic curve (AUC) and other performance metrics. SHapley Additive exPlanations (SHAP) values were used to interpret model outputs and identify key predictors of mortality.

Results: The 1D CNN model with an initial selection of predictors achieved the highest AUC (0.7744), outperforming both traditional ML models and other DL models. Key clinical predictors of mortality identified across models included the APS-III score, age, and length of ICU stay. The SHAP analysis revealed that these predictors had a substantial influence on model predictions, underscoring their importance in assessing mortality risk in this patient population.

Conclusion: Deep learning models, particularly the 1D CNN, demonstrated superior predictive accuracy compared to traditional ML models in predicting mortality among critically ill patients with hypertension. The integration of these models into clinical workflows could enhance the early identification of high-risk patients, enabling more targeted interventions and improving patient outcomes. Future research should focus on the prospective validation of these models and the ethical considerations associated with their implementation in clinical practice.

Introduction

Hypertension is a prevalent chronic condition, affecting nearly half of the adult population globally and contributing significantly to the burden of cardiovascular and renal disease (1). In the context of critical illness, hypertension is not only a common comorbidity but also a dynamic clinical factor that can complicate patient management. In Intensive Care Units (ICUs), patients with preexisting hypertension may present with exacerbations of end-organ damage, such as acute heart failure, ischemic stroke, or acute kidney injury, often requiring aggressive interventions (2). Additionally, critical illness itself can cause blood pressure variability due to sepsis, trauma, fluid shifts, medication effects, and mechanical ventilation, further complicating hypertension control.

Management of hypertension in the ICU is challenging. While chronically elevated blood pressure may be tolerated or even protective in some cases, acute blood pressure elevations (hypertensive emergencies) demand immediate treatment to prevent life-threatening complications like intracerebral hemorrhage or aortic dissection. Conversely, overcorrection can lead to hypoperfusion, ischemia, and worsened outcomes. Therapeutic strategies often include intravenous antihypertensives, volume management, and continuous hemodynamic monitoring. Despite advances in monitoring and pharmacologic management, patients with hypertension remain at increased risk of in-hospital mortality, prolonged ICU stays, and adverse neurologic or cardiovascular events (3).

Given the complex interplay of comorbidities, physiologic instability, and treatment responses, accurate and early risk stratification in this population is critical. Traditional prognostic tools, such as scoring systems like the Acute Physiology and Chronic Health Evaluation (APACHE) and Sequential Organ Failure Assessment (SOFA), offer general severity estimates but may not fully capture the nonlinear and multifactorial relationships that govern outcomes in hypertensive ICU patients (4). While these tools provide valuable insights, they are often limited by their reliance on a predefined set of clinical variables and may not fully capture the complex interactions between patient characteristics and clinical outcomes. Furthermore, the predictive accuracy of these traditional models can be compromised by the heterogeneous nature of critically ill populations, including those with hypertension (5).

Recent advances in machine learning (ML) and deep learning (DL) have opened new avenues for developing more sophisticated and accurate predictive models (6). These models can handle large, multidimensional datasets and automatically identify complex patterns and interactions within the data, offering the potential for more personalized and precise predictions (7). In particular, deep learning models, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), have shown promise in various clinical applications, including the prediction of mortality, disease progression, and treatment response (8, 9).

Despite the potential advantages of ML and DL models, their application in predicting mortality among critically ill patients with hypertension remains underexplored. This study seeks to address this gap by evaluating the performance of various ML and DL models in predicting mortality in this high-risk population. Specifically, we compare traditional ML models, such as logistic regression and decision trees, with advanced DL models, including 1D CNNs (10) and long short-term memory (LSTM) networks (11). We also investigate the importance of different clinical features in predicting mortality, with a particular focus on the role of commonly used ICU scoring systems, such as the APS-III score and SOFA score (12).

The objectives of this study are twofold: first, to identify the most effective predictive model for mortality in critically ill patients with hypertension; and second, to provide insights into the key clinical features that drive these predictions. By leveraging the strengths of both ML and DL approaches, we aim to contribute to the development of more accurate and clinically useful prognostic tools that can support decision-making in the ICU.

Methods

Dataset processing

This study utilized the Medical Information Mart for Intensive Care IV (MIMIC-IV) dataset, which contains de-identified healthcare information for approximately 40,000 patients admitted to critical care units at Beth Israel Deaconess Medical Center (BIDMC) between 2008 and 2019. (13) Patients with hypertension were identified using ICD-9 and ICD-10 codes (14). We extracted the following data from the patients’ electronic health records (EHRs): (1) Demographics, including self-reported race, sex, and age; (2) Vital signs; (3) Laboratory test results; (4) Maximum creatinine levels on days two and three; and (5) Patient mortality.

We also calculated the Sequential Organ Failure Assessment (SOFA) Score and the Acute Physiology Score III (APS-III) for each patient. The SOFA score assesses organ failure in ICU patients, tracking the status of six organ systems: respiratory, cardiovascular, hepatic, coagulation, renal, and neurological (15). SOFA is advantageous as it does not require specific tests and relies on routinely collected ICU data. The APS-III score evaluates the severity of illness in adult ICU patients (16) and is an enhanced version of the Acute Physiology and Chronic Health Evaluation (APACHE) system (17). APS-III is designed to predict hospital mortality with greater accuracy by assessing the patient's current health status, underlying medical conditions, and complications arising during their ICU stay. It is widely used in critical care research to compare illness severity across patients and to adjust outcomes in clinical studies (18).

We addressed missing data using Multivariable Imputation by Chained Equations (MICE) (19). MICE is a robust statistical technique that imputes missing values with plausible estimates, creating complete datasets. Initially, missing values are filled with preliminary imputations, such as means and standard deviations. The method then iteratively models each variable with missing data as a function of other variables in a chained equation system, updating the imputations at each step. This process is repeated over multiple cycles until convergence. Multiple complete datasets are generated, analyzed separately, and the results are combined to account for imputation uncertainty.

This study aimed to predict patient mortality using various predictors and to examine the relationship between these predictors and mortality. We employed multiple machine learning models, including Decision Trees and Support Vector Machines (SVM), as well as deep learning models, such as one-dimensional convolutional neural networks (1D CNNs) and recurrent neural networks (RNNs). We excluded data with values outside reasonable ranges (e.g., heart rate above 200).

Statistical analysis

Statistical analysis was conducted using Python and the open-source library “statsmodels” (20). The analysis had two main objectives: (1) to explore the bivariate association between patient mortality and predictors, and (2) to perform initial feature selection using conventional statistical machine learning models, specifically multivariable logistic regression. We defined a model with all predictors as the initial selection and a model with highly associated predictors as the backward selection. For the initial selection, we performed multivariable logistic regression to obtain unadjusted relationships with p-values. Predictors with the highest p-values greater than 0.05 were iteratively removed until all remaining variables had p-values below 0.05. The resulting set of predictors was considered strongly relevant and used as input variables for the backward selection. Model performance was assessed using root mean square error (RMSE) through 10-fold cross-validation. For logistic regression, odds ratios, p-values, and 95% confidence intervals were computed. Receiver operating characteristic (ROC) curves were drawn for both the initial and backward selections during the 10 runs of cross-validation, and the area under the curve (AUC) was calculated for each run. A higher AUC (closer to 1) indicates better model performance, while an AUC of 0.5 suggests no discrimination (21).

Model

We developed multiple classifiers to predict mortality using scikit-learn (22) for machine learning models and TensorFlow (23) for deep learning models. Categorical variables, such as race, were encoded as vectors using one-hot encoding, and continuous variables, such as lab test results, were standardized by subtracting the mean and dividing by the standard deviation to facilitate model fitting.

Several algorithms were used to predict mortality, including logistic regression, support vector machines, multilayer perceptrons (a type of artificial neural network), random forests, bagging decision trees, and boosting decision trees. For logistic regression, we optimized the binary cross-entropy loss using the “Newton-Cholesky” solver, which is well-suited for binary classification with a large number of samples compared to features. For the support vector machine, we employed the stochastic gradient descent algorithm to minimize the log-loss function. The multilayer perceptron architecture included two hidden layers, with 7 neurons in the first layer and 4 neurons in the second, using Rectified Linear Unit (ReLU) activation functions and the Adam solver (24). For the random forest, we set the number of estimators to 100 and the maximum number of features per tree to the square root of the total features. The bagging decision trees also had 100 estimators, with decision trees as the base estimator. Lastly, the boosting decision trees were implemented using a histogram-based gradient boosting strategy, which is more powerful and efficient for large datasets than other strategies.

To ensure generalizability and robustness, we employed three prominent deep learning methods: 1D CNNs, RNNs, and long short-term memory networks (LSTMs). The 1D CNN, a variant of the traditional 2D CNN, is specifically designed for processing sequential data, such as time-series data. Our model architecture included two 1D convolutional layers with a max-pooling layer in between, followed by two fully connected layers, a dropout layer for regularization, and a SoftMax output layer for classification. For the RNN and LSTM models (11), we replaced the 1D CNN with RNN and LSTM units, respectively, while keeping the rest of the architecture the same. The binary cross-entropy loss was optimized using the Adam optimizer with a learning rate of 0.0001 over 20 epochs. To avoid overfitting, we monitored the training process by calculating the loss on the validation set and saved the optimal model at the point where validation loss began to increase. In our study, each sample from the dataset was treated as having 36 time steps, with a single feature per time step. Additionally, we incorporated SHAP (SHapley Additive exPlanations) values into our predictor selection process to evaluate predictor importance and refine the backward selection (25). SHAP is a game-theoretic approach that explains the output of machine learning models by connecting optimal credit allocation with local explanations using the classic Shapley values from game theory.

We evaluated all models using multiple metrics, including AUC, accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). The dataset was split into a training set (80%), validation set (10%), and testing set (10%). Hyperparameter tuning was conducted on the training and validation sets, and model performance was evaluated on the test set.

Study approval

This study exclusively used publicly available MIMIC-IV data.

Results

Statistics overview

A total of 30,096 patients met the inclusion criteria for this study. The demographic and clinical characteristics of these patients are summarized in Table 1. The majority of the cohort was white (67.55%), with 13,619 (45.25%) patients deceased following ICU admission, and 16,477 (54.75%) patients surviving. This near-equitable distribution indicates that the classification task is reasonably balanced.

Table 1

Table 1. Characteristics of patients with hypertension in ICU.

Supplementary Table S1 presents the bivariate associations between patient mortality and the predictors of interest. In the unadjusted logistic regression analysis, variables such as minimum SpO2, mean SpO2, minimum platelet count, and maximum sodium levels (p > 0.05) were not significantly associated with patient mortality post-ICU admission. The multivariable logistic regression model using backward selection is detailed in Supplementary Table S2. The analysis indicates that patients with elevated red blood cell counts and creatinine levels are more likely to succumb upon ICU admission.

Models

We conducted logistic regression analyses using 10-fold cross-validation across 10 iterations, each with different random seeds. The results, shown in Supplementary Table S3, suggest that the initial selection model exhibits superior classification performance compared to the backward selection model. This finding implies that incorporating all predictors may lead to more accurate predictions.

Table 2 outlines the performance metrics on the testing set across the models employed in the backward selection, where the top fifteen predictors ranked by the SHAP framework were used as input. Each metric provides unique and critical insights into the model's performance. Among the models, the 1D CNN achieved the highest AUC (0.7744), demonstrating both good sensitivity and a strong positive predictive value. This suggests that the model not only excels at identifying positive cases but also produces highly reliable positive predictions. These results indicate that the models effectively fit the data without overfitting, due to the cross-validation strategy employed. Notably, the LSTM model reached the highest NPV (0.7317). Overall, the 1D CNN model from the initial selection outperformed the other models across all metrics, indicating that deep learning approaches may be particularly powerful for this study.

Table 2

Table 2. Performance metrics on the models.

Figure 1 illustrates the ROC curves for the machine learning models. Among these, the boosting decision trees model achieved the best AUC (0.762), regardless of whether the initial or backward selection was used. Figure 2 presents the ROC curves for the deep learning models, with the 1D CNN emerging as the top performer (AUC: 0.770).

Figure 1

ROC curve showing the performance of backward selection for multiple machine learning models. The y-axis represents the true positive rate and the x-axis represents the false positive rate. Models include Boosting DT (area 0.762), MLP (area 0.757), LR (area 0.753), RF (area 0.746), Bagging DT (area 0.741), and SVM (area 0.728). A dashed line indicates the random classifier threshold.

Figure 1. ROC curve of machine learning models.

Figure 2

ROC curve comparing deep learning models: 1D CNN (orange), LSTM (blue), and RNN (red). The 1D CNN has the largest area under the curve at 0.770, followed by LSTM at 0.759, and RNN at 0.758. The plot includes a diagonal line representing a random classifier.

Figure 2. ROC curves of deep learning models.

Model interpretation

We provided an interpretation of the 1D CNN model, which had the best overall performance (AUC: 0.774). First, we identified the top 25 predictors based on their mean absolute SHAP values. Figure 3 shows the relationship between these predictors and their influence on the model's output. Globally, the APS-III score, age, and length of ICU stay were the top three predictors.

Figure 3

Dot plot showing SHAP values for various features impacting a model's output. Features like APS-III score, age, and ICU stay length are listed along the y-axis. Colors range from blue to pink, indicating low to high feature values. The x-axis displays SHAP values, demonstrating each feature's influence.

Figure 3. SHAP beeswarm plot of 1D CNN.

We further analyzed individual predictors with scatter plots. Figure 4(a) demonstrates that younger age (left side, negative SHAP values) contributes negatively to the model's output, while older age (right side, positive SHAP values) contributes positively. Extreme ages appear to have a more significant impact on the model's predictions, indicating that age has a monotonically increasing effect on the prediction of patient mortality. Figure 4(b) shows that SHAP values increase with the APS-III score, suggesting that a higher APS-III score correlates with an elevated probability of mortality, consistent with the score's role in assessing patient risk.

Figure 4

Two scatter plots showing SHAP values. The top plot relates SHAP values to Age, with values from negative four to four. The bottom plot relates SHAP values to APS-III scores, with a similar value range. Both plots indicate a positive correlation, with SHAP values increasing as Age and APS-III scores increase.

Figure 4. Scatter plots for SHAP values: age and APS-III score.

Finally, we used SHAP values to interpret two individual cases. Figure 5(a) displays the contribution of clinical measurements and patient characteristics to a positive mortality prediction, where age, APS-III score, and maximum blood urea nitrogen had the strongest positive contributions. Conversely, Figure 5(b) illustrates a case where the model predicted survival, with APS-III score, length of ICU stays, and SOFA score being the most significant factors in the negative prediction.

Figure 5

Top graph shows a bar chart with various medical features affecting a model prediction at \\(f(x) = 0.807\\), with age having the highest positive impact. Bottom graph indicates another prediction at \\(f(x) = 0.162\\) with APS-III score having the largest negative impact. Bars are colored red for positive and blue for negative impacts, illustrating feature contributions to predictions.

Figure 5. Waterfall plots for two individual cases. Top: true positive case; below: true negative case.

SHAP values were calculated for all models discussed in this paper (26). Across models, we identified a consistent set of significant predictors for mortality in critically ill patients with hypertension. These predictors include minimum bicarbonate, APS-III score, maximum blood urea nitrogen, maximum creatinine during days 2 and 3, mean respiratory rate, minimum creatinine, maximum red blood cell count, length of ICU stay, age, minimum prothrombin, SOFA score, minimum systolic blood pressure, minimum diastolic blood pressure, and minimum hemoglobin.

Discussion

Accurate prognosis prediction is vital for patient-centered care, facilitating shared decision-making and informing treatment strategies. In this study, we evaluated various machine learning and deep learning models to predict mortality in critically ill patients with hypertension upon ICU admission. Our findings reveal that deep learning models, particularly the 1D CNN, consistently outperformed traditional machine learning models in predictive accuracy, as measured by AUC and other performance metrics. Specifically, the 1D CNN model, using an initial selection of predictors, achieved the highest AUC (0.7744), demonstrating its superior ability to distinguish between patients who survived and those who did not after ICU admission.

Our analysis also underscored the significance of specific clinical features, with the APS-III score, age, and length of ICU stay emerging as the most influential predictors of mortality. These results highlight the importance of comprehensive physiological assessments and detailed patient history in developing predictive models for critically ill populations. A higher APS-III score was directly associated with an increased risk of mortality, reflecting its design as a severity-of-illness measure; patients with more acute physiological disturbances or chronic comorbidities had markedly higher predicted mortality probabilities. Similarly, older age showed a monotonic relationship with mortality risk—SHAP values revealed that as age increased, the model's predicted probability of death also increased, especially among patients at the extreme ends of the age distribution. The length of ICU stay also exhibited a positive correlation with mortality risk, where longer stays—often indicative of more severe illness or complications—were associated with higher SHAP values contributing to death predictions. These patterns not only validate the clinical relevance of the predictors but also provide interpretability for how the deep learning model distinguishes between survivors and non-survivors.

The identification of key predictors such as APS-III score, age, and length of ICU stay carries significant implications for clinical practice (27). These factors are routinely available in ICU settings, making the integration of the 1D CNN model into real-time clinical decision-making practical. With its high AUC and reliable positive predictive value, this model could assist healthcare providers in the early identification of high-risk patients, enabling more targeted interventions that may improve patient outcomes and optimize resource allocation (28).

Moreover, the model's ability to efficiently process large and complex datasets, coupled with its capacity for ongoing learning and adaptation, suggests that deep learning approaches could be instrumental in advancing precision medicine in critical care. This study demonstrates the effectiveness of both classic machine learning algorithms and modern deep learning models in predicting mortality among critically ill patients with hypertension. Our predictive models incorporated a diverse set of variables, leveraging professional medical knowledge by including SOFA and APS-III scores, which assess overall organ failure and the severity of illness. This approach enhanced the models' ability to predict patient mortality more accurately.

Among the models evaluated, the 1D CNN exhibited the best performance, likely due to its architecture that combines the learning capabilities of neural networks with the efficiency of convolutional filters. The 1D CNN's ability to capture local dependencies and recognize patterns within the data enabled it to outperform other models, including RNNs and LSTMs, which did not perform as well, possibly due to the lack of strong sequential dependencies in the predictors used (29). Decision tree algorithms also performed well, consistent with existing literature, while support vector machines (SVMs) lagged behind, potentially due to the high-dimensional complexity of the dataset.

The 1D CNN prediction model has the potential to facilitate advanced clinical decision-making for critically ill patients with hypertension. The 1D CNN is designed to process sequential data, such as time series or any data with a temporal dimension. Its structure, which includes convolutional layers, activation functions, max pooling, and fully connected layers, is particularly suited for tasks requiring the recognition of local and positionally invariant patterns (30). This makes 1D CNNs robust and efficient for analyzing large input sequences, eliminating the need for manual feature engineering, and ensuring robustness to shifts and variations in the input data. By reducing the number of parameters through shared weights and pooling, 1D CNNs are computationally efficient and can automatically learn to detect important features from raw data. This makes them particularly beneficial in complex clinical data scenarios, where data may not always be perfectly aligned or uniform. The model's robustness to slight variations in input data is especially valuable in clinical contexts, where data consistency can vary (30).

Our findings are consistent with previous research that has shown the utility of machine learning in predicting patient outcomes in the ICU (31). However, this study advances the field by demonstrating the particular effectiveness of deep learning models, such as the 1D CNN, in handling the complex and multidimensional data typically found in ICU settings. Unlike traditional machine learning models, which often require extensive feature engineering, deep learning models can automatically discern intricate patterns and interactions within the data, leading to potentially more accurate predictions (32).

The study also builds on prior work by using SHAP values to interpret the models, thereby providing transparency and insight into the decision-making process. Through the SHAP framework, we explored the impact of predictors across all employed models. This feature selection method has become widely recognized for explaining the effects of each feature on learning algorithms. Addressing one of the main criticisms of deep learning models—namely, their “black-box” nature—this approach elucidates the contributions of individual predictors to the model's predictions. Such interpretability is crucial for gaining the trust of clinicians and ensuring that these models are used effectively and ethically in practice.

The deep learning models developed in this study can be effectively integrated into clinical workflows in several ways. For example, they could be embedded into automated risk scoring systems at the time of ICU admission, using routinely collected data such as vitals, laboratory results, and scoring systems (e.g., APS-III and SOFA). These models could also be deployed as part of an early warning system (EWS) that continuously monitors patient data in real-time, flagging high-risk individuals for early intervention. Integration into existing EHR systems would allow the model to run passively in the background and trigger clinical alerts when a patient's predicted mortality risk surpasses a predefined threshold. This would enable prioritization of care, more timely allocation of critical resources (e.g., specialist consultation, escalation of therapy), and facilitate shared decision-making discussions with patients and families. Importantly, the use of SHAP values to explain individual predictions enhances model transparency, making it easier for clinicians to trust and act on the model's outputs in high-stakes environments like the ICU.

Limitations

Despite its strengths, this study has several limitations. First, the study was conducted using a retrospective dataset, which may limit the generalizability of the findings to other patient populations or clinical settings. Future research should aim to validate these models prospectively in diverse ICU populations to confirm their utility and robustness in real-world settings. Second, although we employed SHAP values to enhance model interpretability, there remains a need for further research into the ethical implications of using such models in clinical practice. Issues such as algorithmic bias, patient consent, and the transparency of decision-making processes should be rigorously examined to ensure that these technologies are implemented responsibly (33). Finally, while the models demonstrated good predictive performance, there is room for improvement. Exploring ensemble methods that combine multiple deep learning models or incorporating time-series data could further enhance predictive accuracy. Additionally, integrating multi-modal data sources, such as genomic information, clinical notes (34), patient-generated health data (35), and imaging data (36), could provide a more comprehensive view of patient health, leading to even more personalized and precise predictions (37).

Conclusion

This study demonstrates the potential of deep learning models, particularly the 1D CNN, in predicting mortality in critically ill patients with hypertension in the ICU. The superior performance of these models, coupled with their ability to handle complex datasets and provide interpretable results, positions them as powerful tools for improving patient outcomes in critical care settings. However, ongoing research is needed to validate these findings in broader patient populations and to address the ethical challenges associated with the deployment of such technologies in clinical practice. By continuing to refine these models and ensuring their responsible use, we can move closer to realizing the promise of precision medicine in the ICU.

Data availability statement

Publicly available datasets were analyzed in this study. This data can be found here: https://mimic.physionet.org/.

Author contributions

ZZ: Formal analysis, Methodology, Validation, Visualization, Writing – original draft. JY: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing.

Funding

The author(s) declare that no financial support was received for the research and/or publication of this article.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declare that no Generative AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fcvm.2025.1568907/full#supplementary-material

References

1. Flack JM, Adekola B. Blood pressure and the new ACC/AHA hypertension guidelines. Trends Cardiovasc Med. (2020) 30(3):160–4. doi: 10.1016/j.tcm.2019.05.003

PubMed Abstract | Crossref Full Text | Google Scholar

2. Malbrain ML, Chiumello D, Pelosi P, Bihari D, Innes R, Ranieri VM, et al. Incidence and prognosis of intraabdominal hypertension in a mixed population of critically ill patients: a multiple-center epidemiological study. Crit Care Med. (2005) 33(2):315–22. doi: 10.1097/01.ccm.0000153408.09806.1b

PubMed Abstract | Crossref Full Text | Google Scholar

3. Ye J, Sanchez-Pinto LN. Three data-driven phenotypes of multiple organ dysfunction syndrome preserved from early childhood to middle adulthood. AMIA Annu Symp Proc. (2021) 2020:1345–53.33936511

PubMed Abstract | Google Scholar

4. Beigmohammadi MT, Amoozadeh L, Rezaei Motlagh F, Rahimi M, Maghsoudloo M, Jafarnejad B, et al. Mortality predictive value of APACHE II and SOFA scores in COVID-19 patients in the intensive care unit. Can Respir J. (2022) 2022:5129314. doi: 10.1155/2022/5129314

PubMed Abstract | Crossref Full Text | Google Scholar

5. Roncon L, Zuin M, Zuliani G, Rigatelli G. Patients with arterial hypertension and COVID-19 are at higher risk of ICU admission. Br J Anaesth. (2020) 125(2):e254–e255. doi: 10.1016/j.bja.2020.04.056

PubMed Abstract | Crossref Full Text | Google Scholar

6. Syed M, Syed S, Sexton K, Syeda HB, Garza M, Zozus M, et al. Application of machine learning in intensive care unit (ICU) settings using MIMIC dataset: systematic review. Informatics (MDPI). (2021) 8(1):16. doi: 10.3390/informatics8010016

PubMed Abstract | Crossref Full Text | Google Scholar

7. Ye J, Woods D, Jordan N, Starren J. The role of artificial intelligence for the application of integrating electronic health records and patient-generated data in clinical decision support. AMIA Jt Summits Transl Sci Proc. (2024) 2024:459–67.38827061

PubMed Abstract | Google Scholar

8. van de Sande D, van Genderen ME, Huiskens J, Gommers D, van Bommel J. Moving from bytes to bedside: a systematic review on the use of artificial intelligence in the intensive care unit. Intensive Care Med. (2021) 47(7):750–60. doi: 10.1007/s00134-021-06446-7

PubMed Abstract | Crossref Full Text | Google Scholar

9. Ye J. Patient safety of perioperative medication through the lens of digital health and artificial intelligence. JMIR Perioper Med. (2023) 6:e34453. doi: 10.2196/34453

PubMed Abstract | Crossref Full Text | Google Scholar

10. Kiranyaz S, Ince T, Abdeljaber O, Avci O, Gabbouj M. 1-D convolutional neural networks for signal processing applications. In: ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); IEEE (2019). p. 8360–4.

Google Scholar

11. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. (1997) 9(8):1735–80. doi: 10.1162/neco.1997.9.8.1735

PubMed Abstract | Crossref Full Text | Google Scholar

12. Johnson AE, Kramer AA, Clifford GD. A new severity of illness scale using a subset of acute physiology and chronic health evaluation data elements shows comparable predictive accuracy. Crit Care Med. (2013) 41(7):1711–8. doi: 10.1097/CCM.0b013e31828a24fe

PubMed Abstract | Crossref Full Text | Google Scholar

13. Johnson AEW, Bulgarelli L, Shen L, Gayles A, Shammout A, Horng S, et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci Data. (2023) 10(1):1. doi: 10.1038/s41597-022-01899-x. (Erratum in: Sci Data. 2023 Jan 16;10(1):31. doi: 10.1038/s41597-023-01945-2. Erratum in: Sci Data. 2023 Apr 18;10(1):219. doi: 10.1038/s41597-023-02136-9).36596836

PubMed Abstract | Crossref Full Text | Google Scholar

14. Quan H, Sundararajan V, Halfon P, Fong A, Burnand B, Luthi JC, et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care. (2005) 43(11):1130–9. doi: 10.1097/01.mlr.0000182534.19832.83

PubMed Abstract | Crossref Full Text | Google Scholar

15. Vincent JL, Moreno R, Takala J, Willatts S, De Mendonça A, Bruining H, et al. The SOFA (Sepsis-related Organ Failure Assessment) score to describe organ dysfunction/failure. On behalf of the Working Group on Sepsis-Related Problems of the European Society of Intensive Care Medicine. Intensive Care Med. (1996) 22(7):707–10. doi: 10.1007/BF01709751

PubMed Abstract | Crossref Full Text | Google Scholar

16. Knaus WA, Wagner DP, Draper EA, Zimmerman JE, Bergner M, Bastos PG, et al. The APACHE III prognostic system. Risk prediction of hospital mortality for critically ill hospitalized adults. Chest. (1991) 100(6):1619–36. doi: 10.1378/chest.100.6.1619

PubMed Abstract | Crossref Full Text | Google Scholar

17. Wagner DP, Draper EA, Abizanda Campos R, Nikki P, Le Gall JR, Loirat P, et al. Initial international use of APACHE. An acute severity of disease measure. Med Decis Making. (1984) 4(3):297–313. doi: 10.1177/0272989X8400400305

PubMed Abstract | Crossref Full Text | Google Scholar

18. Ye J, Yao L, Shen J, Janarthanam R, Luo Y. Predicting mortality in critically ill patients with diabetes using machine learning and clinical notes. BMC Med Inform Decis Mak. (2020) 20(Suppl 11):295. doi: 10.1186/s12911-020-01318-4

PubMed Abstract | Crossref Full Text | Google Scholar

19. van Buuren S, Groothuis-Oudshoorn K. mice: Multivariate imputation by chained equations in R. J Stat Softw. (2011) 45(3):1–67. doi: 10.18637/jss.v045.i03

Crossref Full Text | Google Scholar

20. Seabold S, Perktold J. Statsmodels: econometric and statistical modeling with python. SciPy. (2010) 7(1):92–6. doi: 10.25080/Majora-92bf1922-011

Crossref Full Text | Google Scholar

21. Myerson J, Green L, Warusawitharana M. Area under the curve as a measure of discounting. J Exp Anal Behav. (2001) 76(2):235–43. doi: 10.1901/jeab.2001.76-235

PubMed Abstract | Crossref Full Text | Google Scholar

22. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. (2011) 12:2825–30.

Google Scholar

23. Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, et al. {TensorFlow}: a system for {Large-Scale} machine learning. 12th USENIX symposium on operating systems design and implementation (OSDI 16) (2016). p. 265–83

Google Scholar

24. Eckle K, Schmidt-Hieber J. A comparison of deep networks with ReLU activation function and linear spline-type methods. Neural Netw. (2019) 110:232–42. doi: 10.1016/j.neunet.2018.11.005

PubMed Abstract | Crossref Full Text | Google Scholar

25. Lundberg SM, Lee S-I. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst. (2017) 30. doi: 10.48550/arXiv.1705.07874

Crossref Full Text | Google Scholar

26. Marcílio WE, Eler DM. From explanations to feature selection: assessing SHAP values as feature selection mechanism. In: 2020 33rd SIBGRAPI conference on Graphics, Patterns and Images (SIBGRAPI); IEEE (2020). p. 340–7.

Google Scholar

27. Ye J. Design and development of an informatics-driven implementation research framework for primary care studies. AMIA Annu Symp Proc. (2022) 2021:1208–14.35308925

PubMed Abstract | Google Scholar

28. Ye J, Sanuade OA, Hirschhorn LR, Walunas TL, Smith JD, Birkett MA, et al. Interventions and contextual factors to improve retention in care for patients with hypertension in primary care: hermeneutic systematic review. Prev Med. (2024) 180:107880. doi: 10.1016/j.ypmed.2024.107880

PubMed Abstract | Crossref Full Text | Google Scholar

29. Wu H, Prasad S. Convolutional recurrent neural networks for hyperspectral data classification. Remote Sens (Basel). (2017) 9(3):298. doi: 10.3390/rs9030298

Crossref Full Text | Google Scholar

30. Kiranyaz S, Avci O, Abdeljaber O, Ince T, Gabbouj M, Inman DJ. 1D convolutional neural networks and applications: a survey. Mech Syst Signal Process. (2021) 151:107398. doi: 10.1016/j.ymssp.2020.107398

Crossref Full Text | Google Scholar

31. Shillan D, Sterne JAC, Champneys A, Gibbison B. Use of machine learning to analyse routinely collected intensive care unit data: a systematic review. Crit Care. (2019) 23(1):284. doi: 10.1186/s13054-019-2564-9

PubMed Abstract | Crossref Full Text | Google Scholar

32. Kong G, Lin K, Hu Y. Using machine learning methods to predict in-hospital mortality of sepsis patients in the ICU. BMC Med Inform Decis Mak. (2020) 20:1–10. doi: 10.1186/s12911-020-01271-2

PubMed Abstract | Crossref Full Text | Google Scholar

33. Ye J, Ren Z. Examining the impact of sex differences and the COVID-19 pandemic on health and health care: findings from a national cross-sectional study. JAMIA Open. (2022) 5(3):ooac076. doi: 10.1093/jamiaopen/ooac076

PubMed Abstract | Crossref Full Text | Google Scholar

34. Ye J, He L, Hai J, Xu C, Ding S, Beestrum M. Development and application of natural language processing on unstructured data in hypertension: a scoping review. medRxiv. (2024):2024–02. doi: 10.1101/2024.02.27.24303468

Crossref Full Text | Google Scholar

35. Ye J, Hai J, Song J, Wang Z. The role of artificial intelligence in the application of the integrated electronic health records and patient-generated health data. medRxiv. (2024):2024–05. doi: 10.1101/2024.05.01.24306690

Crossref Full Text | Google Scholar

36. Cohen JP, Morrison P, Dao L, Roth K, Duong TQ, Ghassemi M. Covid-19 image data collection: Prospective predictions are the future. arXiv [Preprint]. arXiv:2006.11988 (2020).

Google Scholar

37. Ye J, Hai J, Song J, Wang Z. Multimodal data hybrid fusion and natural language processing for clinical prediction models. AMIA Jt Summits Transl Sci Proc. (2024) 2024:191–200.38827058

PubMed Abstract | Google Scholar

Keywords: hypertension, intensive care unit, mortality prediction, machine learning, deep learning, convolutional neural networks, SHAP analysis

Citation: Zhang Z and Ye J (2025) Predicting mortality in critically ill patients with hypertension using machine learning and deep learning models. Front. Cardiovasc. Med. 12:1568907. doi: 10.3389/fcvm.2025.1568907

Received: 30 January 2025; Accepted: 14 July 2025;
Published: 8 August 2025.

Edited by:

Tlili Barhoumi, King Abdullah International Medical Research Center (KAIMRC), Saudi Arabia

Reviewed by:

Gianluca Paternoster, San Carlo Hospital, Italy
Larisa Anghel, Institute of Cardiovascular Diseases, Romania

Copyright: © 2025 Zhang and Ye. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Jiancheng Ye, amlhbmNoZW5nLnllQHUubm9ydGh3ZXN0ZXJuLmVkdQ==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.