A machine learning model for prediction of cardiac arrest-associated acute kidney injury in the ICU: an internal and external validation study

Xu, Wenbo; Cheng, Shenchi; Wu, Chenxi; Li, Chen; Ni, Tianhao; Ni, Peifeng; Zhang, Gensheng; Diao, Mengyuan; Hu, Wei

doi:10.3389/fmed.2025.1717973

ORIGINAL RESEARCH article

Front. Med., 20 January 2026

Sec. Intensive Care Medicine and Anesthesiology

Volume 12 - 2025 | https://doi.org/10.3389/fmed.2025.1717973

This article is part of the Research TopicThe Future of Artificial Intelligence in Acute Kidney InjuryView all 5 articles

A machine learning model for prediction of cardiac arrest-associated acute kidney injury in the ICU: an internal and external validation study

Wenbo Xu^1,2^†

Shenchi Cheng^1,2^†

Chenxi Wu^1,2^†

Chen Li^1,2^†

Tianhao Ni^1,2^†

Peifeng Ni^2,3

Gensheng Zhang⁴

Mengyuan Diao²^*

Wei Hu²^*

¹The Fourth School of Clinical Medicine, Zhejiang Chinese Medical University, Hangzhou, Zhejiang, China
²Department of Critical Care Medicine, Affiliated Hangzhou First People’s Hospital, School of Medicine, Westlake University, Hangzhou, Zhejiang, China
³Department of Critical Care Medicine, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
⁴Department of Critical Care Medicine, Second Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, China

Background: Cardiac arrest-associated acute kidney injury is common after cardiac arrest and adversely affects patient survival and disease outcomes. Early prediction of acute kidney injury is essential for guiding clinical management, especially in cardiac arrest patients admitted to the intensive care unit. Early detection of acute kidney injury can improve long-term outcomes.

Methods: Data were obtained from two local hospitals and the Medical Information Mart for Intensive Care (MIMIC)-IV database. Feature selection was performed using least absolute shrinkage and selection operator regression. Model performance was evaluated using decision curve analysis and calibration curves, and the best-performing model was interpreted with SHapley Additive exPlanations.

Results: This study included 873 patients from local hospitals and 719 patients from the MIMIC-IV database as an external validation cohort, least absolute shrinkage and selection operator regression identified 10 predictor variables. The logistic regression model demonstrated the best performance in predicting cardiac arrest-associated acute kidney injury, with an area under the curve of 0.958 (95% CI: 0.942–0.974) in the training set, 0.953 (95% CI: 0.920–0.987) in the internal validation set, and 0.825 (95% CI: 0.791–0.859) in the external validation set. The model was further interpreted using the SHAP framework.

Conclusion: An externally validated logistic regression model incorporating 10 variables effectively predicted early acute kidney injury onset in cardiac arrest patients. The SHapley Additive exPlanations algorithm facilitated model interpretation, helping clinicians understand the contribution of each variable to acute kidney injury risk, to determine which factors contribute most significantly to patient risk.

1 Introduction

Cardiac arrest is an acute condition with an extremely high mortality risk. In-hospital survival rates are approximately 25%, while out-of-hospital survival can be as low as 10–12% (1). Survivors often develop post-cardiac arrest syndrome, a clinical state characterized by neurological dysfunction, systemic inflammation, and vasoregulatory failure following the return of spontaneous circulation (ROSC). This syndrome results from systemic ischemia during circulatory arrest and subsequent reperfusion injury, with acute kidney injury (AKI) being a frequent complication (2–5).

Respiratory failure, neurological dysfunction, liver failure, and acute kidney injury are common complications following cardiac arrest. Among these, cardiac arrest-associated acute kidney injury (CA-AKI) significantly impacts patient prognosis, characterized by rapid deterioration of renal function, with an incidence ranging from 37 to 80% (6). It is associated with poorer clinical outcomes, including decreased survival (7, 8) and increased intensive care unit (ICU) mortality (9). Early prediction and tailored management of AKI are therefore essential in post-arrest care. Current diagnosis relies on the Kidney Disease: Improving Global Outcomes (KDIGO) criteria, which use serum creatinine levels or urine output. However, these markers have limitations: urine output can be influenced by hemodynamic status and treatments, while rises in serum creatinine are often delayed (10). These shortcomings underscore the need for more reliable predictive tools to identify AKI earlier in this high-risk population. The development of current prediction tools is often confined to a single data center, making it difficult to generalize across diverse patient populations. Therefore, the development and application of higher-quality models, those based on multicenter data, incorporating larger patient cohorts, and validated externally are crucial for addressing the current challenges.

The integration of machine learning (ML) with electronic health records has advanced predictive capabilities across various medical fields (11–13). Yet, its application for predicting AKI following cardiac arrest remains limited. Most existing models are derived from single-center, small-scale datasets and lack external validation, which restricts their generalizability and clinical applicability. To address this gap, our study developed an ML model using multicenter data and performed rigorous internal and external validation. Our aim was to create a robust, clinically generalizable tool for early prediction of AKI in patients after cardiac arrest.

The main contributions of this study are as follows:

• A high-performance prediction model based on logistic regression was developed: We integrated multicenter clinical data and utilized LASSO regression to select 10 key features, constructing a simplified and interpretable logistic regression (LR) model for predicting AKI within 48 h after cardiac arrest. This model demonstrated exceptional discriminatory ability (AUC > 0.95) in both the training set and internal validation set.

• Rigorous external validation was conducted: We externally validated the model using the publicly available international MIMIC-IV database as an independent validation cohort. Results demonstrated robust predictive performance (AUC = 0.825) even across different patient populations and clinical settings, confirming its excellent generalization capability and potential universality.

• Provided deep clinical interpretability analysis based on SHAP: We applied the SHAP algorithm to interpret the model, not only quantifying the global importance of each predictor but also revealing the nonlinear relationship between key variables (such as SOFA score and lactate) and AKI risk, along with specific clinical risk thresholds. This transformed model outputs into understandable clinical insights.

2 Related works

Research on the early prediction of acute kidney injury (AKI) has progressively shifted from reliance on clinical scoring systems and novel biomarkers toward data-driven analytical models (14–16). Despite these advances, there remains a notable lack of predictive tools specifically for AKI following cardiac arrest that possess robust generalization capability. The application of ML to electronic health records (EHRs) represents a paradigm shift, allowing for the capture of complex, nonlinear relationships and offering superior predictive performance in routine intensive care settings. Notably, models developed using large ICU databases such as MIMIC have demonstrated the ability to identify AKI hours before clinical diagnosis (17).

Over the past 5 years, the deep integration of ML with EHRs has substantially improved the accuracy and real-time processing of critical illness prognostic models. For example, one study (18) utilized an improved particle swarm optimization-based gradient optimizer to tune long short-term memory networks, achieving 89.9% accuracy on gait time series data from 73 patients with Parkinson’s disease. This illustrates the potential of combining meta-heuristic algorithms with recurrent neural networks for analyzing small-sample medical time-series data. Bacanin et al. (19) proposed an extreme learning machine (ELM) optimized by the group search firefly algorithm, which outperformed nine classical meta-heuristic algorithms across 16 medical benchmark datasets, including fetal heart rate curves. This suggests ELM as a candidate for ultra-low-latency scenarios in critical care. Zivkovic et al. (20) applied an improved arithmetic optimization algorithm to a convolutional neural network-XGBoost hybrid model, achieving 99.4% accuracy on a dataset of 12,000 COVID-19 chest X-ray images. Their two-stage “convolutional features + tree model” framework provides a structural reference for the “LSTM + LightGBM” cascade adopted in the present study. Jovanovic et al. (21) employed a recurrent neural network tuned by the crayfish optimization algorithm on the PTB-XL database, attaining an F1-score of 0.995, which confirms the robustness of meta-heuristic parameter optimization in physiological signal analysis.

In summary, ML research in critical care has transitioned from a paradigm of “traditional statistics + small feature sets” toward one centered on “machine learning + large-scale feature data.” This evolution establishes a methodological foundation for our study, which focuses on predicting AKI in the specific ICU subpopulation of post-cardiac arrest patients.

3 Methods

3.1 Study population

This study included cardiac arrest patients admitted to the intensive care units (ICUs) of the First People’s Hospital of Hangzhou and the Second Affiliated Hospital of Zhejiang University between 2017 and 2024. Data from these two centers constituted the training and internal validation cohort. For external validation, we used data from the Medical Information Mart for Intensive Care IV (MIMIC-IV, version 2.2) database. Local clinical data were extracted by the investigators from the electronic medical record systems of the participating hospitals. MIMIC-IV data were retrieved using Structured Query Language (SQL) queries executed via the PostgreSQL database management system with Navicat software, focusing on patients who were admitted to the ICU following cardiac arrest.

3.2 Inclusion and exclusion criteria

Adult patients with cardiac arrest (defined as sustained cessation of cardiac mechanical activity requiring external cardiopulmonary resuscitation) who were admitted to the ICU for the first time, had an ICU length of stay >72 h, and were aged ≥18 years at ICU admission were included. Exclusion criteria were: (1) pregnancy; (2) pre-existing chronic kidney disease (CKD); (3) anatomical renal abnormalities (including renal transplant recipients or congenital/acquired solitary kidney).

3.3 Data collection

The following variables were initially collected: Demographics: Sex, age. Physical parameter: Body mass index (BMI). Medical history: Hypertension, diabetes, heart failure, myocardial infarction, cerebral infarction, chronic obstructive pulmonary disease (COPD), cirrhosis, cancer, chronic renal failure (CRF). Blood markers: Alanine aminotransferase (ALT), aspartate aminotransferase (AST), total bilirubin (TBil), blood urea nitrogen (BUN), glucose, sodium (Na), potassium (K), chloride (Cl), calcium (Ca), prothrombin time (PT), international normalized ratio (INR), initial serum creatinine (InitialCr), lactate (Lac), albumin. Blood cell counts: Hemoglobin, white blood cell count (WBC), platelet count (PLT). Interventions: Percutaneous coronary intervention (PCI), extracorporeal membrane oxygenation (ECMO), continuous renal replacement therapy (CRRT), intra-aortic balloon pump (IABP), targeted temperature management (TTM), mechanical ventilation. Blood gas analysis: Partial pressure of oxygen (PaO₂), partial pressure of carbon dioxide (PaCO₂), PH, bicarbonate (HCO₃⁻), base excess. Scoring systems: Charlson Comorbidity Index, Glasgow Coma Scale (GCS) score, Sequential Organ Failure Assessment (SOFA) score. Vital signs: Heart rate (HR), mean systolic blood pressure (SBP), mean diastolic blood pressure (DBP), mean arterial pressure (MAP), respiratory rate (RR), temperature. Medications: Vasoactive drugs, sodium bicarbonate, glucocorticoids, antiarrhythmic drugs. In total, 53 variables were recorded. Serum creatinine levels were also measured at 24 and 48 h after ICU admission to assess acute kidney injury according to KDIGO criteria.

3.4 Statistical analysis and characteristic variable screening

To ensure model consistency and generalizability across multicenter cohorts, we first aligned variables by comparing the data of the local dataset and the MIMIC-IV dataset. Only variables present in both cohorts with consistent clinical definitions were retained; variables unique to either dataset were excluded to enable external validation. For shared variables, unit conversions were applied as needed to achieve uniformity (e.g., bilirubin units were converted from mg/dL in MIMIC-IV to μmol/L to match the local dataset). Missing data were handled in two stages. First, variables with >30% missing values in either dataset were excluded. For remaining missing values, multiple imputation was performed using the mice package in R. To prevent data leakage and ensure unbiased performance evaluation, imputation was conducted separately on the training set (local data) and the external test set (MIMIC-IV data). Specifically, the imputation model was fitted only on the training set and then applied to the test set. All analyzed variables represent the mean of clinical measurements recorded during the first 3 days after ICU admission for cardiac arrest.

Baseline data were processed and analyzed using R (version 4.2.1). Categorical variables were compared with Fisher’s exact test and reported as frequencies (percentages). Continuous variables were compared with the Mann–Whitney U test and summarized as median (interquartile range). To identify significant predictors and mitigate multicollinearity, we calculated the variance inflation factor (VIF) for each variable; variables with VIF > 10 were excluded. Continuous variables were evaluated using analysis of variance to retain only those significantly associated with the outcome; categorical variables were screened using the chi-square test. All tests were two-tailed, with statistical significance set at p < 0.05.

Following the above steps, the remaining variables were further screened using least absolute shrinkage and selection operator (LASSO). LASSO regression is a widely used feature-selection method that penalizes regression coefficients based on their magnitude, thereby effectively eliminating non-informative variables and enhancing model interpretability.

3.5 Model development and explainability

Five machine learning algorithms—logistic regression (LR), random forest (RF), support vector machine (SVM), extreme gradient boosting (XGBoost), and k-nearest neighbors (KNN)—were applied to predict early AKI following cardiac arrest. Hyperparameters for each algorithm were optimized through randomized search tuning. To reduce overfitting and evaluation bias, a 5-fold cross-validation scheme was employed during training. The local cohort was randomly split into training and internal validation sets at a 7:3 ratio. Model performance was first assessed on the internal validation set. Subsequently, the finalized model was evaluated on the independent MIMIC-IV dataset for external validation, to validate universality and reliability.

Model performance was quantified using the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, F1-score, and Cohen’s kappa. Clinical utility across different risk thresholds was evaluated using decision curve analysis (DCA) and calibration curves. To enhance interpretability, SHapley Additive exPlanations (SHAP) were applied to illustrate the contribution of each predictor to the model’s output. This approach translates complex model decisions into an intuitive feature-outcome framework, supporting clinical understanding and practical application.

4 Results

4.1 Characteristics of the study population

From the two participating hospitals, 968 patients were initially identified, and 979 patients were retrieved from the MIMIC-IV database. After applying exclusion criteria to the local data, 873 patients were included, of whom 339 (39%) developed early AKI following cardiac arrest. The same inclusion and exclusion criteria were applied to the MIMIC-IV cohort, resulting in 719 patients for external validation, including 267 (37%) with early AKI. The patient selection process is summarized in Figure 1.

Figure 1

Flowchart depicting a data processing pipeline. Two data sources,

Figure 1. Flowchart of model development and verification.

Baseline characteristics of the local cohort are presented in Table 1. Patients with higher BMI showed a greater incidence of early AKI, independent of hypertension, diabetes, or myocardial infarction status. Those with early AKI required more frequent use of medications (e.g., vasoactive drugs, sodium bicarbonate) and interventions such as ECMO and CRRT (Supplementary Table 1). Additionally, early-AKI patients had significantly higher HR, RR, white WBC, ALT, AST, TBil, InitialCr, BUN, glucose, K, Lac, PaCO₂, and SOFA scores. In contrast, they exhibited lower mean SBP, temperature, hemoglobin, PLT, Cl, Ca, and GCS scores compared to non-AKI patients. Baseline characteristics of the MIMIC-IV cohort are provided in Supplementary Table 2.

Table 1

Table 1. Baseline data table for the study population.

4.2 Feature screening

The local dataset was randomly divided into a training set (70%) and an internal validation set (30%). A total of 34 candidate variables were included in LASSO regression for feature selection. Figure 2 illustrates the trajectory of variable coefficients and the cross-validation curve of the LASSO model. The optimal penalty parameter (λ) that minimized the mean squared error was λ = 0.004, which retained 27 variables. At λ = 0.019 (one standard error above the minimum), 10 variables were selected: diabetes, myocardial infarction, sodium bicarbonate use, heart rate, AST, glucose, lactate, PaCO₂, SOFA score, and initial creatinine. To balance model simplicity and predictive performance, the latter set of 10 variables was chosen for the final model.

Figure 2

Plot A shows coefficient paths for different features against Log Lambda, with lines converging towards zero as Log Lambda increases. Plot B displays binomial deviance over Log Lambda values, with a curve that increases as Log Lambda decreases, and red markers indicating error ranges. Both plots have vertical dashed lines marking specific Log Lambda values.

Figure 2. LASSO regression for feature selection in CA-AKI prediction. (A) Path of coefficients as a function of the regularization parameter λ. (B) Ten-fold cross-validation for optimal λ selection.

4.3 Predictive modeling and performance evaluation

4.3.1 Training cohort and internal validation cohort

Five machine learning algorithms were developed to predict early CA-AKI. In the training set, the area under the receiver operating characteristic curve (AUC) values were: XGBoost, 0.999 (95% CI: 0.998–1.000); random forest (RF), 0.980 (95% CI: 0.969–0.991); LR, 0.958 (95% CI: 0.942–0.974); support vector machine (SVM), 0.929 (95% CI: 0.909–0.948); and multilayer perceptron (MLP), 0.918 (95% CI: 0.896–0.940) (Figures 3A,B). In the internal validation set, XGBoost and LR achieved the highest accuracy. SVM showed the highest sensitivity (0.906), while LR had the highest specificity (0.949). Detailed performance metrics are presented in Tables 2, 3.

Figure 3

Four-panel figure displaying machine learning model evaluations. Panel A shows the ROC curve for training data, while Panel B depicts the ROC curve for validation data, both highlighting model sensitivity and specificity. Panel C presents a validation decision curve, analyzing net benefit across different threshold probabilities. Panel D illustrates a calibration curve, comparing predicted values to actual outcomes, indicating model reliability. Different models are represented by distinct colored lines in each panel.

Figure 3. Model performance evaluation in the training set and validation sets. (A) Receiver operating characteristic (ROC) curve for training set. (B) Receiver operating characteristic (ROC) curve for validation set. (C) Decision curve analysis (DCA) evaluating clinical net benefit. (D) Calibration curve assessing agreement between predicted and observed probabilities.

Table 2

Table 2. Performance of the ML models in the training cohort.

Table 3

Table 3. Performance of the ML models in the internal validation cohort.

Calibration curves and decision curve analysis (DCA) further compared the five models (Figures 3C,D). XGBoost and LR demonstrated the best calibration, with curves closest to the ideal diagonal. XGBoost achieved a Brier score of 0.062 and LR 0.073, both superior to the other models. DCA also indicated higher clinical net benefit for XGBoost and LR across most risk thresholds. However, while XGBoost showed near-perfect discrimination in the training set (AUC = 0.999), its performance declined in internal validation (AUC = 0.962) and more markedly in external validation (AUC = 0.812). In contrast, LR exhibited stable performance across the training (AUC = 0.958), internal validation (AUC = 0.953), and external validation (AUC = 0.825) sets, indicating better generalizability.

4.3.2 External validation cohort

External validation was performed on 719 patients from the MIMIC-IV database. Among the five models, LR achieved the highest AUC of 0.825 (95% CI: 0.791–0.859) (Figure 4), demonstrating robust performance despite differences in data sources. Detailed external validation metrics are listed in Table 4.

Figure 4

Five ROC curves compare machine learning models: A) MLP with AUC 0.679, B) Logistic Regression with AUC 0.825, C) Random Forest with AUC 0.765, D) XGBoost with AUC 0.812, E) SVM with AUC 0.816. Each graph plots sensitivity versus 1-specificity. Red dashed lines indicate the diagonal reference.

Figure 4. Model performance evaluation in the external validation sets. (A) ROC of external training for MLP; (B) ROC of external training for LR; (C) ROC of external training for RF; (D) ROC of external training for XGBoost; (E) ROC of external training for SVM.

Table 4

Table 4. Performance of the ML models in the external validation cohort.

4.4 Interpretability analysis

The contribution of each variable in the LR model was assessed using SHapley Additive exPlanations (SHAP). Variables ranked by overall impact were: initial creatinine, SOFA score, PaCO₂, Heart rate, myocardial infarction, glucose, AST, lactate, diabetes, and sodium bicarbonate use (Figure 5). Local interpretability analysis illustrated how specific values of each variable influenced the prediction (Figure 6), where the x-axis represents the variable value and the y-axis its corresponding SHAP value.

Figure 5

SHAP summary plot showing the impact of various features on model output. Features include InitialCr, SOFA, PaCO2, Heart rate, Myocardial Infarction, Glucose, AST, Lactate, Diabetes, and Sodium Dicarbonate. The x-axis represents SHAP values, indicating the feature's effect size, while colors denote feature value, with red as high and blue as low.

Figure 5. Model interpretation of the weighted ensemble model using SHAP. SHAP summary dot plot, showing the global importance, direction, and distribution of features. From top to bottom, InitialCr, SOFA, PaCO2, Heart rate, Myocardial infarction, Glucose, AST, Lactate, Diabetes, and Sodium bicarbonate.

Figure 6

Ten scatter plots labeled A to J show SHAP (SHapley Additive exPlanations) values on the y-axis against different medical parameters on the x-axis: AST, SOFA, Sodium Bicarbonate, PaCO2, Myocardial Infarction, Lactate, InitialCr, Heart rate, Glucose, and Diabetes. Each plot has data points colored according to a SHAP value scale, indicating the parameter's impact on the model's prediction.

Figure 6. SHAP dependence plots for clinical features in the ensemble model. (A) AST; (B) SOFA; (C) Sodium bicarbonate; (D) PaCO2; (E) Myocardial infarction; (F) Lactate; (G) InitialCr; (H) Heart rate; (I) Glucose; (J) Diabetes.

5 Discussion

In this study, we developed a predictive model for the early occurrence (within 48 h) of AKI in patients admitted to the ICU following cardiac arrest. Clinical data were collected from two hospitals between 2017 and 2024. Feature selection was performed using LASSO regression, which identified 10 predictive variables. Five machine learning algorithms were then trained and evaluated. All models performed well in the training and internal validation cohorts. However, a marked disparity in generalizability emerged during external validation using the independent MIMIC-IV database. While the LR, XGBoost, and SVM models all maintained AUC values above 0.8, the performance of the more complex models, particularly XGBoost, exhibited a significant decline from near-perfect training performance (AUC = 0.999) to a lower external validation AUC (0.812), a pattern suggestive of overfitting to the nuances of the training data. This likely stems from the model’s high complexity and capacity to capture not only the generalizable signal but also dataset-specific noise and subtle, non-generalizable patterns present in the training data. Factors such as differences in patient populations, ICU practices, and data collection protocols between the development data and external validation databases may have contributed to this decline, highlighting the challenge of achieving model portability across institutions. In contrast, the LR model demonstrated not only the highest external validation performance (AUC = 0.825) but also the most stable generalizability across all three datasets (training, internal, and external validation), and was therefore selected as the final model.

LR is a widely used multivariate method for analyzing the relationship between multiple predictors and a categorical outcome (22). In machine learning studies, LR has frequently outperformed more complex models, particularly in AKI prediction, where it has shown optimal performance in approximately 25.6% of published models—substantially higher than the 18–19% reported for other algorithms (22, 23). This advantage is often attributed to its lower propensity for overfitting, especially with limited clinical sample sizes, leading to more reliable performance in new patient populations—a finding strongly corroborated by our results. Our findings further support the robust predictive capability and superior generalizability of LR in clinical settings.

To enhance interpretability, we applied SHAP, a well-established post-hoc analysis framework that enables both global and local interpretation of model predictions (24, 25). Our analysis included global feature importance ranking and local explanation plots, the latter illustrating how individual variable values influence the prediction for specific patients.

During cardiac arrest, systemic ischemia–reperfusion injury induces oxidative stress, inflammatory mediator release, mitochondrial dysfunction, and microcirculatory impairment in renal tissue. Reduced oxidative phosphorylation exacerbates reactive oxygen species production, contributing to kidney damage, the accumulation of reactive oxygen species leads to apoptosis and kidney damage (26, 27). Clinical studies have consistently linked post-cardiac arrest AKI with decreased survival (28, 29), underscoring the importance of early detection and intervention for improving outcomes.

Previous attempts to predict AKI after cardiac arrest have been limited. Lin et al. used LASSO and LR to select four predictors from 15 clinical indicators (12). Hou et al. developed a model based on shock status, CRP, LDH, and ALP, achieving an AUC of 0.731 using the Dryad database (30). However, these models were derived from single-center data, lacked external validation, and did not incorporate a comprehensive set of clinically relevant variables, potentially omitting important predictors. In contrast, our study utilized multicenter data for model development, performed internal validation, and conducted external validation using MIMIC-IV. By initially evaluating 53 clinical indicators, we aimed to improve the comprehensiveness and robustness of the predictive model.

Our analysis identified several admission factors associated with early AKI in post-cardiac arrest patients: InitialCr, SOFA score, HR, AST, glucose, Lac, and PaCO₂. Additionally, diabetes, history of myocardial infarction, and the need for sodium bicarbonate were linked to increased AKI risk.

InitialCr reflects baseline renal function at ICU admission and served as a strong predictor in our model. The SHAP analysis indicated a distinct increase in AKI risk when InitialCr exceeded 150 μmol/L, providing a practical threshold that may aid clinicians in risk stratification when prior renal function data are unavailable. SOFA score, a well-established measure of multi-organ dysfunction, was also significantly associated with AKI risk. SHAP based local interpretation identified a SOFA score >10 as a critical risk threshold, beyond which the predicted probability of AKI rose markedly. Elevated blood glucose-particularly levels >10 mmol/L and especially >20 mmol/L was associated with higher SHAP values and increased AKI risk. This observation aligns with prior evidence suggesting that intensive glycemic control can reduce AKI incidence in critically ill patients (31, 32). HR > 100 beats per minute emerged as another relevant predictor, consistent with findings from earlier predictive models (12, 33–35). Elevated HR may reflect underlying physiological stress such as infection, hypovolemia, or cardiac dysfunction, each of which can predispose patients to AKI. Hypercapnia (PaCO₂ > 50 mmHg) was associated with elevated AKI risk, likely mediated through reduced renal blood flow, as suggested in previous studies (36–38). Similarly, lactate levels >10 mmol/L were strongly predictive, underscoring the role of tissue hypoperfusion and acidosis in AKI pathogenesis-a context frequently accompanying the use of sodium bicarbonate (39, 40). Aspartate aminotransferase (AST), a marker of hepatic and muscular injury, also contributed to model predictions. Its inclusion aligns with prior studies where AST served as a predictor in trauma-and burn-associated AKI (41, 42), suggesting that systemic injury and inflammation may further exacerbate renal vulnerability after cardiac arrest.

Collectively, these predictors highlight the multifactorial nature of CA-AKI, incorporating elements of baseline renal reserve, systemic organ dysfunction, metabolic stress, and perfusion-related injury. This suggests that patients with baseline creatinine levels above this threshold should initiate enhanced monitoring and renal protection strategies, even if they do not meet acute kidney injury (AKI) criteria. A rapid increase in SOFA score or a score >10 should be regarded as a warning signal, prompting prioritized assessment of hemodynamics and optimization of systemic supportive therapy. Concurrently, intensified glycemic control (target <10 mmol/L), management of heart rate (HR > 100 bpm), investigation of reversible causes, and optimization of volume and cardiac output serve as feasible interventions for AKI prevention. In mechanically ventilated patients, avoid prolonged permissive hypercapnia by adjusting respiratory parameters, particularly in those at risk for AKI. Persistent lactate elevation indicates the need for aggressive restoration of tissue perfusion, not merely correction of acidosis; sodium bicarbonate use requires careful evaluation. Significant AST elevation should be interpreted as a marker of multiorgan injury, not solely as an indicator of liver function.

This study has several limitations. First, differences in variable availability between the local datasets and MIMIC-IV necessitated the exclusion of some clinically relevant factors (such ascardiac arrest location and resuscitation timing), which may affect model completeness. Second, missing data and potential subjectivity in clinical assessments (e.g., GCS scores) could introduce inaccuracies. Third, although the model demonstrated good generalizability in internal and external validation, further prospective validation in independent cohorts is needed to confirm its clinical utility and strengthen the evidence for routine application.

6 Conclusion

In this study, we constructed a LR model based on 10 clinical variables to predict the early onset of CA-AKI. The model demonstrated good discriminatory power and generalization ability in both internal and external validation, confirming its feasibility for predicting CA-AKI. We employed the SHAP model to elucidate the contribution of each variable, enabling clinicians to gain deeper insights into disease prediction and the underlying mechanisms by which relevant clinical data influence the occurrence of CA-AKI.

Data availability statement

Publicly available datasets were analyzed in this study. This data can be found at: data were obtained from the MIMIC database. The first author, Wenbo Xu, has completed a series of mandatory training courses and obtained authorization to extract data from the database (certification number: 62628171). Due to ongoing research requirements, the local data are not currently available for release. Upon completion of subsequent studies, interested parties may contact the first author if data are required.

Author contributions

WX: Conceptualization, Data curation, Methodology, Resources, Validation, Visualization, Writing – original draft. SC: Conceptualization, Resources, Validation, Writing – original draft. CW: Formal analysis, Resources, Software, Visualization, Writing – original draft. CL: Data curation, Formal analysis, Resources, Software, Writing – original draft. TN: Data curation, Writing – original draft. PN: Data curation, Methodology, Software, Writing – review & editing. GZ: Data curation, Investigation, Validation, Writing – original draft. MD: Investigation, Project administration, Resources, Supervision, Writing – review & editing. WH: Funding acquisition, Project administration, Resources, Supervision, Writing – review & editing.

Funding

The author(s) declared that financial support was received for this work and/or its publication. This work was financial support from the Zhejiang Provincial Medical and Health Technology Project (Grant: 2025KY1091), the Zhejiang Provincial Medical and Health Technology Project (Grant. WKJ-ZJ-2315), the Construction Fund of Key Medical Disciplines of Hangzhou (Grant: 2025HZZD04).

Acknowledgments

We would like to express our gratitude to Zhejiang Chinese Medical University and Hangzhou First People’s Hospital for their support of this study. We extend our gratitude to all individuals, organizations, and hospitals that contributed to this research.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that Generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmed.2025.1717973/full#supplementary-material

References

1. Andersen, LW, Holmberg, MJ, Berg, KM, Donnino, MW, and Granfeldt, A. In-hospital cardiac arrest: a review. JAMA J Am Med Assoc. (2019) 321:1200

Google Scholar

2. Sandroni, C, Dell'Anna, AM, Tujjar, O, Geri, G, Cariou, A, and Taccone, FS. Acute kidney injury after cardiac arrest: a systematic review and meta-analysis of clinical studies. Minerva Anestesiol. (2016) 82:989.

PubMed Abstract | Google Scholar

3. Geri, G, Guillemet, L, Dumas, F, Charpentier, J, Antona, M, Lemiale, V, et al. Acute kidney injury after out-of-hospital cardiac arrest: risk factors and prognosis in a large cohort. Intensive Care Med. (2015) 41:1273–80. doi: 10.1007/s00134-015-3848-4,

PubMed Abstract | Crossref Full Text | Google Scholar

4. Nolan, JP, Neumar, RW, Adrie, C, Aibiki, M, Berg, RA, Böttiger, BW, et al. Post-cardiac arrest syndrome: epidemiology, pathophysiology, treatment, and prognostication a scientific statement from the international liaison committee on resuscitation; the American Heart Association emergency cardiovascular care committee; the council on cardiovascular surgery and anesthesia; the council on cardiopulmonary, perioperative, and critical care; the council on clinical cardiology; the council on stroke. Resuscitation. (2008) 79:350–79. doi: 10.1016/j.resuscitation.2008.09.017,

PubMed Abstract | Crossref Full Text | Google Scholar

5. Neumar, RW, Nolan, JP, Adrie, C, Aibiki, M, Berg, RA, Böttiger, BW, et al. Post-cardiac arrest syndrome: epidemiology, pathophysiology, treatment, and prognostication a consensus statement from the international liaison committee on resuscitation (American Heart Association, Australian and New Zealand council on resuscitation, European resuscitation council, Heart and Stroke Foundation of Canada, InterAmerican Heart Foundation, resuscitation Council of Asia, and the resuscitation Council of Southern Africa); the American Heart Association emergency cardiovascular care committee; the council on cardiovascular surgery and anesthesia; the council on cardiopulmonary, perioperative, and critical care; the council on clinical cardiology; and the stroke council. Circulation. (2008) 118:2452–83.

Google Scholar

6. Martensson, J, and Bellomo, R. Acute kidney injury after cardiac arrest: an unappreciated complication. Minerva Anestesiol. (2016) 82:929–31.

PubMed Abstract | Google Scholar

7. Prasitlumkum, N, Cheungpasitporn, W, Sato, R, Chokesuwattanaskul, R, Thongprayoon, C, Patlolla, SH, et al. Acute kidney injury and cardiac arrest in the modern era: an updated systematic review and meta-analysis. Hosp Pract (1995). (2021) 49:280–91. doi: 10.1080/21548331.2021.1931234,

PubMed Abstract | Crossref Full Text | Google Scholar

8. Storm, C, Krannich, A, Schachtner, T, Engels, M, Schindler, R, Kahl, A, et al. Impact of acute kidney injury on neurological outcome and long-term survival after cardiac arrest - a 10 year observational follow up. J Crit Care. (2018) 47:254–9. doi: 10.1016/j.jcrc.2018.07.023,

PubMed Abstract | Crossref Full Text | Google Scholar

9. Mattana, J, and Singhal, PC. Prevalence and determinants of acute renal failure following cardiopulmonary resuscitation. JAMA Netw. (1993) 153:235–9.

Google Scholar

10. Bellomo, R, and See, EJ. Novel renal biomarkers of acute kidney injury and their implications. Intern Med J. (2021) 51:316–8. doi: 10.1111/imj.15229,

PubMed Abstract | Crossref Full Text | Google Scholar

11. Nevin, L. Advancing the beneficial use of machine learning in health care and medicine: toward a community understanding. PLoS Med. (2018) 15:e1002708. doi: 10.1371/journal.pmed.1002708,

PubMed Abstract | Crossref Full Text | Google Scholar

12. Lin, L, Chen, L, Jiang, Y, Gao, R, Wu, Z, Lv, W, et al. Construction and validation of a risk prediction model for acute kidney injury in patients after cardiac arrest. Ren Fail. (2023) 45:2285865. doi: 10.1080/0886022X.2023.2285865,

PubMed Abstract | Crossref Full Text | Google Scholar

13. Collins, GS, de Groot, JA, Dutton, S, Omar, O, Shanyinde, M, Tajar, A, et al. External validation of multivariable prediction models: a systematic review of methodological conduct and reporting. BMC Med Res Methodol. (2014) 14:40. doi: 10.1186/1471-2288-14-40,

PubMed Abstract | Crossref Full Text | Google Scholar

14. Dong, J, Feng, T, Thapa-Chhetry, B, Cho, BG, Shum, T, Inwald, DP, et al. Machine learning model for early prediction of acute kidney injury (AKI) in pediatric critical care. Crit Care. (2021) 25:288. doi: 10.1186/s13054-021-03724-0,

PubMed Abstract | Crossref Full Text | Google Scholar

15. Zhang, Y, Xu, D, Gao, J, Wang, R, Yan, K, Liang, H, et al. Development and validation of a real-time prediction model for acute kidney injury in hospitalized patients. Nat Commun. (2025) 16:68. doi: 10.1038/s41467-025-67402-3,

PubMed Abstract | Crossref Full Text | Google Scholar

16. Xia, X, Liu, R, and Jiang, X. Integration of mitochondrial gene expression and immune landscape in acute kidney injury prediction. Ren Fail. (2025) 47:2502608. doi: 10.1080/0886022X.2025.2502608,

PubMed Abstract | Crossref Full Text | Google Scholar

17. Fan, Z, Jiang, J, Xiao, C, Chen, Y, Xia, Q, Wang, J, et al. Construction and validation of prognostic models in critically ill patients with sepsis-associated acute kidney injury: interpretable machine learning approach. J Transl Med. (2023) 21:406. doi: 10.1186/s12967-023-04205-4,

PubMed Abstract | Crossref Full Text | Google Scholar

18. Markovic, F, Jovanovic, L, Spalevic, P, Kaljevic, J, Zivkovic, M, Simic, V, et al. Parkinsons detection from gait time series classification using modified metaheuristic optimized long short term memory. Neural Process Lett. (2025) 57:14. doi: 10.1007/s11063-025-11735-z

Crossref Full Text | Google Scholar

19. Bacanin, N, Stoean, C, Markovic, D, Zivkovic, M, Rashid, TA, Chhabra, A, et al. Improving performance of extreme learning machine for classification challenges by modified firefly algorithm and validation on medical benchmark datasets. Multimed Tools Appl. (2024) 83:76035–75. doi: 10.1007/s11042-024-18295-9

Crossref Full Text | Google Scholar

20. Zivkovic, M, Bacanin, N, Antonijevic, M, Nikolic, B, Kvascev, G, Marjanovic, M, et al. Hybrid CNN and XGBoost model tuned by modified arithmetic optimization algorithm for COVID-19 early diagnostics from X-ray images. Electronics-Switz. (2022) 11:3798

Google Scholar

21. Jovanovic, L, Bacanin, N, Zivkovic, M, Antonijevic, M, Petrovic, A, and Zivkovic, T. “Anomaly detection in ECG using recurrent networks optimized by modified metaheuristic algorithm,” 2023 31st Telecommunications Forum (TELFOR), Belgrade, Serbia, (2023) 1–4. doi: 10.1109/TELFOR59449.2023.10372802

Crossref Full Text | Google Scholar

22. Boateng, EY, and Abaye, DA. A review of the logistic regression model with emphasis on medical research. J Data Anal Inf Process. (2019) 7:190–207. doi: 10.4236/jdaip.2019.74012

Crossref Full Text | Google Scholar

23. Song, X, Liu, X, Liu, F, and Wang, C. Comparison of machine learning and logistic regression models in predicting acute kidney injury: a systematic review and meta-analysis. Int J Med Inform. (2021) 151:104484. doi: 10.1016/j.ijmedinf.2021.104484,

PubMed Abstract | Crossref Full Text | Google Scholar

24. Hu, C, Tan, Q, Zhang, Q, Li, Y, Wang, F, Zou, X, et al. Application of interpretable machine learning for early prediction of prognosis in acute kidney injury. Comput Struct Biotechnol J. (2022) 20:2861–70. doi: 10.1016/j.csbj.2022.06.003,

PubMed Abstract | Crossref Full Text | Google Scholar

25. Ali, S, Akhlaq, F, Imran, AS, Kastrati, Z, Daudpota, SM, and Moosa, M. The enlightening role of explainable artificial intelligence in medical & healthcare domains: a systematic literature review. Comput Biol Med. (2023) 166:107555. doi: 10.1016/j.compbiomed.2023.107555,

PubMed Abstract | Crossref Full Text | Google Scholar

26. Ishimoto, Y, Tanaka, T, Yoshida, Y, and Inagi, R. Physiological and pathophysiological role of reactive oxygen species and reactive nitrogen species in the kidney. Clin Exp Pharmacol Physiol. (2018) 45:1097–105. doi: 10.1111/1440-1681.13018,

PubMed Abstract | Crossref Full Text | Google Scholar

27. Tsivilika, M, Kavvadas, D, Karachrysafi, S, Kotzampassi, K, Grosomanidis, V, Doumaki, E, et al. Renal injuries after cardiac arrest: a morphological ultrastructural study. Int J Mol Sci. (2022) 23:6147. doi: 10.3390/ijms23116147,

PubMed Abstract | Crossref Full Text | Google Scholar

28. Mah, KE, Alten, JA, Cornell, TT, Selewski, DT, Askenazi, D, Fitzgerald, JC, et al. Acute kidney injury after in-hospital cardiac arrest. Resuscitation. (2021) 160:49–58. doi: 10.1016/j.resuscitation.2020.12.023,

PubMed Abstract | Crossref Full Text | Google Scholar

29. Jeppesen, KK, Rasmussen, SB, Kjaergaard, J, Schmidt, H, Mølstrøm, S, Beske, RP, et al. Acute kidney injury after out-of-hospital cardiac arrest. Crit Care. (2024) 28:169. doi: 10.1186/s13054-024-04936-w,

PubMed Abstract | Crossref Full Text | Google Scholar

30. Hou, S, Zhang, L, Ji, H, Zhao, T, Hu, M, Jiang, Y, et al. Development and evaluation of the model for acute kidney injury in patients with cardiac arrest after successful resuscitation. BMC Cardiovasc Disord. (2024) 24:440. doi: 10.1186/s12872-024-04110-8,

PubMed Abstract | Crossref Full Text | Google Scholar

31. Mendez, CE, Der Mesropian, PJ, Mathew, RO, and Slawski, B. Hyperglycemia and acute kidney injury during the perioperative period. Curr Diabetes Rep. (2016) 16:10. doi: 10.1007/s11892-015-0701-7,

PubMed Abstract | Crossref Full Text | Google Scholar

32. Van Den Berghe, G, Wilmer, A, and Hermans, G. Intensive insulin therapy in the medical ICU. N Engl J Med. (2006) 354:449–61. doi: 10.1056/NEJMoa052521,

PubMed Abstract | Crossref Full Text | Google Scholar

33. Grand, J, Bro-Jeppesen, J, Hassager, C, Rundgren, M, Winther-Jensen, M, Thomsen, JH, et al. Cardiac output during targeted temperature management and renal function after out-of-hospital cardiac arrest. J Crit Care. (2019) 54:65–73. doi: 10.1016/j.jcrc.2019.07.013,

PubMed Abstract | Crossref Full Text | Google Scholar

34. Li, M, Han, S, Liang, F, Hu, C, Zhang, B, Hou, Q, et al. Machine learning for predicting risk and prognosis of acute kidney disease in critically ill elderly patients during hospitalization: internet-based and interpretable model study. J Med Internet Res. (2024) 26:e51354. doi: 10.2196/51354,

PubMed Abstract | Crossref Full Text | Google Scholar

35. Zhou, H, Liu, L, Zhao, Q, Jin, X, Peng, Z, Wang, W, et al. Machine learning for the prediction of all-cause mortality in patients with sepsis-associated acute kidney injury during hospitalization. Front Immunol. (2023) 14:1140755. doi: 10.3389/fimmu.2023.1140755,

PubMed Abstract | Crossref Full Text | Google Scholar

36. Sharkey, RA, Mulloy, EM, and O'Neill, SJ. Acute effects of hypoxaemia, hyperoxaemia and hypercapnia on renal blood flow in normal and renal transplant subjects. Eur Respir J. (1998) 12:653–7. doi: 10.1183/09031936.98.12030653,

PubMed Abstract | Crossref Full Text | Google Scholar

37. Eastwood, GM, Bailey, M, Nichol, AD, Parke, R, Nielsen, N, Dankiewicz, J, et al. Impact of mild hypercapnia on renal function after out-of-hospital cardiac arrest. Resuscitation. (2025) 207:110480. doi: 10.1016/j.resuscitation.2024.110480,

PubMed Abstract | Crossref Full Text | Google Scholar

38. Chapman, CL, Schlader, ZJ, Reed, EL, Worley, ML, and Johnson, BD. Renal and segmental artery hemodynamic response to acute, mild hypercapnia. Am J Physiol Regul Integr Comp Physiol. (2020) 318:R822–7. doi: 10.1152/ajpregu.00035.2020,

PubMed Abstract | Crossref Full Text | Google Scholar

39. An, S, Yao, Y, Hu, H, Wu, J, Li, J, Li, L, et al. PDHA1 hyperacetylation-mediated lactate overproduction promotes sepsis-induced acute kidney injury via Fis1 lactylation. Cell Death Dis. (2023) 14:457. doi: 10.1038/s41419-023-05952-4,

PubMed Abstract | Crossref Full Text | Google Scholar

40. Demirjian, S, Bashour, CA, Shaw, A, Schold, JD, Simon, J, Anthony, D, et al. Predictive accuracy of a perioperative laboratory test-based prediction model for moderate to severe acute kidney injury after cardiac surgery. JAMA J Am Med Assoc. (2022) 327:956.

Google Scholar

41. Omrani, H, Najafi, I, Bahrami, K, Najafi, F, and Safari, S. Acute kidney injury following traumatic rhabdomyolysis in Kermanshah earthquake victims; a cross-sectional study. Am J Emerg Med. (2021) 40:127–32. doi: 10.1016/j.ajem.2020.01.043,

PubMed Abstract | Crossref Full Text | Google Scholar

42. Kym, D, Cho, Y, Yoon, J, Yim, H, and Yang, H. Evaluation of diagnostic biomarkers for acute kidney injury in major burn patients. Ann Surg Treat Res. (2015) 88:281–8. doi: 10.4174/astr.2015.88.5.281,

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: acute kidney injury, cardiac arrest, machine learning, MIMIC-IV database, predictive modeling

Citation: Xu W, Cheng S, Wu C, Li C, Ni T, Ni P, Zhang G, Diao M and Hu W (2026) A machine learning model for prediction of cardiac arrest-associated acute kidney injury in the ICU: an internal and external validation study. Front. Med. 12:1717973. doi: 10.3389/fmed.2025.1717973

Received: 03 October 2025; Revised: 15 December 2025; Accepted: 22 December 2025;
Published: 20 January 2026.

Edited by:

Xianghong Yang, Zhejiang Provincial People's Hospital, China

Reviewed by:

Miodrag Zivkovic, Singidunum University, Serbia
Harikrishna Choudary Ponnam, Summa Health System, United States

Copyright © 2026 Xu, Cheng, Wu, Li, Ni, Ni, Zhang, Diao and Hu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Wei Hu, aHV3ZWlAaG9zcGl0YWwud2VzdGxha2UuZWR1LmNu; Mengyuan Diao, ZGlhb21lbmd5dWFuQGhvc3BpdGFsLndlc3RsYWtlLmVkdS5jbg==

^†These authors have contributed equally to this work

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.