Stage prediction of acute kidney injury in sepsis patients using explainable machine learning approaches

Quan, Zhen; Han, Zheng; Zeng, Siyao; Wen, Lianghe; Wang, Jingkai; Li, Yue; Wang, Hongliang

doi:10.3389/fmed.2025.1667488

ORIGINAL RESEARCH article

Front. Med., 15 October 2025

Sec. Intensive Care Medicine and Anesthesiology

Volume 12 - 2025 | https://doi.org/10.3389/fmed.2025.1667488

Stage prediction of acute kidney injury in sepsis patients using explainable machine learning approaches

Zhen Quan¹

Zheng Han¹

Siyao Zeng¹

Lianghe Wen²

Jingkai Wang¹

Yue Li²

Hongliang Wang²^*

¹The Second Clinical Medical College of Harbin Medical University, Harbin, Heilongjiang Province, China
²Department of Critical Care Medicine, The Second Affiliated Hospital of Harbin Medical University, Harbin, Heilongjiang Province, China

Background: Acute kidney injury (AKI) is a prevalent and serious complication among sepsis patients, closely associated with high mortality rates and substantial disease burden. Early prediction of AKI is vital for prompt and effective intervention and improved prognosis. This research seeks to construct and assess forecasting frameworks that leverage advanced machine learning algorithms to anticipate AKI progression in high-risk sepsis patients.

Methods: This study utilized the MIMIC-IV database, a large, publicly available critical care dataset containing comprehensive, de-identified electronic health records of over 70,000 ICU admissions at Beth Israel Deaconess Medical Center, to extract sepsis patient data for model training and test. Following feature selection, various machine learning algorithms were employed, including Decision Tree (DT), Efficient Neural Network (ENet), k-Nearest Neighbor (KNN), Light Gradient Boosting Machine (LightGBM), Multi-Layer Perceptron (MLP), Multinomial Mixture Model (Multinom), Random Forest (RF), and eXtreme Gradient Boosting (XGBoost). A five-fold cross-test strategy was implemented to minimize bias and assess model performance. SHapley Additive exPlanations (SHAP) was used to interpret the results.

Results: A total of 6,866 critically ill sepsis patients were analyzed, of whom 5,896 developed AKI during hospitalization The RF model demonstrated superior performance, attaining an average AUC score of 0.89 on the ROC curve. SHAP analysis provided detailed insights into feature importance, including urine output, BMI, SOFA score, and maximum blood urea nitrogen, enhancing the clinical applicability of the model.

Conclusion: The machine learning models developed in this study effectively predicted the stages of AKI in severely ill sepsis patients, with the Random Forest model demonstrating optimal performance. SHAP analysis offered crucial insights into the risk factors, facilitating timely and personalized interventions within a clinical setting. Additional multi-center research is essential to confirm the validity of these findings and to ultimately improve patient outcomes and quality of life.

Introduction

Acute kidney injury associated with sepsis (SA-AKI) frequently occurs as a complication among severely ill patients. SA-AKI frequently occurs as a complication among severely ill patients (1–3). Patients with SA-AKI face substantially higher mortality rates than those without AKI or those whose AKI stems from other causes (3, 4). Although numerous therapeutic strategies have been explored, no effective clinical treatment is currently available, making early identification crucial for successful intervention (5, 6). In 2023, the 28th Acute Disease Quality Initiative (ADQI) workgroup similarly emphasized the urgent need for early identification of sepsis patients at risk of developing AKI or progressing to severe and/or persistent AKI, which is critical for timely initiation of supportive interventions, including hemodynamic optimization, fluid management, avoidance of nephrotoxic drugs, and renal replacement therapy when indicated (7). In recent years, with the advancement of machine learning (ML) models, large amounts of clinical data have been efficiently utilized, leading to numerous studies on early prediction of SA-AKI and demonstrating high diagnostic performance in related applications such as cancer and sepsis prediction (8–14). Current research mainly focuses on binary classification to predict whether AKI occurs, which presents an apparent limitation: it does not accurately classify the data for effective clinical diagnosis and treatment, nor can it differentiate the severity of AKI across individuals. Therefore, developing an ML model capable of multi-class prediction for SA-AKI Kidney Disease: Improving Global Outcomes (KDIGO) stages is crucial for better management of SA-AKI patients, as the KDIGO classification provides internationally recognized criteria for defining and staging acute kidney injury based on serum creatinine levels and urine output.

By applying the Shapley Additive exPlanations (SHAP) method, the opaque nature typical of ML models has been partially reduced. SHAP serves as a widely-used technique in machine learning to unravel the intricate relationships between features and predictive results. SHAP provides personalized insights by explaining the role of specific features in shaping model predictions, which helps clinicians understand the changing importance of features across different severities of the disease, providing more specific targets for early individualized intervention (15).

Therefore, this study aims to develop machine learning models for AKI in sepsis patients, with the dual purpose of identifying key risk factors that may enable personalized clinical intervention and achieving two specific objectives: first, to develop an ML model that best predicts the stages of SA-AKI in sepsis patients; second, to employ SHAP in interpreting the mode, visualize the risk factors, and explain the outcomes.

Methods

Data source

This study retrospectively analyzed data from the MIMIC-IV database (version 2.2), encompassing records from over 50,000 ICU admissions collected between 2008 and 2019 at Boston’s Beth Israel Deaconess Medical Center (16). The MIMIC-IV database is a large, publicly available critical care dataset that is continuously updated and widely used for clinical and machine learning research. It provides comprehensive and high-resolution clinical information, including patient demographics, vital signs, laboratory results, medications, procedures, and diagnostic codes from the International Classification of Diseases, Ninth and Tenth Revisions (ICD-9 and ICD-10). All data are fully de-identified in compliance with the Health Insurance Portability and Accountability Act (HIPAA), and therefore informed consent was not required.

Participants

Patients with sepsis were identified in the MIMIC-IV database according to the Sepsis-3 criteria, which define sepsis as life-threatening organ dysfunction caused by a dysregulated host response to infection. Organ dysfunction was assessed using the Sequential Organ Failure Assessment (SOFA) score, with a ≥2-point increase from baseline indicating clinically significant organ dysfunction. Patients with non-first admissions, non-first ICU stays, age <18 or >85 years, and ICU stay less than 48 h were excluded. The data were then matched with the highest KDIGO stage during the ICU stay (0, 1, 2, 3), and patients were categorized into four groups: sepsis without AKI, sepsis with AKI stage 1, 2, and 3. The KDIGO stage served as the outcome for the prediction model. The dataset was randomly split into a training set (70%) and a hold-out test set (30%) to evaluate the final model performance. No separate internal validation set was created because hyperparameter tuning and cross-validation were conducted within the training set only. The screening process is shown in Figure 1.

Figure 1

Flowchart depicting the selection of sepsis patients from the MIMIC-IV database. Initially, 32,971 sepsis records are noted. An exclusion criteria is applied, narrowing it to 9,806 sepsis patients. These patients are categorized into three: sepsis without acute kidney injury (AKI) (1,373 cases), sepsis with AKI Stage 1 (1,586 cases), AKI Stage 2 (4,055 cases), and AKI Stage 3 (2,792 cases).

Figure 1. Data screening was conducted according to the Sepsis 3.0 criteria and AKI-KDIGO staging definitions, with subsequent inclusion and exclusion criteria applied.

Data extraction

Data extraction was performed using Navicat Premium software (version 12.0.11) and Structured Query Language (SQL). The extracted information included demographics (e.g., gender, age), comorbidities (e.g., diabetes, hypertension, pneumonia, hepatitis, heart failure), vital signs within 24 h of ICU admission (e.g., minimum/maximum systolic and diastolic blood pressure, respiratory rate, temperature, heart rate, SpO2), and laboratory indicators within 24 h of ICU admission (e.g., minimum/maximum hemoglobin, platelets, white blood cell count, anion gap, bicarbonate, blood urea nitrogen, chloride, creatinine, glucose, sodium, potassium, international normalized ratio, prothrombin time, partial thromboplastin time, SOFA score, urine output).

BMI values that were clearly implausible, resulting from data entry errors or inconsistent height and weight units, were excluded from the analysis. To reduce data bias, populations with missing values exceeding 10% were excluded, while missing values below 10% were imputed using the KNN method.

Model development and evaluation

The dataset was imbalanced, which could affect model training and performance. Compared to binary classification tasks, multi-class imbalance problems are more complex and require more attention (17). Because the distribution of AKI stages was highly imbalanced (class 0: 1,292; class 1: 1,395; class 2: 3,587; class 3: 2,573), we applied the Synthetic Minority Oversampling Technique (SMOTE), introduced by Chawla et al. (18) to the training set to balance class sizes to approximately 4,055 samples per class before model training. A five-fold cross-validation strategy was performed within the training set to optimize hyperparameters and prevent overfitting. The final model was trained on the training set and evaluated on the independent test set, with performance assessed by average ROC–AUC and calibration metrics.

Machine learning models

The dataset was input into seven machine learning algorithms: Decision Tree (DT), Efficient Neural Network (ENet), K-Nearest Neighbor (KNN), Light Gradient Boosting Machine (LightGBM), Multi-Layer Perceptron (MLP), Multinomial Mixture Model (Multinom), Random Forest Model (RF), and Extreme Gradient Boosting (XGBoost).

Eight machine learning models were used to predict the stages of AKI. Model evaluation metrics included Accuracy, Balanced Accuracy (Bal Accuracy), Detection Prevalence, F Measure (F Meas), Jaccard index (J index), Kappa (Kap), Matthews Correlation Coefficient (MCC), Negative Predictive Value (NPV), Positive Predictive Value (PPV), Precision, Recall, Area Under the Curve (AUC), Sensitivity (Sens), and Specificity (Spec).

For all machine learning models, key hyperparameters were tuned using grid search within reasonable ranges based on previous studies (19) and recent evidence of its effectiveness in clinical prediction contexts. Parameters were selected by 5-fold cross-validation within the training set. Final parameters were as follows:

Decision Tree: max_depth = 5

Efficient Neural Network: alpha = 1.0,l1_ratio = 0.5, max_iter = 1,000, random_state = 42

K-Nearest Neighbor: n_neighbors = 5

Light Gradient Boosting Machine: n_estimators = 200, num_leaves = 31, learning_rate = 0.1

Multi-Layer Perceptron: hidden_layer_sizes = (100, 50), activation = ‘relu’, solver = ‘adam’, learning_rate_init = 0.001

Multinomial Mixture Model: penalty = ‘l2’, C = 1.0, solver = ‘lbfgs’, multi_class = ‘multinomial’, max_iter = 1,000, random_state = 42

Random Forest Model:n_estimators = 100,random_state = 42,h_jobs = −1

Extreme Gradient Boosting: n_estimators = 200, max_depth = 6, learning_rate = 0.1

Results

Baseline characteristics and feature selection

Following screening and data imputation, the training set comprised a total of 6,866 patients, including 970 patients with sepsis without AKI (14.12%), 1,102 patients with SA-AKI stage 1 (16.05%), 2,839 patients with SA-AKI stage 2 (41.35%), and 1,955 patients with SA-AKI stage 3 (28.47%). Differences in characteristics among the groups are shown in Table 1. Initially, univariable analysis was conducted on these features, and those with statistical significance were subsequently included in multivariable analysis. Features with statistical significance in both univariable and multivariable analyses were adopted for model training.

Table 1

Table 1. AKI stage 0: sepsis without AKI; hepatitis, diabetes, hypertension, pneumonia, heart failure: 0 indicates absence of the comorbidity, 1 indicates presence of the comorbidity.

Model performance

The RF model demonstrated the highest performance, achieving an average macro-AUC of 0.888 across all AKI stages during five-fold cross-validation (Supplementary Figure 1). In the independent test set, the ROC–AUC values for each class were: sepsis without AKI, 0.934; SA-AKI stage 1, 0.903; SA-AKI stage 2, 0.784; and SA-AKI stage 3, 0.925 (Figure 2). The ROC–AUC values of the other models were as follows: Multinomial Mixture Model (Multinom), 0.760; Efficient Neural Network (ENet), 0.759; Decision Tree (DT), 0.710; XGBoost, 0.804; Multi-Layer Perceptron (MLP), 0.782; LightGBM, 0.750; and k-Nearest Neighbor (KNN), 0.833 (Table 2, Figure 3). Comparison between training and test sets showed similar AUC distributions (Supplementary Figure 2, Table 1), highlighting the validity of the RF model. These results indicate that the RF model not only outperforms other algorithms but also maintains consistent discriminative ability across all AKI stages.

Figure 2

Four ROC curve plots show the performance of various models in predicting sepsis with and without acute kidney injury (AKI). Each plot represents a different condition: sepsis without AKI, sepsis with AKI stage 1, stage 2, and stage 3. Models include Decision Tree (DT), Elastic Net (ENET), K-Nearest Neighbors (KNN), LightGBM, Multilayer Perceptron (MLP), Multinomial, Random Forest (RF), and XGBoost, with respective AUC values differing across the conditions. Sensitivity is plotted against one minus specificity.

Figure 2. ROC–AUC of eight machine learning models for the four AKI stages.

Table 2

Table 2. Evaluation metrics values for the performance of eight machine learning models.

Figure 3

Line graph comparing the performance of multiple machine learning models, including dt, enet, knn, lightgbm, mlp, multinom, rf, and xgboost, across various metrics like accuracy, precision, and recall. Models are differentiated by color.

Figure 3. Comparison of eight machine learning models based on a Line graph.

Interpretability analysis

Features were ranked by SHAP values in descending order, which helps analyze the occurrence of AKI and display the importance of different predictive variables across groups. Figure 4 shows the top eight important features, while Figure 5 presents the SHAP bee swarm plot for four groups Each patient’s feature is depicted as a dot, with colors reflecting attribution values: red for higher values and blue for lower values. Urine output, BMI, SOFA score, and maximum blood urea nitrogen were the most important factors across groups. The importance of different features varied among groups; for example, the importance of SOFA score and minimum anion gap was positively correlated with AKI stage severity.

Figure 4

Bar chart showing the top eight feature importances for classes zero to three. Features include urine output, BMI, and age. Urine output has the highest importance for class three at 0.13. Color coding distinguishes classes.

Figure 4. Top eight important features ranked by SHAP values: The X-axis represents the importance of the features, while the Y-axis shows the different features; Class 0, 1, 2, and 3 represent sepsis without AKI, and sepsis with AKI stages 1, 2, and 3, respectively.

Figure 5

Four SHAP value plots compare feature impacts on model output for sepsis patients with different acute kidney injury (AKI) stages. Each plot includes features like urine output and body mass index, with color bars indicating high (red) to low (blue) feature values. The plots are labeled as

Figure 5. Bee swarm plot of the RF model: each point represents the data of a patient within the corresponding class. Red indicates relatively high values, and blue indicates relatively low values. The X-axis represents the magnitude of SHAP values, while the Y-axis shows features ranked by importance from top to bottom.

The SHAP force plot (Figure 6) helps understand local interpretability (i.e., individual patients) by showing how features contribute to the prediction for a particular patient. The force plot displays whether a feature promotes or inhibits the prediction outcome and shows its relative strength, providing explicit guidance for clinical diagnosis and treatment.

Figure 6

Three waterfall charts display SHAP values for sepsis patients with different AKI stages. The first chart shows

Figure 6. The force plot of the RF model visualizes the result for a randomly selected patient from the four groups. The base value represents the average predicted outcome, with feature values and names listed at the bottom of the plot. Features are sorted from the center outward based on their impact on the prediction.

The SHAP dependence plot (Supplementary Figures 3–5) displays the interaction effects between features, showing how two primary features influence each other.

Discussion

This study involved screening data from the MIMIC-IV database to examine multiple indicators of sepsis patients within the first 24 h of ICU admission and their associations with the occurrence and progression of AKI. Currently, dozens of studies on sepsis use this database to construct ML models, making our use of the same data reasonable and feasible (20–22).

We used univariable and multivariable analyses to identify 16 early clinical parameters for developing and validating the prediction model. The results showed that the RF model exhibited better discrimination and calibration capabilities than other ML algorithms. Compared with traditional logistic regression or simple scoring systems, our multi-class RF model integrates multiple routinely collected clinical indicators and provides more accurate and granular risk stratification, which enhances its potential for bedside application. To investigate how these features influence RF algorithm decisions, we used SHAP to interpret predictions. The SHAP bee swarm plots illustrated the importance of features across the different groups, while the dependence plots demonstrated feature relationships and their effect on model measurement. Additionally, SHAP force plots and waterfall plots illustrated how the model locally explained the relationship between feature and sepsis prediction.

SA-AKI is a sepsis complication with high mortality. Although several novel biomarkers for detecting kidney injury and predicting AKI development—such as NGAL, KIM-1, cystatin C, and IL-18—have been discovered, they are still insufficiently sensitive for early detection, which makes the exploration of early prediction of SA-AKI irreplaceable (5). Our approach leverages only standard clinical and laboratory data available within 24 h of ICU admission, avoiding the need for costly or time-consuming biomarker testing, and thus increasing its feasibility in routine critical care settings.

With the development of artificial intelligence, ML models have become increasingly important tools in medical research. ML automatically learns patterns and features from large datasets and generates prediction and decision models to make predictions for new data. Previous studies have predominantly used binary classification to predict whether SA-AKI occurs, but the severity classification of AKI is crucial for treatment and prognosis. Studies have shown that the higher the AKI stage, the greater the likelihood of requiring renal replacement therapy, and the higher the mortality rate (23). Therefore, this study adopted a multi-class classification approach to predict AKI KDIGO stages, which aligns better with treatment guideline variations for patients at different KDIGO stages and has greater clinical application potential. By simultaneously predicting all KDIGO stages, our model provides clinicians with a more refined tool for individualized risk assessment and early intervention planning.

An important concern in ML is the black-box issue, where early studies lacked explanations of ML models—how input variables affect model results, and to what extent are often unknown. This is one of the major barriers to clinical application. This study used SHAP to visualize and interpret the multi-class results, allowing us to see how features impact each stage of the model. Another advantage of SHAP is comparing changes in feature importance across different AKI stages, which offers a deeper understanding of how feature importance changes with disease severity and helps guide targeted treatment. For example, for a patient predicted to have a high risk of stage 2–3 AKI, SHAP analysis identified low urine output, elevated BMI, high SOFA score, and increased maximum BUN as the top contributing factors. This explanation helps clinicians understand why the model predicts high risk and facilitates targeted interventions, such as closer monitoring of renal function or adjustment of fluid and medication management. This combination of multi-class modeling and interpretable AI not only improves predictive performance but also enhances clinical trust and facilitates translation into practice.

Recent studies indicate that a high BMI is associated with the early occurrence of SA-AKI and correlates with its severity, which was also confirmed by our model. Further, by introducing SHAP, the impact of high BMI on SA-AKI occurrence was quantified and visualized (24). Multivariate regression analyses have identified SOFA score as an independent risk factor for persistent severe SA-AKI, which is consistent with the predictions of our model (25). In addition, previous studies have determined 12 risk factors associated with early SA-AKI development, including age, BMI, and urine output—key features also captured by our model (26). Since SOFA score was introduced to define sepsis, numerous studies have either modified it or combined it with other biomarkers for prediction (27, 28). By integrating ML with large and complex datasets, our model demonstrated that SOFA score is one of the important influencing factors (Figure 5). SOFA score ranked second in importance in the KDIGO stage 3 model, while ranking lower in the other three models, suggesting that its accuracy in predicting patients with different severity levels should be considered when using the SOFA score, thus underscoring the importance of a multi-class model for predicting SA-AKI.

Nonetheless, this study has certain limitations. The MIMIC-IV database originates from a single U. S. center, which may limit generalizability to other regions or populations, and the use of imputation for missing data could introduce bias. Additionally, the model has not been externally validated, which may affect its robustness. We also only used the minimum and maximum values within the first 24 h, potentially overlooking important temporal dynamics. Future work will focus on external validation in multi-center prospective cohorts across different regions and populations, exploration of novel biomarkers, incorporation of continuous time-series data (e.g., dynamic trends of creatinine and urine output) to capture temporal patterns of disease progression, and assessment of the model in real ICU settings. These improvements aim to enhance predictive accuracy, clinical applicability, and the robustness of our findings, while helping identify optimal time-points for stage-specific clinical interventions.

Conclusion

This research effectively established robust machine learning models for predicting stages of AKI in severely ill sepsis patients, with the RF model exhibiting optimal performance. Through the application of SHAP analysis, critical risk factors such as urine output, body mass index, SOFA score, and peak blood urea nitrogen were identified, highlighting the potential for personalized risk assessment. These results lay the groundwork for early interventions, supporting improved management and survival outcomes in sepsis patients.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary material, further inquiries can be directed to the corresponding author.

Ethics statement

The studies involving humans were approved by this study utilized the MIMIC-IV database, which is publicly available and de-identified. As the study involves secondary analysis of anonymized data, no direct ethics approval was required. The MIMIC-IV database was originally collected with ethical approval from the Institutional Review Board (IRB) at Beth Israel Deaconess Medical Center (BIDMC), Boston, USA. The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required from the participants or the participants’ legal guardians/next of kin in accordance with the national legislation and institutional requirements.

Author contributions

ZQ: Data curation, Methodology, Conceptualization, Writing – original draft. ZH: Data curation, Writing – review & editing, Conceptualization. SZ: Writing – review & editing, Conceptualization, Data curation. LW: Supervision, Conceptualization, Writing – review & editing. JW: Writing – review & editing, Methodology, Data curation. YL: Investigation, Writing – review & editing, Data curation. HW: Supervision, Funding acquisition, Writing – review & editing.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. This study was supported by Funding National Natural Science Foundation of China (No. 82472184); Key projects of NSFC Technology Department and Provincial Natural Science Foundation Outstanding Youth Project (No. JQ2021H002); the National Key Research and Development Program of China (No. 2021YFC2501800); Harbin Medical University Youth Fund (No. PYQN2023-9); Fundamental Research Funds for the Heilongjiang Provincial Universities (No. 2023-KYYWF-0192).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The authors declare that no Gen AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fmed.2025.1667488/full#supplementary-material

References

1. Uchino, S, Kellum, JA, Bellomo, R, Doig, GS, Morimatsu, H, Morgera, S, et al. Acute renal failure in critically ill Patients: a multinational multicenter study. JAMA. (2005) 294:813–8. doi: 10.1001/jama.294.7.813

Crossref Full Text | Google Scholar

2. Yegenaga, I, Hoste, E, Van Biesen, W, Vanholder, R, Benoit, D, Kantarci, G, et al. Clinical characteristics of patients developing ARF due to sepsis/systemic inflammatory response syndrome: results of a prospective study. Am J Kidney Dis. (2004) 43:817–24. doi: 10.1053/j.ajkd.2003.12.045

PubMed Abstract | Crossref Full Text | Google Scholar

3. Bagshaw, SM, Lapinsky, S, Dial, S, Arabi, Y, Dodek, P, Wood, G, et al. Acute kidney injury in septic shock: clinical outcomes and impact of duration of hypotension prior to initiation of antimicrobial therapy. Intensive Care Med. (2009) 35:871–81. doi: 10.1007/s00134-008-1367-2

PubMed Abstract | Crossref Full Text | Google Scholar

4. Morrell, ED, Kellum, JA, Pastor-Soler, NM, and Hallows, KR. Septic acute kidney injury: molecular mechanisms and the importance of stratification and targeting therapy. Crit Care. (2014) 18:501. doi: 10.1186/s13054-014-0501-5

PubMed Abstract | Crossref Full Text | Google Scholar

5. Poston, JT, and Koyner, JL. Sepsis associated acute kidney injury. BMJ. (2019) 364:k4891. doi: 10.1136/bmj.k4891

Crossref Full Text | Google Scholar

6. Chawla, LS, Bellomo, R, Bihorac, A, Goldstein, SL, Siew, ED, Bagshaw, SM, et al. Acute kidney disease and renal recovery: consensus report of the acute disease quality initiative (ADQI) 16 workgroup. Nat Rev Nephrol. (2017) 13:241–57. doi: 10.1038/nrneph.2017.2

PubMed Abstract | Crossref Full Text | Google Scholar

7. Zarbock, A, Nadim, MK, Pickkers, P, Gomez, H, Bell, S, Joannidis, M, et al. Sepsis-associated acute kidney injury: consensus report of the 28th acute disease quality initiative workgroup. Nat Rev Nephrol. (2023) 19:401–17. doi: 10.1038/s41581-023-00683-3

PubMed Abstract | Crossref Full Text | Google Scholar

8. Bruse, N, Pardali, K, Kraan, M, Kox, M, and Pickkers, P. Phenotype-specific therapeutic efficacy of ilofotase alfa in patients with sepsis-associated acute kidney injury. Crit Care. (2024) 28:50. doi: 10.1186/s13054-024-04837-y

PubMed Abstract | Crossref Full Text | Google Scholar

9. Peng, J, Tang, R, Yu, Q, Wang, D, and Qi, D. No sex differences in the incidence, risk factors and clinical impact of acute kidney injury in critically ill patients with sepsis. Front Immunol. (2022) 13:895018. doi: 10.3389/fimmu.2022.895018

PubMed Abstract | Crossref Full Text | Google Scholar

10. Li, X, Wu, R, Zhao, W, Shi, R, Zhu, Y, Wang, Z, et al. Machine learning algorithm to predict mortality in critically ill patients with sepsis-associated acute kidney injury. Sci Rep. (2023) 13:5223. doi: 10.1038/s41598-023-32160-z

PubMed Abstract | Crossref Full Text | Google Scholar

11. Tang, J, Huang, J, He, X, Zou, S, Gong, L, Yuan, Q, et al. The prediction of in-hospital mortality in elderly patients with sepsis-associated acute kidney injury utilizing machine learning models. Heliyon. (2024) 10:e26570. doi: 10.1016/j.heliyon.2024.e26570

PubMed Abstract | Crossref Full Text | Google Scholar

12. Lai, W, Kuang, M, Wang, X, Ghafariasl, P, Sabzalian, MH, and Lee, S. Skin cancer diagnosis (SCD) using artificial neural network (ANN) and improved gray wolf optimization (IGWO). Sci Rep. (2023) 13:19377. doi: 10.1038/s41598-023-45039-w

PubMed Abstract | Crossref Full Text | Google Scholar

13. Ansari, KA, and Ghafariasl, P. Advanced meta-ensemble machine learning models for early and accurate sepsis prediction to improve patient outcomes. (2024). arXiv. abs:2407.08107.

Google Scholar

14. Ghafariasl, P, Zeinalnezhad, M, and Chang, S. Fine-tuning pre-trained networks with emphasis on image segmentation: a multi-network approach for enhanced breast cancer detection. Eng Appl Artif Intell. (2025) 139:109666. doi: 10.1016/j.engappai.2024.109666

Crossref Full Text | Google Scholar

15. Ferrettini, G, Escriva, E, Aligon, J, Excoffier, JB, and Soulé-Dupuy, C. Coalitional strategies for efficient individual prediction explanation. Inf Syst Front. (2022) 24:49–75. doi: 10.1007/s10796-021-10141-9

PubMed Abstract | Crossref Full Text | Google Scholar

16. Johnson, AEW, Bulgarelli, L, Shen, L, Gayles, A, Shammout, A, Horng, S, et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci Data. (2023) 10:1. doi: 10.1038/s41597-022-01899-x

PubMed Abstract | Crossref Full Text | Google Scholar

17. Yang, Y, Khorshidi, HA, and Aickelin, U. A review on over-sampling techniques in classification of multi-class imbalanced datasets: insights for medical problems. Front Digit Health. (2024) 6:1430245. doi: 10.3389/fdgth.2024.1430245

PubMed Abstract | Crossref Full Text | Google Scholar

18. Chawla, NV, Bowyer, KW, Hall, LO, and Kegelmeyer, WP. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res. (2022) 16:321–57. doi: 10.48550/arXiv.1106.1813

Crossref Full Text | Google Scholar

19. Hidayaturrohman, QA, and Hanada, E. A comparative analysis of hyper-parameter optimization methods for predicting heart failure outcomes. Appl Sci. (2025) 15:3393. doi: 10.3390/app15063393

Crossref Full Text | Google Scholar

20. Kalimouttou, A, Lerner, I, Cheurfa, C, Jannot, AS, and Pirracchio, R. Machine-learning-derived sepsis bundle of care. Intensive Care Med. (2023) 49:26–36. doi: 10.1007/s00134-022-06928-2

PubMed Abstract | Crossref Full Text | Google Scholar

21. Zheng, R, Qian, S, Shi, Y, Lou, C, Xu, H, and Pan, J. Association between triglyceride-glucose index and in-hospital mortality in critically ill patients with sepsis: analysis of the MIMIC-IV database. Cardiovasc Diabetol. (2023) 22:307. doi: 10.1186/s12933-023-02041-w

PubMed Abstract | Crossref Full Text | Google Scholar

22. Yan, F, Chen, X, Quan, X, Wang, L, Wei, X, and Zhu, J. Association between the stress hyperglycemia ratio and 28-day all-cause mortality in critically ill patients with sepsis: a retrospective cohort study and predictive model establishment based on machine learning. Cardiovasc Diabetol. (2024) 23:163. doi: 10.1186/s12933-024-02265-4

PubMed Abstract | Crossref Full Text | Google Scholar

23. Zarbock, A, Kellum, JA, Schmidt, C, Van Aken, H, Wempe, C, Pavenstädt, H, et al. Effect of early vs delayed initiation of renal replacement therapy on mortality in critically ill patients with acute kidney injury: the ELAIN randomized clinical trial. JAMA. (2016) 315:2190–9. doi: 10.1001/jama.2016.5828

PubMed Abstract | Crossref Full Text | Google Scholar

24. Ahn, YH, Yoon, SM, Lee, J, Lee, SM, Oh, DK, Lee, SY, et al. Early sepsis-associated acute kidney injury and obesity. JAMA Netw Open. (2024) 7:e2354923. doi: 10.1001/jamanetworkopen.2023.54923

PubMed Abstract | Crossref Full Text | Google Scholar

25. Luo, X, Liu, D, Li, C, Liao, J, Lv, W, Wang, Y, et al. The predictive value of the serum creatinine-to-albumin ratio (sCAR) and lactate dehydrogenase-to-albumin ratio (LAR) in sepsis-related persistent severe acute kidney injury. Eur J Med Res. (2025) 30:25. doi: 10.1186/s40001-024-02269-6

PubMed Abstract | Crossref Full Text | Google Scholar

26. Zhao, CC, Nan, ZH, Li, B, Yin, YL, Zhang, K, Liu, LX, et al. Development and validation of a novel risk-predicted model for early sepsis-associated acute kidney injury in critically ill patients: a retrospective cohort study. BMJ Open. (2025) 15:e088404. doi: 10.1136/bmjopen-2024-088404

PubMed Abstract | Crossref Full Text | Google Scholar

27. Lee, HJ, Ko, BS, Ryoo, SM, Han, E, Suh, GJ, Choi, SH, et al. Modified cardiovascular SOFA score in sepsis: development and internal and external test. BMC Med. (2022) 20:263. doi: 10.1177/08850666241282294

Crossref Full Text | Google Scholar

28. Lee, CW, Kou, HW, Chou, HS, Chou, HH, Huang, SF, Chang, CH, et al. A combination of SOFA score and biomarkers gives a better prediction of septic AKI and in-hospital mortality in critically ill surgical patients: a pilot study. World J Emerg Surg. (2018) 13:41. doi: 10.1186/s13017-018-0202-5

PubMed Abstract | Crossref Full Text | Google Scholar

Glossary

AKI - Acute kidney injury

DT - Decision Tree

Enet - efficient Neural Network

KNN - K-Nearest Neighbor

LightGBM - Light Gradient Boosting Machine

MLP - Multi-Layer Perceptron

Multinom - Multinomial Mixture Model

RF - Random Forest

XGBoost - eXtreme Gradient Boosting

SA-AKI - Acute kidney injury associated with sepsis

ADQI - Acute Disease Quality Initiative

ML - machine learning

SHAP - Shapley Additive exPlanations

SQL - Structured Query Language

SOFA - Sequential Organ Failure Assessment

SMOTE - Synthetic Minority Oversampling Technique

Bal Accuracy - Balanced Accuracy

F Meas - F Measure

J index - Jaccard index

Kap - Kappa

MCC - Matthews Correlation Coefficient

NPV - Negative Predictive Value

PPV - Positive Predictive Value

AUC - Area Under the Curve

Sens - Sensitivity

Spec - Specificity

Keywords: acute kidney injury, MIMIC-IV database, machine learning, prediction model, sepsis

Citation: Quan Z, Han Z, Zeng S, Wen L, Wang J, Li Y and Wang H (2025) Stage prediction of acute kidney injury in sepsis patients using explainable machine learning approaches. Front. Med. 12:1667488. doi: 10.3389/fmed.2025.1667488

Received: 16 July 2025; Accepted: 01 October 2025;
Published: 15 October 2025.

Edited by:

Nozomi Takahashi, University of British Columbia, Canada

Reviewed by:

Harikrishna Choudary Ponnam, Summa Health System, United States
Parviz Ghafariasl, Kansas State University Olathe, United States

Copyright © 2025 Quan, Han, Zeng, Wen, Wang, Li and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Hongliang Wang, aWN1d2FuZ2hvbmdsaWFuZ0BocmJtdS5lZHUuY24=

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.