Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Immunol., 19 January 2026

Sec. Alloimmunity and Transplantation

Volume 16 - 2025 | https://doi.org/10.3389/fimmu.2025.1745873

This article is part of the Research TopicComputational and Machine Learning Approaches in Solid Organ and Hematopoietic Stem Cell TransplantationView all articles

Machine learning-based prediction of one-year mortality after alloHCT identifies the impact of pre-transplant immunity and inflammation

Thomas Meyer,Thomas Meyer1,2Robert MeyerRobert Meyer3Maren HackenbergMaren Hackenberg4Daniela OelkeDaniela Oelke5Laura GengenbachLaura Gengenbach1Christoph RummeltChristoph Rummelt1Hauke WilckenHauke Wilcken1Kristina Maas-Bauer,Kristina Maas-Bauer1,6Ralph WschRalph Wäsch1Justus DuysterJustus Duyster1Hartmut BertzHartmut Bertz1Jesús Duque-Afonso,Jesús Duque-Afonso1,2Jürgen FinkeJürgen Finke1Robert ZeiserRobert Zeiser1Claudia Wehr*Claudia Wehr1*
  • 1Department of Internal Medicine I, Hematology, Oncology and Stem Cell Transplantation, Medical Center - University of Freiburg, Faculty of Medicine, Freiburg, Germany
  • 2Collaborative Research Institute Intelligent Oncology (CRIION), Freiburg, Germany
  • 3Alcemy GmbH, Berlin, Germany
  • 4Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany
  • 5Offenburg University of Applied Sciences, Offenburg, Germany
  • 6IMMediate Advanced Clinician Scientist-Program, Department of Medicine II, Medical Center – University of Freiburg and Faculty of Medicine, University of Freiburg, Freiburg, Germany

Accurate prediction of mortality after allogeneic hematopoietic stem cell transplantation (alloHCT) is essential for individualized treatment decisions, yet existing clinical risk scores capture only a limited number of variables and show modest predictive performance. In our single-center retrospective analysis, we included data from 909 adult patients with hematologic malignancies undergoing alloHCT. We used 31 features to build machine-learning models to predict death within the first year after alloHCT. These features included established clinical risk factors together with pre-transplant lymphocyte subsets and inflammatory markers. Among four models, a random forest algorithm showed the best performance (AUC = 0.773) and retained good generalizability in an independent test set (AUC = 0.748). SHapley Additive exPlanations (SHAP)-based interpretation of the machine-learning models showed that age together with five easily measurable pre-transplant immunological and inflammatory parameters influenced the outcome: pre-transplant CD4+, CD8+, and B-lymphocyte counts, albumin, and C-reactive protein (CRP) levels. Based on these features, our random forest approach outperformed established clinical risk scores (HCT-CI, EASIX, rDRI, mGPS) in predicting one-year mortality after alloHCT and more effectively distinguished patients at low and high risk of an adverse outcome. Our study shows that machine-learning-based models can not only predict patient outcomes after alloHCT but also serve as powerful tools for data exploration, confirming the prognostic relevance of pre-transplant inflammation while uncovering the critical role of lymphocyte subsets as previously unknown risk factors. External validation in independent multicenter cohorts will be required to confirm generalizability.

Introduction

Allogeneic hematopoietic stem cell transplantation (alloHCT) is a potentially curative treatment option for patients with hematologic malignancies. Over recent decades, survival following alloHCT has steadily improved, with current estimates showing ~20-30% overall mortality and ~15% non-relapse mortality within the first year post-transplant (1). Accurately estimating individual mortality risk after alloHCT is essential for informed decision-making by both physicians and patients, and also plays a critical role in safeguarding donors. Several prognostic scores have been developed to predict overall and/or non-relapse mortality (27). The Hematopoietic Cell Transplantation Comorbidity Index (HCT-CI) quantifies comorbidity burden and predicts non-relapse mortality, and is also widely used to estimate overall survival (7, 8). The refined Disease Risk Index (rDRI) reflects disease-related risks (2). The Endothelial Activation and Stress Index (EASIX) assesses endothelial dysfunction (5, 6), and the modified Glasgow Prognostic Score (mGPS) uses albumin and CRP for outcome prediction (9, 10). However, current scoring systems lack the desirable accuracy for individualized outcome prediction after alloHCT, as demonstrated by reported AUC values for one-year survival prediction typically ranging between 0.53 and 0.64 (8).

To address the potential limitations in the complexity of traditional scores, machine learning-based approaches have been implemented for mortality prediction, demonstrating promising results (1115). Nevertheless, these models lack the explainability aspect of a scoring system. Machine learning has also been employed to explore the impact of individual features on relapse or mortality risk (1618). While the prognostic relevance of pre-transplant immunological factors is increasingly recognized (9), most studies have focused on post-transplant immune reconstitution and its association with transplantation characteristics. In parallel, accumulating evidence highlights that a heightened systemic inflammatory state before transplantation represents an important and independent determinant of post-transplant outcomes (1923).

In this single-center study of 909 adults undergoing alloHCT, we set out to identify new, accessible predictors, focusing on immunological features and incorporating emerging inflammatory risk factors, and to determine their relative importance for one-year mortality after transplantation using SHapley Additive exPlanations (SHAP) values (24, 25). We applied machine learning to analyze pre-transplant disease, donor, and recipient factors, including lymphocyte subset counts and inflammatory markers to predict one-year mortality within a rigorous nested cross-validation framework and independent test validation step. The approach yielded a well-performing, disease-agnostic generalizable model. Model interpretation underscored the prognostic value of pre-transplant immune and inflammatory status, which we benchmarked against established clinical risk scores (HCT-CI, rDRI, EASIX, mGPS) demonstrating superior predictive performance of our model.

Methods

Patient and donor characteristics

This section describes the patient cohort, variable definitions, and data sources used for model development. The work was conducted in accordance with the tenets of the Declaration of Helsinki, approved by the local ethics committee (EK-FR: 22-1490-S1-retro). Informed consent for data analysis was obtained from all patients. We retrospectively identified all patients who underwent first alloHCT between 2008 and 2023 and for whom lymphocyte subsets data were available in our institutional database. Among 1,346 first alloHCT recipients, 909 patients with available lymphocyte subset data were included (Table 1, Supplementary Figure S1A). Data were extracted retrospectively from our prospectively maintained database. The 31 features used in model development are shown in Table 2 and represent routinely available variables across transplant centers.

Table 1
www.frontiersin.org

Table 1. Patient characteristics.

Table 2
www.frontiersin.org

Table 2. Univariate analysis of 31 features used for machine learning models.

We defined any disease status other than a partial response (PR) or complete response (CR) recorded before alloHCT as active disease. For this study, donor-recipient mismatch was defined as a 5–9 out of 10 human leukocyte antigen (HLA) match. Conditioning intensity was quantified using the Transplant Conditioning Intensity (TCI) score, calculated according to published criteria (26). Lymphocyte subsets were measured as described previously (21) between day -20 and day -6 before the start of conditioning for alloHCT. For laboratory parameters, we included lactate dehydrogenase (LDH), platelet count, creatinine, albumin, C-reactive protein (CRP), and alanine aminotransferase (ALT). For each patient, the median value within the same pre-transplant time window (day -20 to -6) was used.

In addition to these variables, we assessed four established clinical scores: the Hematopoietic Cell Transplantation-Comorbidity Index (HCT-CI) (7), the refined Disease Risk Index (rDRI) (2), the Endothelial Activation and Stress Index (EASIX) (27), and the modified Glasgow Prognostic Score (mGPS) (9, 10). These scores were computed according to their published definitions. For the EASIX and mGPS scores, the above-mentioned median laboratory values were used for calculation. For Kaplan-Meier analyses concerning EASIX, we applied the thresholds published by Shouval et al. (8), which approximately correspond to the quantile-based categories used in other studies (27, 28). One patient had a missing HCT-CI value due to incomplete pulmonary function testing, 11 patients had missing mGPS values owing to unavailable pre-transplant albumin measurements, and 51 patients had missing rDRI scores because the index was not applicable to their underlying disease according to its definition (2).

Model generation and feature analysis

To ensure transparent, unbiased, and reproducible model development, we implemented a machine-learning pipeline combining nested cross-validation, SHAP-based feature selection, and hyperparameter optimization (Figure 1). The study was conducted in accordance with the TRIPOD-AI (Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis-Artificial Intelligence extension) guidelines (29). A version of the TRIPOD-AI checklist is provided in the Supplementary Materials (Supplementary Table S1).

Figure 1
Flowchart illustrating a machine learning workflow. Panel A shows a dataset split into training (seventy-five percent) and test sets (twenty-five percent). The training data undergoes repeated stratified five-fold cross-validation in panel B. The process involves shuffling and repeating, with hyperparameter tuning within an internal loop detailed in panel C. After each repetition, metrics like AUC and SHAP values are recorded in panel D. The best model undergoes final validation in panel E. Arrows indicate the process flow, highlighting shuffling and repetition.

Figure 1. Visualization of model generation. The cohort (n = 909) was split 75/25 into a training set and a locked test set (gray, A). The training data underwent 10 × repeated 5-fold nested cross-validation (orange): The outer folds (B) supplied 50 AUC estimates per algorithm and patient-level SHAP values (D), while the inner folds were used to tune hyperparameters (C). SHAP values were recalculated whenever a patient re-entered an outer training fold and then averaged (example in purple). The final model was refitted on the complete training set and evaluated once on the independent test set to confirm generalizability (E).

To develop and evaluate predictive models for one-year survival after alloHCT, we implemented a machine-learning pipeline combining nested cross-validation, SHAP-based feature selection, and hyperparameter optimization (Figure 1). We selected death within the first year after alloHCT as the primary categorical endpoint for model development. This time frame was chosen because of its clinical relevance and relatively high event frequency, enabling robust statistical modeling based on baseline variables. Moreover, early mortality is often strongly influenced by pre-transplant patient-, donor-, and disease-related factors, making it a suitable outcome for prediction based on baseline variables. Follow-up was complete during the first year after transplantation, with no censoring before the one-year time point. Categorical features were one-hot encoded. The overall degree of missingness was low, with only 14 missing values out of 28,179 data points. Median imputation was applied where necessary for all models except XGBoost, which inherently handles missing values. Imputation was implemented within the machine-learning pipeline and therefore performed separately within each cross-validation training fold, with imputer parameters learned exclusively from the respective training data and applied to the corresponding validation or test folds, thereby preventing information leakage. Imputation affected four variables: pre-transplant albumin (n=11), pre-transplant ALT (n=1), and donor gender and donor age (n=1 each). The single missing values for donor gender and donor age arose from a cord blood transplantation, for which donor characteristics (sex and age) are not consistently defined or recorded.

We compared four algorithms: logistic regression, gradient boosting, XGBoost, and random forest. Before model generation, a 75/25 stratified data split was performed to maintain an event frequency of ~25% in both the 75% training and the 25% test set (Figure 1A). The test set was only used after model selection to ensure generalizability.

Using the training set, we performed 10 repeats of 5-fold nested cross-validation. The outer loop (Figure 1B) yielded unbiased performance estimates. For each of the 50 outer folds, we refit the model with the hyperparameters selected in the inner loop, and both the area under the receiver operating characteristic curve (AUC) and patient-level SHAP values were computed (Figure 1D). Within each outer training split, hyperparameters were optimized by randomized search over a predefined grid using 5-fold cross-validation with receiver operating characteristic (ROC) AUC scoring (Figure 1C).

To avoid information leakage and ensure unbiased feature selection, feature selection was performed within the inner cross-validation loop using a SHAP-based approach. Feature selection was done inside this inner cross-validation via a SHAP-based selector implemented in the pipeline: a lightweight model matching the main algorithm was fit on each inner training fold, features were ranked by mean absolute SHAP values, and the top k features were passed forward, with k treated as a tunable hyperparameter. The final model for each outer fold was refitted on the full outer training data with the hyperparameters and features selected through randomized grid search, then evaluated on the held-out outer test fold, which prevents information leakage (Figure 1E).

For feature analysis, we used SHAP values (24, 25). SHAP values decompose individual predictions into feature-specific contributions, enabling both patient-level and global interpretation of feature importance and interactions. Due to the cross-validation scheme, we acquired 10 sets of SHAP values per patient in the training set from the outer loop. We then averaged these SHAP values for further interpretation.

For the validation analysis, hyperparameter selection for the best-performing algorithm was done on the training set using the same grid and randomized search as in the 10×5-fold nested cross-validation, with 2,500 iterations in a 5-fold cross-validation. We did not perform post hoc probability calibration, as the primary comparisons between models and clinical risk scores were based on rank-based metrics (AUCs) and risk group stratifications derived from predicted probability percentiles in the training cohort. Moreover, calibration of the final random forest model on the independent test set was already acceptable, with a Brier score of 0.172 Supplementary Figure S1B).

Models were built and evaluated using scikit-learn (30) on Python 3.11 (31). SHAP values were computed using the SHAP package (24, 25).

Statistical analysis and additional information

Additional statistical analyses were performed to assess the independent prognostic contribution of key immune and inflammatory features and to benchmark model performance against established clinical scores.

To assess whether pre-transplant lymphocyte subsets represented independent risk factors for one-year mortality, we performed a multivariable logistic regression analysis. SHAP values from the best-performing machine-learning model were smoothed using locally weighted scatterplot smoothing (LOWESS) and subsequently used to evaluate non-linear associations between each feature and the model-predicted risk. Specifically, for pre-transplant B, CD4+ T, CD8+ T cells, and CRP, thresholds were defined at points where the smoothed SHAP values crossed zero, indicating a shift in predictive direction. These thresholds were used to construct binary risk indicators. These risk flags, along with the top remaining SHAP-ranked features (excluding the original continuous lymphocyte counts) and additional covariates selected based on clinical relevance and presumed interaction potential, were included in a multivariable logistic regression model fitted on the combined dataset. To evaluate the incremental contribution of risk variables, reduced models excluding these flags were also constructed. Likelihood ratio tests were then performed to compare the full and reduced models.

To benchmark the predictive performance of the random forest model against established clinical risk scores, we compared ROC-derived AUC and test-set risk stratifications. Specifically, the mean AUC from the outer folds of the nested cross-validation (± standard deviation) was used as the unbiased estimate of model performance on the training data, while test-set AUCs were computed on the independent holdout cohort restricted to feature-complete cases. For the clinical scores (HCT-CI, rDRI, EASIX, and mGPS), AUCs were calculated separately on the training and test set. For interpretability, test-set patients were divided into three or four risk groups (tertiles or quartiles) based on predicted probabilities from the training set, and Kaplan-Meier curves were generated to visualize overall survival across risk strata in comparison with the respective clinical scores.

Writing assistance was provided using large language model-based tools to improve clarity and readability under full author supervision.

Results

Patient cohort

We included 909 patients who underwent alloHCT between 2008 and 2023, with a median follow-up of survivors of 2,256 days (Table 1). The cohort was broadly distributed in age and gender, with acute myeloid leukemia (AML) as the leading diagnosis, followed by myelodysplastic neoplasia (MDS), lymphoma, and multiple myeloma. A majority of patients had active disease at the time of transplant. Matched unrelated and related donors with peripheral blood stem cell grafts were most often used. One third of patients experienced acute graft-versus-host disease (GvHD) beyond grade I, and a similar proportion developed moderate to severe chronic GvHD. During follow-up, about half of the patients died, with disease progression (PD) and infections accounting for most deaths. Of these, 232 patients (26%) died within the first year after alloHCT. Among these early deaths, 102 were due to PD (44%) and 91 (39%) were due to infections.

We tested a total of 31 clinical, immunological, inflammatory, and transplant-related variables for association with one-year mortality after alloHCT (Table 2). Of these, a substantial number showed statistically significant differences between survivors and non-survivors, underscoring the complexity of risk prediction. Patients who died within the first year were significantly older and had poorer pre-transplant performance status, elevated LDH, CRP, and creatinine levels, and lower platelet and albumin levels. Regarding immune status, lower pre-transplant counts of B cells, CD4+ and CD8+ T-cells were significantly associated with one-year mortality. Disease-related factors also contributed: active disease at the time of transplantation was more common among non-survivors, and patients with lymphoma were significantly overrepresented in the early death group. Transplant-related factors also differed between groups: patients who died within one year more often underwent HLA-mismatched transplantation, while GvHD prophylaxis with anti-T lymphocyte globulin (ATLG) was used more frequently in survivors. Use of post-transplant cyclophosphamide (PT-CY) showed a nonsignificant trend toward more frequent application in the early mortality group.

Given the number of significant associations, potential interactions, and non-linear relationships, classical multivariable regression would be limited for this cohort. Therefore, we chose machine learning-based modeling approaches, which are better suited to capture complex, high-dimensional, and non-linear patterns in clinical data.

Random forest outperforms other machine-learning models in predicting one-year mortality after alloHCT

The goal of our modeling approach was to predict one-year mortality after alloHCT as a binary outcome. Model performance was evaluated using a 10× repeated 5-fold nested cross-validation on the training set (75% of the data, Figure 1), with the remaining 25% reserved as an independent test set.

Across all 50 outer folds, random forest achieved the highest median AUC (0.773), followed by XGBoost (0.761), gradient boosting (0.750), and logistic regression (0.742), as shown in Figure 2A (adjusted p < 0.01 for all comparisons with the random forest). A Friedman test showed a significant overall difference (p < 0.01). Pairwise Wilcoxon tests (Bonferroni-corrected) indicated that XGBoost and random forest each outperformed logistic regression (p < 0.05), and random forest also outperformed gradient boosting and XGBoost (p < 0.01). Based on its superior performance, random forest was selected as the final model for our SHAP value analysis.

Figure 2
Box and whisker plots compare AUC performance of logistic regression, gradient boosting, XGBoost, and random forest on a training set, with random forest scoring highest. An ROC curve graph compares the test and training performance of random forest, showing test AUC of 0.748 and training mean ROC AUC of 0.771, against a chance line.

Figure 2. Model performance comparison and validation. (A) Area under the curve (AUC) values from 10× repeated 5-fold cross-validation for logistic regression, gradient boosting, XGBoost, and random forest (* p < 0.01 for all comparisons with the random forest). (B) Receiver operating characteristic (ROC) curves for the random forest model. The gray line shows the mean ROC across 50 folds (AUC of mean curve = 0.771; shaded area = ± 1 SD). The solid red line represents the ROC on the independent held-out test set (AUC = 0.748).

When applied to the test set, the random forest model maintained robust performance, achieving an AUC of 0.748, indicating good generalizability (Figure 2B). Decision curve analysis further confirmed the clinical utility of the model, showing consistent net benefit across relevant threshold probabilities in both the training and test set (Supplementary Figure S1C). Calibration of the final random forest model was assessed on the independent test set. The calibration curve demonstrated good alignment in the low-to-intermediate risk range, with a tendency toward modest overestimation at higher predicted probabilities (Brier score of 0.172, calibration intercept −0.14; slope 1.59; Supplementary Figure S1B).

To assess whether median imputation influenced model performance, we performed a complete-case sensitivity analysis restricted to patients without missing predictor values (training: 672/681; test: 224/228). Restricting to complete cases resulted in negligible changes in discrimination (ΔAUC +0.0002 in the training set and −0.0018 in the independent test set), indicating that the low level of missing data and the imputation strategy did not affect performance.

SHAP values illustrate feature importance for outcome prediction

SHAP values were used to quantify each feature’s contribution to the predicted probability of death within the first year after alloHCT (Figure 3).

Figure 3
A composite image with four panels labeled A to D. Panel A is a horizontal bar chart showing feature importance, with “Age at alloHCT” having the highest SHAP value impact. Panel B is a scatter plot of SHAP values per patient with blue to pink gradient indicating low to high feature values. Panels C and D contain scatter plots showing SHAP feature dependency for age, albumin, CRP, and pre-transplant cells (CD4+, B-cells, CD8+), indicating how different factors impact predictions, with age often on a secondary y-axis.

Figure 3. Feature importance and SHAP-based interpretation of the random forest model. (A) Bar plot of the mean absolute SHAP values for the top ten features across all ten cross-validation repeats. The SHAP value represents the average magnitude of each feature’s contribution to predicting death within the first year after alloHCT. (B) SHAP summary plot showing per-patient SHAP values for each of the top 10 features. Each point represents one patient, with color indicating the feature value (red = high, blue = low). Points to the right of the vertical axis contribute to the prediction of death within the first year, while those on the left contribute against it. (C, D) SHAP dependence plots. Each dot represents one patient. SHAP values (y-axis) indicate the direction and magnitude of each feature’s contribution to the model output, while the x-axis shows raw feature value. Dashed vertical and horizontal lines denote observed inflection points, suggesting threshold effects. Color gradients represent interacting feature values. (C) Age at alloHCT, pre-transplant albumin, and C-reactive protein (CRP) levels, showing their influence on predicted one-year mortality. (D) Pre-transplant lymphocyte subpopulations (CD4+, CD8+, and B-cells), highlighting non-linear associations with outcome.

The feature with the highest impact was age at alloHCT (mean |SHAP| = 0.028), followed by pre-transplant albumin levels (0.027) and pre-transplant CD4+ T-cell count (0.021), CRP levels (0.018), CD8+ T-cell count (0.018), and B-cell count (0.018). Notably, three immune cell subsets (CD4+, CD8+, and B cells) and two inflammatory markers (albumin and CRP) accounted for five of the six most influential predictors, all of which were quantitative and non-disease-specific (Figure 3A, Supplementary Table S2).

At the individual patient level (Figure 3B), higher age, lower albumin, higher CRP, and deviations in pre-transplant CD4+, CD8+, and B-cell counts contributed most strongly to predicted one-year mortality. SHAP dependence plots revealed distinct patterns: age and albumin showed approximately linear associations with risk, CRP exhibited a plateau effect at higher values, and lymphocyte subsets demonstrated non-linear relationships (Figures 3C, D). In particular, low B-cell counts were associated with a steep increase in risk, while CD4+ and CD8+ T-cell counts displayed U-shaped relationships, with increased risk at both low and high values. These U-shaped relationships persisted across disease categories (Supplementary Figure S2A).

These patterns were consistently observed across all tree-based models, and SHAP analysis of the final random forest on the independent test set showed similar feature rankings and dependency curves, supporting the robustness of these findings.

To assess the independent prognostic value of immune and inflammatory features, we fitted a multivariable logistic regression model using one-hot-encoded indicators derived from SHAP-defined thresholds for CD4+, CD8+, and B-cell counts, as well as CRP (Supplementary Figure S2B, Supplementary Table S2), capturing the nonlinear relationships observed in the SHAP dependency plots. These indicators remained independently associated with one-year mortality after adjustment for clinically relevant covariates (Supplementary Figure S2C). Excluding all three lymphocyte indicators significantly worsened model fit (LR = 24.3, df = 3, p < 0.0001), as did removal of CRP and albumin (LR = 12.1, df = 2, p < 0.01).

Random forest outperforms established clinical risk scores in prediction and risk discrimination

To benchmark our random forest (RF) model against established clinical risk scores, we compared its performance with HCT-CI, rDRI, EASIX, and mGPS. ROC analysis demonstrated that the RF model clearly outperformed all four scores in both the training and independent test sets (Figure 4A). In the nested cross-validation of the training set, the RF achieved the highest mean area under the ROC curve (AUC = 0.771), whereas the clinical scores showed lower discriminatory power (HCT-CI AUC = 0.599, rDRI AUC = 0.644, EASIX AUC = 0.644, and mGPS AUC = 0.590). When applied to the independent test set, the RF model retained robust generalizability with an AUC of 0.747, again exceeding the performance of all established scores. Interestingly, rDRI performed comparatively better on the test set (AUC = 0.690) than in the training set, while HCT-CI (AUC = 0.562), EASIX (AUC = 0.555), and mGPS (AUC = 0.617) remained substantially lower.

Figure 4
Panel A displays ROC curves for training and test set performance, showing Random Forest and various models with AUC values. Panel B presents Kaplan- Meier curves using tertile cutoffs for Random Forest 3 risk groups, HCT-CI, and mGPS, displaying overall survival probabilities post alloHCT. Panel C shows Kaplan-Meier curves using quartile cutoffs for Random Forest 4 risk groups, rDRI, and EASIX, illustrating survival probabilities across various risk categories.

Figure 4. Model performance and model comparison to established clinical scores. (A) Receiver operating characteristic (ROC) curves summarizing model discrimination. The left panel shows the mean nested cross-validation ROC for the random forest (RF) on the training set (shaded area = ± 1 SD) compared with clinical scores (HCT-CI, rDRI, EASIX, and mGPS). The right panel shows the ROC curves on the independent test set restricted to feature-complete cases without missing values. (B) Kaplan-Meier curves for RF, HCT-CI, and mGPS. For RF, test-set Kaplan-Meier curves were generated using three risk groups (tertiles) defined from training-set predicted probabilities. (C) Kaplan-Meier curves for RF, rDRI, and EASIX. For RF, test-set Kaplan-Meier curves were generated using four risk groups (quartiles) defined from the training set.

Beyond overall discrimination, we evaluated the model’s ability to stratify survival risk within the test cohort (Figures 4B, C). To ensure comparability with conventional scoring systems that categorize patients into three risk levels, we defined RF-based tertiles of predicted mortality probability from the training set and compared them directly with HCT-CI and mGPS (Figure 4B). The RF model clearly separated test-set patients into three distinct survival trajectories (global log-rank p < 0.001), outperforming both HCT-CI (p = 0.26) and mGPS (p < 0.001). Compared with these scores, the RF identified a true low-risk group with a one-year overall survival of 94% and a clearly defined high-risk group with only 57% survival. In contrast, the corresponding low-risk groups of HCT-CI and mGPS showed substantially lower survival rates of 78% and 81%, respectively. This highlights the RF model’s ability to distinguish both patients with excellent post-transplant outcomes and those at markedly increased risk of early mortality after alloHCT.

When extending the analysis to four groups to mirror the four-tiered risk stratifications of the rDRI and EASIX, the RF model again demonstrated a clear and stepwise separation across all strata (global log-rank p < 0.001; Figure 4C, left). One-year overall survival declined progressively from 96% in the lowest-risk quartile (Q1) to 55% in the highest-risk quartile (Q4), illustrating the model’s ability to reliably identify both patients at very low and very high risk of post-transplant mortality. Among the comparator scores, rDRI showed overall significance (p < 0.001; Figure 4C, middle) and successfully identified a very-high-risk group with poor outcomes (1-year OS 41%). However, in our test population, rDRI did not define a genuine low-risk group, as only four patients were classified as low risk, all of whom survived beyond one year. In contrast, EASIX produced only modest separation (p = 0.03; Figure 4C, right), with overlapping survival between intermediate groups (1-year OS 82-70%).

Taken together, the RF model demonstrated both superior predictive performance and more meaningful clinical stratification than HCT-CI, rDRI, EASIX, or mGPS. Notably, the RF approach is largely based on routine laboratory parameters and remains disease-agnostic, heavily relying on immunological and inflammatory features. By leveraging these predictors, the RF reliably identified patients with excellent one-year survival as well as those with markedly elevated early mortality risk, offering a refined and potentially easily reproducible framework for individualized risk assessment after alloHCT (Figures 4A–C). Complementary Kaplan-Meier analyses based on out-of-fold training predictions confirmed consistent risk-group separation (Supplementary Figure S1D).

Discussion

Accurately predicting mortality after alloHCT is a long-standing challenge. Conventional scores offer quick assessments but capture only a small slice of the complex interaction between recipient, disease, and transplant factors, and their performance has not kept pace with changes in conditioning regimens, donor selection, and supportive care. A recent registry-wide machine-learning (ML) effort has shown that flexible algorithms outperform rule-based scores in this setting (12).

By analyzing 31 routinely available pre-transplant variables in 909 recipients with diverse hematologic malignancies, our study demonstrates that machine-learning approaches can improve predictive accuracy. Importantly, these models also identify clinically meaningful and biologically interpretable features. Among the algorithms tested, the random forest model achieved the best discrimination, with a median AUC of 0.773 in cross-validation and 0.748 on the independent test set, outperforming established clinical scores, and demonstrating performance highly competitive with other machine-learning models reported in the alloHCT field (1113, 32). Given the exceptional biological and clinical complexity of allogeneic transplantation, where outcomes are shaped by intricate interactions between patient, disease, donor, and procedural factors, this improvement in discrimination represents a meaningful advance and highlights the robustness and clinical relevance of the model. SHAP interpretation ranked immune and inflammatory markers such as pre-transplant CD4+, CD8+, B-cell counts, albumin, and CRP together with patient age at transplantation as the six most influential predictors of one-year mortality. Remarkably, all these parameters are objective, disease-agnostic laboratory measures that are readily available in routine pre-transplant evaluation, underscoring their potential for broad clinical applicability. Our random forest model driven by immunological and inflammatory predictors outperformed established composite indices such as the HCT-CI, rDRI, EASIX, or mGPS. The independent prognostic relevance of lymphocyte subsets and inflammatory markers was further supported by multivariable analysis. This work highlights the exploratory power of machine-learning approaches to uncover clinically relevant, data-driven patterns beyond the scope of conventional risk models (16, 17).

In selecting our modeling framework, we deliberately focused on binary classification algorithms rather than time-to-event models such as random survival forests. Because our follow-up was complete for the first year after transplantation with no censoring before the one-year endpoint, a binary framework was both statistically appropriate and computationally efficient. Moreover, this approach enabled the use of TreeSHAP (25), which allows consistent and interpretable estimation of feature importance across nested cross-validation folds, ensuring generalizable insights into the drivers of risk. Comparable interpretability is currently not feasible for survival-based models or deep-learning approaches within such a rigorous validation design. Thus, tree-based classification models offered an optimal balance between transparency, reproducibility, and computational efficiency without loss of clinically relevant information.

Our work is based on data from a single-center cohort, which may limit generalizability to broader populations. The majority of patients received PBSC grafts and ATLG-based serotherapy, and our cohort tended to be older with more active disease, which may differ from other transplant populations (1, 33). To partly mitigate such center-specific bias, we implemented a rigorous nested cross-validation framework and an independent hold-out test set, ensuring robust internal validation. This strategy increases confidence that the identified predictive patterns reflect true biological and clinical relationships. Hence, our findings support the integration of simple pre-transplant immune and inflammatory profiling into standard alloHCT evaluation to improve risk stratification and guide clinical decision-making, thereby facilitating much-needed multicenter validation and broader implementation.

As the importance of CD4+, CD8+, and B-cell counts as predictors of mortality after alloHCT is a novel finding, the biological relevance of these pre-transplant immune markers warrants particular attention. Low lymphocyte counts likely reflect cumulative treatment burden, disease activity, and immune senescence. Conversely, high T-cell counts may indicate dysregulated or exhausted immunity. Either extreme could compromise early engraftment, infection control, or graft-versus-leukemia effects. Such non-linear relationships are difficult to capture with classic scores but become visible in SHAP dependence plots and similar tools for model explanation (17). These results suggest that simple flow-cytometry-based immune profiling adds an informative dimension to pre-transplant evaluation that is not reflected in existing risk algorithms. This finding also introduces a new perspective on immune profiling in transplant hematology. Whereas most work to date centers on post-transplant immune reconstitution and its association with survival benchmarks (21, 3436), our findings emphasize the value of baseline immune competence.

Our machine-learning approach achieved good performance by incorporating the granularity of lymphocyte subsets. In contrast, total lymphocyte counts alone would not capture the complex, non-linear relationships between B- and T-cell levels and survival. Natural killer (NK) cell counts, however, were not identified as important predictors in our models. A possible explanation is that NK cell reconstitution occurs very rapidly after alloHCT (21, 37), with the newly emerging NK cell pool consisting entirely of donor-derived cells, making pre-transplant recipient NK levels potentially less impactful for post-transplant survival.

Beyond immunological factors, our results underscore the growing recognition of systemic inflammation as a key determinant of post-transplant outcome. Markers such as CRP and albumin, established as powerful prognostic indicators in solid malignancies through the modified Glasgow Prognostic Score (mGPS) framework (10), have recently been validated in the alloHCT setting (9). In our work, integrating these inflammatory markers into a machine-learning framework alongside immune cell subsets not only reaffirmed their prognostic relevance but also highlighted their strong, quantifiable impact on early post-transplant mortality. This convergence of evidence from conventional and data-driven approaches emphasizes that systemic inflammation represents a biologically and clinically robust axis of risk-one that complements immunological competence and enhances prediction in alloHCT.

In summary, our disease-agnostic machine-learning model, built from routine baseline data, improves one-year mortality prediction after alloHCT and highlights pre-transplant lymphocyte counts (CD4, CD8, B lymphocytes) and systemic inflammatory markers (CRP, albumin) as complementary risk factors. Identifying and confirming such emerging predictors will sharpen pre-transplant risk assessment and support better shared decision-making between clinicians and patients. In clinical practice, this approach may facilitate more individualized pre-transplant counseling and risk-adapted planning of conditioning intensity, donor selection, and prophylactic or supportive care strategies. Future work will focus on external validation of this model in independent multicenter cohorts to confirm generalizability and to evaluate its feasibility for integration into routine pre-transplant risk assessment. Moreover, these findings will stimulate mechanistic research into how pre-existing immune and inflammatory states shape post-transplant outcomes, potentially uncovering new biological targets for intervention.

Data availability statement

The data analyzed in this study is subject to the following licenses/restrictions: The dataset contains sensitive personal health information from our institutional clinical database and cannot be fully anonymized under local data protection regulations. Access can therefore only be granted upon reasonable request. Requests to access these datasets should be directed to Y2xhdWRpYS53ZWhyQHVuaWtsaW5pay1mcmVpYnVyZy5kZQ==.

Ethics statement

The studies involving humans were approved by Ethics committee Freiburg (EK-FR: 22-1490-S1-retro). The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required from the participants or the participants’ legal guardians/next of kin in accordance with the national legislation and institutional requirements.

Author contributions

TM: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing. RM: Methodology, Resources, Supervision, Writing – original draft, Writing – review & editing. MH: Methodology, Writing – original draft, Writing – review & editing. DO: Writing – original draft, Writing – review & editing. LG: Resources, Writing – original draft, Writing – review & editing. CR: Resources, Writing – original draft, Writing – review & editing. HW: Resources, Writing – original draft, Writing – review & editing. KM-B: Resources, Writing – original draft, Writing – review & editing. RW: Resources, Writing – original draft, Writing – review & editing. JD: Resources, Writing – original draft, Writing – review & editing. HB: Resources, Writing – original draft, Writing – review & editing. JD-A: Resources, Writing – original draft, Writing – review & editing. JF: Resources, Supervision, Writing – original draft, Writing – review & editing. RZ: Resources, Supervision, Writing – original draft, Writing – review & editing. CW: Conceptualization, Formal analysis, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Writing – original draft, Writing – review & editing.

Funding

The author(s) declared that financial support was not received for this work and/or its publication. We acknowledge support by the Open Access Publication Fund of the University of Freiburg.

Acknowledgments

We thank Irmgard Matt for thoroughly maintaining our database, Melanie Sieder and Darina Siegmund for performing routine diagnostics on lymphocyte counts and Regine Mayer for IT support. The work of Thomas Meyer was funded by the Mertelsmann Foundation. The work of Maren Hackenberg was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) -Project-ID 499552394 -SFB 1597 Small Data.

Conflict of interest

Author RM was employed by company Alcemy GmbH, Berlin, Germany. JD-A received speakers honoraria from Roche, Amgen, Riemser, SOBI, IPSEN, Abbvie, Beigene, NovoNordisk, AstraZeneca and Lilly and travel support from Lilly, Roche, Gilead, IPSEN, SOBI, and Beigene RW consulted for and/or received honoraria from Abbvie, Alexion, Amgen, BMS, Johnson & Johnson, Kite/Gilead, Novartis, Pfizer, Sanofi, Takeda and received research funding from Johnson & Johnson and Sanofi all paid to UKF. CR has received honoraria from Abbvie, Servier, and BMS, research funding from Astellas, and travel support from Abbvie, Servier, and Jazz. KM-B has received travel support from Sanofi. RZ has received honoraria from Therakos, Medac, Novartis, Neovii, VectivBio, Incyte. CW has received travel support, honoraria, or research funding from Jazz Pharmaceuticals, Mundipharma, Grifols, Medac, and Takeda.

The remaining author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

Writing assistance was provided using large language model-based tools (GPT-4, GPT-5) to improve clarity and readability under full author supervision. The author(s) declared that generative AI was used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fimmu.2025.1745873/full#supplementary-material

References

1. Penack O, Peczynski C, Mohty M, Yakoub-Agha I, Styczynski J, Montoto S, et al. How much has allogeneic stem cell transplant–related mortality improved since the 1980s? A retrospective analysis from the EBMT. Blood Adv. (2020) 4:6283–90. doi: 10.1182/bloodadvances.2020003418

PubMed Abstract | Crossref Full Text | Google Scholar

2. Armand P, Kim HT, Logan BR, Wang Z, Alyea EP, Kalaycio ME, et al. Validation and refinement of the Disease Risk Index for allogeneic stem cell transplantation. Blood. (2014) 123:3664–71. doi: 10.1182/blood-2014-01-552984

PubMed Abstract | Crossref Full Text | Google Scholar

3. Au BKC, Gooley TA, Armand P, Fang M, Madtes DK, Sorror ML, et al. Reevaluation of the pretransplant assessment of mortality score after allogeneic hematopoietic transplantation. Biol Blood Marrow Transplant J Am Soc Blood Marrow Transpl. (2015) 21:848–54. doi: 10.1016/j.bbmt.2015.01.011

PubMed Abstract | Crossref Full Text | Google Scholar

4. Gratwohl A. The EBMT risk score. Bone Marrow Transpl. (2012) 47:749–56. doi: 10.1038/bmt.2011.110

PubMed Abstract | Crossref Full Text | Google Scholar

5. Luft T, Benner A, Jodele S, Dandoy CE, Storb RF, Gooley TA, et al. It is easix to predict non-relapse mortality (NRM) of allogeneic stem cell transplantation (alloSCT). Blood. (2016) 128:519. doi: 10.1182/blood.V128.22.519.519

Crossref Full Text | Google Scholar

6. Sanchez-Escamilla M, Flynn J, Devlin S, Maloy M, Fatmi SA, Tomas AA, et al. EASIX score predicts inferior survival after allogeneic hematopoietic cell transplantation. Bone Marrow Transpl. (2023) 58:498–505. doi: 10.1038/s41409-023-01922-8

PubMed Abstract | Crossref Full Text | Google Scholar

7. Sorror ML, Maris MB, Storb R, Baron F, Sandmaier BM, Maloney DG, et al. Hematopoietic cell transplantation (HCT)-specific comorbidity index: a new tool for risk assessment before allogeneic HCT. Blood. (2005) 106:2912–9. doi: 10.1182/blood-2005-05-2004

PubMed Abstract | Crossref Full Text | Google Scholar

8. Shouval R, Fein JA, Shouval A, Danylesko I, Shem-Tov N, Zlotnik M, et al. External validation and comparison of multiple prognostic scores in allogeneic hematopoietic stem cell transplantation. Blood Adv. (2019) 3:1881–90. doi: 10.1182/bloodadvances.2019032268

PubMed Abstract | Crossref Full Text | Google Scholar

9. Bertz H, Sahlmann J, McMillan DC, Wehr C, Duque-Afonso J, Maas-Bauer K, et al. The importance of systemic inflammatory response measurements as pretransplant risk factors for outcome after allogeneic haematopoietic cell transplantation. Br J Haematol. (2025) 207(4):1517–28. doi: 10.1111/bjh.70049

PubMed Abstract | Crossref Full Text | Google Scholar

10. Dolan RD, McSorley ST, Horgan PG, Laird B, and McMillan DC. The role of the systemic inflammatory response in predicting outcomes in patients with advanced inoperable cancer: Systematic review and meta-analysis. Crit Rev Oncol Hematol. (2017) 116:134–46. doi: 10.1016/j.critrevonc.2017.06.002

PubMed Abstract | Crossref Full Text | Google Scholar

11. Choi EJ, Jun TJ, Park HS, Lee J-H, Lee K-H, Kim Y-H, et al. Predicting long-term survival after allogeneic hematopoietic cell transplantation in patients with hematologic Malignancies: machine learning–based model development and validation. JMIR Med Inform. (2022) 10:e32313. doi: 10.2196/32313

PubMed Abstract | Crossref Full Text | Google Scholar

12. Hernández-Boluda JC, Mosquera-Orgueira A, Gras L, Koster L, Tuffnell J, Kröger N, et al. Use of machine learning techniques to predict poor survival after hematopoietic cell transplantation for myelofibrosis. Blood. (2025) 145:3139–52. doi: 10.1182/blood.2024027287

PubMed Abstract | Crossref Full Text | Google Scholar

13. Mussetti A, Rius-Sansalvador B, Moreno V, Peczynski C, Polge E, Galimard JE, et al. Artificial intelligence methods to estimate overall mortality and non-relapse mortality following allogeneic HCT in the modern era: an EBMT-TCWP study. Bone Marrow Transplant. (2023) 59(2):232–8. doi: 10.1038/s41409-023-02147-5

PubMed Abstract | Crossref Full Text | Google Scholar

14. Shouval R, Labopin M, Bondi O, Mishan-Shamay H, Shimoni A, Ciceri F, et al. Prediction of allogeneic hematopoietic stem-cell transplantation mortality 100 days after transplantation using a machine learning algorithm: A European group for blood and marrow transplantation acute leukemia working party retrospective data mining study. J Clin Oncol. (2015) 33:3144–51. doi: 10.1200/JCO.2014.59.1339

PubMed Abstract | Crossref Full Text | Google Scholar

15. Shouval R, Labopin M, Unger R, Giebel S, Ciceri F, Schmid C, et al. Prediction of hematopoietic stem cell transplantation related mortality- lessons learned from the in-silico approach: A European society for blood and marrow transplantation acute leukemia working party data mining study. PloS One. (2016) 11:e0150637. doi: 10.1371/journal.pone.0150637

PubMed Abstract | Crossref Full Text | Google Scholar

16. Le Bris Y, Costes D, Bourgade R, Guillaume T, Peterlin P, Garnier A, et al. Impact on outcomes of mixed chimerism of bone marrow CD34+ sorted cells after matched or haploidentical allogeneic stem cell transplantation for myeloid Malignancies. Bone Marrow Transpl. (2022) 57:1435–41. doi: 10.1038/s41409-022-01747-x

PubMed Abstract | Crossref Full Text | Google Scholar

17. Li H, Sachdev V, Tian X, Nguyen M-L, Hsieh M, Fitzhugh C, et al. A machine learning-based workflow for predicting transplant outcomes in patients with sickle cell disease. Br J Haematol. (2025) 206:919–23. doi: 10.1111/bjh.19842

PubMed Abstract | Crossref Full Text | Google Scholar

18. Zhou Y, Smith J, Keerthi D, Li C, Sun Y, Mothi SS, et al. Longitudinal clinical data improve survival prediction after hematopoietic cell transplantation using machine learning. Blood Adv. (2024) 8:686–98. doi: 10.1182/bloodadvances.2023011752

PubMed Abstract | Crossref Full Text | Google Scholar

19. Massoud R, Gagelmann N, Fritzsche-Friedland U, Zeck G, Heidenreich S, Wolschke C, et al. Comparison of immune reconstitution between anti-T-lymphocyte globulin and posttransplant cyclophosphamide as acute graft-versus-host disease prophylaxis in allogeneic myeloablative peripheral blood stem cell transplantation. Haematologica. (2022) 107:857–67. doi: 10.3324/haematol.2020.271445

PubMed Abstract | Crossref Full Text | Google Scholar

20. Meyer T, Ihorst G, Bartsch I, Zeiser R, Wäsch R, Bertz H, et al. Cellular and humoral SARS-CoV-2 vaccination responses in 192 adult recipients of allogeneic hematopoietic cell transplantation. Vaccines. (2022) 10:1782. doi: 10.3390/vaccines10111782

PubMed Abstract | Crossref Full Text | Google Scholar

21. Meyer T, Maas-Bauer K, Wäsch R, Duyster J, Zeiser R, Finke J, et al. Immunological reconstitution and infections after alloHCT - a comparison between post-transplantation cyclophosphamide, ATLG and non-ATLG based GvHD prophylaxis. Bone Marrow Transpl. (2025) 60:286–96. doi: 10.1038/s41409-024-02474-1

PubMed Abstract | Crossref Full Text | Google Scholar

22. Velardi E, Tsai JJ, and van den Brink MRM. T cell regeneration after immunological injury. Nat Rev Immunol. (2021) 21:277–91. doi: 10.1038/s41577-020-00457-z

PubMed Abstract | Crossref Full Text | Google Scholar

23. Soiffer RJ, Kim HT, McGuirk J, Horwitz ME, Johnston L, Patnaik MM, et al. Prospective, randomized, double-blind, phase III clinical trial of anti-T-lymphocyte globulin to assess impact on chronic graft-versus-host disease-free survival in patients undergoing HLA-matched unrelated myeloablative hematopoietic cell transplantation. J Clin Oncol Off J Am Soc Clin Oncol. (2017) 35:4003–11. doi: 10.1200/JCO.2017.75.8177

PubMed Abstract | Crossref Full Text | Google Scholar

24. Lundberg SM and Lee SI. A unified approach to interpreting model predictions. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. NIPS’17. 57 Morehouse Lane, Red Hook, NY 12571, USA: Curran Associates Inc (2017). p. 4768–77.

Google Scholar

25. Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, et al. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell. (2020) 2:56–67. doi: 10.1038/s42256-019-0138-9

PubMed Abstract | Crossref Full Text | Google Scholar

26. Spyridonidis A, Labopin M, Gedde-Dahl T, Ganser A, Stelljes M, Craddock C, et al. Validation of the transplant conditioning intensity (TCI) index for allogeneic hematopoietic cell transplantation. Bone Marrow Transpl. (2024) 59:217–23. doi: 10.1038/s41409-023-02139-5

PubMed Abstract | Crossref Full Text | Google Scholar

27. Luft T, Benner A, Jodele S, Dandoy CE, Storb R, Gooley T, et al. EASIX in patients with acute graft-versus-host disease: a retrospective cohort analysis. Lancet Haematol. (2017) 4:e414–23. doi: 10.1016/S2352-3026(17)30108-4

PubMed Abstract | Crossref Full Text | Google Scholar

28. Luft T, Benner A, Terzer T, Jodele S, Dandoy CE, Storb R, et al. EASIX and mortality after allogeneic stem cell transplantation. Bone Marrow Transpl. (2020) 55:553–61. doi: 10.1038/s41409-019-0703-1

PubMed Abstract | Crossref Full Text | Google Scholar

29. Collins GS, Moons KGM, Dhiman P, Riley RD, Beam AL, Calster BV, et al. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ. (2024) 385:e078378. doi: 10.1136/bmj-2023-078378

PubMed Abstract | Crossref Full Text | Google Scholar

30. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: machine learning in python. J Mach Learn Res. (2011) 12:2825–30.

Google Scholar

31. van Rossum G and Drake FL. The Python Language Reference. Release 3.0.1 [Repr.]. Wilmington, Delaware, USA: Python Software Foundation (2010).

Google Scholar

32. Asteris PG, Gandomi AH, Armaghani DJ, Mohammed AS, Bousiou Z, Batsis I, et al. Pre-transplant and transplant parameters predict long-term survival after hematopoietic cell transplantation using machine learning. Transpl Immunol. (2025) 90:102211. doi: 10.1016/j.trim.2025.102211

PubMed Abstract | Crossref Full Text | Google Scholar

33. Admiraal R, Nierkens S, de Witte MA, Petersen EJ, Fleurke G, Verrest L, et al. Association between anti-thymocyte globulin exposure and survival outcomes in adult unrelated haemopoietic cell transplantation: a retrospective, pharmacodynamic cohort analysis. Lancet Haematol. (2017) 4:e183–91. doi: 10.1016/S2352-3026(17)30029-7

PubMed Abstract | Crossref Full Text | Google Scholar

34. Kim DH, Sohn SK, Won DI, Lee NY, Suh JS, and Lee KB. Rapid helper T-cell recovery above 200 × 106/l at 3 months correlates to successful transplant outcomes after allogeneic stem cell transplantation. Bone Marrow Transpl. (2006) 37:1119–28. doi: 10.1038/sj.bmt.1705381

PubMed Abstract | Crossref Full Text | Google Scholar

35. Troullioud Lucas AG, Lindemans CA, Bhoopalan SV, Dandis R, Prockop SE, Naik S, et al. Early immune reconstitution as predictor for outcomes after allo-HCT; a tri-institutional analysis. Cytotherapy. (2023) 25:977–85. doi: 10.1016/j.jcyt.2023.05.012

PubMed Abstract | Crossref Full Text | Google Scholar

36. Zhou G, Zhan Q, Huang L, Dou X, Cui J, Xiang L, et al. The dynamics of B-cell reconstitution post allogeneic hematopoietic stem cell transplantation: A real-world study. J Intern Med. (2024) 295:634–50. doi: 10.1111/joim.13776

PubMed Abstract | Crossref Full Text | Google Scholar

37. Russo A, Oliveira G, Berglund S, Greco R, Gambacorta V, Cieri N, et al. NK cell recovery after haploidentical HSCT with posttransplant cyclophosphamide: dynamics and clinical implications. Blood. (2018) 131:247–62. doi: 10.1182/blood-2017-05-780668

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: alloHCT, explainable AI, immunocompetence, inflammation, lymphocytes, machine learning, one-year mortality, risk prediction

Citation: Meyer T, Meyer R, Hackenberg M, Oelke D, Gengenbach L, Rummelt C, Wilcken H, Maas-Bauer K, Wäsch R, Duyster J, Bertz H, Duque-Afonso J, Finke J, Zeiser R and Wehr C (2026) Machine learning-based prediction of one-year mortality after alloHCT identifies the impact of pre-transplant immunity and inflammation. Front. Immunol. 16:1745873. doi: 10.3389/fimmu.2025.1745873

Received: 13 November 2025; Accepted: 23 December 2025; Revised: 18 December 2025;
Published: 19 January 2026.

Edited by:

Kelley M. K. Hitchman, University of Texas Health Science Center San Antonio, United States

Reviewed by:

Eman M. Elsabbagh, CITADEL Lab [Computational Immunology & Transplant AI Data Engineering Lab], United States
Maximilian Alexander Röhnert, Technical University Dresden, Germany

Copyright © 2026 Meyer, Meyer, Hackenberg, Oelke, Gengenbach, Rummelt, Wilcken, Maas-Bauer, Wäsch, Duyster, Bertz, Duque-Afonso, Finke, Zeiser and Wehr. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Claudia Wehr, Y2xhdWRpYS53ZWhyQHVuaWtsaW5pay1mcmVpYnVyZy5kZQ==

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.