Prediction of hemorrhagic transformation after thrombolysis based on machine learning models combined with platelet distribution width-to-count ratio

Li, Xiaosheng; Lei, Chunyan; Xu, Hongyun; Yuan, Churan; Zhou, Yuzhen; Jiang, Wen

doi:10.3389/fneur.2025.1466543

ORIGINAL RESEARCH article

Front. Neurol., 02 October 2025

Sec. Stroke

Volume 16 - 2025 | https://doi.org/10.3389/fneur.2025.1466543

This article is part of the Research TopicAdvancing Precision Medicine in Acute Stroke Care: Personalized Treatment Strategies and OutcomesView all 22 articles

Prediction of hemorrhagic transformation after thrombolysis based on machine learning models combined with platelet distribution width-to-count ratio

Xiaosheng Li^1,2^†

Chunyan Lei^1,3^†

Hongyun Xu⁴

Churan Yuan⁵

Yuzhen Zhou⁵

Wen Jiang^1,3^*

¹Department of Neurology I, First Affiliated Hospital of Kunming Medical University, Kunming, China
²Department of Neurology, The 920th Hospital of Joint Logistics Support Force, Kunming, China
³Yunnan Clinical Center for Neurological and Cardiovascular Diseases, Kunming, China
⁴Department of Endocrinology, Xuanwei Hospital Affiliated to Yunnan University of Chinese Medicine, Xuanwei, China
⁵Department of Rehabilitation Medicine, The Affiliated Hospital of Yunnan University, Kunming, China

Background: Hemorrhagic transformation (HT) is a common and potentially serious complication following intravenous thrombolysis (IVT) in patients with acute ischemic stroke (AIS). Despite its high incidence, there remains a lack of simple and effective tools for predicting HT risk.

Objective: This study aimed to develop an interpretable machine learning (ML) model incorporating the platelet distribution width to platelet count ratio (PPR) to predict HT occurrence in AIS patients after IVT.

Methods: We included AIS patients who underwent IVT at the First Affiliated Hospital of Kunming Medical University between July 2019 and April 2024. Four ML models—logistic regression (LR), random forest (RF), support vector machine (SVM), and extreme gradient boosting (Xgboost)—were constructed using 5-fold cross-validation, with HT after IVT as the outcome. Feature selection was performed using least absolute shrinkage and selection operator (LASSO) regression. Model performance was evaluated based on the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, and balanced F-score. The best-performing model was selected for interpretability analysis, and feature importance was assessed.

Results: LASSO regression identified six predictive features with non-zero coefficients: age, diabetes, malignancy, onset-to-treatment time (OTT), baseline National Institutes of Health Stroke Scale (NIHSS) score, and PPR. Among the models, LR demonstrated the highest predictive performance, achieving an optimal AUC of 0.919, along with average accuracy, sensitivity, and specificity of 0.825, 0.830, and 0.832, respectively. Feature importance in the LR model ranked as follows: baseline NIHSS score, diabetes, PPR, malignancy, age, and OTT.

Conclusion: The LR-based model incorporating PPR effectively predicts HT risk in AIS patients after IVT, providing clinicians with a rapid and accurate tool to assess thrombolytic hemorrhage risk and support treatment decision-making.

1 Introduction

Alteplase intravenous thrombolytic therapy (IVT) is effective in treating acute ischemic stroke (AIS) (1). Hemorrhagic transformation (HT) is a frequently occurring potentially adverse complication of intravenous thrombolysis in AIS patients, and the occurrence of HT in AIS patients following thrombolysis is significantly higher than that in AIS patients without thrombolysis (2). HT is further grouped into hemorrhagic infarction (HI) and parenchymal hemorrhage (PH), where PH, as a progressive manifestation of HI, usually indicates poor prognosis (3–5). Therefore, early identification of potential HT during thrombolysis and asymptomatic HI patients after thrombolysis is crucial.

Currently, most of the studies use multiple linear regression (MLR) to identify risk factors (6, 7) for HT following thrombolysis, including white blood cells (8), coagulation function (9), bilirubin (10), and uric acid (11). However, MLR is restrained by linear hypotheses between predicted variables and results, and the sensitivity to outliers may have adverse effects on predictive performance (12). Machine learning (ML), as an emerging discipline in the medical field, leverages computer science and statistical techniques to address healthcare challenges (13), making up for the shortcomings mentioned above and thus being widely applied. Due to the varying performance of different ML algorithms in different application scenarios, it is necessary to select appropriate algorithms to optimize model performance and accuracy before constructing a risk model for predicting HT after thrombolysis.

Platelet is the main component of blood and plays a crucial role in the onset and progression of AIS by maintaining the integrity of vascular endothelial cells, coagulation, and other pathophysiological functions (14, 15). Platelet count changes in hemorrhagic diseases are typically more rapid and pronounced than fibrinolytic markers. Thrombocytopenia or dysfunction is often an early sign of bleeding, while fibrinolytic indicators require some time to accumulate before showing any changes, which may not reflect risks in the early stages of HT. Additionally, fibrinolytic markers can be influenced by more complex factors, such as liver dysfunction and inflammatory responses (16), which can limit their clinical utility. These markers primarily reflect the degree of fibrinolytic activity, and although fibrin degradation products and D-dimer have some predictive value for HT after IVT (17, 18), their accuracy and timeliness are inferior to platelet count. Studies has demonstrated a positive correlation of elevated platelet distribution width (PDW) with a heightened likelihood of severe HT (19). As a new hematological indicator, PDW to platelet count ratio (PPR) can more comprehensively reflect platelet function, and its prognostic value has been confirmed in the prediction of other diseases (20, 21).

The objective of this study is to assess the predictive capabilities of various models utilizing different algorithms, develop a ML model that incorporates the PPR index for predicting the risk of HT after thrombolysis, and compare the performance of models to establish an effective assessment tool.

2 Methods

2.1 Study design and object

A single-center, observational, and retrospective study was conducted, and all subjects were collected from the First Affiliated Hospital of Kunming Medical University. The Ethics Committee of the hospital [No. 2022-L-157] provided approval for this study. Due to the retrospective nature of this study, which entailed anonymous and non-invasive data collection, the requirement for obtaining informed consent was waived. All procedures were performed in compliance with the principles outlined in the Declaration of Helsinki.

This study enrolled AIS patients who received IVT from July 2019 to April 2024. Subsequently, the patients were grouped into two groups, namely HT and non-HT, based on CT or MRI (magnetic resonance imaging) findings. A predictive model was constructed to assess the risk of HT following AIS.

The inclusion criteria were as follows: (1) Patients who met the World Health Organization (WHO) diagnostic criteria for AIS; (2) Hospitalized patients who received rt-PA IVT after excluding cerebral hemorrhage through transcranial CT examination or magnetic resonance imaging; (3) Patients aged ≥ 18 years; (4) Participants without a recent history of surgical treatment or brain injury. The exclusion criteria were as follows: (1) Patients with concurrent vital organ diseases, such as liver and kidney impairment; (2) Patients complicated with blood system diseases, coagulation dysfunction, connective tissue diseases, cerebral aneurysms, and cerebrovascular malformations; (3) Patients lacked of PDW and PLT at admission (Figure 1).

Figure 1

Flowchart detailing patient selection for a study from the First Affiliated Hospital of Kunming Medical University. Eligible patients were those who received IVT from July 2019 to April 2024. Exclusion criteria included vital organ diseases (6 excluded), hematologic and cerebrovascular disorders (11 excluded), and lack of PDW and PLT data (4 excluded). Final participants were 331, divided into endovascular (25) and non-endovascular treatment groups (306), further classified into HT (43) and non-HT (288) patients.

Figure 1. Patient selection process. AIS, Acute Ischemic Stroke; IVT, Intravenous Thrombolytic Therapy; PDW, Platelet Distribution Width; PLT, Platelet Count; HT, Hemorrhagic Transformation.

2.2 Data collection and processing

The demographic data and clinical characteristics of the study participants (age, gender, diabetes, hypertension, atrial fibrillation, body mass index (BMI), smoking history, drinking history, malignant tumor, previous use of lipid-lowering drugs, previous use of antiplatelet drugs, previous use of anticoagulants, intravascular treatment after onset of disease, blood pressure, baseline National Institutes of Health Stroke Scale (NIHSS) score, time from onset to thrombolysis), along with their initial laboratory test results, which were first obtained before IVT initiation in AIS patients upon admission [blood routine, coagulation function, fibrinolysis, liver function, kidney function, electrolytes, blood lipids, glucose, myoglobin, brain natriuretic peptide (BNP)], were retrieved from the laboratory information system of the First Affiliated Hospital of Kunming Medical University. The PPR was calculated as (PDW/PLT). To predict missing values in continuous variables, a multiple imputation technique was utilized when the proportion of missing values was less than 20%. Categorical variables with more than 20% missing values were excluded. To mitigate multicollinearity, variables exhibiting a variance inflation factor (VIF) exceeding 5 were eliminated from the model.

2.3 LASSO regression for feature selection

LASSO regression was utilized to identify and select features significantly associated with HT, leveraging its ability to perform both variable selection and regularization. The primary strength of LASSO lies in its L1 regularization, which shrinks some regression coefficients to zero, thus effectively excluding irrelevant predictors. This automatic feature selection is conducive to focus the model with the most relevant variables related to HT, thus enhancing its interpretability. The optimal regularization parameter (λ) for LASSO was determined through cross-validation, a technique that helped to select the λ value and minimized model error by testing the model on different subsets of the data. By doing so, overfitting was mitigated, ensuring that the model generalized well to new data while still retaining the most meaningful predictors. Through this process of variable selection and regularization, LASSO improved both the accuracy and interpretability of the model, making it more effective for identifying significant predictors of HT.

2.4 Model construction

This retrospective study employed four widely used ML algorithms—Logistic Regression (LR), Random Forest (RF), Support Vector Machine (SVM), and xgboost—to predict the onset of HT following thrombolysis, as illustrated in Figure 2. The process began by selecting a set of significant features that were most strongly associated with the occurrence of HT. These features were then used as the input variables for training each of the four ML models. To ensure optimal model performance, hyperparameter tuning was conducted for each algorithm using a grid search approach. This method systematically explored different combinations of hyperparameters within a predefined parameter space specific to each algorithm. The models were then fine-tuned based on performance metrics obtained from an extensive search, ensuring that they were optimized to achieve their highest performance potential. For the four machine learning models (LR, RF, SVM, and xgboos), the following hyperparameters were fine-tuned: LR: The regularization strength (L1 or L2 regularization) was adjusted. RF: Hyperparameters such as the number of trees, maximum depth, and minimum samples required to split a node were optimized. SVM: The penalty parameter C and the type of kernel function were tuned.xgboos: Key hyperparameters like learning rate, number of trees, and maximum depth were adjusted. These hyperparameters were optimized through techniques like cross-validation, grid search, and random search, to find the optimal combination that maximized the model’s generalization ability on the validation set.

Figure 2

Flowchart depicting a medical research process at

Figure 2. Machine learning flowchart for constructing predictive models. AIS, Acute Ischemic Stroke; IVT, Intravenous Thrombolytic Therapy; LR, Logistic Regression; RF, Random Forest; SVM, Support Vector Machine; xgboost, Extreme Gradient Boosting.

2.5 Training model

To train our models and mitigate the risk of overfitting, we implemented 5-fold cross-validation. This method involved dividing the dataset into 5 separate folds. This method involved dividing the dataset into 5 separate folds, maximizing the number of folds while ensuring that each fold contained a sufficient number of HT patient samples (8–9 positive samples per fold). In each round, the model was trained on 4 of the folds and validated on the remaining fold. The process was repeated 5 times, with each fold acting as the validation set once. The final model performance was calculated by taking the average of the metrics obtained from each iteration. By using this technique, the dataset was effectively split into 5 parts, and the model was trained and validated on different combinations of these parts. This helped to minimize potential bias in the performance assessment. As a result, this strategy provided a more robust and generalizable evaluation, producing performance metrics that were less reliant on any single partition of the data.

2.6 Model evaluation

To evaluate and ensure the generalizability of each model, the performance was assessed by calculating the mean [standard deviation (SD)] of key metrics across the 5-fold cross-validation, including the Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC), accuracy, sensitivity, specificity, and balanced F-score. These metrics provided a comprehensive evaluation of each model’s predictive capabilities. To explain the impact of predictors in a model, feature importance analysis was used.

2.7 Nomogram for HT prediction after IVT

The best-performing model, selected based on the highest AUC and overall metric scores, was used to identify the key features most strongly associated with HT. These selected features, along with the model’s predictions, were then incorporated into the development of a nomogram. The nomogram allows for the calculation of the probability of HT occurrence using multiple clinical variables. For each predictive variable, a horizontal line was drawn, with a scale beneath it indicating the possible values of that variable. Based on the actual observed value of each variable, the corresponding score was located on the scale. The scores for all variables were then summed to obtain a total score. This total score was finally mapped to a probability curve on the nomogram, allowing for the conversion of the total score into the predicted probability of HT occurrence in AIS patients who received IVT. The nomogram provided a visual, intuitive tool that allowed clinicians to estimate the probability of HT in individual patients following thrombolysis, facilitating decision-making and personalized care.

2.8 Statistical analysis

Continuous variables were represented as mean (SD) or median (upper and lower quartiles), and compared through student t-tests or non-parametric tests according to specific circumstances. Categorical variables were represented as frequency and percentage (%), and Pearson chi square test or Fisher’s exact test was adopted for comparison between groups. p < 0.05 was set to indicate a statistically significant difference. The statistical analysis of this study was performed using R software (version 4.3.2).

3 Results

3.1 Characteristics of patients

In this study, 331 AIS patients after IVT were included, of whom 43 (13.0%) developed HT. The patients had a median age of 68 [58, 77] years, and there were 205 (61.9%) males and 126 (38.1%) females. Differences were observed regarding the following variables between the HT group and the non-HT group: age, baseline NIHSS score, hemoglobin (Hb), BNP, D-dimer (D-D) (p < 0.05). The details are presented in Table 1.

Table 1

Table 1. Clinical baseline characteristics.

3.2 Selection of predictive variables

The function selection was carried out using the Least Absolute Shrinkage and Selection Operator (LASSO) method, where the penalty for β coefficient was determined by the tuning parameter λ (λ = 0.02651381). In this study, 37 variables were included, and 37 lines of different colors were obtained, each representing the change trajectory of a specific independent variable’s coefficient. As the value of λ increased, the coefficients gradually decreased, reflecting the regularization effect of the LASSO method (Figure 3A). The dashed line on the left represented λ value, and the value minimized the bias and corresponded to the optimal model fit. Regarding this value, the model selected 6 variables, indicating that these variables provided the most reliable and predictive relationship with the outcome. Consequently, six feature variables with non-zero coefficients were chosen, including age, diabetes, malignancy, onset to treatment time (OTT), baseline NIHSS score, and PPR (Figure 3B).

Figure 3

Two graphs are shown. Graph A: A plot of coefficients versus log lambda, displaying multiple colored lines converging towards the right. Graph B: A plot of binomial deviance versus log lambda, showing a red curve with error bars, descending and then stabilizing. Dotted vertical lines are near the right end.

Figure 3. Results of variable screening using the LASSO regression. (A) The 5-fold cross-validation was performed, and the coefficients of all predictors gradually returned to zero. λ = 0.02651381. (B) There were 6 predictors of non-zero coefficients at the far right dotted line.

3.3 Model performance

Four ML algorithms, namely Logistic, Random Forest, SVM, and Xgboost, were selected to construct models. The optimized ML model underwent 5-fold cross-validation, and the mean value obtained from each algorithm was utilized as the prediction result for that algorithm. The mean accuracy values of LR, RF, SVM, and XGBoost models were 0.825, 0.743, 0.773, and 0.813, respectively; the mean AUC values were 0.851, 0.763, 0.711, and 0.718, respectively; the mean sensitivity values were 0.830, 0.821, 0.731, and 0.636, respectively; the mean specificity values were 0.832, 0.725, 0.776, and 0.841, respectively. The details are represented in Table 2. The optimal ROC curves for different models are shown in Figure 4. It could be found that the optimal AUC values for all four models were above 0.8 (LR > Xgboost>SVM > RF), indicating good fitting effect.

Table 2

Table 2. Mean and standard deviation of 5-fold cross-validation for four ML Models.

Figure 4

ROC curves comparing four models: LR (violet line, AUC 0.919), RF (yellow line, AUC 0.874), SVM (red line, AUC 0.846), and XGBoost (blue line, AUC 0.862). Sensitivities on y-axis and specificities on x-axis.

Figure 4. Optimal ROC curves for four ML models. LR, Logistic Regression; RF, Random Forest; SVM, Support Vector Machine; xgboost, Extreme Gradient Boosting.

The results in Table 2 and Figure 5 indicated that after comprehensive evaluation of the four models, the LR model exhibited the best performance in terms of mean value of AUC, accuracy, sensitivity, precision, and F1. Therefore, it could be considered that the LR model had the best performance among these four models. According to the Nomogram constructed from the LR model and the statistical analysis of the LR mode, for AIS patients undergoing IVT, a total score of 226 corresponded to an estimated probability of 0.64 for HT (Figure 6).

Figure 5

Nomogram depicting relationships between various factors and total points. Factors include malignancy, PPR, age, OTT, diabetes, and NIHSS, each with corresponding histograms and red markers. Total points range from 120 to 260, with a specific calculation at 226 points corresponding to a probability of 0.646.

Figure 5. Nomogram model constructed based on LR. OTT, Onset to Treatment Time; PPR, Platelet Distribution Width to Platelet Count Ratio; Blue curve: Represents the relationship between one modeling variable and the occurrence of HT in AIS patients who received IVT; Gray shaded area: Represents the range of fluctuations in the occurrence of HT in AIS patients who received IVT as the input variables change.

Figure 6

Bar graph ranking feature importance after permutation; order: NIHSS > Diabetes > PPR > Malignancy > Age > OTT. The RMSE values range from approximately 0.300 to 0.370.

Figure 6. Feature importance results of the LR model. OTT, Onset to Treatment Time; PPR, Platelet Distribution Width to Platelet Count Ratio.

3.4 Feature importance

The LR model demonstrated the best overall performance. We further ranked these features based on their contribution to the model’s predictive capability. The feature importance of the LR model are summarized in Figure 5. In addition, based on their contribution to the model, feature variables were ranked in descending order as baseline NIHSS score, diabetes, PPR, malignancy, age, and OTT.

4 Discussion

In this study, the occurrence of HT in AIS patients following IVT was found to be 13.0%, which was similar to the results in previous studies (22). The ML model based on patient PPR exhibited favorable performance in predicting HT, with optimal AUC values exceeding 0.8. Particularly, the LR model performed well in this study, with an optimal AUC value exceeding 0.9. In addition, the mean values of the model in accuracy, sensitivity, specificity, were all over 0.8. Overall, the LR model exhibited well in performance evaluation and model calibration, providing strong support in clinical decision-making. The reason LR may perform optimally compared to other models is that LR is a relatively simple linear model, particularly suitable for situations with small datasets and clear linear relationships between features. While other complex models like XGBoost have advantages in handling non-linear relationships, LR may demonstrate better predictive performance when dealing with linear data, less noise, or a lower risk of overfitting (23). Moreover, the ML-based predictive model developed in this study demonstrated superior risk prediction capabilities compared to previous MLR models (6). Compared with ML models developed by other researchers, the model developed by Wang et al. (24) was slightly inferior in terms of optimal performance (AUC = 0.82), and its inclusion of missing values in variables exceeded 30%, which might cause bias in HT prediction. The study by Li et al. (25) showed that the Xgboost model exhibited the highest performance in terms of AUC (AUC > 0.95). However, the CO2-CP included in this model was not a conventional testing index for AIS admission, which might also affect the promotion and application of the model in primary hospitals (25). The modeling variables in this study included age, diabetes, malignant tumor, OTT, baseline NIHSS score, and PPR. These were convenient for popularization. In summary, the HT risk predictive model developed in this study performed well in multiple performance indicators and had high clinical application potential compared to other ML models.

The role of platelets in ischemia–reperfusion injury has gained increasing attention in the pathophysiological process of AIS. The activation of platelets and activated platelets can exacerbate post-stroke ischemia–reperfusion injury, and the disruption of the blood–brain barrier by reperfusion injury is one of the important causes for HT (26–28). Platelet aggregation and clot retraction play important roles in the bleeding process. Among them, platelet aggregation is a key step in the hemostasis process, while clot retraction helps stabilize thrombosis and reduce the risk of hemorrhage (29). Alteplase can inhibit platelet aggregation and clot retraction by inhibiting ADP, collagen, and adrenaline, thereby affecting platelet function (30). Platelets can also enhance fibrinolysis by participating in the plasminogen activation system, thereby increasing the risk of HT after IVT (31). In addition, platelets form platelet-fibrin aggregates around the thrombus, leading to living contracting of cerebral thrombosis, thereby affecting the severity and prognosis of AIS (32). Mean platelet volume (MPV) and PDW together provide comprehensive information on platelet production, activation, and functional status. Compared to MPV, PDW is a more sensitive marker of variation in platelet volume, providing more comprehensive platelet activation information and effectively indicating the severity of the disease (19, 33, 34). In addition, Chen et al. (35) discovered an obvious association between PDW and the severity of stroke. Unfortunately, PDW may be affected by platelet count (36). Some scholars believe that PDW should not be used alone as a direct indicator of thromboembolic diseases (37). Lin et al.’s (21) study shows that the AUCs for predicting 120-day mortality in severe burn patients using PDW, PLT, and PPR on the third day post-burn are 0.792, 0.782, and 0.816, respectively. Therefore, as a novel biological indicator, PPR, by reflecting both the distribution width and platelet count, can more comprehensively reflect platelet function and predict the risk of HT occurrence. In this study, the baseline PPR lacked statistical significance, but p-values can be influenced by sample size and may not accurately reflect the true relationship between variables and outcomes, potentially leading to bias in variable selection. This is especially true when p-values are used as the sole criterion for feature selection, which may result in the inclusion of variables with no practical significance. In contrast, regularization methods like LASSO regression help address this issue by penalizing coefficients and automatically selecting variables, effectively removing less important ones and improving the model’s generalizability (38). Given that PPR was included in the LASSO model and ranked third in feature importance, its role as an independent impact factor is justifiable, as it demonstrated a certain effect in predicting HT. Future studies should further explore the predictive ability of PPR in different patient populations and evaluate its application value in clinical practice.

In our study, the key factors influencing the prediction results include malignant tumors and diabetes. Malignant tumor patients have a certain impact on the prediction results, which may be due to the higher coagulation, platelet, and endothelial dysfunction markers, as well as more circulating tumor microemboli in stroke patients complicated with malignant tumors (39). However, patients with malignant tumors may not necessarily develop HT (40–42). Other variables still need to be explored for comprehensive evaluation. In addition, in our model, diabetes may have an important influence on the prediction results (Ranked 2rd in feature importance), which may be due to the combined effect of multiple mechanisms such as endothelial dysfunction, changes in coagulation and fibrinolytic systems, abnormal platelet function, and direct tissue damage caused by hyperglycemia in diabetes patients (43). Previous studies have shown that the admission glucose performs better in predicting the adverse outcome of AIS patients than diabetes (44, 45). It should be noted that there are differences in physiological mechanism between chronic hyperglycemia and stress hyperglycemia (46). Therefore, it may not fully reflect the actual condition of patients to consider diabetes and admission glucose alone. Future studies should further explore biomarkers and clinical parameters that reflect comprehensive blood glucose levels, and construct more accurate predictive models. By incorporating age, OTT, and baseline NIHSS score into the model, the findings of previous studies have been effectively corroborated and validated (5, 47–49).

Imaging variables may encounter certain challenges in predicting HT risk after IVT. Although IVT can effectively dissolve thrombus and restore cerebral blood flow, it also increases the risk of intracranial hemorrhage (50). Therefore, when deciding to perform IVT on AIS patients, the risk of HT is one of the primary factors that should be considered by clinicians. IVT is suitable for AIS patients within 3 to 4.5 h after onset (51), which requires rapid and accurate evaluation by clinicians. Imaging plays a pivotal role in the rapid diagnosis and treatment of ischemic stroke. Head computed tomography (CT) scan can quickly and accurately determine cerebral hemorrhage; CT angiography (CTA) can locate ischemic blood vessels; CT perfusion (CTP) imaging can detect ischemic penumbra through multiple automated post-processing; MRI and diffusion-weighted imaging (DWI) can clarify the diagnosis of AIS and the extent of cerebral infarction (52). Due to the high risk of radiation exposure and contrast agent application of CT and CTA, as well as the longer duration, higher cost, and limited equipment accessibility of CTP and MRI, it may lead to different imaging protocols chosen by clinicians, resulting in different imaging variables. MRI and CTP may have moderate diagnostic performance in predicting HT in patients with AIS (53, 54), but current clinical evidence is insufficient to support these imaging parameters in predicting HT (55). Therefore, challenges still exist in incorporating imaging variables into predictive models to assess the risk of HT after thrombolysis. Moreover, multi-center studies have shown that early active treatment and dehydration therapy for asymptomatic HT patients can reduce the risk of hematoma enlargement and death (4). This study aims to develop a model based on ML combined with laboratory indicators that enables rapid and accurate prediction of HT following IVT. This will assist clinicians in making informed decisions regarding the administration of thrombolytic therapy and facilitate the early identification of asymptomatic HT patients after IVT, so as to prevent them from developing PH. By comprehensively analyzing various clinical and laboratory data, and combining with ML algorithms, the predictive model developed in this study has been able to efficiently and accurately evaluate the risk of HT (optimal AUC > 0.9). Future studies will focus on standardizing multiple imaging variables to further optimize the predictive ability of HT.

4.1 Limitations

This study used single-center data, lacked external validation, and adopted a retrospective study design, which could potentially limit the generalizability and accuracy of the research findings. Additionally, while HT was assessed as a whole, it was not further divided into its subtypes—HI and PH. PH was generally associated with more severe outcomes and poorer prognosis compared to HI, making it a critical factor for risk stratification and prediction in ischemic stroke patients. Future studies should aim to distinguish between HI and PH to better predict and manage the more severe forms of hemorrhagic transformation. Incorporating multi-center data and adopting prospective designs would also improve the generalizability and accuracy of predictive models.

5 Conclusion

It can be concluded in our research that the independent predictors of HT are age, diabetes, malignancy, OTT, baseline NIHSS score, and PPR. Among the models constructed by four ML algorithms, we have chosen the HT model with the best performance constructed by the LR algorithm. This model offers precise predictions of HT after IVT, providing valuable support to clinicians in promptly and accurately assessing the risk of thrombolytic hemorrhage and identifying asymptomatic HT patients after IVT.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.

Ethics statement

The studies involving humans were approved by the First Affiliated Hospital of Kunming Medical University (Approval number: 2022-L-157). The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required from the participants or the participants’ legal guardians/next of kin in accordance with the national legislation and institutional requirements.

Author contributions

XL: Conceptualization, Formal analysis, Investigation, Methodology, Writing – original draft, Writing – review & editing. CL: Formal analysis, Investigation, Resources, Supervision, Writing – review & editing. HX: Conceptualization, Writing – original draft, Writing – review & editing. CY: Methodology, Writing – original draft. YZ: Methodology, Writing – original draft. WJ: Conceptualization, Funding acquisition, Resources, Supervision, Writing – review & editing.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. This work was supported by the following grants: Grant No. 202201AY070001-049 from the Joint Special Project on Applied Basic Research of Yunnan Provincial Science and Technology Department & Kunming Medical University. Grant No. YNLCYXZX2023300077 from the Yunnan Clinical Center for Neurological and Cardiovascular Diseases (corrected from the previously listed YWLCYXZX2023300077).

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Wardlaw, JM, Murray, V, Berge, E, and del Zoppo, GJ. Thrombolysis for acute ischaemic stroke. Cochrane Database Syst Rev. (2014). doi: 10.1002/14651858.CD000213

Crossref Full Text | Google Scholar

2. Honig, A, Percy, J, Sepehry, AA, Gomez, AG, Field, TS, and Benavente, OR. Hemorrhagic transformation in acute ischemic stroke: a quantitative systematic review. J Clin Med. (2022) 11:1162. doi: 10.3390/jcm11051162

PubMed Abstract | Crossref Full Text | Google Scholar

3. Yaghi, S, Willey, JZ, Cucchiara, B, Goldstein, JN, Gonzales, NR, Khatri, P, et al. Treatment and outcome of hemorrhagic transformation after intravenous alteplase in acute ischemic stroke: a scientific statement for healthcare professionals from the American Heart Association/American Stroke Association. Stroke. (2017) 48:e343–61. doi: 10.1161/str.0000000000000152

Crossref Full Text | Google Scholar

4. Liu, J, Wang, Y, Li, J, Zhang, S, Wu, Q, Wei, C, et al. Treatment and outcomes of thrombolysis related hemorrhagic transformation: a multi-center study in China. Front Aging Neurosci. (2022) 14:847648. doi: 10.3389/fnagi.2022.847648

PubMed Abstract | Crossref Full Text | Google Scholar

5. Qiu, L, Fu, F, Zhang, W, He, J, Zhan, Z, and Cheng, Z. Prevalence, risk factors, and clinical outcomes of remote intracerebral hemorrhage after intravenous thrombolysis in acute ischemic stroke: a systematic review and meta-analysis. J Neurol. (2023) 270:651–61. doi: 10.1007/s00415-022-11414-2

PubMed Abstract | Crossref Full Text | Google Scholar

6. Zhong, K, An, X, Kong, Y, and Chen, Z. Predictive model for the risk of hemorrhagic transformation after rt-PA intravenous thrombolysis in patients with acute ischemic stroke: a systematic review and meta-analysis. Clin Neurol Neurosurg. (2024) 239:108225. doi: 10.1016/j.clineuro.2024.108225

PubMed Abstract | Crossref Full Text | Google Scholar

7. Lei, YS, Li, H, Lei, JY, Li, SX, and Li, DF. Effect of intravenous thrombolysis in acute ischemic stroke patients with cerebral microbleeds and analysis of risk factors for hemorrhagic transformation. Eur Rev Med Pharmacol Sci. (2022) 26:779–86. doi: 10.26355/eurrev_202202_27986

PubMed Abstract | Crossref Full Text | Google Scholar

8. Xie, J, Pang, C, Yu, H, Zhang, W, Ren, C, and Deng, B. Leukocyte indicators and variations predict worse outcomes after intravenous thrombolysis in patients with acute ischemic stroke. J Cereb Blood Flow Metab. (2023) 43:393–403. doi: 10.1177/0271678x221142694

PubMed Abstract | Crossref Full Text | Google Scholar

9. Huang, P, and Yi, XY. Predictive role of admission serum glucose, baseline NIHSS score, and fibrinogen on hemorrhagic transformation after intravenous thrombolysis with alteplase in acute ischemic stroke. Eur Rev Med Pharmacol Sci. (2023) 27:9710–20. doi: 10.26355/eurrev_202310_34141

PubMed Abstract | Crossref Full Text | Google Scholar

10. Chen, X, Yang, X, Xu, X, Fu, F, and Huang, X. Higher serum bilirubin levels are associated with hemorrhagic transformation after intravenous thrombolysis in acute ischemic stroke. Front Aging Neurosci. (2023) 15:1159102. doi: 10.3389/fnagi.2023.1159102

PubMed Abstract | Crossref Full Text | Google Scholar

11. Tian, Y, Xie, Q, You, J, Yang, S, Zhao, H, and Song, Y. Lower uric acid level may be associated with hemorrhagic transformation after intravenous thrombolysis. Neurol Sci. (2022) 43:3113–20. doi: 10.1007/s10072-021-05760-8

PubMed Abstract | Crossref Full Text | Google Scholar

12. Krzywinski, M, and Altman, N. Multiple linear regression: when multiple variables are associated with a response, the interpretation of a prediction equation is seldom simple. Nat Methods. (2015) 12:1103–4. doi: 10.1038/nmeth.3665

PubMed Abstract | Crossref Full Text | Google Scholar

13. Deo, RC. Machine learning in medicine. Circulation. (2015) 132:1920–30. doi: 10.1161/circulationaha.115.001593

PubMed Abstract | Crossref Full Text | Google Scholar

14. Rawish, E, Nording, H, Münte, T, and Langer, HF. Platelets as mediators of Neuroinflammation and thrombosis. Front Immunol. (2020) 11:548631. doi: 10.3389/fimmu.2020.548631

PubMed Abstract | Crossref Full Text | Google Scholar

15. Burnouf, T, and Walker, TL. The multifaceted role of platelets in mediating brain function. Blood. (2022) 140:815–27. doi: 10.1182/blood.2022015970

PubMed Abstract | Crossref Full Text | Google Scholar

16. Medcalf, RL, and Keragala, CB. The fibrinolytic system: mysteries and opportunities. Hema. (2021) 5:e570. doi: 10.1097/hs9.0000000000000570

PubMed Abstract | Crossref Full Text | Google Scholar

17. Jin, T, Chen, D, Chen, Z, Feng, D, Zheng, M, Wang, P, et al. Post-thrombolytic D-dimer elevation predicts symptomatic intracranial hemorrhage and poor functional outcome after intravenous thrombolysis in acute ischemic stroke patients. Neuropsychiatr Dis Treat. (2022) 18:2737–45. doi: 10.2147/ndt.S389469

PubMed Abstract | Crossref Full Text | Google Scholar

18. Liu, C, Zhang, Y, Niu, L, and Li, J. High level of the fibrin degradation products at admission predicts parenchymal hematoma and unfavorable outcome of ischemic stroke after intravenous thrombolysis. Front Neurol. (2021) 12:797394. doi: 10.3389/fneur.2021.797394

Crossref Full Text | Google Scholar

19. Dourado Sotero, F, Calçada, A, Aguiar de Sousa, D, Dias, M, Fonseca, AC, Pinho, EMT, et al. Mean platelet volume is a prognostic marker in acute ischemic stroke patients treated with intravenous thrombolysis. J Stroke Cerebrovasc Dis. (2021) 30:105718. doi: 10.1016/j.jstrokecerebrovasdis.2021.105718

PubMed Abstract | Crossref Full Text | Google Scholar

20. Purbiya, P, Golwala, ZM, Manchanda, A, Sreenivas, V, and Puliyel, JM. Platelet distribution width to platelet count ratio as an index of severity of illness. Indian J Pediatr. (2018) 85:10–4. doi: 10.1007/s12098-017-2432-z

PubMed Abstract | Crossref Full Text | Google Scholar

21. Lin, JC, Wu, GH, Zheng, JJ, Chen, ZH, and Chen, XD. Prognostic values of platelet distribution width and platelet distribution width-to-platelet ratio in severe burns. Shock. (2022) 57:494–500. doi: 10.1097/shk.0000000000001890

PubMed Abstract | Crossref Full Text | Google Scholar

22. Xu, X, Li, C, Wan, T, Gu, X, Zhu, W, Hao, J, et al. Risk factors for hemorrhagic transformation after intravenous thrombolysis in acute cerebral infarction: a retrospective single-center study. World Neurosurg. (2017) 101:155–60. doi: 10.1016/j.wneu.2017.01.091

PubMed Abstract | Crossref Full Text | Google Scholar

23. Gui, J, Chen, T, Zhang, J, Cao, Q, Sun, Z, Luo, H, et al. A survey on self-supervised learning: algorithms, applications, and future trends. IEEE Trans Pattern Anal Mach Intell. (2024) 46:9052–71. doi: 10.1109/TPAMI.2024.3415112

PubMed Abstract | Crossref Full Text | Google Scholar

24. Wang, F, Huang, Y, Xia, Y, Zhang, W, Fang, K, Zhou, X, et al. Personalized risk prediction of symptomatic intracerebral hemorrhage after stroke thrombolysis using a machine-learning model. Ther Adv Neurol Disord. (2020) 13:1756286420902358. doi: 10.1177/1756286420902358

PubMed Abstract | Crossref Full Text | Google Scholar

25. Li, X, Xu, C, Shang, C, Wang, Y, Xu, J, and Zhou, Q. Machine learning predicts the risk of hemorrhagic transformation of acute cerebral infarction and in-hospital death. Comput Methods Prog Biomed. (2023) 237:107582. doi: 10.1016/j.cmpb.2023.107582

PubMed Abstract | Crossref Full Text | Google Scholar

26. Bernardo-Castro, S, Sousa, JA, Brás, A, Cecília, C, Rodrigues, B, Almendra, L, et al. Pathophysiology of blood-brain barrier permeability throughout the different stages of ischemic stroke and its implication on hemorrhagic transformation and recovery. Front Neurol. (2020) 11:594672. doi: 10.3389/fneur.2020.594672

PubMed Abstract | Crossref Full Text | Google Scholar

27. Whyte, CS, Mitchell, JL, and Mutch, NJ. Platelet-mediated modulation of fibrinolysis. Semin Thromb Hemost. (2017) 43:115–28. doi: 10.1055/s-0036-1597283

PubMed Abstract | Crossref Full Text | Google Scholar

28. Stegner, D, Klaus, V, and Nieswandt, B. Platelets as modulators of cerebral ischemia/reperfusion injury. Front Immunol. (2019) 10:2505. doi: 10.3389/fimmu.2019.02505

PubMed Abstract | Crossref Full Text | Google Scholar

29. Gremmel, T, Frelinger, AL 3rd, and Michelson, AD. Platelet physiology. Semin Thromb Hemost. (2016) 42:191–204. doi: 10.1055/s-0035-1564835.

Crossref Full Text | Google Scholar

30. Lu, J, Hu, P, Wei, G, Luo, Q, Qiao, J, and Geng, D. Effect of alteplase on platelet function and receptor expression. J Int Med Res. (2019) 47:1731–9. doi: 10.1177/0300060519829991

PubMed Abstract | Crossref Full Text | Google Scholar

31. Napolitano, F, and Montuori, N. Role of plasminogen activation system in platelet pathophysiology: emerging concepts for translational applications. Int J Mol Sci. (2022) 23:6065. doi: 10.3390/ijms23116065

PubMed Abstract | Crossref Full Text | Google Scholar

32. Khismatullin, RR, Nagaswami, C, Shakirova, AZ, Vrtková, A, Procházka, V, Gumulec, J, et al. Quantitative morphology of cerebral thrombi related to Intravital contraction and clinical features of ischemic stroke. Stroke. (2020) 51:3640–50. doi: 10.1161/strokeaha.120.031559

PubMed Abstract | Crossref Full Text | Google Scholar

33. Cui, MM, Li, N, Liu, X, Yun, ZY, Niu, Y, Zhang, Y, et al. Platelet distribution width correlates with prognosis of non-small cell lung cancer. Sci Rep. (2017) 7:3456. doi: 10.1038/s41598-017-03772-z

PubMed Abstract | Crossref Full Text | Google Scholar

34. Izzi, B, Gialluisi, A, Gianfagna, F, Orlandi, S, De Curtis, A, Magnacca, S, et al. Platelet distribution width is associated with P-selectin dependent platelet function: results from the Moli-family cohort study. Cells. (2021) 10:2737. doi: 10.3390/cells10102737

Crossref Full Text | Google Scholar

35. Chen, Z, He, Y, Su, Y, Sun, Y, Zhang, Y, and Chen, H. Association of inflammatory and platelet volume markers with clinical outcome in patients with anterior circulation ischaemic stroke after endovascular thrombectomy. Neurol Res. (2021) 43:503–10. doi: 10.1080/01616412.2020.1870359

PubMed Abstract | Crossref Full Text | Google Scholar

36. Pecci, A, and Balduini, CL. Inherited thrombocytopenias: an updated guide for clinicians. Blood Rev. (2021) 48:100784. doi: 10.1016/j.blre.2020.100784

PubMed Abstract | Crossref Full Text | Google Scholar

37. Yalcinkaya, E, Bugan, B, Celik, M, Yasar, S, and Gursoy, E. Platelet distribution width should not be used alone as a direct Indicator of thromboembolic disorders. Angiology. (2014) 65:65–5. doi: 10.1177/0003319713496629

PubMed Abstract | Crossref Full Text | Google Scholar

38. Feng, G, Xu, H, Wan, S, Wang, H, Chen, X, Magari, R, et al. Twelve practical recommendations for developing and applying clinical predictive models. Innov Med. (2024) 2:100105-1. doi: 10.59717/j.xinn-med.2024.100105

Crossref Full Text | Google Scholar

39. Navi, BB, Sherman, CP, Genova, R, Mathias, R, Lansdale, KN, LeMoss, NM, et al. Mechanisms of ischemic stroke in patients with cancer: a prospective study. Ann Neurol. (2021) 90:159–69. doi: 10.1002/ana.26129

PubMed Abstract | Crossref Full Text | Google Scholar

40. Huang, S, Lu, X, Tang, LV, and Hu, Y. Efficacy and safety of intravenous thrombolysis for acute ischemic stroke in cancer patients: a systemic review and meta-analysis. Am J Transl Res. (2020) 12:4795–806. doi: 10.1007/s11739-023-03312-w

Crossref Full Text | Google Scholar

41. Rael, S, Webb, M, Brown, RD Jr, Ruff, MW, Keser, Z, and Sener, U. Safety of intravenous thrombolysis for ischemic stroke in patients with hematologic malignancies: a single institution experience. J Stroke Cerebrovasc Dis. (2023) 32:107294. doi: 10.1016/j.jstrokecerebrovasdis.2023.107294

PubMed Abstract | Crossref Full Text | Google Scholar

42. Sobolewski, P, Brola, W, Szczuchniak, W, Fudala, M, and Sobota, A. Safety of intravenous thrombolysis for acute ischaemic stroke including concomitant neoplastic disease sufferers - experience from Poland. Int J Clin Pract. (2015) 69:666–73. doi: 10.1111/ijcp.12586

PubMed Abstract | Crossref Full Text | Google Scholar

43. Kautzky-Willer, A, Harreiter, J, and Pacini, G. Sex and gender differences in risk, pathophysiology and complications of type 2 diabetes mellitus. Endocr Rev. (2016) 37:278–316. doi: 10.1210/er.2015-1137

PubMed Abstract | Crossref Full Text | Google Scholar

44. Desilles, JP, Meseguer, E, Labreuche, J, Lapergue, B, Sirimarco, G, Gonzalez-Valcarcel, J, et al. Diabetes mellitus, admission glucose, and outcomes after stroke thrombolysis: a registry and systematic review. Stroke. (2013) 44:1915–23. doi: 10.1161/strokeaha.111.000813

PubMed Abstract | Crossref Full Text | Google Scholar

45. Pajo, AT, Diestro, JDB, Espiritu, AI, Dmytriw, AA, Enriquez-Marulanda, A, Sarmiento, RJC, et al. Thrombolysis outcomes in patients with diabetes and previous stroke: a meta-analysis. Can J Neurol Sci. (2020) 47:486–93. doi: 10.1017/cjn.2020.63

PubMed Abstract | Crossref Full Text | Google Scholar

46. Marik, PE, and Bellomo, R. Stress hyperglycemia: an essential survival response! Crit Care. (2013) 17:305. doi: 10.1186/cc12514

Crossref Full Text | Google Scholar

47. Berge, E, Whiteley, W, Audebert, H, De Marchis, GM, Fonseca, AC, Padiglioni, C, et al. European stroke organisation (ESO) guidelines on intravenous thrombolysis for acute ischaemic stroke. Eur Stroke J. (2021) 6:I-lxii. doi: 10.1177/2396987321989865

Crossref Full Text | Google Scholar

48. Yang, M, Zhong, W, Zou, W, Peng, J, and Tang, X. A novel nomogram to predict hemorrhagic transformation in ischemic stroke patients after intravenous thrombolysis. Front Neurol. (2022) 13:913442. doi: 10.3389/fneur.2022.913442

PubMed Abstract | Crossref Full Text | Google Scholar

49. Iancu, A, Buleu, F, Chita, DS, Tutelca, A, Tudor, R, and Brad, S. Early hemorrhagic transformation after reperfusion therapy in patients with acute ischemic stroke: analysis of risk factors and predictors. Brain Sci. (2023) 13:840. doi: 10.3390/brainsci13050840

PubMed Abstract | Crossref Full Text | Google Scholar

50. Phipps, MS, and Cronin, CA. Management of acute ischemic stroke. RMD Open. (2020) 68:l6983. doi: 10.1136/bmj.l6983

PubMed Abstract | Crossref Full Text | Google Scholar

51. Warner, JJ, Harrington, RA, Sacco, RL, and Elkind, MSV. Guidelines for the early Management of Patients with Acute Ischemic Stroke: 2019 update to the 2018 guidelines for the early Management of Acute Ischemic Stroke. Stroke. (2019) 50:3331–2. doi: 10.1161/strokeaha.119.027708

PubMed Abstract | Crossref Full Text | Google Scholar

52. Czap, AL, and Sheth, SA. Overview of imaging modalities in stroke. Neurology. (2021) 97:S42–s51. doi: 10.1212/wnl.0000000000012794

Crossref Full Text | Google Scholar

53. Suh, CH, Jung, SC, Cho, SJ, Woo, DC, Oh, WY, Lee, JG, et al. MRI for prediction of hemorrhagic transformation in acute ischemic stroke: a systematic review and meta-analysis. Acta Radiol. (2020) 61:964–72. doi: 10.1177/0284185119887593

Crossref Full Text | Google Scholar

54. Suh, CH, Jung, SC, Cho, SJ, Kim, D, Lee, JB, Woo, DC, et al. Perfusion CT for prediction of hemorrhagic transformation in acute ischemic stroke: a systematic review and meta-analysis. Eur Radiol. (2019) 29:4077–87. doi: 10.1007/s00330-018-5936-7

Crossref Full Text | Google Scholar

55. Ande, SR, Grynspan, J, Aviv, RI, and Shankar, JJS. Imaging for predicting hemorrhagic transformation of acute ischemic stroke-a narrative review. Can Assoc Radiol J. (2022) 73:194–202. doi: 10.1177/08465371211018369

PubMed Abstract | Crossref Full Text | Google Scholar

Glossary

HT - Hemorrhagic transformation

AIS - acute ischemic stroke

IVT - intravenous thrombolytic therapy

ML - machine learning

PPR - platelet count ratio

AUC - area under curve

ROC - receiver operating characteristic

OTT - onset to treatment time

HI - hemorrhagic infarction

PH - parenchymal hemorrhage

PDW - platelet distribution width

WHO - World Health Organization

BMI - body mass index

NIHSS - National Institutes of Health Stroke Scale

BNP - brain natriuretic peptide

VIF - variance inflation factor

SD - standard deviation

MPV - Mean platelet volume

CT - computed tomography

CTA - CT angiography

CTP - CT perfusion

DWI - diffusion-weighted imaging

Keywords: hemorrhagic transformation, intravenous thrombolytic therapy, stroke, platelet distribution width-to-count ratio, machine learning

Citation: Li X, Lei C, Xu H, Yuan C, Zhou Y and Jiang W (2025) Prediction of hemorrhagic transformation after thrombolysis based on machine learning models combined with platelet distribution width-to-count ratio. Front. Neurol. 16:1466543. doi: 10.3389/fneur.2025.1466543

Received: 22 July 2024; Accepted: 16 September 2025;
Published: 02 October 2025.

Edited by:

Jean-Claude Baron, University of Cambridge, United Kingdom

Reviewed by:

Kais Gadhoumi, Duke University, United States
Yao Yu, The Affiliated Hospital of Qingdao University, China

Copyright © 2025 Li, Lei, Xu, Yuan, Zhou and Jiang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Wen Jiang, YXJhaW5qd0AxMjYuY29t

^†These authors have contributed equally to this work and share first authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.