ORIGINAL RESEARCH article

Front. Neurol.

Sec. Stroke

Comparison of logistic regression and machine learning methods for predicting early neurological deterioration after thrombolysis in patients with mild stroke

  • Dongyang Hospital of Wenzhou Medical University, Dongyang, China

The final, formatted version of the article will be published soon.

Abstract

Background: We aimed to explore the risk factors for early neurological deterioration after thrombolysis in patients with mild stroke. Machine learning model and logistic regression model were established. We compared them to facilitate early identification of patients with mild stroke who still experience early neurological deterioration after thrombolysis. It can alert the physician and clinical remedial measures can be prepared in advance. Methods: We conducted a study on patients with mild stroke who underwent thrombolysis from April 1, 2017 to April 1, 2024 at emergency department. Four common machine learning methods-Extreme Gradient Boosting (XGBoost), K-Nearest Neighbors (KNN), Random Forest (RF), and Support Vector Machine (SVM)-were used to create predictive models based on the information of eligible participants. The unbalanced data was preprocessed using four different methods. Each machine learning model was paired with four preprocessing schemes, resulting in 16 workflows. Then, we selected the optimal machine learning model from them. Additionally, five methods were used to establish logistic regression models. The optimal logistic regression model was then selected from them. Results: A total of 625 patients with mild stroke were included in the study, among whom 80 experienced early neurological deterioration after thrombolysis. Through 10-fold stratified cross-validation and simulated annealing algorithm, the optimal model among the four machine learning methods was selected as the SVM model that balanced the data through upsampling in 16 workflows. The area under the curve (AUC) of the SVM model was 0.889 (95% CI: 0.853,0.926) in the training set and 0.859 in the test set processed by upsampling. Among the five methods used to establish logistic regression models, model m4 was the optimal one, with an AUC of 0.848 in the test set. Conclusion: We explored the risk factors influencing the occurrence of early neurologic deterioration after thrombolysis in patients with mild stroke. We also found that logistic regression model and machine learning model demonstrated comparable performance in this single-center retrospective dataset.

Summary

Keywords

Early neurological deterioration, intravenous thrombolysis, machine learning, Mild ischemic stroke, Prediction model

Received

15 September 2025

Accepted

18 February 2026

Copyright

© 2026 Lou, Zhang, Li and Xu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Dongjuan Xu

Disclaimer

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Outline

Share article

Article metrics