- 1Second Clinical Medical College, Heilongjiang University of Chinese Medicine, Harbin, Heilongjiang Province, China
- 2Second Department of Rehabilitation, The First Affiliated Hospital of Heilongjiang University of Chinese Medicine, Harbin, Heilongjiang Province, China
- 3First Department of Pediatrics, The First Affiliated Hospital of Heilongjiang University of Chinese Medicine, Harbin, Heilongjiang Province, China
- 4Fourth Department of Acupuncture and Moxibustion, The Second Affiliated Hospital of Heilongjiang University of Chinese Medicine, Harbin, Heilongjiang Province, China
Background: Malnutrition is a critical concern associated with increased mortality rates and adverse outcomes in stroke adults undergoing subacute rehabilitation. Despite its clinical significance, predictive tools for assessing malnutrition risk in this population remain limited. This study aimed to develop and validate an interpretable machine learning (ML) model to predict malnutrition risk among stroke patients during subacute rehabilitation.
Methods: This multicenter study comprised a development cohort of 802 patients from a single institution, which randomly split into training and testing sets at a 7:3 ratio. An external validation cohort of 345 patients was recruited from an independent hospital. Feature selection was conducted using the Least Absolute Shrinkage and Selection Operator (LASSO) regression combined with the Boruta algorithm. Eight ML models—Logistic Regression (LR), Random Forests (RF), Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LGBM), Support Vector Machines (SVM), k-Nearest Neighbors (KNN), Neural Network (NNet), and CatBoost (CAT)—were trained utilizing five-fold cross-validation. These models were evaluated using metrics such as discrimination, calibration curve, and decision curve analysis (DCA). Model interpretability was assessed via Shapley Additive Explanations (SHAP) analysis.
Results: The CAT algorithm exhibited superior predictive model in the training and testing sets, achieving an area under the receiver operating characteristic curve (AUC) of 0.848 (95% CI: 0.817–0.879) and 0.806 (95% CI = 0.752–0.861), respectively. Calibration metrics underscored the model’s robustness and DCA emphasized its clinical utility. External validation further corroborated the generalizability of the CAT model, demonstrating an AUC of 0.772; (95% CI: 0.723–0.820). SHAP analysis identified age, handgrip strength, and Barthel Index (BI) score as the most significant predictors of malnutrition.
Conclusion: This study successfully developed and validated an ML model for efficiently screening malnutrition risk in patients with subacute stroke. The interpretable CAT-based model serves as a clinically actionable tool, enabling early stratification of malnutrition risk in subacute stroke patients. This facilitates the implementation of targeted nutritional interventions and personalized rehabilitation strategies, potentially improving outcomes in this vulnerable population.
Introduction
Stroke remains a leading cause of mortality and long-term disability worldwide, placing significant socioeconomic burdens on healthcare systems and communities (1). Post-stroke malnutrition, a prevalent yet frequent overlooked complication, affects 19 to 72% of patients upon admission to rehabilitation (2). This condition is strongly associated with poor functional recovery, an increased risk of stroke-associated pneumonia, prolonged hospitalization, higher mortality rates, and elevated healthcare costs (3). The subacute phase of stroke recovery, spanning from onset to 3 months post-stroke, represents a critical therapeutic window during which neuroplasticity peaks and targeted interventions can maximize functional recovery (4). Paradoxically, this period is also marked by increased vulnerability to malnutrition due to complications, metabolic dysregulation, and impaired self-care capacity (5). Post-stroke malnutrition results from a variety of etiological pathways. Dysphagia, affecting up to 50% of stroke survivors, directly impairs nutritional intake and is a primary cause of malnutrition (6). Furthermore, poor functional status—manifested by hemiparesis, reduced mobility, depression, post-stroke dementia, or systemic inflammation—severely disrupts nutritional balance (7). Malnutrition also exacerbates sarcopenia, a progressive decline in muscle mass and strength that impairs rehabilitation effectiveness and functional outcomes (8). Despite its clinical significance, malnutrition in subacute stroke patients remains systematically underrecognized, primarily due to the absence of standardized assessment protocols (9).
Identifying malnutrition is crucial for implementing effective nutritional interventions during the subacute phase of rehabilitation recovery (4). However, current nutritional assessment methods exhibit significant limitations, primarily due to the absence of a universally accepted definition of malnutrition and the lack of a gold standard for evaluating nutritional status (10). Although the Global Leadership Initiative on Malnutrition (GLIM) criteria recently established a global consensus on diagnostic criteria for malnutrition in adults (11), they provide standardized phenotypic and etiologic diagnostic criteria that are impractical for rapid assessment in acute and subacute stroke settings and fail to proactively assess malnutrition risk (12). Furthermore, the dynamic nature of post-stroke recovery, during which the critical subacute rehabilitation window coincides with peak malnutrition risk, necessitates tools that offer early, actionable insights (2, 13). Machine learning (ML), a subset of artificial intelligence, focuses on developing algorithms that autonomously improve their performance through experience (14). Over the past few decades, ML has demonstrated proficiency in handling large-scale datasets and has been widely applied in medicine for early detection, diagnosis, and outcome prediction (15). Recently, a growing body of studies has emerged on ML techniques for predicting cerebrovascular events, particularly in stratifying stroke patients to optimize therapeutic interventions and forecast outcomes (16).
Several studies have developed and validated diagnostic models utilizing ML algorithms to predict malnutrition across diverse cohorts (17–19). However, limited research has specifically addressed malnutrition prediction in subacute stroke patients. To fill these gaps, this study proposes an interpretable ML-based predictive model specifically designed for malnutrition risk stratification in subacute post-stroke patients. By integrating variables such as dysphagia, sarcopenia indicators, and functional status metrics, the model seeks to balance predictive performance with interpretability, enabling clinicians to identify high-risk patients and customize interventions during this critical rehabilitation phase. This approach facilitates the implementation of nutrition-focused rehabilitation protocols and addresses the need for rapid-assessment tools across various clinical settings.
Methods
Study design and population
This prospective, multicenter, cross-sectional study consecutively enrolled stroke patients admitted to inpatient rehabilitation departments at two tertiary care hospitals affiliated with Heilongjiang University of Chinese Medicine from April 2021 to December 2024. The model development cohort consisted of patients recruited from the Second Affiliated Hospital between April 2021 and August 2024 (39-month recruitment period), whereas the external validation cohort comprised participants from the First Affiliated Hospital enrolled between March 2023 and December 2024 (22-month recruitment period). Inclusion criteria were: (1) age ≥18 years; (2) first- diagnosis of ever ischemic or hemorrhagic stroke, confirmed via magnetic resonance imaging (MRI) or computed tomography (CT); (3) subacute phase (1–3 months after stroke onset); (4) sufficient cognitive function and language ability to permit assessment completion; (5) voluntarily participated in the study and provided written informed consent; (6) stable vital signs; (7) availability of comprehensive nutritional status evaluations. Exclusive criteria included: (1) diagnosis of transient ischemic attack or subarachnoid hemorrhage; (2) history of prior stroke; (3) active psychiatric disorders; (4) concurrent severe life-threatening diseases (e.g., malignant tumors, end-stage cardiac/renal dysfunction); (5) missing data > 20%; (6) receipt of nutritional supplementation within 3 months prior to admission; (7) estimated life expectancy <6 months. The recruitment process is illustrated in Figure 1.
This study adhered to the ethical principles of the Declaration of Helsinki and received approval from the respective institutional review boards: the Second Affiliated Hospital (Approval No. [2021]K179) and the Frist Affiliated Hospital (Approval No. KY[2023]983) of Heilongjiang University of Chinese Medicine. Written informed consent was obtained from all participants.
Malnutrition screening and diagnosis
This study employed a two-step methodology to identify malnutrition in subacute post-stroke patients, following the GLIM criteria (20). The same outcome definitions and diagnostic protocols were applied uniformly to both cohorts to ensure consistency. Initially, nutritional risk stratification was conducted within 24 h of admission using the Nutritional Risk Screening 2002 (NRS-2002) (21). This validated tool assessed three domains: nutritional status impairment, disease severity, and age-adjusted risk stratification. Patients scoring ≥3 points were classified as nutritionally at-risk and progressed to secondary assessment. The subsequent malnutrition diagnosis followed GLIM standards, requiring fulfillment of at least one phenotypic and one etiological criterion. The phenotypic criteria included: (1) weight loss: > 5% within 6 months or > 10% beyond 6 months; (2) low BMI: BMI < 18.5 kg/m2 for patients aged < 70 years or < 20 kg/m2 for those ≥ 70 years; (3) reduced muscle mass, measured via calf-circumference (< 34 cm for males and < 33 cm for females) (22) or handgrip strength (< 28 kg for males or < 18 kg for females) (10). The etiological criteria included (1) reduced nutrient intake: persistent consumption ≤50% of estimated energy requirements for > 7 days or sustained intake reduction of any magnitude for ≥14 days; (2) inflammatory: acute or chronic inflammatory burden resulting from stroke-related complications, comorbidities, or systemic disease (23).
Data collection
Demographic and clinical data were systematically gathered upon hospital admission through electronic health records (EHRs) and questionnaires. Demographic characteristics included age (in years), gender, body mass index (BMI, kg/m2), medical payment method; education level, employment status, living status, monthly household income, drinking history, smoking history, comorbidities (hypertension, diabetes, dyslipidemia, cardiovascular disease, digestive disease, chronic kidney disease), eating habits, polypharmacy status (concurrent use of ≥5 medications), functional independence measured by Barthel Index (BI) score, and handgrip strength (assessed via dynamometer on the non-hemiparetic side). Clinical stroke profiles included days since stroke onset, stroke classification (ischemic/hemorrhagic), lesion localization, stroke severity quantified using the National Institutes of Health Stroke Scale (NIHSS), and post-stroke complications (including pneumonia, dominant arm paresis, anorexia, dysphagia, dysarthria/aphasia). Biological parameters were analyzed from fasting venous blood samples obtained within 24 h of admission: hemoglobin (HGB, g/L), total protein (TP, g/L), albumin (Alb, g/L), total cholesterol (TC, mmol/L), triglycerides (TG, mmol/L), serum creatinine (Scr, mmol/L), fibrinogen (FIB. g/L), D-dimer (μg/mL), high-density lipoprotein cholesterol (HDL-C, mmol/L), low-density lipoprotein cholesterol (LDL-C, mmol/L), C-reaction protein (CRP, mg/L), uric acid (UA, μmol/L), white blood cell count (WBC, 109/L), neutrophils (NEU, 109/L), lymphocytes (LYM, 109/L), platelets (PLA, 109/L), neutrophil-lymphocyte ratio (NLR), platelet-lymphocyte ratio (PLR), and prognostic nutritional index (PNI, calculated as Alb + 5 × LYM).
Data preprocessing
Initially, variables (e.g., laboratory indicators) across participating centers were standardized to ensure consistency and robustness in multicenter data integration. Variables with >20% missing values were excluded from the analysis to reduce bias associated with incomplete data. Remaining missing values were imputed using a random forest-based approach, which utilizes feature correlations to iteratively predict plausible replacements while maintaining data structure (24). The procedure ran for a maximum of 10 iterations, with convergence automatically assessed through stabilization of the out-of-bag (OOB) imputation error (relative change tolerance: <0.5% between iterations). Continuous variables were imputed via regression forests, while categorical variables were imputed via classification forests (Supplementary Figure S1). Subsequently, continuous variables were normalized through Z-score transformation. Categorical variables (e.g., gender, comorbidities) were converted into dummy variables using one-hot encoding, ensuring compatibility with ML algorithms while avoiding ordinal assumptions.
Feature selection process
To balance model parsimony with robust feature identification, we employed a hybrid two-step feature selection process that included the Least Absolute Shrinkage and Selection Operator (LASSO) and Boruta algorithms to identify independent risk factors for malnutrition. LASSO regression, a regularization technique, was utilized to mitigate overfitting while identifying parsimonious predictors through L1-penalized optimization (25). The optimal regulation parameter (λ) was determined using 10-fold cross-validation with one standard error (λ.1se) as the selection criterion. Concurrently, the Boruta algorithm iteratively assessed variable importance against permuted “shadow features,” retaining only attributes that demonstrated statistically significant predictive power (26). This approach was chosen to leverage their complementary strengths: LASSO addresses multicollinearity through coefficient shrinkage, while Boruta’s permutation-based ensemble method captures non-linear and interaction effects often overlooked by linear models (27, 28).
Model development
Eight ML algorithms were employed to develop the malnutrition prediction framework: Logistic Regression (LR), Random Forests (RF), Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LGBM), Support Vector Machines (SVM), k-Nearest Neighbors (KNN), Neural Network (NNet), and CatBoost (CAT). A detailed summary of the algorithmic rationale is provided in Supplementary Table S1. In this study, the dataset from the model development cohort was randomly divided into training and testing sets in a 7:3 ratio. To address potential overfitting and optimize the predictive models, the final hyperparameters for each model were obtained through five-fold cross-validation combing with grid search (Supplementary Table S2).
Model validation
The predictive performance of the optimal model was rigorously evaluated using both internal and external validation cohorts, employing multiple evaluation metrics, including the area under the receiver operating characteristic curve (AUC) with 95% confidence intervals (CIs), accuracy, precision, recall, specificity, Brier score, log loss, and F1 score. Calibration accuracy was evaluated by comparing observed versus predicted probabilities via calibration curves, while clinical utility was quantified through decision curve analysis (DCA) to estimate net benefit across probability thresholds. Statistical differences in AUC values between candidate models were examined using the Delong test. The final model was selected following a comprehensive comparative analysis of predictive performance, calibration integrity, and clinical applicability.
Model interpretability
Model interpretability was rigorously assessed using the Shapley Additive Explanation (SHAP) framework (29), a method derived from game theory that calculates feature importance through coalitional contribution analysis. SHAP values yield mathematically consistent attribution scores, decomposing each prediction into the marginal impact of individual feature while preserving the global model behavior. To operationalize this framework, we implemented two complementary analytical visualizations: (1) a SHAP beeswarm plot, which quantitatively illustrates the impact of features at the population level by aggregating SHAP values across the entire cohort, with each point representing the magnitude of a feature’s directional contribution (positive/negative) for a single observation; (2) individualized waterfall plots were constructed to illustrate how specific features cumulatively modify baseline population risk estimates, ultimately producing final predictions for representative clinical cases.
Statistical analysis
All statistical analyses were conducted using SPSS Statistics (version 27.0; IBM Corp.) and R software (version 4.4.3; R Foundation for Statistical Computing). Continuous variables characterized by a non-normal distribution were presented as medians with interquartile ranges (IQRs). Inter-group differences in medians were evaluated using the Mann–Whitney U test. Categorical variables were expressed as frequencies and percentages and were subjected to comparisons using the chi-squired test. A two tailed p value < 0.05 was considered statistically significant.
Results
Baseline characteristics
A total of 802 eligible stroke patients were included in the model development cohort. The median age of the enrolled patients was 64 years (range: 44–85), with 71.4% (573/802) being male. Ischemic stroke was the most prevalent subtype, accounting for 86.5% of cases. Baseline characteristics of the patients are summarized in Table 1. The overall prevalence of malnutrition within the cohort was 57.2% (459/802). Statistical analysis showed significant differences in age (p < 0.001), BMI (p = 0.030), smoking history (p = 0.048), eating habits (p < 0.001), BI score (p < 0.001), handgrip strength (p < 0.001), NIHSS score (p < 0.001), loss of appetite (p = 0.032), dysphagia (p < 0.001), FIB (p = 0.006), and CRP (p = 0.025) between the malnutrition and non-malnutrition groups (Supplementary Table S3). For model development, the cohort was randomly divided into training (n = 562) and testing (n = 240) groups. Demographic and clinical variables exhibited comparable distributions between the two groups, with no significant differences observed (p > 0.05, Supplementary Table S4).
Feature selection
The LASSO regression method was utilized to analyze independent variables associated with malnutrition (Figure 2A). The optimal regularization parameter (λ) was determined using 10-fold cross-validation, yielding a λ.1se value of 0.0434 (Figure 2B). This process identified seven candidate variables that were predictive of malnutrition: age, BI score, dysphagia, eating habits, handgrip strength, NIHSS score, and PNI. Subsequently, the Boruta algorithm was applied, independently selecting seven critical predictors: age, BI score, dysphagia, eating habits, handgrip strength, NIHSS score, and BMI (Figure 2C). A comparative analysis of feature subsets derived from LASSO and Boruta revealed a consensus set of six variables shared by both methods: age, BI score, dysphagia, eating habits, handgrip strength, and NIHSS score (Figure 2D).
Figure 2. Feature screening process. (A) Dynamic plot for LASSO variable selection. (B) Predictor screening using LASSO model with 10-fold cross-validation. The left dashed line indicates the λ value corresponding to the minimum error (λ.min), while the right dashed line represents the λ value within one standard error (λ.1se). (C) Feature identification using Boruta algorithm. The bule boxes represent the minimum, average, and maximum shadow score. Green boxes indicate important variables, while red ones are rejected. (D) Common predictors identified by both LASSO and Boruta.
Model development and performance evaluation
Eight ML algorithms, including LR, RF, XGBoost, CAT, LGBM, KNN, NNet, and SVM were employed to develop and optimize malnutrition prediction models. In the training set, the CAT model demonstrated the best overall performance, achieving the highest AUC of 0.848 (95% CI: 0.817–0.879). This was followed by XGBoost (AUC = 0.821; 95% CI: 0.787–0.855) and RF (AUC = 0.798; 95% CI: 0.761–0.835; Figure 3A). Furthermore, the CAT model exhibited superior performance across multiple metrics, including accuracy (0.762), precision (0.785), specificity (0.726), F1 score (0.787), and log loss (0.489; Figures 4A,C; Table 1). In the testing set, the CAT model maintained the highest AUC at 0.806 (95% CI: 0.752–0.861), followed by XGBoost (AUC = 0.783; 95% CI: 0.725–0.841) and LR (AUC = 0.775; 95% CI: 0.716–0.834) models (Figure 3B). Consistently, the CAT model outperformed all other evaluated models, yielding the highest accuracy (0.738), precision (0.793), specificity (0.695), F1 score (0.779), and log loss (0.539; Figures 4B,D; Table 1).
Figure 3. Evaluation of the discriminative ability of the eight ML models. (A,B) ROC curves for the training and testing sets.
Figure 4. Comparison of the performance of eight ML models. (A,B) Line plots comparing the evaluation metrics across eight ML models in the training and testing cohorts. (C,D) Heatmap of evaluation metrics for eight ML models in the training and testing cohorts.
In the training set, the Delong test confirmed that the AUC of the CAT model differed significantly from that of the other models (all p < 0.05; Figure 5A; Supplementary Table S5). Similarly, in the testing set, the CAT model yielded a significantly higher AUC compared to the other models (Figure 5B; Supplementary Table S5).
Figure 5. Comparison of AUC for eight ML models using the Delong test. (A) Comparison within the training set. (B) Comparison within the testing set.
Cross-validation and model stability
To assess the stability of the CAT model, five-fold cross-validation was conducted on the training dataset. Across the five folds, the model yielded AUC values ranging from 0.721 (95% CI: 0.619–0.823) to 0.827 (95% CI: 0.750–0.904), resulting in a mean AUC of 0.763 (95% CI: 0.706–0.820; Supplementary Figure S2; Supplementary Table S6). The minimal variation of the CAT model throughout the cross-validated iterations underscored its reliability.
Calibration curves and decision curve analysis
In the training set, calibration analysis demonstrated strong agreement between predicted and observed malnutrition probabilities across most models. The CAT model exhibited near-ideal calibration, with a Brier score of 0.162 (Figure 6A; Table 1). DCA showed that the CAT model performed the best across entire threshold range, followed by XGBoost and NNet models (Figure 6C). In the testing set, calibration curves showed that the CAT model demonstrated excellent agreement between predicted and observed malnutrition probabilities, with the lowest Brier score of 0.182 (Figure 6B; Table 1). DCA further indicated that the CAT model offered the greatest net benefits across a wide range of threshold probabilities, consistently outperforming other models (Figure 6D). Collectively, based on the comprehensive evaluation, the CAT algorithm emerged as the optimal model for predicting malnutrition in subacute post-stroke patients.
Figure 6. Calibration curves and clinical utility of the eight ML models. (A,B) Calibration curves for the training and testing sets. (C,D) DCA curves for the training and testing sets.
External validation
External validation of the CAT model was performed using a prospective cohort (n = 345) from an independent hospital. The baseline characteristics of this cohort were summarized in Supplementary Table S7. Although the AUC of the CAT model decreased slightly in the external validation (AUC = 0.772; 95% CI: 0.723–0.820), it still had the best discriminative capacity (Figure 7A; Supplementary Table S8). Calibration analysis revealed strong agreement between predicted and observed malnutrition probabilities (Figure 7B), characterized by a Brier score of 0.193 (Supplementary Table S8). DCA underscored the model’s clinical applicability, demonstrating sustained net benefit across a wide range of threshold probabilities (Figure 7C). These findings further underscored the CAT model’s strong stability and reliability in predicting malnutrition risk across different datasets.
Figure 7. External validation of the CAT-based predictive model. (A) ROC curve. (B) Calibration curve. (C) DCA.
SHAP for model interpretation
The SHAP framework was employed to interpret the contribution of predictor variables to malnutrition risk predictions generated by the CAT model. The SHAP summary plot quantified feature importance based on mean absolute SHAP values, ranking predictors in descending order of influence: age, handgrip strength, BI score, NIHSS score, dysphagia, and eating habits (Figure 8A). A complementary beeswarm plot further elucidated the directional relationships between individual features and predicted outcomes (Figure 8B). In this visualization, SHAP values (horizontal axis) represent the magnitude and direction of feature effects, with color intensity denoting high (yellow) or low (purple) feature magnitudes. Features positioned farther from the neutral SHAP values of zero exhibited stronger associations with malnutrition risk. Specifically, advanced age, increased NIHSS score, reduced handgrip strength, lower BI score, presence of dysphagia, and tube feeding dependency contributed to malnutrition outcomes.
Figure 8. Visual interpretation of CAT-based predictive model by SHAP. (A) SHAP summary plot. (B) SHAP beeswarm plot. Yellow color represents a high feature value for malnutrition, whereas purple represents a low feature value. (C) SHAP waterfall plot for a case of malnutrition. (D) SHAP waterfall plot for a case of non-malnutrition. Yellow indicates positive contributions and red indicates negative impacts.
Case-specific interpretations were generated using SHAP waterfall plots to illustrate individual prediction mechanisms (Figures 8C,D). In these visualizations, yellow arrows indicate features contributing to higher risk (positive SHAP values), whereas red arrows denote protective effects (negative SHAP values). The baseline model output, denoted as E[f(x)], represented the expected risk across the population, whereas f(x) reflected the model’s predicted output for a specific individual. For the malnourished patient (Figure 8C), the model yielded a prediction value of 3.51, substantially exceeding the baseline value of 0. This elevated risk was primarily attributed to three factors: unfavorable BI score (+2.62), tube feeding (+0.909), and the presence of dysphagia (+0.297). Conversely, for the non-malnourished patient (Figure 8D), the prediction value was −0.694, markedly below the baseline. Protective factors contributing to this risk reduction included younger age (−0.398), absence of dysphagia (−0.215), oral feeding (−0.116), a higher BI score (−0.114), and a lower NIHSS score (−0.0869).
Discussion
This study conducted a comprehensive analysis of clinical data from 802 subacute post-stroke patients in China to develop and validate a ML-based predictive model for malnutrition risk. To the best of our knowledge, this represents the first interpretable ML framework specifically designed for malnutrition screening in stroke populations. The CAT algorithm was identified as the optimal model following a rigorous comparative evaluation of predictive performance across multiple ML algorithms. This model demonstrated strong predictive performance across the training, testing, and external validation cohorts. To ensure clinical interpretability, SHAP analysis was employed to elucidate the model’s decision-making logic. The SHAP analysis identified six key predictors of malnutrition: age, NIHSS score, handgrip strength, BI score, dysphagia, and eating habits. Regarding clinical translation, future work should prioritize integrating this model into EHR systems via user-friendly digital interfaces. Such integration, augmented by real-time, patient-specific explanations of risk factors, is essential to foster clinician trust and streamline the model’s adoption into routine care protocols.
Previous studies have developed predictive models to assess malnutrition risk in stroke populations. For example, nomograms validated in diverse stroke cohorts have demonstrated satisfactory predictive accuracy (30, 31). However, existing models predominantly focus on acute-phase stroke patients and rely on specialized assessments, potentially limiting their clinical scalability. Furthermore, conventional multivariate regression approaches are susceptible to small-sample bias and often lack generalizability when modeling complex, nonlinear relationships among variables. To address these limitations, ML algorithms that leverage routinely collected clinical data offer a promising avenue for enhancing predictive performance. Recent literature highlights the superiority of ML models over traditional LR for stratifying malnutrition risk across diverse clinical populations (17, 19, 32). Consistent with these advancements, in our study, the CAT model outperformed conventional LR in discriminative accuracy across training, testing, and external validation cohorts. These findings underscore the potential of ML-based tools to facilitate the early identification of high-risk individuals in clinical practice, thereby alleviating the burden on healthcare systems through targeted interventions.
Unlike linear classifiers such as LR, which rely on predefined parametric assumptions, the CAT model inherently addresses feature importance estimation through an iterative, hierarchical splitting mechanism. This process systematically partitions heterogeneous clinical data into interpretable decision pathways while mitigating overfitting through ensemble techniques (33). This adaptability is particularly advantageous for malnutrition prediction, given that multifactorial risk factors (e.g., dysphagia severity, functional status) exhibit nonlinear interdependencies that conventional linear models often fail to capture adequately (19). Performance evaluations demonstrated that the CAT-based model consistently outperformed competing algorithms across multiple aspects—including discrimination, calibration, and clinical utility—in both internal and external validations. These findings highlight the utility of the CAT framework for malnutrition risk stratification in subacute post-stroke care. In contrast to traditional nutritional screening tools such as GLIM and NRS-2002, which are susceptible to variability in patient-report data and evaluator-dependent biases, the CAT model addresses these limitations by leveraging ML to provide an objective, data-driven evaluation framework. This approach minimizes subjectivity and facilitates timely nutritional interventions in clinical settings.
ML models are frequently criticized for their inherent “black-box” nature, which limits insight into their decision-making processes. To enhance the interpretability of our CAT algorithm, we employed the SHAP framework to elucidate the model’s predictive mechanisms (34). Although traditional logistical regression identified associations between predictors and outcomes, it is constrained by assumptions of linearity and often lacks the granularity required to capture complex feature interactions. In contrast, our SHAP analysis not only validated established predictors—such as age, dysphagia, and handgrip strength—but also quantified the nuanced contributors of additional variables, including NIHSS score, dietary patterns, and BI score. Crucially, SHAP revealed potential non-linear relationships and complex interactions that were overlooked by logistic regression, thereby deepening the understanding of the mechanisms underlying malnutrition in subacute post-stroke patients (35). These insights facilitate individualized risk stratification, providing clinicians with a comprehensive perspective on how various clinical, biochemical, and demographic factors synergistically influence malnutrition risk. For instance, while logistic regression quantified age as an independent linear predictor, SHAP elucidated its dynamic interaction with comorbidities (e.g., elevated NIHSS scores) doubling risk multiplicatively. By translating complex model outputs into actionable evidence, this approach enhances both predictive accuracy and clinical utility, bridging the divide between algorithmic complexity and clinical applicability.
The interplay of older age, dysphagia, and tube feeding dependency emerged as a critical nexus influencing malnutrition risk in our cohort. Advanced age inherently predisposes patients to nutritional deficits due to age-related physiological changes—including diminished physiological function reserve, altered appetite regulation, and reduced metabolic efficiency—which are often compounded by various comorbidities (36). These vulnerabilities are exacerbated in post-stroke contexts, where older adults frequently experience delayed recovery of independent swallowing function (37). Dysphagia, a prevalent sequela of stroke, directly disrupts oral intake and elevates aspiration risk, necessitating compensatory strategies such as texture-modified diets or tube feeding (38). However, while tube feeding ensures caloric delivery in dysphagic patients, delayed initiation or suboptimal calibration of enteral formulas—common in resource-constrained settings—may fail to meet elevated protein-energy demands, particularly in older adults with reduced physiological resilience (39). These observations align with prior studies (9, 40), demonstrating a significant association between older age, dysphagia and malnutrition. Furthermore, previous literature has consistently identified tube feeding as a significant predictor of malnutrition risk among post-stroke patients (41, 42). Consequently, these findings underscore the importance of regular nutritional monitoring, tailored formular adjustments in tube-fed cohorts, and early dysphagia rehabilitation to mitigate long-term dependency.
The association of elevated NIHSS scores and diminished BI scores with malnutrition underscores the complex interaction between stroke severity, functional dependency, and nutritional compromise. Higher NIHSS scores, reflecting greater neurological impairment, are associated with prolonged immobilization, systemic inflammation, and metabolic dysregulation (43, 44). Concurrently, diminished BI scores, indicative of reduced functional independence, exacerbate malnutrition risk by limiting patients’ ability to self-feed or access adequate nutrition, particularly in settings with insufficient caregiver support (45). This functional dependency may delay or disrupt meal schedules, reduce dietary diversity, and precipitate unintentional weight loss (30, 46). Furthermore, reduced handgrip strength—a marker of sarcopenia and global muscle wasting—serves as both a contributor to and consequence of malnutrition (47). Notably, the predictive value of handgrip strength aligns with its utility as a surrogate for overall nutritional status, reflecting both neuromuscular integrity and protein-energy reserves in stroke patients (5). Consistent with these observations, previous studies utilizing nomogram models have identified dysphagia, BI score, and grip strength as crucial risk factors for malnutrition (30, 31). Similarly, Zheng et al. identified the NIHSS scores as an independent risk factor for malnutrition in 774 stroke patients with bulbar paralysis via multiple logistic regression analysis (48). Collectively, these findings emphasize that severe stroke and functional impairment converge to amplify nutritional vulnerability, while sarcopenia mediates and exacerbates this risk.
Clinical implications
The interpretable CAT model offers a robust framework for the proactive and personalized management of malnutrition during the subacute stage of post-stroke care. By integrating model-derived risk stratification with modifiable predictors, clinicians can effectively implement evidence-based strategies to mitigate malnutrition. Upon initial assessments, the model classifies patients into low- or high-risk categories based on predicted probabilities, thereby enabling targeted interventions for high-risk individuals characterized by factors such as advanced age, severe dysphagia, diminished handgrip strength, or dependence on tube feeding. To facilitate clinical integration, the model may be deployed as a web-based calculator, allowing clinicians to input patient data and generate immediate risk assessments. Furthermore, integrating model outputs into EHRs could trigger real-time alerts for at-risk patients, prompting timely nutritional support. These automated alerts serve to guide multidisciplinary teams—comprising neurologists, dietitians, and rehabilitation specialists—in coordinating nutritional strategies with broader recovery objectives, ensuring the effective translation of model insights into clinical practice. Targeted intervention pathways focus on optimizing modifiable predictors identified via SHAP analysis; these include resistance training and protein supplementation to enhance handgrip strength, early swallowing therapy for dysphagia management, and caloric adjustments tailored to neurological severity. Additionally, for high-risk subgroups, particularly those dependent on tube feeding or subject to prolonged immobilization, immunonutrition is recommended to counteract catabolic states and mitigate infection risks.
Limitations
This study has several limitations that warrant consideration. First, the cross-sectional design inherently restricts causal inference, introducing potential selection bias and precluding the establishment of temporal relationships between predictors and malnutrition outcomes. Second, although the model integrates routine clinical parameters, it is essential to acknowledge the omission of critical confounders, including premorbid nutritional status indicators (e.g., pre-existing sarcopenia and frailty), psychological factors (e.g., depression severity and adequacy of caregiver support), and dynamic metabolic biomarkers (e.g., oxidative stress markers). These omissions of these variables may obscure significant modifiers of malnutrition risk and introduce residual confounding. Furthermore, due to clinical workflow constraints, gold-standard assessments for sarcopenia—specifically muscle mass and strength measurements via dual-energy X-ray absorptiometry (DEXA) and bioelectrical impedance analysis (BIA), as recommended by the GLIM criteria—were unavailable. Instead, calf circumference and handgrip strength served as surrogate measures, potentially compromising diagnostic accuracy. Consequently, these constraints obscure the delineation of the complex interplay between nutritional status and sarcopenia progression. Addressing these gaps could significantly enhance understanding of malnutrition and sarcopenia development. Third, there are ethical considerations concerning data privacy and the potential for algorithmic bias. The model relies on patient data, necessitating stringent safeguards to protect confidentiality and comply with data protection regulations. Moreover, the use of ML models raises concerns of bias if training data do not adequately represent all subpopulations, which could lead to inequities in health outcomes. Fourth, although external validation was performed, both the development and validation cohorts were recruited from hospitals affiliated with the same institutional network in the same city. This institutional overlap, combined with geographic and demographic homogeneity, restricts the generalizability of the model to broader healthcare contexts, particularly rural regions or populations with differing cultural, economic, or infrastructural characteristics that influence post-stroke recovery dynamics. Moreover, a key concern is the potential overfitting of the model due to these constraints. Overfitting may occur when the model captures noise specific to the training data rather than underlying patterns, resulting in reduced predictive performance on new unseen datasets. This risk underscores the importance of conducting additional external multicenter validation involving geographically dispersed institutions and heterogeneous patient populations to ensure the model’s robustness and validity across diverse clinical environments.
Conclusion
This study demonstrates that an ML-based approach, specifically utilizing the CAT model, effectively predicts malnutrition risk in subacute post-stroke patients. This model not only exhibited strong predictive performance across both internal and external validations but also identified key clinical determinants of malnutrition, including advanced age, reduced handgrip strength, reliance on tube feeding, elevated NIHSS scores, diminished BI scores, and dysphagia. These findings highlight the potential of ML tools to enhance patient care by providing reliable risk stratification, thereby enabling clinicians to implement timely and targeted interventions. The integration of ML models into clinical workflows facilitates decision-making in complex scenarios, ultimately aiming to improve patient outcomes. Future work should focus on further refining these models and assessing their applicability across diverse clinical settings to enhance their generatability and clinical utility.
Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Ethics statement
This study adhered to the ethical principles of the Declaration of Helsinki and received approval from the respective institutional review boards: the Second Affiliated Hospital (Approval No. [2021]K179) and the Frist Affiliated Hospital (Approval No. KY[2023]983) of Heilongjiang University of Chinese Medicine. Written informed consent was obtained from all participants. The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.
Author contributions
PS: Investigation, Software, Writing – review & editing, Conceptualization, Methodology, Writing – original draft, Validation, Visualization, Data curation. JL: Methodology, Validation, Data curation, Investigation, Software, Writing – original draft, Visualization. GD: Data curation, Investigation, Validation, Software, Writing – original draft. QS: Data curation, Writing – original draft, Investigation. GL: Conceptualization, Supervision, Writing – review & editing.
Funding
The author(s) declared that financial support was not received for this work and/or its publication.
Acknowledgments
We would like to express our gratitude to the healthcare colleagues who assisted us in collecting the data at our hospitals.
Conflict of interest
The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declared that Generative AI was not used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fnut.2025.1692020/full#supplementary-material
References
1. Feigin, VL, Brainin, M, Norrving, B, Martins, SO, Pandian, J, Lindsay, P, et al. World stroke organization: global stroke fact sheet 2025. Int J Stroke. (2025) 20:132–44. doi: 10.1177/17474930241308142,
2. Huppertz, V, Guida, S, Holdoway, A, Strilciuc, S, Baijens, L, Schols, JMGA, et al. Impaired nutritional condition after stroke from the Hyperacute to the chronic phase: a systematic review and Meta-analysis. Front Neurol. (2022) 12:780080. doi: 10.3389/fneur.2021.780080,
3. Sabbouh, T, and Torbey, MT. Malnutrition in stroke patients: risk factors, assessment, and management. Neurocrit Care. (2018) 29:374–84. doi: 10.1007/s12028-017-0436-1,
4. Dobkin, BH, and Carmichael, ST. The specific requirements of neural repair trials for stroke. Neurorehabil Neural Repair. (2016) 30:470–8. doi: 10.1177/1545968315604400,
5. Siotto, M, Cocco, C, Guerrini, A, Bertoncini, C, Germanotta, M, Cipollini, V, et al. Nutritional status in subacute post-stroke patients undergoing rehabilitation treatment: a protocol for a prospective observational study. BMC Sports Sci Med Rehabil. (2025) 17:138. doi: 10.1186/s13102-025-01174-7,
6. Cohen, DL, Roffe, C, Beavan, J, Blackett, B, Fairfield, CA, Hamdy, S, et al. Post-stroke dysphagia: a review and design considerations for future trials. Int J Stroke. (2016) 11:399–411. doi: 10.1177/1747493016639057,
7. Di VO,, Pagano, E, Cervone, M, Natale, R, Morena, A, Esposito, A, et al. High nutritional risk is associated with poor functional status and prognostic biomarkers in stroke patients at admission to a rehabilitation unit. Nutrients. (2023) 15:4144. doi: 10.3390/nu15194144
8. Siotto, M, Germanotta, M, Guerrini, A, Pascali, S, Cipollini, V, Cortellini, L, et al. Relationship between nutritional status, food consumption and sarcopenia in post-stroke rehabilitation: preliminary data. Nutrients. (2022) 14:4825. doi: 10.3390/nu14224825,
9. Yoon, J, Baek, S, Jang, Y, Lee, CH, Lee, ES, Byun, H, et al. Malnutrition and associated factors in acute and subacute stroke patients with dysphagia. Nutrients. (2023) 15:3739. doi: 10.3390/nu15173739,
10. Wong, HJ, Harith, S, Lua, PL, and Ibrahim, KA. Comparison of concurrent validity of different malnutrition screening tools with the global leadership initiative on malnutrition (GLIM) among stroke survivors in Malaysia. Sci Rep. (2023) 13:5189. doi: 10.1038/s41598-023-31006-y,
11. Jensen, GL, Cederholm, T, Correia, MITD, Gonzalez, MC, Fukushima, R, Higashiguchi, T, et al. GLIM criteria for the diagnosis of malnutrition: a consensus report from the global clinical nutrition community. JPEN J Parenter Enteral Nutr. (2019) 43:32–40. doi: 10.1002/jpen.1440,
12. Liu, P, Tian, H, Ji, T, Zhong, T, Gao, L, and Chen, L. Predictive value of malnutrition, identified via different nutritional screening or assessment tools, for functional outcomes in patients with stroke: a systematic review and Meta-analysis. Nutrients. (2023) 15:3280. doi: 10.3390/nu15143280,
13. Bernhardt, J, Hayward, KS, Kwakkel, G, Ward, NS, Wolf, SL, Borschmann, K, et al. Agreed definitions and a shared vision for new standards in stroke recovery research: the stroke recovery and rehabilitation roundtable taskforce. Int J Stroke. (2017) 12:444–50. doi: 10.1177/1747493017711816,
14. Jordan, MI, and Mitchell, TM. Machine learning: trends, perspectives, and prospects. Science. (2015) 349:255–60. doi: 10.1126/science.aaa8415,
15. Jiang, F, Jiang, Y, Zhi, H, Dong, Y, Li, H, Ma, S, et al. Artificial intelligence in healthcare: past, present and future. Stroke Vasc Neurol. (2017) 2:230–43. doi: 10.1136/svn-2017-000101,
16. Zu, W, Huang, X, Xu, T, Du, L, Wang, Y, Wang, L, et al. Machine learning in predicting outcomes for stroke patients following rehabilitation treatment: a systematic review. PLoS One. (2023) 18:e0287308. doi: 10.1371/journal.pone.0287308,
17. Wang, X, Yang, F, Zhu, M, Cui, HS, Wei, J, Li, J, et al. Development and assessment of assisted diagnosis models using machine learning for identifying elderly patients with malnutrition: cohort study. J Med Internet Res. (2023) 25:e42435. doi: 10.2196/42435,
18. Ma, W, Cai, B, Wang, Y, Wang, L, Sun, MW, Lu, CD, et al. Artificial intelligence driven malnutrition diagnostic model for patients with acute abdomen based on GLIM criteria: a cross-sectional research protocol. BMJ Open. (2024) 14:e077734. doi: 10.1136/bmjopen-2023-077734,
19. Liu, Y, Xu, Y, Guo, L, Chen, Z, Xia, X, Chen, F, et al. Development and external validation of machine learning models for the early prediction of malnutrition in critically ill patients: a prospective observational study. BMC Med Inform Decis Mak. (2025) 25:248. doi: 10.1186/s12911-025-03082-9,
20. Cederholm, T, Jensen, GL, Correia, MITD, Gonzalez, MC, Fukushima, R, Higashiguchi, T, et al. GLIM criteria for the diagnosis of malnutrition - a consensus report from the global clinical nutrition community. Clin Nutr. (2019) 38:1–9. doi: 10.1016/j.clnu.2018.08.002
21. Kondrup, J, Rasmussen, HH, Hamberg, O, and Stanga, Z. Nutritional risk screening (NRS 2002): a new method based on an analysis of controlled clinical trials. Clin Nutr. (2003) 22:321–36. doi: 10.1016/s0261-5614(02)00214-5,
22. Chen, LK, Woo, J, Assantachai, P, Auyeung, TW, Chou, MY, Iijima, K, et al. Asian working Group for Sarcopenia: 2019 consensus update on sarcopenia diagnosis and treatment. J Am Med Dir Assoc. (2020) 21:300–307.e302. doi: 10.1016/j.jamda.2019.12.012
23. Cederholm, T, Jensen, GL, Correia, MITD, Gonzalez, MC, Fukushima, R, Higashiguchi, T, et al. GLIM criteria for the diagnosis of malnutrition - a consensus report from the global clinical nutrition community. J Cachexia Sarcopenia Muscle. (2019) 10:207–17. doi: 10.1002/jcsm.12383,
24. Emmanuel, T, Maupong, T, Mpoeleng, D, Semong, T, Mphago, B, and Tabona, O. A survey on missing data in machine learning. J Big Data. (2021) 8:140. doi: 10.1186/s40537-021-00516-9,
25. Kang, J, Choi, YJ, Kim, IK, Lee, HS, Kim, H, Baik, SH, et al. LASSO-based machine learning algorithm for prediction of lymph node metastasis in T1 colorectal Cancer. Cancer Res Treat. (2021) 53:773–83. doi: 10.4143/crt.2020.974,
26. Sun, T, Liu, J, Yuan, H, Li, X, and Yan, H. Construction of a risk prediction model for lung infection after chemotherapy in lung cancer patients based on the machine learning algorithm. Front Oncol. (2024) 14:1403392. doi: 10.3389/fonc.2024.1403392,
27. Chen, J, Shen, C, Xue, H, Yuan, B, Zheng, B, Shen, L, et al. Development of an early prediction model for vomiting during hemodialysis using LASSO regression and Boruta feature selection. Sci Rep. (2025) 15:10434. doi: 10.1038/s41598-025-95287-1,
28. Shen, C, Chen, G, Chen, Z, You, J, and Zheng, B. The risk prediction model for acute urine retention after perineal prostate biopsy based on the LASSO approach and Boruta feature selection. Front Oncol. (2025) 15:1626529. doi: 10.3389/fonc.2025.1626529,
29. Xiang, Y, Yang, F, Yuan, F, Gong, Y, Li, J, Wang, X, et al. Development and validation of a multimodal machine learning model for diagnosing and assessing risk of Crohn's disease in patients with perianal fistula. Aliment Pharmacol Ther. (2025) 61:824–34. doi: 10.1111/apt.18455,
30. Tang, R, Guan, B, Xie, J, Xu, Y, Yan, S, Wang, J, et al. Prediction model of malnutrition in hospitalized patients with acute stroke. Top Stroke Rehabil. (2025) 32:173–87. doi: 10.1080/10749357.2024.2377521,
31. Liu, L, He, C, Yang, J, Chen, W, Xie, Y, and Chen, X. Development and validation of a nomogram for predicting nutritional risk based on frailty scores in older stroke patients. Aging Clin Exp Res. (2024) 36:112. doi: 10.1007/s40520-023-02689-0,
32. Turjo, EA, and Rahman, MH. Assessing risk factors for malnutrition among women in Bangladesh and forecasting malnutrition using machine learning approaches. BMC Nutr. (2024) 10:22. doi: 10.1186/s40795-023-00808-8,
33. Lin, SY, Law, KM, Yeh, YC, Wu, KC, Lai, JH, Lin, CH, et al. Applying machine learning to carotid sonographic features for recurrent stroke in patients with acute stroke. Front Cardiovasc Med. (2022) 9:804410. doi: 10.3389/fcvm.2022.804410,
34. Hu, J, Xu, J, Li, M, Jiang, Z, Mao, J, Feng, L, et al. Identification and validation of an explainable prediction model of acute kidney injury with prognostic implications in critically ill children: a prospective multicenter cohort study. EClinicalMedicine. (2024) 68:102409. doi: 10.1016/j.eclinm.2023.102409,
35. Guo, K, Zhu, B, Zha, L, Shao, Y, Liu, Z, Gu, N, et al. Interpretable prediction of stroke prognosis: SHAP for SVM and nomogram for logistic regression. Front Neurol. (2025) 16:1522868. doi: 10.3389/fneur.2025.1522868,
36. Dent, E, Wright, ORL, Woo, J, and Hoogendijk, EO. Malnutrition in older adults. Lancet. (2023) 401:951–66. doi: 10.1016/S0140-6736(22)02612-5,
37. Warabi, T, Ito, T, Kato, M, Takei, H, Kobayashi, N, and Chiba, S. Effects of stroke-induced damage to swallow-related areas in the brain on swallowing mechanics of elderly patients. Geriatr Gerontol Int. (2008) 8:234–42. doi: 10.1111/j.1447-0594.2008.00473.x,
38. Wu, C, Zhu, X, Zhou, X, Li, C, Zhang, Y, Zhang, H, et al. Intermittent tube feeding for stroke patients with dysphagia: a meta-analysis and systematic review. Ann Palliat Med. (2021) 10:7406–15. doi: 10.21037/apm-21-736,
39. Souza, JT, Ribeiro, PW, de Paiva, SAR, Tanni, SE, Minicucci, MF, Zornoff, LAM, et al. Dysphagia and tube feeding after stroke are associated with poorer functional and mortality outcomes. Clin Nutr. (2020) 39:2786–92. doi: 10.1016/j.clnu.2019.11.042,
40. Foley, NC, Martin, RE, Salter, KL, and Teasell, RW. A review of the relationship between dysphagia and malnutrition following stroke. J Rehabil Med. (2009) 41:707–13. doi: 10.2340/16501977-0415,
41. Wong, HJ, Harith, S, Lua, PL, and Ibrahim, KA. Prevalence and predictors of malnutrition risk among post-stroke patients in outpatient setting: a cross-sectional study. Malays J Med Sci. (2020) 27:72–84. doi: 10.21315/mjms2020.27.4.7,
42. Chen, N, Li, Y, Fang, J, Lu, Q, and He, L. Risk factors for malnutrition in stroke patients: a meta-analysis. Clin Nutr. (2019) 38:127–35. doi: 10.1016/j.clnu.2017.12.014,
43. Sato, M, Ido, Y, Yoshimura, Y, and Mutai, H. Relationship of malnutrition during hospitalization with functional recovery and Postdischarge destination in elderly stroke patients. J Stroke Cerebrovasc Dis. (2019) 28:1866–72. doi: 10.1016/j.jstrokecerebrovasdis.2019.04.012,
44. Nozoe, M, Inoue, T, Ogino, T, Okuda, K, and Yamamoto, K. Association between undernutrition on admission and stroke severity in patients with acute stroke. Nutr Neurosci. (2025) 28:1523–31. doi: 10.1080/1028415X.2025.2531344,
45. Nozoe, M, Yamamoto, M, Masuya, R, Inoue, T, Kubo, H, and Shimada, S. Prevalence of malnutrition diagnosed with GLIM criteria and association with activities of daily living in patients with acute stroke. J Stroke Cerebrovasc Dis. (2021) 30:105989. doi: 10.1016/j.jstrokecerebrovasdis.2021.105989,
46. Lee, LC, Tsai, AC, and Wang, JY. Need-based nutritional intervention is effective in improving handgrip strength and Barthel index scores of older people living in a nursing home: a randomized controlled trial. Int J Nurs Stud. (2015) 52:904–12. doi: 10.1016/j.ijnurstu.2015.01.008,
47. Meyer, F, and Valentini, L. Disease-related malnutrition and sarcopenia as determinants of clinical outcome. Visc Med. (2019) 35:282–91. doi: 10.1159/000502867,
Keywords: CAT, machine learning, multicenter study, predictive model, risk factors, subacute stroke
Citation: Sun P, Luan J, Duan G, Sun Q and Liu G (2026) Interpretable machine learning-based predictive model for malnutrition in subacute post-stroke patients: an internal and external validation study. Front. Nutr. 12:1692020. doi: 10.3389/fnut.2025.1692020
Edited by:
Mahmoud M. Abulmeaty, King Saud University, Saudi ArabiaReviewed by:
Laura Avila-Jimenez, Mexican Social Security Institute, MexicoBatoul Khoundabi, Iranian Red Crescent Society, Iran
Norah Alshammari, King Saud University, Saudi Arabia
Copyright © 2026 Sun, Luan, Duan, Sun and Liu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Genli Liu, bGl1Z2VubGlAaGxqdWNtLmVkdS5jbg==
Ping Sun1,2