Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Oncol., 09 February 2026

Sec. Thoracic Oncology

Volume 16 - 2026 | https://doi.org/10.3389/fonc.2026.1727595

Machine learning model for predicting malnutrition risk in lung cancer patients after thoracoscopic resection: a multi-center study

Tianfeng Chen&#x;Tianfeng Chen1†Ruilan Pan&#x;Ruilan Pan1†Ling LiangLing Liang2Limei XuLimei Xu3Mingyue YangMingyue Yang3Xiujuan DengXiujuan Deng2Ping Wang*Ping Wang1*
  • 1School of Nursing, Guangdong Medical University, Dongguan, Guangdong, China
  • 2Department of Thoracic Surgery, Affiliated Hospital of Guangdong Medical University, Zhanjiang, Guangdong, China
  • 3Dongguan School of Clinical Medicine, Guangdong Medical University (Dongguan People’s Hospital), Dongguan, Guangdong, China

Background: Early detection of malnutrition is critical for timely intervention in lung cancer patients undergoing thoracoscopic resection. Existing black-box prediction models lack clinical interpretability, limiting trust and application. The present study was conducted to predict malnutrition risk by establishing an explainable machine learning (ML) model and evaluate the model performance across several sites, so as to develop a web-based application to aid clinical decision-making.

Methods: A retrospective analysis was conducted on 1, 134 lung cancer patients who underwent thoracoscopic resection at Dongguan People’s Hospital between October 2021 and October 2024, consisting of a training set (n = 795) and a testing set (n = 339). Meanwhile, an external validation cohort (n=273) was prospectively enrolled at the Affiliated Hospital of Guangdong Medical University from March to June of 2025. Furthermore, univariate and multivariate analyses were employed to determine the individual risk variables for post-operative malnutrition. This study constructed eight ML models using Gradient Boosting Machine (GBM), Neural Network, Logistic Regression, Extreme Gradient Boosting (XGBoost), Random Forest, K-Nearest Neighbors (KNN), Adaptive Boosting (AdaBoost), and Support Vector Machine (SVM). The performance of the established models was assessed by decision curve analysis (DCA) and receiver operating characteristic (ROC) curves. Meanwhile, feature contributions and visualize model outputs were quantified using the SHapley Additive exPlanations (SHAP) method to enhance clinical interpretability. Consequently, a web-based risk calculator was created to assist in personalized forecasting.

Results: Among 1, 407 total patients, post-operative malnutrition incidence was 11.3% (159/1, 407). Multivariate analysis identified seven independent risk factors: albumin (ALB), Nutritional Risk Screening 2002 score, age, intraoperative blood loss, total drainage volume, Basic Activities of Daily Living (BADL) score, and serum potassium (K). The XGBoost model outperformed others, with AUC 0.845 (95% CI: 0.771–0.919) in the testing set and 0.886 (95% CI: 0.841–0.932) in external validation. SHAP analysis clarified the relative importance of risk factors, improving interpretability.

Conclusion: The XGBoost-based explainable ML model effectively predicts malnutrition risk in lung cancer patients after thoracoscopic resection. Integrating high predictive performance with interpretability, it supports clinical risk stratification and personalized nutritional interventions to improve post-operative outcomes. A publicly available web-based calculator facilitates easy clinical application.

1 Introduction

Malnutrition (1) refers to poor nutrition resulting from inadequate energy/nutrient intake or utilization disorders, characterized by abnormal body composition and impaired physiological function, and is an independent risk factor for cancer prognosis (2). Lung cancer remains the leading cause of cancer-related death worldwide (with approximately 2.5 million new cases and 1.8 million deaths annually); China, a country with a high incidence of lung cancer, recorded over 1.06 million new cases and 730, 000 deaths in 2022 (3, 4). The challenges are exacerbated by China’s large population, high smoking prevalence, and industrialization. Advances in diagnostics (5, 6) mean that more early-stage patients can undergo surgery. Video-assisted thoracoscopic resection has become the primary treatment modality due to fewer complications and faster recovery (7, 8). However, surgical stress induces hypermetabolism, activates immune-endocrine-metabolic cascades, and superimposes gastrointestinal dysfunction, leading to negative nitrogen balance and malnutrition (9). Postoperative malnutrition affects 17.5% - 54.2% of lung cancer patients (10, 11), prolonging mechanical ventilation duration, increasing complication risk, extending the length of stay in the hospital, and elevating mortality, while impairing quality of life and increasing family financial burdens (12, 13). Thus, identifying high-risk individuals is crucial for optimizing clinical decision-making.

Existing malnutrition risk prediction models exhibit several key limitations (14, 15): they predominantly employ traditional modeling approaches, rely on single-center datasets, lack external validation, and demonstrate suboptimal predictive performance. Critically, no dedicated models exist for thoracoscopic lung cancer patients, leaving a significant clinical gap for precise screening. Advances in artificial intelligence have enabled machine learning (ML) models to exhibit robust data processing and predictive capabilities, with great application potential in healthcare (16). Unlike traditional statistical models, explainable ML (e.g., SHapley Additive exPlanations (SHAP)) efficiently processes complex clinical data, quantifies each feature’s predictive importance, and converts “black box” decision-making into clinician-comprehensible logic, addressing the core limitation of traditional models in guiding clinical intervention priorities (17).

This study aimed to develop an interpretable ML model for predicting postoperative malnutrition in lung cancer patients undergoing thoracoscopic surgery. Distinct from existing malnutrition risk models and previous ML-based nutritional prediction studies, our work innovatively integrates a lung cancer-specific population undergoing thoracoscopic surgery with SHAP and prospective external validation, addressing the lack of dedicated and interpretable predictive tools for this specific cohort. We developed and validated eight ML models, selected the optimal model, quantified feature contributions via SHAP, and created a user-friendly online calculator. This work is expected to guide clinicians in prioritizing interventions and improving postoperative care quality.

2 Materials and methods

2.1 Design and participants

This study utilized a mixed design with simple random sampling applied for participant selection, incorporating both retrospective and prospective data. Retrospectively, this study collected data from lung cancer patients who had thoracoscopic resection at the Department of Thoracic Surgery, Dongguan People’s Hospital, Guangdong Province, from October 2021 to October 2024. These patients were randomly assigned to a training set and a testing set at a 7:3 ratio. Meanwhile, patients undergoing VATS for lung cancer at the Affiliated Hospital of Guangdong Medical University between March and June 2025 were sourced to get the prospective data and establish an external validation set. Class imbalance was not addressed during the data splitting process. Standardized perioperative management protocols were implemented at both centers, including routine preoperative education, standardized analgesia schemes, and unified postoperative dietary guidance, ensuring consistency in patient care across the study sites.

2.1.1 Inclusion criteria

1. Participants aged ≥18 years;

2. Lung cancer patients who underwent thoracoscopic resection with pathologically confirmed lung cancer (18);

3. Patients with complete clinical and laboratory data available;

4. Participants with signed informed consent provided by themselves or their family members.

2.1.2 Exclusion criteria

1. Patients with concurrent primary malignancies;

2. Patients with preoperative malnutrition, diagnosed per the 2019 Global Leadership Initiative on Malnutrition (GLIM) criteria (19);

3. Patients receiving postoperative radiotherapy, chemotherapy, neoadjuvant therapy, or targeted therapy;

4. Patients conversed to open thoracotomy during surgery;

2.1.3 Sample size calculation

In our study, the minimum sample size was determined using the following equation: n=(Zα/2)2P(1-P)/δ2, where Zα/2 = 1.96 [95% confidence interval (95% CI), P = 0.273 (malnutrition incidence, referenced from Nakyeyune et al. (20)), and δ=3% (tolerance error). The required sample size was 942 given a 10% attrition rate. Finally, a total of 1, 407 eligible patients were included, including 795 cases in the retrospective training set, 339 cases in the retrospective testing set, and 273 cases in the prospective external validation set. Figure 1 illustrates the detailed process of patient enrollment and dataset partitioning.

Figure 1
Flowchart showing the selection and modeling process for patient data analysis. Patients from Dongguan People's Hospital and Guangdong Medical University were divided into development and validation cohorts. Exclusions included preoperative malnutrition, complications, chemotherapy, and incomplete data. The development cohort underwent analysis, selecting seven variables for machine learning models. An optimal model was chosen, followed by performance comparison and interpretation using SHAP.

Figure 1. Flow diagram of patient selection workflow and model development methodology.

2.1.4 Ethics approval

The study was officially approved by the Ethics Committee of Dongguan People’s Hospital, Guangdong Province (Approval No.: KYKT2025-011). It was also filed with the Ethics Committee of the Affiliated Hospital of Guangdong Medical University. With written informed consent provided by each patient or family member, this study was conducted strictly in accordance with the Declaration of Helsinki.

2.2 Malnutrition screening and diagnostic criteria

2.2.1 Nutritional risk screening 2002 (NRS2002)

This study utilized the NRS2002 tool (21) to evaluate the nutritional risk of each patient. The tool comprises three components, with a total score ranging from 0 to 7 points. The scoring criteria are as follows:

2.2.2 Malnutrition diagnosis (GLIM consensus, 2019)

In this study, the diagnosis of malnutrition was achieved according to the two-step method suggested by the Global Leadership Initiative on the GLIM consensus (2019) (19). It involved the first step of confirming NRS2002 results, and the second step of combining one phenotypic criterion with one etiological criterion for the definitive diagnosis. Patients were continuously assessed until discharge, with the occurrence of malnutrition as the endpoint event. For clarity, the specific phenotypic and etiological criteria as well as decision pathway are detailed in Figure 2. Postoperative nutritional assessments were stratified by clinical status: weekly full NRS2002 screening was conducted for patients with stable conditions, while daily assessments were implemented for those with clinical changes (e.g., reduced food intake, postoperative complications, or abnormal laboratory results). A mandatory final nutritional assessment was performed for all patients before discharge. The endpoint of postoperative malnutrition was defined as the first confirmed diagnosis of malnutrition via the GLIM criteria during hospitalization; patients without a malnutrition diagnosis by discharge were classified into the non-malnutrition group.

Figure 2
Flowchart for nutritional risk screening and diagnosis. It begins with patient admission and NRS2002 screening within 24-48 hours. If the total is less than three, re-screen weekly. If the total is three or more, evaluate GLIM diagnosis using phenotypic criteria (reduced BMI, weight loss, decreased muscle mass) and etiological criteria (impaired intake, disease burden). If GLIM criteria are met, develop a nutrition plan. Phenotypic details include BMI less than 18.5 kg/m², weight loss over 5% in 6 months, and calf circumference less than 30.5 cm for males. Etiological details include intake less than 50% of requirements.

Figure 2. Flowchart of two-step malnutrition assessment: NRS2002 screening and GLIM diagnostic criteria (2019 consensus).

2.3 Predictor variables

Based on literature review, group discussions, and clinical practice experience, this study collected eligible data using a self-designed “Clinical Data Survey Form for Patients After Thoracoscopic Lung Cancer Resection”. The survey covered the following three aspects:

2.3.1 Demographic and baseline data

Specifically, the demographic and baseline data included patients’ name; gender; age; education level; smoking index (years smoked × cigarettes smoked per day); underlying conditions (e.g. chronic obstructive pulmonary disease, hypertension, diabetes mellitus, and tuberculosis); anthropometric measurements (e.g. height, weight, body mass index [BMI = weight/height²] and calf circumference at 10 cm below the patella); functional status assessment [e.g. Barthel Index for activities of daily living (ADL)] (22), with better functional status determined in case of higher scores, 100 points totally); and NRS2002 score.

2.3.2 Disease- and surgery-related clinical data

Disease- and surgery-related clinical data encompassed tumour characteristics [pathological stage based on the 8th edition of the AJCC lung cancer staging system (18), tumour type (e.g., adenocarcinoma, squamous cell carcinoma, small cell carcinoma, etc.), and the number of lymph nodes dissected]; surgical data [surgical approach (e.g., lobectomy, segmentectomy, pneumonectomy), anaesthesia duration, and intraoperative blood loss]; total chest tube drainage volume; pain score (Numeric Rating Scale, 0–10 points) (23); complications (24); and constipation status (<1 bowel movement within 72 h postoperatively).

2.3.3 Laboratory parameters

The following parameters were tested after the collection of fasting venous blood samples from the included patients within 24 h preoperatively.

a. Biochemical markers: Total protein (TP), albumin (ALB), globulin (GLB), total cholesterol (TCH), blood glucose (GLU), and lactate dehydrogenase (LDH);

b. Complete blood count parameters: Hemoglobin (Hb), white blood cell count (WBC), lymphocyte count (LYM), neutrophil count (NEUT), and platelet count (PLT);

c. Electrolytes: Serum potassium (K), sodium (Na), chloride (Cl), and calcium (Ca).

2.4 Data collection and quality control

2.4.1 Training and testing dataset collection

By sourcing the electronic medical record system and nursing records within the hospital, the demographic and sociological data, disease-related data, and laboratory test parameters of the enrolled patients were collected and organized by two nursing staff members from the Department of Thoracic Surgery who had undergone standardized training and passed the assessment. Among these, demographic and sociological data, as well as laboratory test-related data, were selected from the testing results of patients within 24 h of admission. Disease-related data included intraoperative and postoperative outcomes. According to the GLIM criteria for malnutrition assessment, patients were grouped into malnutrition and non-malnutrition categories.

2.4.2 External validation dataset collection

Two researchers were responsible for follow-up assessments of patients after admission, using the NRS2002 and GLIM to evaluate patients until discharge. All other methods for data collection were consistent with those of the training and testing sets.

2.4.3 Data quality control

A double-entry verification was performed using EpiData after the entry of all data into Excel by two researchers independently. Discrepancies were corrected by consulting the original records.

2.5 Statistical analyses

Data cleaning and database construction were performed using Excel. Statistical analysis and model development were carried out with the SPSS 26 and R 4.4.2 software packages (e.g., caret, pROC, Shapviz, and Shiny). Counting data were expressed as frequencies (n) and percentages (%), with inter-group comparisons performed using Fisher’s exact test or the chi-squared test. Meanwhile, measurement data distributed normally were expressed as mean ± standard deviation, and compared between groups by an independent samples t-test. Non-normally distributed measurement data in the form of median (interquartile range) [M (P25, P75)] were subjected to inter-group comparisons utilizing the Mann–Whitney U test. P < 0.05 denotes statistically significant differences, and α = 0.05 was chosen as the significance level. To verify the robustness of the selected features, Least Absolute Shrinkage and Selection Operator (LASSO) regression was performed. Continuous features were standardized via z-score normalization (R’s scale function) for model training stability. Hyperparameters were optimized by 10-fold internal cross-validation combined with GridSearchCV (caret package); optimal settings (selected by maximizing validation AUC) are detailed in Supplementary Table. Furthermore, calibration curves were plotted to assess the agreement between the predicted probabilities and the observed frequencies. To validate the model’s stability across different populations, stratified analysis was conducted in the external validation cohort based on age (< 65 vs. ≥ 65 years) and NRS2002 scores (< 3 vs. ≥ 3).

2.6 Development and evaluation of ML prediction models

(1) The training dataset was subjected to univariate and multivariate regression analyses. Logistic regression was applied to variables with statistical significance from the univariate analysis to identify independent factors affecting malnutrition in lung cancer patients after thoracoscopic resection. (2) Eight ML models were developed using Random Forest, Logistic Regression, Gradient Boosting Machine (GBM), Neural Network, XGBoost, K-Nearest Neighbors (KNN), Adaptive Boosting (AdaBoost), and Support Vector Machine (SVM). Then, the predictive performance of the constructed models was assessed based on ROC curves, AUC, sensitivity, specificity, precision, and F1-score. The model with the best performance was selected for further analysis. (3) Model validation. Internal validation and independent external validation were performed using 10-fold cross-validation to assess the robustness of the proposed model (25). (4) Model interpretation. Variable importance was assessed using the SHAP method. Finally, based on the optimal model, an interactive web calculator was developed using ShinyApps to facilitate clinical translation.

3 Results

3.1 Baseline characteristics

The study cohort consisted of 1, 407 suitable patients on the basis of strict inclusion and exclusion criteria. Of these, 1, 134 cases, dividing into a training set (n=795) and a testing set (n=339), were sourced from Dongguan People’s Hospital, Guangdong Province. The independent external validation set consisted of additional 273 cases from the Affiliated Hospital of Guangdong Medical University in Guangdong Province. As shown in Table 1, males accounted for 39.6% of the training set, 39.2% of the testing set, and 46.9% of the external validation set. The malnutrition rate was 10.94%, 10.62%, and 13.2% (slightly higher) in the training set, the testing set, and the external validation set, respectively. Compared to the training and testing cohorts, the external validation cohort had significantly higher mean age (56.96 ± 11.75 year and 57.78 ± 11.50 years vs. 61.20 ± 10.50 years, P < 0.001). The external validation cohort exhibited a higher proportion of patients with an NRS2002 score ≥ 3 compared to the training and testing cohorts (23.4% vs. 12.1% and 12.7%; P < 0.001).

Table 1
www.frontiersin.org

Table 1. Baseline characteristics of patients with lung cancer in the training, testing, and external validation cohorts.

3.2 Feature selection

In our study, malnutrition status (binary outcome variable defined by the GLIM criteria) was determined as the dependent variable. Thirty-four clinical features were included as independent variables. In univariate analysis (1 sample excluded), there were statistically significant differences in 16 variables between the malnutrition group and the non-malnutrition group (all P < 0.05). Furthermore, multivariate logistic regression analysis identified NRS2002, age, blood loss, total drainage volume, BADL, and serum K level as independent nutritional risk variables (OR > 1, P < 0.05), while ALB as an independent protective factor (OR < 1, P < 0.05) (Table 2).

Table 2
www.frontiersin.org

Table 2. Univariate and multivariate evaluations of malnutrition-related factors.

The robustness of these seven features was further confirmed by LASSO regression analysis (Figure 3), which showed consistent results with the multivariate logistic regression.

Figure 3
Two plots labeled A and B depict LASSO coefficient profiles. Plot A shows binomial deviance against negative log lambda with red dots decreasing to a minimum around 4.5 before increasing. Plot B displays the trajectories of various coefficients against negative log lambda, with lines diverging as lambda decreases, illustrating coefficient changes.

Figure 3. LASSO binary logistic regression for feature selection. (A) Binomial deviance curve of the LASSO model. The dashed line indicates the optimal λ value selected by 10-fold cross-validation. (B) Coefficient path of the LASSO model. Colored lines represent feature coefficients, with non-zero values at optimal λ indicating selected features.

3.3 Model construction and validation

Based on the 7 selected clinical features in multivariate analysis, eight typical machine learning methods were used to construct predictive models. Model fitting evaluations and stability screening were implemented to eliminate overfitted models, followed by a comprehensive assessment using calibration curve, ROC, DCA, AUC, sensitivity, specificity, precision, and F1-score.

Calibration analysis demonstrated good agreement between the predicted risk and actual malnutrition incidence. As shown in Figure 4, the calibration curve of the XGBoost model aligned closely with the ideal diagonal line, indicating high reliability in risk estimation.

Figure 4
Line chart comparing various models in terms of observed event percentage against bin midpoint. Models include Logistic, SVM, GBM, Neural Network, Random Forest, Xgboost, KNN, and Adaboost. The diagonal line represents perfect calibration. Models are color-coded for differentiation.

Figure 4. Calibration curves of multiple machine learning models on the testing set.

In the comparison of the AUC values, the RandomForest model achieved an AUC of 0.994 and 0.802 on the training and testing sets, respectively, with a difference of 0.192. Furthermore, the sensitivity of the testing set was obviously lower than that of the training set (72.2% vs. 98.9%), indicating overfitting. Therefore, it was ultimately not selected as the optimal model. Simultaneously, the RandomForest model was also subjected to analyses using methods such as pruning and increasing the sample size, but the results were unsatisfactory. Consequently, the optimal model was determined to be XGBoost, which showed excellent ability of generalization.

In the independent external validation cohort, the XGBoost model maintained stable performance (AUC = 0.886, 95% CI: 0.841–0.932), validating its robustness across different clinical settings and potential for translation in real-world practice (Figure 5 and Table 3). Given the significant differences in baseline characteristics (older age and higher nutritional risk) between the training and external validation cohorts, a stratified analysis was further conducted to evaluate the model’s robustness in specific subgroups. The XGBoost model demonstrated excellent discriminative ability across age groups, with an AUC of 0.943 (95% CI: 0.902–0.984) for patients < 65 years and 0.822 (95% CI: 0.737–0.906) for those $\ge$ 65 years. Regarding nutritional status, the model performed well in the majority of patients with low-to-moderate risk (NRS2002 < 3; AUC: 0.829, 95% CI: 0.617–1.000). However, attenuated performance was observed in the high-risk subgroup (NRS2002 $\ge$ 3; AUC: 0.600, 95% CI: 0.460–0.740), likely reflecting the increased clinical complexity and limited sample size (n = 64) in this specific subset (Table 4).

Figure 5
Composite image showing multiple panels of ROC and decision curve analyses. Panels A, C, and E display ROC curves with varying models such as Logistic, SVM, and RandomForest, each with corresponding AUC values. Panels B, D, and F present decision curves illustrating the net benefit against high-risk thresholds for the same models. Legends indicate color coding for model identification across all panels.

Figure 5. ROC and DCA curves across training, testing, and external validation sets. (A) ROC curves for the eight models in the training set. (C) ROC curves for the eight models in the testing set. (E) ROC curves for the eight models in the external validation set. (B) DCA for the eight models in the training set. (D) DCA for the eight models in the testing set. (F) The DCA of the eight models in the external validation set.

Table 3
www.frontiersin.org

Table 3. Comparative performance metrics of ML algorithms in predicting malnutrition risk in lung cancer patients undergoing thoracoscopic resection.

Table 4
www.frontiersin.org

Table 4. Stratified validation of model robustness: AUC across age and NRS2002 subgroups.

3.4 SHAP analysis and model interpretability of the XGBoost ML model

The SHAP algorithm was used to measure the significance and contribution of each predictor variable. As depicted in Figure 6A, feature importance was ranked in descending order by average SHAP absolute value as follows: ALB > age > NRS2002 score > intraoperative blood loss > total thoracic drainage volume > K > BADL score. As a result, ALB was identified as the core variable with the strongest predictive capability in the model.

Figure 6
Panel A displays a bar chart demonstrating the mean SHAP values for different features using Xgboost. Key features include ALB, Age, NRS2002, Blood_loss, Total_drainage_volume, K, and BADL. Panel B shows a SHAP summary plot with dot distributions for each feature, colored by feature value from low (purple) to high (yellow). Both panels illustrate the importance and impact of the features on the model's predictions.

Figure 6. XGBoost model interpretability: (A) Feature Importance Ranking, and (B) SHAP Value Distribution.

This study continued to identify the correlation between variables and prediction outcomes using the SHAP_XGBoost_importance_beeswarm plot (Figure 6B). Specifically, increased NRS2002 scores and advancing age were both positively correlated with SHAP values, indicating that both of which were independent risk factors for malnutrition. Conversely, elevated ALB level exhibited a negative correlation with the SHAP value, suggesting it acting as a protective factor against malnutrition, consistent with clinical nutritional assessment logic.

Based on the aforementioned XGBoost model, this study developed an online malnutrition risk assessment tool (https://chentianfeng630077.shinyapps.io/make_web). Construction of this tool would support real-time input of key clinical indicators and personalized risk prediction, providing a convenient approach for rapid clinical screening. The optimal threshold of this model was identified as 0.4086 for predicting malnutrition risk.

Eventually, in order to validate the practical efficacy of the tool, this study employed simple random sampling to select a lung cancer inpatient with characteristics below: Age: 70 years; ALB: 35 g/L; NRS2002 score: 3 points; intraoperative blood loss: 100 mL; total thoracic drainage volume: 1, 000 mL; BADL score: 80 points; and K: 4 mmol/L. A postoperative malnutrition risk of 46.52% for this patient (> 0.4086) was yielded when inputting these variables into the XGBoost-based online postoperative malnutrition risk prediction model (Figure 7). Via follow-up, the patient was identified to develop postoperative malnutrition, validating the consistency between prediction result of this model and clinical outcomes.

Figure 7
Form with inputs and sliders for NRS2002 score, total drainage volume, blood loss, age, BADL, potassium, and albumin levels. Adjacent to it is a donut chart showing 46.52% risk, with red for risk and blue for non-risk.

Figure 7. Computational results of online website for postoperative malnutrition in lung cancer patients undergoing thoracoscopic resection.

4 Discussion

4.1 Favorable performance of the model for accurately predicting malnutrition risk in lung cancer patients undergoing thoracoscopic resection

This study analyzed the medical data of patients with lung cancer who underwent thoracoscopic surgery at two medical centers. We employed multivariate logistic regression analysis to determine key feature variables, followed by the construction and evaluation of prediction models by using eight mainstream ML prediction models. Comprehensive evaluation of the constructed model performance revealed that the XGBoost model exhibited the optimal performance and significantly outperformed other models, with superior discrimination and stability across the training set (AUC = 0.868), testing set (AUC = 0.845), and external validation set (AUC = 0.886). SHAP analysis identified seven core features, including ALB, Age, NRS2002, blood loss, total drainage volume, K, and BADL. Their influence was ranked as follows: ALB > age > NRS2002 > blood loss > total drainage volume > K > BADL.

Jiang Y et al. (26) developed three risk prediction models for malignant pleural effusion in lung cancer patients, with result visualization via nomogram, yet without the internal validation. In contrast, our model exhibited advantages of improving prediction accuracy as well as enhancing model transparency and clinical guidance as it integrated the SHAP method to comprehensively evaluate multiple factors. Similarly, Li et al. (27) constructed a prediction model for postoperative exercise phobia, with an AUC of 0.893. But their study primarily incorporated subjective variables, without external validation, and failed to develop a web-based assessment tool to support its clinical applicability.

Leveraging the advantages outlined above, our XGBoost model demonstrated broad applicability. Notably, our stratified analysis confirmed the model’s robustness in elderly patients (Age ≥ 65), with an AUC exceeding 0.80, which effectively addresses concerns regarding model applicability in the aging lung cancer population. However, we observed reduced predictive accuracy in patients with high pre-existing nutritional risk (NRS2002 ≥ 3). This ‘ceiling effect’ suggests that distinguishing outcomes within an already high-risk cohort is inherently challenging using standard clinical variables. For this specific subgroup, malnutrition may be driven by more complex metabolic or immune factors not fully captured in the current model, warranting closer clinical monitoring, serving as a complement to the model’s risk stratification. Clinically, healthcare providers can input the seven key indicators identified in our study into the online prediction tool to generate real-time postoperative malnutrition risk scores. Based on clinical practice needs, it is recommended to carry out high-risk classification for patients with a risk of ≥40.86%, triggering daily nutritional assessments and individualized interventions subsequently. Simultaneously, those with a risk of <40.86% may undergo routine weekly nutritional assessments, enabling precision stratified care and optimizing the allocation of healthcare resources.

4.2 Risk factor analysis for malnutrition prediction models in lung cancer patients undergoing thoracoscopic resection

Preoperative ALB levels serve as a core indicator for assessing malnutrition risk and guiding nutritional interventions. The preoperative ALB levels <30 g/L, as designated by the Perioperative Nutrition Guidelines in the 2021 European Society for Parenteral and Enteral Nutrition (ESPEN) (28), is recommended as an indication for preoperative nutritional support. This metric has also been adopted as a key screening criterion by the Perioperative Nutrition Screen published by the Enhanced Recovery After Surgery Society (29), indicating the importance of preoperative ALB levels for effectively suggesting patient nutritional status. In this study, preoperative ALB level <35 g/L was an independent risk factor for postoperative malnutrition [OR = 0.90, 95% CI (0.80–1.00), P = 0.046], further confirming its clinical value in predicting outcomes for lung cancer patients after thoracoscopic resection. In some lung cancer patients, hypoalbuminemia has already existed preoperatively, resulting in further increased probability of developing postoperative hypoalbuminemia. The incidence of postoperative hypoalbuminemia has been reported to be ranged from 14% to 49.60% (30, 31). Consistent with previous data (32), this study recorded a preoperative incidence of 7.97% and a postoperative incidence as high as 61.40%. Critically, hypoalbuminemia can synergistically worsen respiratory dysfunction, given the already compromised postoperative pulmonary function in lung cancer patients, necessitating clinical emphasis on its prevention and management. Meanwhile, perioperative hypoalbuminemia has been recognized to be a risk factor for postoperative complications and even mortality (33). To certain extent, hypoalbuminemia can indicate potential malnutrition in some patients (34), possibly attributable to the activation of the hypothalamic-pituitary-adrenal axis by surgical stress that may induce a hypercatabolic state. Increased release of catabolic hormones can accelerate protein breakdown, while inflammatory cytokines (e.g., tumor necrosis factor, interleukin-6, etc.) may suppress ALB synthesis and disrupt the endothelial glycocalyx layer, thereby promoting vascular permeability and causing ALB leakage. Moreover, during the acute phase response, the liver prioritizes acute-phase protein synthesis over ALB production owing to preoperative gastrointestinal dysfunction and postoperative intake insufficiency. In addition, protein depletion may be further exacerbated given intraoperative blood loss and postoperative exudation. Collectively, these mechanisms may induce negative nitrogen balance, depletion of protein reserves, and deficiency of nutritional substrates. Concurrently, reduced plasma colloid osmotic pressure may trigger tissue edema, which may impair digestive absorption and metabolic efficiency, ultimately resulting in the presence of malnutrition.

Consistent with findings reported by Cruz-Jentoft et al. (35), advanced age is an independent risk factor for postoperative malnutrition in lung cancer patients following thoracoscopic surgery. Primarily, elderly patients experience gradual deterioration in health status and physical function, which may impair their food intake and nutrient absorption due to the resultant teeth loosening and loss, as well as weakened chewing, swallowing, and digestive function. As reported previously, only 9.1% of elderly patients met their daily energy requirements, while over half had energy intakes below 50% of estimated needs, and nearly 50% achieved only 75% of their protein requirements (36). Consequently, the risk of nutritional deficiency may be increased in the context of low nutrient intake, compounded by dietary monotony. Furthermore, these patients may have exacerbated cardiopulmonary burden given age-related changes in T-cell-mediated immune responses and inflammatory reactions. Meanwhile, the risk of malnutrition can be further increased considering an increased susceptibility to infections resulting from impaired airway mucosal function due to repeated inflammatory stimulation. Concurrently, socioeconomic factors (e.g., social isolation and limited income) may also contribute to heightened malnutrition risk in elderly patients (37). Collectively, all these interpretations underscore the clinical necessity to prioritize nutritional management for older patients with lung cancer, including developing personalized nutritional plans, ensuring adequate nutrient intake, and preventing dietary insufficiency-induced malnutrition.

The NRS2002 is an internationally recommended tool for nutritional screening. Beyond identifying patients with malnutrition or nutritional risk, preoperative nutritional screening can also predict clinical outcomes, thus benefiting preoperative nutritional therapy. Approximately 25% to 75% of cancer patients have been revealed to experience nutritional risk, with approximately 30% of deaths directly attributed to malnutrition (3840). Here in our study, compared to the nutritional risk group, the non-nutritional risk group exhibited significantly shorter duration of postoperative thoracic drainage tube retention; and this group also had a notably reduced rate of postoperative complications and fewer patients needing adjuvant therapy. It may be explained by pre-existing nutritional deficiencies in lung cancer patients. Patients’ functional recovery may be impaired by factors such as major surgical trauma, significant intraoperative blood loss, postoperative wound pain, and surgical stress-induced high metabolic demands collectively. Moreover, patients experience shorter period of early postoperative ambulation, poor appetite, or anorexia, resulting in failure to meet their nutritional requirements. Concurrently, there is a delay of nutritional risk screening and intervention for lung cancer patients in China generally, leading to poor surgical tolerance. These factors, coupled with compromised immunity, may increase the risk of infection, severely impact clinical outcomes (41), and further exacerbate malnutrition. It is recommended that patients undergo NRS2002 nutritional screening upon admission. For patients with a score of ≥3 points, nutritional intervention should be initiated immediately by clinicians, such as prioritizing oral nutritional supplementation, supplemented with enteral or parenteral nutrition when necessary, and dynamically adjusting the intervention plan to reduce complications and improve prognosis.

ADL score is strongly linked to the incidence of malnutrition and is a significant determinant of nutritional status. Consistent with Duan et al. (42), this study observed a correlation of lower BADL scores with poorer nutritional status. Preoperative BADL scores serve as a vital indicator for assessing the functional capacity of patients. Patients with low scores may have diminished physiological reserve and functional status, resulting in reduced tolerance to surgery and recovery potential (43). Conversely, patients with high scores may experience reduced likelihood of malnutrition to some extent given their greater physical strength and mobility, as well as higher nutritional intake requirements and capacity. For instance, moderate physical activity has been discovered to enable the reduction of inflammation and promotion of beneficial nutritional metabolism (44). The ESPEN guidelines on nutrition and cancer (45) also incorporated physical activity into nutritional interventions. Meanwhile, patients with lung cancer insisting on exercise would have mitigated fatigue, enhanced quality of life, improved lung function, and boosted muscle mass (46). Therefore, perioperative BADL assessment should be implemented by clinical healthcare providers for elderly lung cancer patients, combined with guidance on appropriate exercise, and increase of daily activity levels to reduce the risk of malnutrition.

Furthermore, as a crucial therapeutic intervention for intrathoracic diseases, closed thoracic drainage inherently carries the risk of exacerbating malnutrition. Specifically, drainage may induce continuous loss of key nutrients (e.g., ALB, immunoglobulins, and electrolytes) from protein-rich pleural effusions, particularly bloody or exudative pleural effusions. Prolonged or excessive drainage can directly deplete the protein reserves in the body (47). Moreover, lung cancer resection may trigger chylothorax owing to the frequent damage to the thoracic duct. Chyle contains substantial ALB, and the loss of this protein, coupled with dietary restrictions [e.g., low-fat or medium-chain triglyceride (MCT) diets] implemented to reduce chyle secretion, may further exacerbate severe malnutrition and energy deficiency (48). Concurrently, patients’ appetite may be dampened by traumatic stress from tube placement, local pain stimuli, and mechanical irritation, accompanied by the inducement of gastrointestinal symptoms such as nausea and abdominal distension, and impaired capacity for nutrient digestion and absorption. Ultimately, these interconnected mechanisms drive the development of malnutrition.

Intraoperative blood loss is a significant contributor to malnutrition in lung cancer patients, with even higher risk in case of greater blood loss. In this type of surgery, major hemorrhage is common due to multiple factors as follows: (1) anatomical challenges (e.g., fragile hilar vascular walls, restricted surgical field, dense adhesions between pulmonary arteries and bronchi rendering lymph nodes susceptible to damage, and anatomical variations in pulmonary artery branches); and (2) surgical difficulties technically (e.g., difficult dissection of tumor-invaded vessels and technical challenges in instrument handling) (49). Owing to direct loss of plasma proteins, red blood cells, and trace elements, massive blood loss may cause hypoalbuminemia in those patients undergoing surgery. Concurrently, it may activate stress pathways and result in hypercatabolism. Patients may also develop impaired digestive and absorptive capacity given the presence of gastrointestinal mucosal ischemia and postoperative anemia. Blood transfusions and volume resuscitation may trigger dilutional hypoalbuminemia. In addition, as observed by Jiang Q et al. (50), decreased hemoglobin may induce systemic hypoxia and gastrointestinal hypoperfusion, further compromising the efficiency of nutrient absorption.

In addition, serum K is a significant predictor of postoperative nutritional deficiencies in lung cancer patients undergoing thoracoscopic surgery. serum K functions to regulate cellular metabolism, neuromuscular excitability, and gastrointestinal motility. Hypokalemia can cause muscle weakness, gastrointestinal smooth muscle paralysis (constipation and paralytic ileus), and reduced appetite, resulting in reduced activity levels in the affected patients (51). It can also suppress metabolic enzyme activity and exacerbate negative nitrogen balance, leading to impaired glycolysis and protein synthesis. Preoperatively, due to tumor catabolism, treatment-related side effects, and dietary restrictions, lung cancer patients often have inadequate potassium intake and depleted nutritional reserves. Intraoperative stress can also induce sympathetic activation and fluid loss, further disrupting potassium homeostasis. Patients following surgery may have gastrointestinal dysfunction, impaired energy metabolism, and complication-induced heightened metabolic demands, consequently leading to an exacerbation in the status of malnutrition. Clinically, there is a need to emphasize dynamic monitoring of serum K levels during the perioperative period. Simultaneous potassium supplementation and nutritional support are critical to break this vicious cycle and improve patient outcomes.

4.3 Clinical implications of the malnutrition risk predictive model for nursing practice in lung cancer patients undergoing thoracoscopic resection

According to the aforementioned findings, in the clinical practice of nursing, it inspire us to give priority to the monitoring of the following high-risk groups: elderly patients, those with hypoalbuminemia (ALB < 35 g/L), preoperative BADL scores < 90, preoperative NRS2002 scores ≥ 3, and patients with significant intraoperative blood loss or markedly increased postoperative drainage volume. This study further developed an online prediction model for risk assessment. Patients with a score > 40.68 would have high malnutrition risk, warranting immediate intervention; while patients with scores near this threshold can undergo more frequent dynamic screening based on the increased assessment frequency. In the future, complementary intervention measures will be developed and delivered to healthcare professionals and individual patients via digital platforms, with real-time monitoring of intervention outcomes.

Taking into consideration of findings in our study, it is recommended to carry out comprehensive perioperative nutritional interventions. Patients with preoperative ALB < 35 g/L should initiate oral nutritional supplementation one week preoperatively [e.g., 1.2–2.0 g/(kg·d) protein daily, administered in 2–3 divided doses]. Moreover, serum ALB levels should be monitored postoperatively; associated with the assessment of requirement for enteral-parenteral nutritional support by nurses in collaboration with physicians, if persistently below 30 g/L.

Simultaneously, preoperative rehabilitation plans should be developed by nurses for patients with preoperative BADL scores < 90, including: (1) aerobic exercise: brisk walking or cycling at Borg Scale intensity 13–16, 30–60 minutes per session, 3–5 times weekly to improve cardiopulmonary endurance; (2) resistance training: 6 exercises (e.g., seated knee raises, resistance knee extensions, chest presses), completing 8–12 repetitions per set, 2–3 sets per session, and 2–3 times weekly to strengthen limb muscles; (3) inspiratory muscle training: training by employing the “rapid inhalation, slow exhalation” technique with initial resistance set to 30%–50% of maximum inspiratory pressure (30 repetitions per set, and 1–2 sets daily), combined with the first two training components. Patients should be instructed to gradually enhance functional capacity and improve nutritional intake by using scientifically designed exercise regimens. Moreover, patients should be subjected to daily monitoring of drainage output (volume and characteristics), K, ALB, and hemoglobin concentrations for dynamic recovery assessment postoperatively. Cases with chylothorax cases should follow a low-fat/MCT diet; while cases with major hemorrhage should receive reinforced meticulous procedural care and hemostatic measures. Concurrently, there is a need to pay attention to the establishment of multidisciplinary intervention plan (52), including strengthening health education and psychological support for patients and families, creating malnutrition risk early warning systems, providing regular training for nursing staff, and integrating nutritional management strategies into routine care protocols. Altogether, multidimensional measures are necessitated for early prevention and intervention to reduce postoperative risks and enhance overall nursing quality.

5 Conclusion

This study develops a straightforward, user-friendly, and easily implementable predictive model, providing an effective tool for nursing staff to assess postoperative malnutrition in patients undergoing thoracoscopic resection for lung cancer. All indicators in this predictive model are easily accessible in clinical settings, thus demonstrating high practical utility. It is expected to benefit the screening of patients with high risk, thereby boosting the initiation of early interventions by nursing staff. In addition, the model also has validated applicability through external validation. Notably, the model’s predictive accuracy is reduced in high-nutritional-risk patients, requiring closer clinical monitoring.

6 Limitations

There are two major drawbacks to this study: (1) Model complexity and overfitting danger. Indeed, XGBoost model exhibits superior performance in processing complex data and non-linear interactions. However, additional model parameter tuning is still necessary to ensure the model’s stability and ability to generalize across different datasets. Future research may continue to improve the model and tools to better meet clinical practical needs, thereby expanding their application potential in clinical decision support. (2) Limited complication assessment. This study failed to measure the severity of complications by employing the Clavien-Dindo classification system. To increase predictive accuracy, the “complication severity grading” should be adopted as a model variable in our future research to examine the impact of various severity levels on the likelihood of malnutrition. (3) Performance variability in high-risk subgroups. While the model showed robust generalizability in the elderly and general population, its discriminative power was limited in the subgroup with high nutritional risk scores (NRS2002 ≥ 3) during external validation. This limitation may be attributed to the relatively small sample size of this subgroup and the high heterogeneity of patients with severe nutritional risk. Future studies should consider incorporating specific metabolic biomarkers to enhance prediction precision for this high-risk population. Furthermore, integrating dynamic postoperative changes (i.e., time-series data) will be a key priority in our future research to achieve continuous forecasting. (4) Unaddressed class imbalance. Postoperative malnutrition accounted for only 11.3% of total samples, leading to inherent class imbalance that was not addressed during data splitting. This may have biased model training toward the majority class and contributed to low Precision in validation sets, compromising predictive accuracy for clinically critical malnourished patients. Future studies will use class weight adjustment or Borderline-SMOTE to balance datasets and improve model robustness.

Data availability statement

The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

Ethics statement

The studies involving humans were approved by Ethics Committee of Dongguan People’s Hospital, Guangdong Province (Approval No.: KYKT2025-011). The studies were conducted in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.

Author contributions

TC: Conceptualization, Data curation, Project administration, Writing – original draft, Writing – review & editing. PW: Methodology, Project administration, Software, Supervision, Writing – original draft, Writing – review & editing. RP: Formal Analysis, Investigation, Project administration, Writing – original draft, Writing – review & editing. LL: Investigation, Validation, Visualization, Writing – original draft. LX: Data curation, Investigation, Software, Writing – review & editing. MY: Data curation, Investigation, Software, Writing – review & editing. XD: Investigation, Validation, Visualization, Writing – original draft.

Funding

The author(s) declared that financial support was not received for this work and/or its publication.

Conflict of interest

The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The author(s) declared that generative AI was not used in the creation of this manuscript.

Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2026.1727595/full#supplementary-material

Supplementary Table | Hyperparameter settings for the eight machine learning models.

References

1. Diagnostic and application guidelines for malnutrition in adult patients (2025 edition). Zhonghua Yi Xue Za Zhi. (2025) 105:953–80. doi: 10.3760/cma.j.cn112137-20241212-02810

PubMed Abstract | Crossref Full Text | Google Scholar

2. Werblinska A, Zielinska D, Szlanga L, Skrzypczak P, Bryl M, Piwkowski C, et al. The impact of nutritional support on outcomes of lung cancer surgery-narrative review. J Clin Med. (2025) 14(9):3197. doi: 10.3390/jcm14093197

PubMed Abstract | Crossref Full Text | Google Scholar

3. Bray F, Laversanne M, Sung H, Ferlay J, Siegel RL, Soerjomataram I, et al. Global cancer statistics 2022: globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. (2024) 74:229–63. doi: 10.3322/caac.21834

PubMed Abstract | Crossref Full Text | Google Scholar

4. Zheng RS, Chen R, Han BF, Wang SM, Li L, Sun KX, et al. Cancer incidence and mortality in China, 2022. Zhonghua Zhong Liu Za Zhi. (2024) 46:221–31. doi: 10.3760/cma.j.cn112152-20240119-00035

PubMed Abstract | Crossref Full Text | Google Scholar

5. Zhang Y, Jiang B, Zhang L, Greuter MJW, de Bock GH, Zhang H, et al. Lung nodule detectability of artificial intelligence-assisted ct image reading in lung cancer screening. Curr Med Imaging. (2022) 18:327–34. doi: 10.2174/1573405617666210806125953

PubMed Abstract | Crossref Full Text | Google Scholar

6. Usanase N, Uzun B, Ozsahin DU, and Ozsahin I. A look at radiation detectors and their applications in medical imaging. Jpn J Radiol. (2024) 42:145–57. doi: 10.1007/s11604-023-01486-z

PubMed Abstract | Crossref Full Text | Google Scholar

7. Hallet J, Rousseau M, Gupta V, Hirpara D, Zhao H, Coburn N, et al. Long-term functional outcomes among older adults undergoing video-assisted versus open surgery for lung cancer: a population-based cohort study. Ann Surg. (2023) 277:e1348–54. doi: 10.1097/SLA.0000000000005387

PubMed Abstract | Crossref Full Text | Google Scholar

8. Ye B and Wang M. Video-assisted thoracoscopic surgery versus thoracotomy for non-small cell lung cancer: a meta-analysis. Comb Chem High Throughput Screen. (2019) 22:187–93. doi: 10.2174/1386207322666190415103030

PubMed Abstract | Crossref Full Text | Google Scholar

9. Kiss N and Curtis A. Current insights in nutrition assessment and intervention for malnutrition or muscle loss in people with lung cancer: a narrative review. Adv Nutr. (2022) 13:2420–32. doi: 10.1093/advances/nmac070

PubMed Abstract | Crossref Full Text | Google Scholar

10. Polanski J, Tanski W, Dudek K, and Jankowska-Polanska B. Pain and coping strategies as determinants of malnutrition risk in lung cancer patients: a cross-sectional study. Nutrients. (2024) 16(14):2193. doi: 10.3390/nu16142193

PubMed Abstract | Crossref Full Text | Google Scholar

11. Huo Z, Chong F, Yin L, Li N, Liu J, Zhang M, et al. Comparison of the performance of the glim criteria, pg-sga and mpg-sga in diagnosing malnutrition and predicting survival among lung cancer patients: a multicenter study. Clin Nutr. (2023) 42:1048–58. doi: 10.1016/j.clnu.2023.04.021

PubMed Abstract | Crossref Full Text | Google Scholar

12. Crestani MS, Stefani GP, Scott LM, and Steemburgo T. Accuracy of the glim criteria and sga compared to pg-sga for the diagnosis of malnutrition and its impact on prolonged hospitalization: a prospective study in patients with cancer. Nutr Cancer. (2023) 75:1177–88. doi: 10.1080/01635581.2023.2184748

PubMed Abstract | Crossref Full Text | Google Scholar

13. Landgrebe M, Tobberup R, Carus A, and Rasmussen HH. Glim diagnosed malnutrition predicts clinical outcomes and quality of life in patients with non-small cell lung cancer. Clin Nutr. (2023) 42:190–98. doi: 10.1016/j.clnu.2022.12.011

PubMed Abstract | Crossref Full Text | Google Scholar

14. Wang X, Chu J, Wei C, Xu J, He Y, and Chen C. Construction and validation of a predictive model for the risk of malnutrition in hospitalized patients over 65 years of age with Malignant tumours: a single-centre retrospective cross-sectional study. Peerj. (2024) 12:e18685. doi: 10.7717/peerj.18685

PubMed Abstract | Crossref Full Text | Google Scholar

15. Ran Q, Zhao X, Tian J, Gong S, and Zhang X. A nomogram model for predicting malnutrition among older hospitalized patients with type 2 diabetes: a cross-sectional study in China. BMC Geriatr. (2023) 23:565. doi: 10.1186/s12877-023-04284-4

PubMed Abstract | Crossref Full Text | Google Scholar

16. Handelman GS, Kok HK, Chandra RV, Razavi AH, Lee MJ, and Asadi H. Edoctor: machine learning and the future of medicine. J Intern Med. (2018) 284:603–19. doi: 10.1111/joim.12822

PubMed Abstract | Crossref Full Text | Google Scholar

17. Almisned FA, Usanase N, Ozsahin DU, and Ozsahin I. Incorporation of explainable artificial intelligence in ensemble machine learning-driven pancreatic cancer diagnosis. Sci Rep. (2025) 15:14038. doi: 10.1038/s41598-025-98298-0

PubMed Abstract | Crossref Full Text | Google Scholar

18. Amin MB, Greene FL, Edge SB, Compton CC, Gershenwald JE, Brookland RK, et al. The eighth edition ajcc cancer staging manual: continuing to build a bridge from a population-based to a more “personalized” approach to cancer staging. CA Cancer J Clin. (2017) 67:93–9. doi: 10.3322/caac.21388

PubMed Abstract | Crossref Full Text | Google Scholar

19. Cederholm T, Jensen GL, Correia MITD, Gonzalez MC, Fukushima R, Higashiguchi T, et al. Glim criteria for the diagnosis of malnutrition - a consensus report from the global clinical nutrition community. Clin Nutr. (2019) 38:1–09. doi: 10.1016/j.clnu.2018.08.002

PubMed Abstract | Crossref Full Text | Google Scholar

20. Nakyeyune R, Ruan X, Wang X, Zhang Q, Shao Y, Shen Y, et al. Comparative analysis of malnutrition diagnosis methods in lung cancer patients using a bayesian latent class model. Asia Pac J Clin Nutr. (2022) 31:181–90. doi: 10.6133/apjcn.202206_31(2).0003

PubMed Abstract | Crossref Full Text | Google Scholar

21. Kondrup J, Allison SP, Elia M, Vellas B, and Plauth M. Espen guidelines for nutrition screening 2002. Clin Nutr. (2003) 22:415–21. doi: 10.1016/s0261-5614(03)00098-0

PubMed Abstract | Crossref Full Text | Google Scholar

22. Katz S, Ford Ab, Moskowitz Rw, Jackson Ba, and Jaffe Mw. Studies of illness in the aged. The index of adl: a standardized measure of biological and psychosocial function. Jama. (1963) 185:914–19. doi: 10.1001/jama.1963.03060120024016

PubMed Abstract | Crossref Full Text | Google Scholar

23. Bjornholdt KT and Andersen CWG. Measurement of acute postoperative pain intensity in orthopedic trials: a qualitative concept elicitation study. Acta Orthop. (2024) 95:625–32. doi: 10.2340/17453674.2024.42182

PubMed Abstract | Crossref Full Text | Google Scholar

24. Okada S, Shimomura M, Ishihara S, Ikebe S, Furuya T, and Inoue M. Clinical significance of postoperative pulmonary complications in elderly patients with lung cancer. Interact Cardiovasc Thorac Surg. (2022) 35(2):ivac153. doi: 10.1093/icvts/ivac153

PubMed Abstract | Crossref Full Text | Google Scholar

25. Poldrack RA, Huckins G, and Varoquaux G. Establishment of best practices for evidence for prediction: a review. JAMA Psychiatry. (2020) 77:534–40. doi: 10.1001/jamapsychiatry.2019.3671

PubMed Abstract | Crossref Full Text | Google Scholar

26. Jiang Y, Hu X, Heibi Y, Wu H, Deng T, and Jiang L. Exploring prognostic precision: a nomogram approach for Malignant pleural effusion in lung cancer. BMC Cancer. (2025) 25:227. doi: 10.1186/s12885-025-13632-z

PubMed Abstract | Crossref Full Text | Google Scholar

27. Li C, Lin Y, Xiao X, Guo X, Fei J, Lu Y, et al. Development and validation of a risk prediction model for kinesiophobia in postoperative lung cancer patients: an interpretable machine learning algorithm study. Sci Rep. (2025) 15:19412. doi: 10.1038/s41598-025-03575-7

PubMed Abstract | Crossref Full Text | Google Scholar

28. Weimann A, Braga M, Carli F, Higashiguchi T, Hubner M, Klek S, et al. Espen practical guideline: clinical nutrition in surgery. Clin Nutr. (2021) 40:4745–61. doi: 10.1016/j.clnu.2021.03.031

PubMed Abstract | Crossref Full Text | Google Scholar

29. Zhang F, He S, Zhang Y, Mu D, and Wang D. Comparison of two malnutrition assessment scales in predicting postoperative complications in elderly patients undergoing noncardiac surgery. Front Public Health. (2021) 9:694368. doi: 10.3389/fpubh.2021.694368

PubMed Abstract | Crossref Full Text | Google Scholar

30. Chen Y, Liu T, Feng H, Liu T, Zhang J, Wang J, et al. The prognostic role of albumin levels in lung cancer patients receiving third-line or advanced immunotherapy: a retrospective study. Transl Lung Cancer Res. (2024) 13:1307–17. doi: 10.21037/tlcr-24-378

PubMed Abstract | Crossref Full Text | Google Scholar

31. Yang J, Xu J, Chen G, Yu N, Yang J, Zeng D, et al. Post-diagnostic c-reactive protein and albumin predict survival in chinese patients with non-small cell lung cancer: a prospective cohort study. Sci Rep. (2019) 9:8143. doi: 10.1038/s41598-019-44653-x

PubMed Abstract | Crossref Full Text | Google Scholar

32. Kinoshita F, Tagawa T, Yamashita T, Takenaka T, Matsubara T, Toyokawa G, et al. Prognostic value of postoperative decrease in serum albumin on surgically resected early-stage non-small cell lung carcinoma: a multicenter retrospective study. PloS One. (2021) 16:e256894. doi: 10.1371/journal.pone.0256894

PubMed Abstract | Crossref Full Text | Google Scholar

33. He Z, Zhou K, Tang K, Quan Z, Liu S, and Su B. Perioperative hypoalbuminemia is a risk factor for wound complications following posterior lumbar interbody fusion. J Orthop Surg Res. (2020) 15:538. doi: 10.1186/s13018-020-02051-4

PubMed Abstract | Crossref Full Text | Google Scholar

34. Pass B, Malek F, Rommelmann M, Aigner R, Knauf T, Eschbach D, et al. The influence of malnutrition measured by hypalbuminemia and body mass index on the outcome of geriatric patients with a fracture of the proximal femur. Med (Kaunas). (2022) 58(11):1610. doi: 10.3390/medicina58111610

PubMed Abstract | Crossref Full Text | Google Scholar

35. Cruz-Jentoft AJ and Volkert D. Malnutrition in older adults. N Engl J Med. (2025) 392:2244–55. doi: 10.1056/NEJMra2412275

PubMed Abstract | Crossref Full Text | Google Scholar

36. Meure CM, Steer B, and Porter J. Nutritional intake, hospital readmissions and length of stay in hospitalised oncology patients. Cancers (Basel). (2023) 15(5):1488. doi: 10.3390/cancers15051488

PubMed Abstract | Crossref Full Text | Google Scholar

37. Bellanti F, Lo Buglio A, Quiete S, and Vendemiale G. Malnutrition in hospitalized old patients: screening and diagnosis, clinical outcomes, and management. Nutrients. (2022) 14(4):910. doi: 10.3390/nu14040910

PubMed Abstract | Crossref Full Text | Google Scholar

38. Trujillo EB, Kadakia KC, Thomson C, Zhang FF, Livinski A, Pollard K, et al. Malnutrition risk screening in adult oncology outpatients: an aspen systematic review and clinical recommendations. Jpen J Parenter Enteral Nutr. (2024) 48:874–94. doi: 10.1002/jpen.2688

PubMed Abstract | Crossref Full Text | Google Scholar

39. Impact of malnutrition on early outcomes after cancer surgery: an international, multicentre, prospective cohort study. Lancet Glob Health. (2023) 11:e341–49. doi: 10.1016/S2214-109X(22)00550-2

PubMed Abstract | Crossref Full Text | Google Scholar

40. Hersberger L, Bargetzi L, Bargetzi A, Tribolet P, Fehr R, Baechli V, et al. Nutritional risk screening (nrs 2002) is a strong and modifiable predictor risk score for short-term and long-term clinical outcomes: secondary analysis of a prospective randomised trial. Clin Nutr. (2020) 39:2720–29. doi: 10.1016/j.clnu.2019.11.041

PubMed Abstract | Crossref Full Text | Google Scholar

41. Vella R, Pizzocaro E, Bannone E, Gualtieri P, Frank G, Giardino A, et al. Nutritional intervention for the elderly during chemotherapy: a systematic review. Cancers (Basel). (2024) 16(16):2809. doi: 10.3390/cancers16162809

PubMed Abstract | Crossref Full Text | Google Scholar

42. Duan R, Li Q, Yuan QX, Hu J, Feng T, and Ren T. Predictive model for assessing malnutrition in elderly hospitalized cancer patients: a machine learning approach. Geriatr Nurs. (2024) 58:388–98. doi: 10.1016/j.gerinurse.2024.06.012

PubMed Abstract | Crossref Full Text | Google Scholar

43. Couderc A, Tomasini P, Greillier L, Nouguerede E, Rey D, Montegut C, et al. Functional status in older patients with lung cancer: an observational cohort study. Support Care Cancer. (2022) 30:3817–27. doi: 10.1007/s00520-021-06752-2

PubMed Abstract | Crossref Full Text | Google Scholar

44. Ferreira V, Lawson C, Ekmekjian T, Carli F, Scheede-Bergdahl C, and Chevalier S. Effects of preoperative nutrition and multimodal prehabilitation on functional capacity and postoperative complications in surgical lung cancer patients: a systematic review. Support Care Cancer. (2021) 29:5597–610. doi: 10.1007/s00520-021-06161-5

PubMed Abstract | Crossref Full Text | Google Scholar

45. Muscaritoli M, Arends J, Bachmann P, Baracos V, Barthelemy N, Bertz H, et al. Espen practical guideline: clinical nutrition in cancer. Clin Nutr. (2021) 40:2898–913. doi: 10.1016/j.clnu.2021.02.005

PubMed Abstract | Crossref Full Text | Google Scholar

46. Voorn MJJ, Driessen EJM, Reinders RJEF, van Kampen-van Den Boogaart VEM, Bongers BC, and Janssen-Heijnen MLG. Effects of exercise prehabilitation and/or rehabilitation on health-related quality of life and fatigue in patients with non-small cell lung cancer undergoing surgery: a systematic review. Eur J Surg Oncol. (2023) 49:106909. doi: 10.1016/j.ejso.2023.04.008

PubMed Abstract | Crossref Full Text | Google Scholar

47. Kim CH, Park JE, Cha JG, Park J, Choi SH, Seo H, et al. Clinical predictors and outcomes of non-expandable lung following percutaneous catheter drainage in lung cancer patients with Malignant pleural effusion. Med (Baltimore). (2023) 102:e34134. doi: 10.1097/MD.0000000000034134

PubMed Abstract | Crossref Full Text | Google Scholar

48. Agrawal A, Chaddha U, Kaul V, Desai A, Gillaspie E, and Maldonado F. Multidisciplinary management of chylothorax. Chest. (2022) 162:1402–12. doi: 10.1016/j.chest.2022.06.012

PubMed Abstract | Crossref Full Text | Google Scholar

49. Tomoyasu M, Deguchi H, Kudo S, Shigeeda W, Kaneko Y, Yoshimura R, et al. Evaluation of pulmonary artery bleeding during thoracoscopic pulmonary resection for lung cancer. Thorac Cancer. (2022) 13:3001–06. doi: 10.1111/1759-7714.14649

PubMed Abstract | Crossref Full Text | Google Scholar

50. Jiang Q, Li F, Xu G, Ma L, Ni X, Wang Q, et al. A nomogram for predicting the risk of malnutrition in hospitalized older adults: a retrospective study. BMC Geriatr. (2025) 25:345. doi: 10.1186/s12877-025-05990-x

PubMed Abstract | Crossref Full Text | Google Scholar

51. Kim MJ, Valerio C, and Knobloch GK. Potassium disorders: hypokalemia and hyperkalemia. Am Fam Physician. (2023) 107:59–70.

Google Scholar

52. Dore I, Piche A, Montiel C, Lambert SD, Gillis C, Dufresne SS, et al. Multimodal group-based tele-prehabilitation for cancer patients and caregivers: a pragmatic multicentre hybrid implementation-effectiveness study protocol. Front Oncol. (2025) 15:15664. doi: 10.3389/fonc.2025.15664

PubMed Abstract | Crossref Full Text | Google Scholar

Keywords: lung cancer, machine learning, malnutrition, SHAP, thoracoscopic resection

Citation: Chen T, Pan R, Liang L, Xu L, Yang M, Deng X and Wang P (2026) Machine learning model for predicting malnutrition risk in lung cancer patients after thoracoscopic resection: a multi-center study. Front. Oncol. 16:1727595. doi: 10.3389/fonc.2026.1727595

Received: 18 October 2025; Accepted: 23 January 2026; Revised: 15 January 2026;
Published: 09 February 2026.

Edited by:

Xiangkui Li, Harbin University of Science and Technology, China

Reviewed by:

Guolong Zhang, The First Affiliated Hospital of Guangzhou Medical University, China
Natacha Usanase, Near East University, Cyprus

Copyright © 2026 Chen, Pan, Liang, Xu, Yang, Deng and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Ping Wang, d2FuZ3BAZ2RtdS5lZHUuY24=

These authors have contributed equally to this work and share first authorship

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.