- Department of Hepatobiliary Surgery, The First Affiliated Hospital of Zhengzhou University, Zhengzhou University, Zhengzhou, China
Objective: Compare the performance of the Multivariable logistic regression (LR) model based on traditional statistical methods and the Random Forest (RF) model in machine learning for predicting clinically relevant postoperative pancreatic fistula (CR-POPF) after pancreatoduodenectomy (PD).
Background: CR-POPF is a common and severe complication following PD. Traditional statistical models are widely used to predict it, but the rise of machine learning has garnered attention for its potential in predictive medicine. Comparing the performance of traditional statistical methods and machine learning models provides insight into the optimal approach for CR-POPF prediction.
Methods: Clinical data from patients undergoing PD were collected. CR-POPF prediction models were developed using Multivariable LR and RF, and their predictive performance was compared using Calibration curves, ROC curves and DCA curves.
Results: In the calibration curve analysis, the Multivariable LR model shows better calibration than the RF. The Multivariable LR model achieved an AUC of 0.96, while the RF model achieved an AUC of 0.90, indicating superior predictive accuracy of the Multivariable LR model. Decision curve analysis demonstrated that the Multivariable LR model provided higher net benefit across most threshold ranges than the RF model.
Conclusion: The Multivariable LR model outperformed the RF model in predicting CR-POPF after PD and can be considered the preferred method for CR-POPF risk assessment.
Introduction
PD is a commonly used surgical approach for treating benign and malignant diseases of the pancreatic head, distal common bile duct, and periampullary region (1, 2). One of its most dreaded complications is CR-POPF, which occurs in 10%–20% of patients and is associated with higher mortality, delayed gastric emptying, infections and bleeding, prolonged hospital stays, increased costs, and unplanned readmissions (3–5). How to accurately predict CR-POPF remains a pressing challenge for clinicians. Precise prediction of CR-POPF facilitates risk stratificationand the development of personalized treatment strategies for patients undergoing PD.
Machine-learning methods, especially ensemble algorithms such as the Random Forest, have recently been introduced into high throughput omics data because they model complex nonlinear interactions that conventional LR may overlook (6–11). However, its performance and relevance studies with conventional clinical data have not been thoroughly evaluated. For CR-POPF, no study has quantified whether RF outperforms the widely used multivariable LR scores for CR-POPF (11, 12). The original Fistula Risk Score and its alternative version, and several single-centre nomograms built with multivariable LR are widely used to predict CR-POPF; however, their discrimination (AUC = 0.70–0.85) declines in heterogeneous cohorts (13, 14). In other studies that developed CR-POPF prediction models using logistic regression, fewer variables were incorporated, and the assessments were rather superficial, so the clinical benefit remains uncertain (15, 16). Here, we construct both models in a 289-patient East-Asian cohort and compare their discrimination, calibration and decision-curve performance to determine the more clinically useful strategy for CR-POPF risk stratification.
Methods
This study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of the First Affiliated Hospital of Zhengzhou University (Approval No.: 2024-KY-0532-001), with a waiver of informed consent. A retrospective analysis was conducted on the clinical data of patients who underwent PD surgery across all campuses of the First Affiliated Hospital of Zhengzhou University from January 2019 to March 2024, with routine postoperative follow-up conducted. The surgical approaches included OPD (Open Pancreaticoduodenectomy), LPD (Laparoscopic Pancreaticoduodenectomy), and RPD (Robotic Pancreaticoduodenectomy).
Inclusion criteria: (1) Patients undergoing PD treatment for benign or malignant tumors around the ampulla. (2) Generally in good physical health, specifically defined as: Left ventricular ejection fraction ≥50%, no recent myocardial infarction or unstable angina within 6 months; No recent stroke or transient ischemic attack within 6 months, no uncontrolled epilepsy or cognitive impairment; No severe chronic obstructive pulmonary disease, GOLD stage III–IV, Forced Expiratory Volume in 1 s [FEV1] ≥60% predicted; estimated glomerular filtration rate ≥60 ml/min/1.73 m2, no requirement for dialysis. (3) No prior systemic anti-tumor treatments before surgery, including but not limited to chemotherapy, radiotherapy, or immunotherapy. (4) Complete clinical and follow-up data are available.
Exclusion criteria: (1) Vascular involvement. (2) Distant metastasis of the tumor. (3) Special intraoperative situations: Patients converted from laparoscopic or robotic surgery to open surgery. Patients undergoing combined resection of other complex organs (e.g., spleen, major vascular reconstruction). (4) Special populations: Pregnant or lactating women. Children (<18 years) or elderly patients (>80 years).
Included patients: After applying the inclusion and exclusion criteria and considering dataset balance, 289 patients were included. 137 patients were in the CR-POPF group, and 152 patients were in the non-CR-POPF group.
The clinical variables included in our prediction model were selected based on previous literature and clinical experience (17, 18). Observation indicators: (1) Preoperative demographic characteristics of patients: age, gender, body mass index (BMI), smoking history, drinking history, preoperative jaundice status, history of heart disease, hypertension, diabetes, upper abdominal surgery, and ECOG score. (2) Preoperative laboratory test-related variables: preoperative blood sample levels (i.e., hemoglobin level, white blood cell count, neutrophil/lymphocyte ratio), plasma total bilirubin level, and related tumor markers, namely carbohydrate antigen 19-9 (CA19-9), cancer antigen 125 (CA125), and carcinoembryonic antigen (CEA). (3) Perioperative-related data: preoperative bile drainage, ASA score (American Society of Anesthesiologists), total operative time, surgical approach, estimated blood loss, intraoperative plasma transfusion volume, intraoperative red blood cell transfusion volume, and number of lymph nodes dissected. (4) Intraoperative evaluation of tumor location, pancreatic texture, and pancreatic duct diameter.
Selection of included variables: This study focused on preoperative and intraoperative factors affecting the occurrence of pancreatic fistula to construct a risk prediction model applicable during or before surgery, aiming to guide early clinical interventions. Therefore, only preoperative variables (e.g., age, gender, BMI) and intraoperative variables (e.g., pancreatic duct diameter, pancreatic texture, intraoperative blood loss, surgical duration) were included in the analysis. Postoperative variables (e.g., duration of intravenous analgesic use, amylase levels in drainage fluid) were not included in this study, as they are only available postoperatively, making them unsuitable for preoperative risk assessment and potentially a result rather than a cause of pancreatic fistula.
The classification of pancreatic fistula is primarily based on the criteria proposed by the International Study Group on Pancreatic Fistula (ISGPS) (19). Based on the clinical impact and required interventions, postoperative pancreatic fistula is classified into three grades: Biochemical Leak: No clinical significance; observation only is required. Grade B fistula: Requires additional treatment but poses no life-threatening risk. Grade C fistula: Severe; requires urgent intervention or surgery and is life-threatening. Clinically relevant POPF (CR-POPF) includes Grade B and Grade C fistulas, while Biochemical Leak or the absence of a fistula are classified as non-clinically relevant pancreatic fistula (non-CR-POPF). The pancreatic texture was classified retrospectively from the original operative notes recorded by the attending surgeons. In these operative records, surgeons explicitly described the pancreatic texture as either soft or firm based on direct intraoperative palpation and subjective surgical judgement. No standardized instrument or quantitative measure was routinely employed for this assessment. Cases lacking clear documentation regarding pancreatic texture were excluded from analysis for consistency. The pancreatic duct diameter variable is defined as a binary variable. Positive (pancreatic duct diameter ≥3 mm) indicates a relatively wide duct, while negative (pancreatic duct diameter <3 mm) indicates a narrower duct. This threshold is based on relevant literature and clinical experience (20) and is commonly used for assessing postoperative pancreatic fistula risk.
Missing data were handled using the multiple imputation method. All imputation procedures were conducted using the MICE package in R software (version 4.3.1, R Core Team, Vienna, Austria). Descriptive statistics were performed using SPSS software (Version 22, IBM Corp., Armonk, NY, USA) to compare the baseline characteristics of the CR-POPF group (137 cases) and the non-CR-POPF group (152 cases). Normally distributed continuous variables were expressed as mean ± standard deviation (SD) and compared using the independent sample t-test. Non-normally distributed continuous variables were expressed as median (interquartile range, IQR), denoted as M (Q1, Q3), and compared using the Mann–Whitney U test. Categorical variables were expressed as frequencies and percentages and compared using the chi-square test or Fisher's exact test where appropriate. Univariate LR analysis was performed to identify potential risk factors for CR-POPF, with CR-POPF as the dependent variable and preoperative and intraoperative factors (e.g., age, sex, BMI, pancreatic duct diameter, and pancreatic texture) as independent variables. Results were reported as odds ratios (ORs) with corresponding 95% confidence intervals (CIs). A two-tailed P-value <0.05 was considered statistically significant variables demonstrating statistical significance were chosen for subsequent multivariate analysis.
Multicollinearity test: To evaluate potential multicollinearity, the Variance Inflation Factor (VIF) was employed to analyze the selected variables (21). VIF values were calculated using a linear regression model (lm function), with a VIF value <10 indicating no severe multicollinearity between variables. The analysis was conducted using R software. Primarily using the car package. Correlation Analysis (22): Correlation analysis was conducted on the selected variables. The correlation matrix was calculated using Pearson correlation coefficients, and the analysis was also conducted in R software, primarily visualized using the corrplot package.
Model Construction and Evaluation: To effectively avoid overfitting and enhance the robustness of the results, a 10-fold cross-validation approach is employed for training and evaluating the model. Specifically, the data are randomly divided into 10 folds; in each iteration, one fold is designated as the validation set, while the remaining nine folds serve as the training set. This procedure is repeated 10 times, and the average metric is reported as the final evaluation. Continuous variables were log-transformed if skewed and then standardized, categorical variables were one-hot encoded, and all preprocessing was executed within each cross-validation fold to avoid data leakage.
Multivariable LR (23) was conducted to investigate the independent effects of the selected variables on the occurrence of CR-POPF. CR-POPF occurrence was set as the dependent variable, Variables with P-value <0.05 in univariate analysis were used as independent variables. Results were expressed as ORs and their 95% CIs. To clearly illustrate the results of the Multivariable LR analysis, this study utilized the forestplot package in R software to generate a forest plot. The plot showed each variable's OR and 95% CIs, helping readers understand their independent roles in the occurrence of postoperative pancreatic fistula. To improve the interpretability of the forest plot, the X-axis was transformed using a base-10 logarithmic scale. This transformation compresses the variable range for better visualization while maintaining the original OR and CI values.
The RF (24) was constructed using 500 decision trees (ntree = 500), with 3 variables randomly selected (mtry = 3) for node splitting, to evaluate the importance of each variable. Variable importance was measured by the Mean Decrease Accuracy (MDA), reflecting the contribution of each variable to the model's predictive performance. The analysis was performed using the randomForest package in R software. To visually present feature importance, bar charts were created using the ggplot2 package, showing the ranking of variable importance.
To assess the agreement between predicted probabilities and actual event occurrence, this study employed calibration curve analysis (25) for both the Multivariable LR and RF models. Specifically, predicted probabilities were grouped into equal-sized bins, and each bin's mean predicted probability was plotted against the corresponding observed proportion of events. A diagonal line (y = x) represented perfect calibration, enabling direct visual comparison of how closely each model's predictions matched reality. The plots were generated in R using ggplot2, with larger deviations from the diagonal indicating poorer calibration performance.
To evaluate the predictive performance of the models, Receiver Operating Characteristic (ROC) curve analysis (26) was used to compare the classification accuracy of the Multivariable LR model and the RF model. The ROC curves were drawn by matching predicted probabilities with actual pancreatic fistula outcomes, and the area under the curve (AUC) was calculated using the pROC package in R software to measure the discriminatory power of the models. The ROC curves were visualized using the ggplot2 package, with the AUC displayed as annotations on the plot. Model performance was compared based on AUC values, with higher values closer to 1 indicating better predictive accuracy.
To further evaluate the clinical utility of the Multivariable LR model and the RF model in predicting CR-POPF, this study employed the Decision Curve Analysis (DCA) (27) method to calculate the Net Benefit of the two models across different threshold probabilities. DCA was performed using the dcurves package with a threshold probability range of 0–1. DCA evaluated the clinical utility of the predictive models by comparing the net benefits of different strategies, such as Treat None and Treat All.
Results
Baseline characteristics of patients
A total of 289 patients were included in this study, with 137 cases in the CR-POPF group and 152 cases in the non-CR-POPF group. The baseline characteristics of the two groups were compared (Table 1). Demographics and Preoperative Variables: There was no significant difference in gender distribution (P = 0.21), age distribution (P = 0.94), or surgical approach (P = 0.84) between the CR-POPF and non-CR-POPF groups. However, patients in the CR-POPF group had a significantly higher BMI compared to the non-CR-POPF group (P = 0.02). The prevalence of preoperative jaundice was significantly higher in the CR-POPF group (45.3%) compared to the non-CR-POPF group (32.9%) (P = 0.03). Smoking and drinking histories showed no statistical differences between the two groups (P > 0.05). Comorbidities, including hypertension, diabetes, and cardiovascular disease, were similarly distributed between the groups (P > 0.05). Laboratory and Tumor-Related Variables: Serum CA19-9 levels were significantly higher in the CR-POPF group compared to the non-CR-POPF group (P = 0.01), suggesting an association with an increased risk of CR-POPF. However, levels of CA125, CEA, and Plasma total bilirubin did not differ significantly between the groups (P > 0.05). Intraoperative Characteristics: Key intraoperative variables, including estimated blood loss and the number of lymph nodes dissected, were comparable between the two groups (P > 0.05). However, pancreatic texture and pancreatic duct diameter showed significant differences. Patients with a soft pancreatic texture were more likely to develop CR-POPF (P < 0.001). Similarly, a smaller pancreatic duct diameter (<3 mm) was strongly associated with CR-POPF (P < 0.001).

Table 1. Comparison of baseline, clinical, and intraoperative characteristics between POPF and non-POPF groups.
Univariate LR analysis
Univariate LR analysis was performed to evaluate the relationship between various preoperative and intraoperative factors and the occurrence of CR-POPF. The dependent variable was defined as the presence of CR-POPF, while each individual variable was analyzed as an independent variable. The results are summarized in Table 2. Higher BMI was significantly associated with an increased risk of CR-POPF (OR = 1.68, 95% CI: 1.28–2.19, P < 0.001). Patients with preoperative jaundice had a higher likelihood of developing CR-POPF (OR = 3.06, 95% CI: 1.03–9.07, P = 0.044). Soft pancreatic texture was strongly correlated with an elevated risk of CR-POPF (OR = 0.03, 95% CI: 0.01–0.08, P < 0.001). Smaller pancreatic duct diameter was associated with a significantly higher risk of CR-POPF (OR = 0.06, 95% CI: 0.016–0.19, P < 0.001). Elevated CA19-9 levels were significantly linked to an increased risk of CR-POPF (OR = 1.01, 95% CI: 1.01–1.01, P = 0.001).
Multicollinearity assessment
To evaluate potential multicollinearity among the selected independent variables (jaundice, CA19-9, pancreatic texture, and pancreatic duct diameter), the VIF was calculated using a linear regression model. The results indicated that all VIF values were below the commonly accepted threshold of 10, suggesting no severe multicollinearity in the model. The specific VIF values were as follows: Jaundice: VIF = 1.28, CA19-9: VIF = 1.27, Pancreatic Texture: VIF = 1.01, Pancreatic Duct Diameter: VIF = 1.01. It should be noted that BMI was used as the dependent variable in this linear regression model to compute the VIF values for the remaining independent variables (jaundice, CA19-9, pancreatic texture, and pancreatic duct diameter). Therefore, BMI itself does not have a corresponding VIF value in this analysis. Additionally, the correlation matrix analysis revealed that the correlation coefficients among variables were generally low, with absolute values ranging from 0.001 to 0.459. The results of the correlation matrix analysis are presented in Figure 1, The analysis of VIF and the correlation matrix indicated that the five selected variables (BMI, jaundice, CA19-9, pancreatic texture, and pancreatic duct diameter) had low levels of multicollinearity, as all VIF values were less than 1.5 and all correlation coefficients were under 0.8. This indicates that these variables demonstrated good independence in this study and can be used as independent variables in the Multivariable model for further analysis.

Figure 1. Correlation heatmap of risk factors for clinically relevant postoperative pancreatic fistula.
Multivariable LR
Multivariable LR analysis identified pancreatic duct diameter, pancreatic texture, CA19-9 levels, and BMI as significant independent predictors of CR-POPF (Figure 2). A smaller pancreatic duct diameter (OR = 0.02, 95% CI: 0.01–0.05, P = 0.004) and soft pancreatic texture (OR = 0.03, 95% CI: 0.02–0.10, P = 0.005) were strongly associated with an increased risk of CR-POPF. Elevated CA19-9 levels (OR = 1.03, 95% CI: 1.01–1.05, P = 0.010) and higher BMI (OR = 2.21, 95% CI: 1.21–4.03, P = 0.010) also significantly increased the risk. However, preoperative jaundice was not found to be a significant predictor (OR = 2.24, 95% CI: 0.21–9.56, P = 0.508). These findings suggest that pancreatic duct and texture characteristics, along with CA19-9 levels and BMI, are critical factors for predicting CR-POPF.

Figure 2. Multivariable LR analysis of factors associated with clinically relevant postoperative pancreatic fistula.
RF
The RF model identified pancreatic duct diameter, pancreatic texture, and CA19-9 levels as the most important predictors of CR-POPF (Figure 3), with pancreatic duct diameter showing the highest importance. BMI also demonstrated moderate predictive importance, while jaundice contributed the least to the model's performance. These results underscore the critical role of pancreatic anatomy and tumor markers in accurately predicting CR-POPF risk.

Figure 3. Variable importance plot from the RF model for predicting clinically relevant postoperative pancreatic fistula.
Model performance evaluation
The model performance evaluation of the multivariable LR model and the RF model is shown in Table 3. Although the RF model demonstrated reasonable discrimination and calibration, the LR model outperformed it across most metrics, suggesting a preferable balance between discrimination and calibration for CR-POPF prediction in this dataset.
Calibration curve
The blue solid line represents the original calibration curve of the Multivariable LR model (Figure 4). Overall, it is close to the diagonal, indicating that the predicted probabilities from multivariable LR align well with the actual incidence. There is a slight deviation in the range where the actual occurrence probability is below 0.2, but the overall trend remains stable. In the high-probability region (greater than 0.5), it closely matches the diagonal. The red solid line represents the calibration curve of the RF model. The RF shows large fluctuations in its predictions when the actual occurrence probability is below 0.2. Within the 0.2–0.5 probability band, the RF typically underestimates the actual incidence. In contrast, for probabilities above 0.5, the RF's calibration is close to the diagonal. Overall, Multivariable LR demonstrates superior calibration compared to RF, especially in the medium-to-high probability range, where its predictions align more closely with actual incidence. Meanwhile, the RF is unstable in the low-probability range.

Figure 4. Comparison of the calibration curves for the predictive performance of the multivariable LR and RF models.
Receiver operating characteristic
The predictive performance of the Multivariable LR model and the RF model was evaluated using ROC curve analysis (Figure 5). The Multivariable LR model achieved an AUC of 0.96, Its 95% CI is 0.93–0.99, and an accuracy of 0.87 indicating excellent discriminatory power. The RF model demonstrated an AUC of 0.90, Its 95% CI is 0.79–0.99, and an accuracy of 0.81 also reflecting strong predictive ability. Comparatively, the LR model slightly outperformed the RF model in terms of AUC, suggesting its superior accuracy in predicting CR-POPF. Both models displayed robust performance, validating their utility in clinical risk prediction.

Figure 5. Receiver operating characteristic (ROC) curves comparing the predictive performance of LR and RF models.
Decision curve analysis
DCA was performed to evaluate the clinical utility of the Multivariable LR model and the RF model across a range of threshold probabilities (Figure 6). Both models demonstrated net benefit across clinically relevant thresholds compared to the “Treat All” and “Treat None” strategies. The Multivariable LR model consistently provided a higher net benefit compared to the RF model, particularly within the threshold probability range of 0.1–0.6. Beyond this range, both models showed comparable performance, maintaining clinical utility. These findings indicate that the Multivariable LR model offers better practical value for risk prediction and clinical decision-making in the context of CR-POPF.

Figure 6. Comparison of the decision curve analysis (DCA) curves for the clinical utility of multivariable LR and RF models.
Discussion
This study identified pancreatic duct diameter, pancreatic texture, CA19-9 levels, and BMI as key independent risk factors for CR-POPF. Among these, pancreatic duct diameter and pancreatic texture were found to have the most significant influence on CR-POPF risk, consistent with prior research emphasizing the anatomical and physiological characteristics of the pancreas as critical determinants (28, 29). Elevated CA19-9 levels and higher BMI also showed a strong correlation with CR-POPF, reinforcing the role of tumor markers and patient-related factors in surgical outcomes.
The Multivariable LR model demonstrated superior predictive accuracy compared to the RF model, achieving a better calibration, a higher area under the curve (AUC) and greater clinical utility in decision curve analysis (DCA). While RF, as a machine learning algorithm, offers the advantage of handling complex variable interactions and high-dimensional datasets, its application to conventional clinical data involving a limited number of variables appears less optimal. LR, on the other hand, benefits from its simplicity, interpretability, and suitability for smaller datasets, making it more practical for clinical applications.
Several factors may explain why the LR model outperformed the RF model in our study. First, the sample size was relatively small, which may have limited the ability of the RF to capture complex interactions between variables. RF algorithms generally require large datasets to fully utilize their ability to model complex nonlinear relationships. Second, the predictor variables selected in our study primarily exhibited strong linear or near-linear associations with the outcome, as supported by both univariate and Multivariable LR analyses. The lack of complex non-linear patterns within the data may have favored the performance of LR. Third, the relatively limited number of predictor variables, coupled with low multicollinearity as confirmed by VIF analysis, likely reduced the necessity for complex ensemble methods, further enhancing the suitability of LR in this context. These data characteristics are well-aligned with the strengths of LR, explaining its superior performance in this study.
A smaller pancreatic duct diameter may lead to impaired pancreatic juice drainage, elevated intrapancreatic pressure, and an increased risk of anastomotic leakage (30). Furthermore, a smaller pancreatic duct diameter increases the complexity of the anastomosis, potentially resulting in loose sutures or incomplete duct-to-intestine anastomosis, causing pancreatic juice leakage (31). The soft pancreatic tissue is relatively loose and lacks fibrous support, making it difficult to secure during suturing (19), which compromises the stability of the anastomosis and increases the likelihood of postoperative pancreatic fluid leakage. Related research indicates that patients with soft pancreatic texture have a higher volume of pancreatic fluid secretion and increased enzyme activity (32). Clinically, pancreatic texture is influenced by underlying pathological conditions. A soft pancreatic parenchyma is commonly observed in patients with a normal pancreas or in those with periampullary tumors without significant pancreatic duct obstruction. In contrast, a firm pancreatic texture is typically associated with chronic pancreatitis or pancreatic ductal adenocarcinoma, where longstanding inflammation or tumor-induced desmoplastic reaction results in extensive fibrosis and parenchymal atrophy (33, 34). Therefore, the underlying disease processes may significantly affect the risk of postoperative CR-POPF by impacting the mechanical properties of the pancreas. If these fluids leak, they can cause enzymatic autodigestion of the surrounding tissues, resulting in pancreatic fistula. High levels of CA19-9 are usually indicative of more severe pancreatic inflammation or tumor burden, such as pancreatic cancer or biliary obstruction (35–37). These pathological conditions may weaken pancreatic tissue, making postoperative pancreatic fistula more likely. BMI is recognized as a reliable indicator of protein-calorie malnutrition and obesity. A research found that BMI is correlated with CR-POPF, possibly because a higher BMI is associated with increased visceral fat in the pancreas (38), which softens and weakens the pancreatic texture, significantly increasing the difficulty of pancreatoenteric anastomosis and the risk of CR-POPF. Elis (39) also demonstrated that BMI ≥30 kg/m2 is a risk factor for CR-POPF. Studies by Le Bian (40) and Zou (41) found that BMI ≥25 kg/m2 is a risk factor for CR-POPF. Despite the lack of statistical significance of preoperative jaundice status in Multivariable LR and RF analyses, related studies indicate a strong link between preoperative jaundice and CR-POPF. Research has shown that preoperative TB >250 μmol/L warrants biliary drainage to reduce bilirubin levels, thereby significantly decreasing the occurrence of CR-POPF. Chen et al. (42) through a multicenter retrospective analysis of 1,465 patients undergoing PD, concluded that preoperative biliary drainage decreases the risk of CR-POPF, underscoring the impact of preoperative TB levels on CR-POPF incidence. Research conducted by Xi Yiqing and Shen et al. (43, 44), similarly concluded that elevated preoperative serum bilirubin levels are a risk factor for postoperative pancreatic fistula.
Despite these limitations, this study also presents several novel contributions to the field. This study systematically compares the predictive performance of a multivariable LR model and RF model for CR-POPF in an East Asian cohort. It provides a multidimensional evaluation of model performance. Moreover, this study employs robust internal validation to enhance the reliability of the results. Finally, by focusing on routine preoperative and intraoperative variables, our study developed a clinically applicable model that can assist PD patients in real-time risk stratification and personalized perioperative management. These features contribute to the methodological rigor and practical significance of our findings. Using the LR-based risk score, patients can be stratified into low and high CR-POPF-risk categories, allowing peri-operative management to be individualized. Preoperatively, patients predicted to have a high risk of CR-POPF receive biliary drainage, metabolic optimisation. Intraoperatively, if the patient is predicted to be high risk, surgical modifications, such as reinforced suturing, can be considered. Postoperatively, for high-risk patients, extending the duration of drainage and providing anti-infective therapy can be considered.
While this study provides valuable insights into the risk factors for CR-POPF, the relatively small sample size and limited scope of analyzed variables may constrain the generalizability of the findings. External validation was not performed in this study due to the limited sample size available. Therefore the generalizability and real-world predictive performance of our models remain to be confirmed by future external validation studies. Additionally, the retrospective design inherently introduces unmeasured confounding variables, which may affect the internal validity of the findings. Further large-scale, prospective studies are warranted to validate these results. Future research should explore the integration of novel biomarkers and advanced imaging techniques for more precise preoperative risk assessment. Additionally, the application of artificial intelligence in the prediction and management of CR-POPF warrants further investigation.
Conclusion
This study identified pancreatic duct diameter, pancreatic texture, CA19-9 levels, and BMI as key risk factors for CR-POPF. The Multivariable LR model demonstrated better predictive performance and greater clinical utility compared to the RF model, as confirmed by Calibration curve analysis, Receiver Operating Characteristic and Decision Curve Analysis. These findings highlight the importance of incorporating anatomical, biochemical, and clinical factors into risk assessments to enhance surgical outcomes.
Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.
Ethics statement
The studies involving humans were approved by the Ethics Committee of the First Affiliated Hospital of Zhengzhou University. The studies were conducted in accordance with the local legislation and institutional requirements. The ethics committee/institutional review board waived the requirement of written informed consent for participation from the participants or the participants’ legal guardians/next of kin because this research was conducted retrospectively.
Author contributions
KZ: Visualization, Writing – original draft, Methodology, Investigation, Data curation, Software, Writing – review & editing, Conceptualization. KC: Resources, Funding acquisition, Writing – review & editing.
Funding
The author(s) declare that financial support was received for the research and/or publication of this article. The project was supported by the Henan Medical Science and Technology Research Project (SBGJ202103071).
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declare that no Generative AI was used in the creation of this manuscript.
Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Abbreviations
CR-POPF, clinically relevant postoperative pancreatic fistula; PD, pancreatoduodenectomy; ROC, receiver operating characteristic; AUC, area under the curve; DCA, decision curve analysis; BMI, body mass index; ASA, American society of anesthesiologists; ECOG, eastern cooperative oncology group; VIF, variance inflation factor; CA19-9, carbohydrate antigen 19-9; CEA, carcinoembryonic antigen; ISGPS, international study group on pancreatic fistula; OPD, open pancreaticoduodenectomy; LPD, laparoscopic pancreaticoduodenectomy; RPD, robotic pancreaticoduodenectomy; SD, standard deviation; IQR, interquartile range; MDA, mean decrease accuracy.
References
1. Wang M, Peng B, Liu J, Yin X, Tan Z, Liu R, et al. Practice patterns and perioperative outcomes of laparoscopic pancreaticoduodenectomy in China: a retrospective multicenter analysis of 1029 patients. Ann Surg. (2021) 273(1):145–53. doi: 10.1097/SLA.0000000000003190
2. Clancy TE, Ashley SW. Pancreaticoduodenectomy (whipple operation). Surg Oncol Clin N Am. (2005) 14(3):533–52, vii. doi: 10.1016/j.soc.2005.05.006
3. Torres O, Moraes-Junior J, Fernandes E, Hackert T. Surgical management of postoperative grade C pancreatic fistula following pancreatoduodenectomy. Visc Med. (2022) 38(4):233–42. doi: 10.1159/000521727
4. Vollmer CM Jr, Sanchez N, Gondek S, McAuliffe J, Kent TS, Christein JD, et al. A root-cause analysis of mortality following major pancreatectomy. J Gastrointest Surg. (2012) 16(1):89–102; discussion 102–3. doi: 10.1007/s11605-011-1753-x
5. Casciani F, Bassi C, Vollmer CM. Decision points in pancreatoduodenectomy: insights from the contemporary experts on prevention, mitigation, and management of postoperative pancreatic fistula. Surgery. (2021) 170(3):889–909. doi: 10.1016/j.surg.2021.02.064
7. Enzer NA, Chiles J, Mason S, Shirahata T, Castro V, Regan E, et al. Proteomics and machine learning in the prediction and explanation of low pectoralis muscle area. Sci Rep. (2024) 14(1):17981. doi: 10.1038/s41598-024-68447-y
8. Hornung R, Wright MN. Block forests: random forests for blocks of clinical and omics covariate data. BMC Bioinformatics. (2019) 20(1):358. doi: 10.1186/s12859-019-2942-y
9. Shehab M, Abualigah L, Shambour Q, Abu-Hashem MA, Shambour MK, Alsalibi AI, et al. Machine learning in medical applications: a review of state-of-the-art methods. Comput Biol Med. (2022) 145:105458. doi: 10.1016/j.compbiomed.2022.105458
10. Ahsan MM, Luna SA, Siddique Z. Machine-learning-based disease diagnosis: a comprehensive review. Healthcare. (2022) 10(3):541. doi: 10.3390/healthcare10030541
11. Mungroop TH, van Rijssen LB, van Klaveren D, Smits FJ, Van Woerden V, Linnemann RJ, et al. Alternative fistula risk score for pancreatoduodenectomy (a-FRS): design and international external validation. Ann Surg. (2019) 269(5):937–43. doi: 10.1097/SLA.0000000000002620
12. Li Y, Zhou F, Zhu DM, Yang J, Yao J, Wei YJ, et al. Novel risk scoring system for prediction of pancreatic fistula after pancreaticoduodenectomy. World J Gastroenterol. (2019) 25(21):2650–64. doi: 10.3748/wjg.v25.i21.2650
13. He C, Zhang Y, Li L, Zhao M, Wang C, Tang Y. Risk factor analysis and prediction of postoperative clinically relevant pancreatic fistula after distal pancreatectomy. BMC Surg. (2023) 23:5. doi: 10.1186/s12893-023-01907-w
14. Li B, Pu N, Chen Q, Mei Y, Wang D, Jin D, et al. Comprehensive diagnostic nomogram for predicting clinically relevant postoperative pancreatic fistula after pancreatoduodenectomy. Front Oncol. (2021) 11:717087. doi: 10.3389/fonc.2021.717087
15. Ouyang L, Liu RD, Ren YW, Nie G, He TL, Li G, et al. Nomogram predicts CR-POPF in open central pancreatectomy patients with benign or low-grade malignant pancreatic neoplasms. Front Oncol. (2022) 12:1030080. doi: 10.3389/fonc.2022.1030080
16. Wang GQ, Yadav DK, Jiang W, Hua YF, Lu C. Risk factors for clinically relevant postoperative pancreatic fistula (CR-POPF) after distal pancreatectomy: a single center retrospective study. Can J Gastroenterol Hepatol. (2021) 2021:8874504. doi: 10.1155/2021/8874504
17. Yang F, Windsor JA, Fu DL. Optimizing prediction models for pancreatic fistula after pancreatectomy: current status and future perspectives. World J Gastroenterol. (2024) 30:1329–45. doi: 10.3748/wjg.v30.i10.1329
18. Ashraf Ganjouei A, Romero-Hernandez F, Wang JJ, Casey M, Frye W, Hoffman D, et al. A machine learning approach to predict postoperative pancreatic fistula after pancreaticoduodenectomy using only preoperatively known data. Ann Surg Oncol. (2023) 30:7738–47. doi: 10.1245/s10434-023-14041-x
19. Bassi C, Marchegiani G, Dervenis C, Sarr M, Hilal MA, Adham M, et al. The 2016 update of the international study group (ISGPS) definition and grading of postoperative pancreatic fistula: 11 years after. Surgery. (2017) 161(3):584. doi: 10.1016/j.surg.2016.11.014
20. Shen J, Guo F, Sun Y, Zhao J, Hu J, Ke Z, et al. Predictive nomogram for postoperative pancreatic fistula following pancreaticoduodenectomy: a retrospective study. BMC Cancer. (2021) 21(1):550. doi: 10.1186/s12885-021-08201-z
21. James G, Witten D, Hastie T, Tibshirani R. An Introduction to Statistical Learning. New York: Springer (2013).
22. Franzese M, Iuliano A. Correlation analysis. In: Ranganathan S, Gribskov M, Nakai K, editors. Encyclopedia of Bioinformatics and Computational Biology. Amsterdam: Elsevier (2019). p. 706–21.
23. Glonek GFV, McCullagh P. Multivariate logistic models. J R Stat Soc Series B Stat Methodol. (1995) 57(3):533. doi: 10.1111/j.2517-6161.1995.tb02046.x
25. Huang Y, Li W, Macheret F, Gabriel RA, Ohno-Machado L. A tutorial on calibration measurements and calibration models for clinical prediction models. J Am Med Inform Assoc. (2020) 27(4):621–33. doi: 10.1093/jamia/ocz228
26. Eng J. Receiver operating characteristic analysis: a primer1. Acad Radiol. (2005) 12(7):909. doi: 10.1016/j.acra.2005.04.005
27. Fitzgerald M, Saville BR, Lewis RJ. Decision curve analysis. J Am Med Assoc. (2015) 313(4):409. doi: 10.1001/jama.2015.37
28. Zhang B, Yuan Q, Li S, Xu Z, Chen X, Li L, et al. Risk factors of clinically relevant postoperative pancreatic fistula after pancreaticoduodenectomy: a systematic review and meta-analysis. Medicine. (2022) 101(26):e29757. doi: 10.1097/MD.0000000000029757
29. Schuh F, Mihaljevic AL, Probst P, Trudeau MT, Müller PC, Marchegiani G, et al. A simple classification of pancreatic duct size and texture predicts postoperative pancreatic fistula: a classification of the international study group of pancreatic surgery. Ann Surg. (2023) 277(3):e597–608. doi: 10.1097/SLA.0000000000004855
30. Yang JX, Ye SY, Dai D. Risk factors and preventive measures for postoperative pancreatic fistula after pancreaticoduodenectomy. World Chin J Dig. (2020) 28:914–9. doi: 10.11569/wcjd.v28.i18.914
31. Barreto SG, Shukla PJ. Different types of pancreatico-enteric anastomosis. Transl Gastroenterol Hepatol. (2017) 2:89. doi: 10.21037/tgh.2017.11.02
32. Wellner UF, Kayser G, Lapshyn H, Sick O, Makowiec F, Höppner J, et al. A simple scoring system based on clinical factors related to pancreatic texture predicts postoperative pancreatic fistula preoperatively. HPB. (2010) 12(10):696. doi: 10.1111/j.1477-2574.2010.00239.x
33. Nikolaidis P, Hammond NA, Day K, Yaghmai V, Wood CG, Mosbach DS, et al. Imaging features of benign and malignant ampullary and periampullary lesions. Radiographics. (2014) 34:624–41. doi: 10.1148/rg.343125191
34. Kalayarasan R, Himaja M, Ramesh A, Kokila K. Radiological parameters to predict pancreatic texture: current evidence and future perspectives. World J Radiol. (2023) 15:170–81. doi: 10.4329/wjr.v15.i6.170
35. Bhandare MS, Varty GP, Reddy Obili RC, Chopde A, Pawar A, Krishnakumar K, et al. CA 19-9 surveillance detects recurrences early and contributes to improvement in survival in resected ampullary cancers: analysis of 572 cases. Ann Surg. (2024). doi: 10.1097/SLA.0000000000006419
36. Yasui K, Yoshida R, Umeda Y, Kuise T, Yoshida K, Takagi K, et al. Sustained elevation of CA19-9 after resection is a strong prognostic factor for resectable pancreatic cancer. HPB. (2021) 23:S254. doi: 10.1016/j.hpb.2020.11.640
37. Izumo W, Higuchi R, Furukawa T, Yazawa T, Uemura S, Shiihara M, et al. Evaluation of preoperative prognostic factors in patients with resectable pancreatic ductal adenocarcinoma. Scand J Gastroenterol. (2019) 54(6):780. doi: 10.1080/00365521.2019.1624816
38. Nishida Y, Kato Y, Kudo M, Aizawa H, Okubo S, Takahashi D, et al. Preoperative sarcopenia strongly influences the risk of postoperative pancreatic fistula formation after pancreaticoduodenectomy. J Gastrointest Surg. (2016) 20(9):1586. doi: 10.1007/s11605-016-3146-7
39. Ellis RJ, Brock Hewitt D, Liu JB, Cohen ME, Merkow RP, Bentrem DJ, et al. Preoperative risk evaluation for pancreatic fistula after pancreaticoduodenectomy. J Surg Oncol. (2019) 119(8):1128. doi: 10.1002/jso.25464
40. Zarzavadjian Le Bian A, Fuks D, Montali F, Cesaretti M, Costi R, Wind P, et al. Predicting the severity of pancreatic fistula after pancreaticoduodenectomy: overweight and blood loss as independent risk factors: retrospective analysis of 277 patients. Surg Infect. (2019) 20(6):486. doi: 10.1089/sur.2019.027
41. Zou SY, Wang WS, Zhan Q, Deng XX, Shen BY. Higher body mass index deteriorates postoperative outcomes of pancreaticoduodenectomy. Hepatobiliary Pancreat Dis Int. (2020) 19(2):163. doi: 10.1016/j.hbpd.2019.11.007
42. Chen H, Wang W, Ying X, Deng X, Peng C, Cheng D, et al. Predictive factors for postoperative pancreatitis after pancreaticoduodenectomy: a single-center retrospective analysis of 1465 patients. Pancreatology. (2020) 20(2):211. doi: 10.1016/j.pan.2019.11.014
43. Xi YQ, Wei X, Yang ZS, Wang HQ, Wang X, Yang TC, et al. A meta-analysis of the risk factor of pancreatic fistula after pancreaticoduodenectomy. Chin J Exp Surg. (2019) 36:1857–60. doi: 10.3760/cma.j.issn.1001-9030.2019.10.037
44. Shen Z, Zhang J, Zhao S, Zhou Y, Wang W, Shen B. Preoperative biliary drainage of severely obstructive jaundiced patients decreases overall postoperative complications after pancreaticoduodenectomy: a retrospective and propensity score-matched analysis. Pancreatology. (2020) 20(3):529. doi: 10.1016/j.pan.2020.02.002
Keywords: postoperative pancreatic fistula, pancreatoduodenectomy, surgical complications, random forest, machine learning
Citation: Zhang K and Chen K (2025) Risk prediction and clinical utility analysis of postoperative pancreatic fistula: a comparative study of multivariable logistic regression and random forest models. Front. Surg. 12:1596224. doi: 10.3389/fsurg.2025.1596224
Received: 19 March 2025; Accepted: 28 May 2025;
Published: 13 June 2025.
Edited by:
Shuying Chen, Brigham and Women’s Hospital and Harvard Medical School, United StatesReviewed by:
Lixia Wang, Cedars Sinai Medical Center, United StatesChristian Cotsoglou, IRCCS San Gerardo dei Tintori Foundation, Italy
Copyright: © 2025 Zhang and Chen. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Kunlun Chen, Y2hlbmtsZm9yZnV0dXJlQDE2My5jb20=