The value of machine learning in preoperative identification of lymph node metastasis status in endometrial cancer: a systematic review and meta-analysis

Background The early identification of lymph node metastasis status in endometrial cancer (EC) is a serious challenge in clinical practice. Some investigators have introduced machine learning into the early identification of lymph node metastasis in EC patients. However, the predictive value of machine learning is controversial due to the diversity of models and modeling variables. To this end, we carried out this systematic review and meta-analysis to systematically discuss the value of machine learning for the early identification of lymph node metastasis in EC patients. Methods A systematic search was conducted in Pubmed, Cochrane, Embase, and Web of Science until March 12, 2023. PROBAST was used to assess the risk of bias in the included studies. In the process of meta-analysis, subgroup analysis was performed according to modeling variables (clinical features, radiomic features, and radiomic features combined with clinical features) and different types of models in various variables. Results This systematic review included 50 primary studies with a total of 103,752 EC patients, 12,579 of whom had positive lymph node metastasis. Meta-analysis showed that among the machine learning models constructed by the three categories of modeling variables, the best model was constructed by combining radiomic features with clinical features, with a pooled c-index of 0.907 (95%CI: 0.886-0.928) in the training set and 0.823 (95%CI: 0.757-0.890) in the validation set, and good sensitivity and specificity. The c-index of the machine learning model constructed based on clinical features alone was not inferior to that based on radiomic features only. In addition, logistic regression was found to be the main modeling method and has ideal predictive performance with different categories of modeling variables. Conclusion Although the model based on radiomic features combined with clinical features has the best predictive efficiency, there is no recognized specification for the application of radiomics at present. In addition, the logistic regression constructed by clinical features shows good sensitivity and specificity. In this context, large-sample studies covering different races are warranted to develop predictive nomograms based on clinical features, which can be widely applied in clinical practice. Systematic review registration https://www.crd.york.ac.uk/PROSPERO, identifier CRD42023420774.


Introduction
Endometrial cancer (EC) is the most common gynecological cancer in high-income countries.In 2020, 417,367 women were diagnosed with EC worldwide.Compared with low-income and middle-income countries, EC is more common in high-income regions.The regions with the highest EC diagnosis are North America and Western Europe, and the incidence rate of EC seems to be increasing rapidly (1,2).EC is a serious threat to women's lives.As of 2018, the incidence and mortality of women with EC in Europe were 19.2-20.2/100,000and 2.0-3.7/100,000(3,4), respectively.
Surgery is the main treatment for patients with localized EC.However, whether lymphadenectomy is necessary during surgery is controversial, and whether para-aortic lymphadenectomy should be added to pelvic lymphadenectomy has been disputed (2, 5).Previously, all patients were advised to undergo complete standard lymphadenectomy (i.e., dissection and evaluation of pelvic and para-aortic lymph nodes), but this was associated with more side effects (6).Therefore, the effective preoperative assessment of lymph node metastasis is of profound significance in clinical practice.Unfortunately, there is a lack of efficient preoperative assessment methods.The Mayo criteria are widely applied in clinical practice for predicting the risk of lymph node metastasis in EC (7).However, its true prediction accuracy still needs to be further improved.
With the gradual improvement of statistical theory, researchers have gradually applied machine learning methods (especially supervised machine learning methods) into clinical practice, mainly for the diagnosis of disease status (8,9), the prediction of disease occurrence (10,11), or the prediction of prognosis (12, 13).In some fields, the accuracy of machine learning in screening or diagnosing diseases is not inferior to human clinical practice (14,15).In this context, some investigators have tried to apply machine learning methods to the identification of preoperative lymph node metastasis in EC.However, machine learning includes diversified mathematical modeling methods (such as logistic regression, random forest, support vector machine, and artificial neural network), and machine learning models also involve a wide range of modeling variables (such as radiomic features, clinical features, and pathological imaging).As modeling methods and modeling variables are diversified, there is a lack of comprehensive and systematic understanding of the preoperative diagnostic value of machine learning for lymph node metastasis status in EC patients (16).Therefore, this systematic review and meta-analysis was conducted to explore the predictive value of machine learning for lymph node metastasis in EC patients.Also, we comprehensively summarized the effective predictive variables and compared the predictive values of clinical and radiomic features for lymph node metastasis in EC patients.
(4) Studies constructing a corresponding machine learning model but lacking external validation or independent validation set; (5) Studies on different types of machine learning constructed from the same dataset; (6) Studies reported in English.

Exclusion criteria
(1) Study types were meta-analysis, review, guidelines, expert opinions, etc.; (2) Studies with risk factors analyzed only but no complete risk model constructed; (3) Studies lacking outcome indicators (Roc, c-statistic, cindex, sensitivity, specificity, accuracy, recovery, precision, confusion matrix, diagnostic fourfold table, F1 score, calibration curve) for the prediction accuracy of the risk model; (4) Studies to validate maturity scale; (5) Studies with the accuracy predicted by single factor.

Data sources and search strategy
A systematic search was conducted in Pubmed, Embase, Web of Science, and Cochrane until March 12, 2023, using subject terms and free terms.No restrictions were imposed on publication regions.The complete search strategy is shown in Table S1.

Study selection and data extraction
The retrieved studies were imported into Endnote X9 to automatically and manually remove duplicate publications.Then, the titles or abstracts were checked to obtain primary studies that were initially eligible.Finally, the full texts of the remaining studies were read to include primary studies that were eligible for this systematic overview.
Before data extraction, a standardized data extraction form was developed.Extracted data encompassed title, first author, publication year, study type (case-control, retrospective/prospective cohort study, nested cohort study, case-cohort study), patient source (single-center, multi-center, registry database), FIGO stage for tumor, number of cases with lymph node metastasis (training set, validation set), total number of cases, generation method of validation set (internal validation: random sampling, k-fold cross-validation, leave-one-out; external validation: prospective, multi-center; over-fitting method: kfold cross-validation, bootstrap), missing value handling method, variable screening/feature selection method, types of mathematical models, and modeling variables.
Two investigators (RZL and YJY) independently screened the literature, extracted data, and cross-checked the data.In case of disagreement, a third investigator (LYQ) participated in discussions and decisions.

Risk of bias in the included studies
PROBAST was used to assess the risk of bias in the included studies, including several questions in four different domains: participants, predictive variables, results, and statistical analysis, which reflected the overall risk of bias and overall application (17).The four domains contained 2, 3, 6, and 9 questions in specificity respectively, each of which had three answers (yes/probably yes, no/ probably no, and no available information).A domain was categorized as having a high risk of bias if at least one specific question in the domain indicated "no/probably no".A domain was categorized as having a low risk of bias if all specific questions in the domain indicated "yes/probably yes".A domain was categorized as having an unclear risk of bias if all specific questions in the domain indicated "no/probably no" with at least one "no available information".The PROBAST was used to assess the machine learning models in the included literature.
Two investigators (RZL and YJY) independently assessed and cross-checked the risk of bias.In case of disagreement, a third investigator (LYQ) participated in discussions and decisions.

Outcomes
The outcome indicator in this systematic review was the c-index, which reflected the overall accuracy of the model.However, the cindex cannot reflect the accuracy of the model in predicting lymph node metastasis, especially when there is a serious imbalance in the number of lymph node metastasis and non-metastasis samples.Even if a high c-index is presented, it may be caused by the high accuracy of the model in predicting negative events (lymph node nonmetastasis).Therefore, outcome indicators of this systematic review also included the sensitivity and specificity of machine learning models in predicting lymph node metastasis.In addition, we also summarized the modeling variables.As the machine learning models constructed clinically is mainly logistic regression, in order to try to construct the logistic regression risk equation for lymph node metastasis, outcome indicators also included the odds ratio (OR) of each modeling variable for constructing logistic regression.

Synthesis methods
The c-index, its standard error (SE), and 95% confidence interval (95%CI) should be provided for meta-analysis of c-index.However, since many included studies lacked the SE and 95%CI of c-index, the SE of c-index was estimated with reference to the study conducted by Debray TP et al. (18).This study also performed meta-analysis on sensitivity and specificity, for which a diagnostic fourfold table was required in the included studies.However, the included studies only provided outcome indicators such as sensitivity, specificity, precision and accuracy, so we developed the diagnostic fourfold table by combining the number of cases with lymph node metastasis and the total number of cases.In addition, some original studies only provided the receiver operating characteristic curve (Roc) of the machine learning model.In this case, we extracted the sensitivity and specificity on the Roc curve by using Origen2021, selected the sensitivity and specificity by using the best Youden's index, and then developed the diagnostic fourfold table by combining the number of cases and the total number of cases.Moreover, the included studies converted continuous variables into categorical variables or remained them in the original continuous state when summarizing the OR values of modeling variables in Logistic regression, so we conducted metaanalysis of continuous variables.
In meta-analyses of c-index and OR values of modeling variables, a random-effects model was used when heterogeneity index I 2 ≥50%, and a fixed-effects model was used when I 2 <50%.The meta-analysis of sensitivity and specificity was performed using a bivariate mixed-effects model.
In addition, the modeling variables consisted of clinical features, radiomic features, and radiomic + clinical features, and there were also diversified machine learning models.Therefore, subgroup analyses were conducted according to modeling variables and model types.This meta-analysis was performed in R4.2.0 (R development Core Team, Vienna, http://www.R-project.org), with a P value less than 0.05 indicating statistical significance.

Study selection
A total of 3,033 studies were retrieved, including 782 duplicate studies marked by Endnote.Endnote can only mark the literature with a completely consistent title and author's writing style.However, a large number of duplicate studies had slight differences in these aspects, making it difficult to mark them automatically.Therefore, there were 356 studies that were manually identified duplicates.Then, after reading the titles or abstracts of the remaining literature, 62 primary studies were initially eligible, and their full texts were downloaded.After reading the full texts, 50 studies were finally included in this systematic review .The literature screening process is shown in Figure 1.

Risk of bias in the included studies
The original studies included 39 case-control studies, based on which the machine learning model constructed was rated as high risk of bias for Populations by PROBAST.Also, it was unclear whether the assessment of predictive factors was carried out under the condition of known lymph node metastasis status in a large number of single-center case-control studies, based on which the machine learning model constructed was rated as high risk of bias for Prediction factors by PROBAST.In contrast, it was clear that the assessment of Results was carried out under the condition of confirmed lymph node metastasis status by biopsy in a large number of single-center case-control studies, based on which the machine learning model constructed was rated as high risk of bias for very few Results.In addition, according to Statistic analysis, the number of cases in the training set needed to meet EVP≥20.An independent validation set was required, and the number of cases in the validation set should be>100, leading to a main high risk of bias.The results of the risk of bias assessment are provided in Figure 2.

Meta-analysis 4.1 Mayo criteria
Eight datasets from the included studies were used to validate the accuracy of Meyo criteria for predicting lymph node metastasis in EC patients.The results of meta-analysis showed that the c-index in the training set was 0.690 (95%CI: 0.640-0.740),the sensitivity was 0.81 (95%CI: 0.66-0.90),and the specificity was 0.59 (95%CI: 0.38-0.77).The detailed results are shown in Tables 2, 3.

Machine learning model based on clinical features alone for lymph node metastasis status
In the included studies, there were a total of 41

Machine learning model based on radiomic features alone for lymph node metastasis status
In the included studies, there were a total of 16 machine learning models constructed based on radiomic features alone.The pooled c-index was 0.798 (95%CI: 0.758-0.837) in the training set and 0.810 (95%CI: 0.770-0.850) in the validation set.

Machine learning model based on radiomic features combined with clinical features for lymph node metastasis status
In the included studies, there were a total of 11 machine learning models constructed based on radiomic features combined with clinical features.The pooled c-index was 0.907 (95%CI: 0.886-0.928) in the training set and 0.823 (95%CI: 0.757-0.890) in the validation set.The pooled sensitivity and specificity in the training set were 0.88 (95%CI: 0.84-0.92)and 0.83 (95%CI: 0.79-0.87),and 0.77 (95%CI: 0.64-0.87)and 0.84 (95%CI: 0.74-0.91) in the validation set, respectively.The detailed results are shown in Tables 2, 3. Literature screening process.Results of risk of bias assessment of included machine-learning models by PROBAST.

Subgroup analysis
Subgroup analyses were performed by the type of machine learning models constructed based on clinical features, radiomic features, and radiomic features combined with clinical features.The models for different modeling variables were mainly logistic regression, and most of the studies also constructed the visual Nomograms.The results of meta-analysis showed that logistic regression had a good predictive value not inferior to that of other machine learning models for the same modeling variable.

Modeling variables in logistic regression
Among the machine learning models constructed by the same type of modeling variables, logistic regression was not inferior to other models in the predictive value.We summarized the modeling variables included in logistic regression, and the results of metaanalysis showed that Grade, Histological type, Myometrial invasion, Cervical stromal invasion, LVSI, CA125, CA153, CA199, Ki67, P53, Tumor size, ER, Enlarged lymph nodes, Mitosis and SII were effective predictive variables (P<0.05) of lymph node metastasis status in EC, as shown in Table 4 and Figures S7-S12.

Clinical importance of preoperative assessment of lymph node metastasis
Preoperative identification of the status of lymph node metastases in EC patients is of profound clinical significance.For EC patients, some postoperative complications can seriously affect the quality of life of surviving patients.Among them, lymphoedema is one of the adverse complications that we need to pay attention to (69).Lymphadenectomy increases the risk of lymphoedema (70).Although the technique of sentinel lymph node (SLN) biopsy is used to infer the surgical staging of EC (71), however, researchers are still actively exploring some artificial intelligence-based lymph node metastasis detection tools.

Summary of the main findings
This study showed that the modeling variables for predicting lymph node metastasis status in EC patients included clinical features, radiomic features, and radiomic combined with clinical features.Among all types of modeling methods, logistic regression was mostly used to construct a nomogram, and it seems to have a cindex not inferior to that of other models in the training set and validation set.In addition, the c-index of the machine learning model constructed based on clinical features alone was not inferior to that of the machine learning model constructed based on radiomic features alone.In terms of the nomogram based on logistic regression, the cindex of the nomogram based on clinical features alone was close to that of the nomogram based on radiomic features alone.The machine learning method with the best predictive value was the one constructed based on radiomic features combined with clinical features, which was also applied to the nomogram.

Comparison with previous studies
Previous clinical research explored the accuracy of preoperative detection for lymph node metastasis in EC patients by using CT, MRI, PET/CT, ultrasound and other imaging approaches, mainly MRI and PET/CT.Bollineni VR et al. (72) systematically reviewed 13 original studies and reported that the sensitivity and specificity of 18F-FDG PET/CT in preoperative detection of lymph node metastasis in EC patients were 0.72 (95% CI: 0.55 ~0.98) and 0.92 (95% CI: 0.84 ~0.97), respectively.A recent study showed that the sensitivity of 18F-FDG PET and PET/CT in preoperative detection of lymph node metastasis in EC patients was 0.68 (95% CI: 0.63 ~0.73) and 0.96 (95% CI: 0.96 ~0.97), respectively (73).Qiu et al. (74) systematically reviewed 14 studies and found that the sensitivity and specificity of MRI for preoperative prediction of pelvic or/and paraaortic lymph node metastasis in EC patients were 0.59 (95%CI: 0.48 0.69) and 0.95 (95%CI: 0.93 ~0.96), while those of MRI for preoperative prediction of pelvic lymph node metastasis were 0.65 (95%CI: 0.51 ~0.77) and 0.95 (95%CI: 0.93 ~0.96).A systematic review by Luomaranta A et al. (75) on the preoperative detection of EC patients by MRI showed similar sensitivity and specificity to that reported by Qiu et al.The detection rate of lymph node metastasis in EC patients by ultrasound seems to be unsatisfactory (76).Thus, the preoperative detection of lymph node metastasis in EC patients by imaging approaches had a good specificity, but a seriously insufficient sensitivity.Our study showed that the machine learning method had a better sensitivity (> 0.8), and the machine learning model constructed based on clinical features had a higher sensitivity but a lower specificity to some extent.
In addition, this study showed that the Meyo criteria currently used in clinical practice had a high sensitivity, but its specificity was worrying.However, this finding was based on a small number of studies, and the identification value of Meyo criteria for lymph node metastasis in EC patients requires further verification.
Among diversified machine learning models, some had better prediction performance, such as convolutional neural network, support vector machine, and XGBoost (77,78), but it seemed that the most popular machine learning model in clinical practice was still logistic regression.This is mainly because the nomogram can be Modeling variables are of critical importance for improving the accuracy of machine learning.However, only a few studies summarized the evidence in this regard.The systematic review by Reijnen C et al. (80) showed that CA-125 and thrombocytosis were associated with the risk of lymph node metastasis in EC patients, and the systematic review by Fu et al. (81) reported that tumor diameter was also related to lymph node metastasis.Therefore, the lack of comprehensive independent predictors for lymph node metastasis in EC patients has posed a challenge to the early identification of lymph node metastasis status in EC patients.In this study, we summarized the modeling variables included in machine learning.Since the risk model constructed based on clinical features alone also had good sensitivity (>0.8), risk equations or predictive nomograms for preoperative prediction of lymph node metastasis in EC patients can be constructed based on this study.
The FIGO 2023 staging system (82) classifies lymph node metastases to micrometastasis and macrometastasis, in which IIIC1 was metastasis to the pelvic lymph nodes (IIIC1i: micrometastasis, IIIC1ii: macrometastasis), IIIC2 was metastasis to para-aortic lymph nodes up to the renal vessels, with or without metastasis to the pelvic lymph nodes (IIIC2i: micrometastasis, IIIC2ii: macrometastasis).SLN biopsy is an appropriate alternative to systematic lymphadenectomy, and ultrastaging provides more sensitive and accurate identification of lymphatic disease than standard lymph node dissection.SLN biopsy may also be considered for low/low intermediate-risk patients to rule out occult lymph node metastases and to identify disease that is truly confined to the uterus.Therefore, the ESGO-ESTRO-ESP guidelines allow a SLN approach for all EC patients, which is recognized by FIGO.Although, the value of machine learning for the identification of lymph node metastatic status in EC patients was systematically described in our study, the detection of lymph node metastatic site and extent is also necessary.Future studies could explore the identification of metastatic status for SLN.

Strengths and limitations of the study
The strengths of this study lie in that it was the first systematic review on the preoperative diagnostic value of machine learning for lymph node metastasis in EC patients, and it summarized the existing main modeling variables (clinical features, radiomic features), so as to provide guidance and references for the development of clinical risk tools in the future.However, there are still some limitations in this study.Firstly, most of the included studies focused on logistic regression, with less exploration on other models, making it difficult to summarize their applied value.Secondly, in the included studies, the validation method of the models was mainly internal validation with random sampling, which likely restricted the promotion of the model to other fields.Especially for models based on radiomic features, it poses a serious challenge, since the radiomic features are seriously affected by the experience of radiologists, and the configuration of radiation devices.Thirdly, the included studies were mainly case-control studies, some of which had a small sample size, raising a concern about the stability of the model.Fourthly, The Cancer Genome Atlas (TCGA) classifies EC into four distinct molecular categories: POLE ultramutated (POLEmut), high microsatellite instability (MSI-H) or mismatch repair defective (MSI-H or MMRd), copy number low or no specific molecular profiling (CNL or NSMP), and copy number high or p53 abnormal (CNH or p53abn).However, the original studies included did not strictly differentiate between molecular subtypes (POLEmut, MMRd, NSMP, and p53bn), which resulted in our systematic review failing to provide corresponding evidence.Finally, we only included studies that constructed machine learning for detecting lymph node metastasis and aggregated interpretable clinical features and associations with lymph node metastasis.However, we did not include studies that only analyzed risk factors.Thus, the pooled results may have missed a small number of other clinical features.

Conclusions
The machine learning model is feasible for preoperative prediction of the lymph node metastasis status of EC patients, and the visual nomogram of logistic regression constructed based on clinical features has favorable sensitivity and specificity.In addition, models based on radiomic features combined with clinical features have a better predictive value.Large-sample studies covering different races are warranted to develop predictive nomograms based on clinical features, which can be widely applied in clinical practice.In view of the excellent predictive performance of machine learning models constructed based on radiomic features combined with clinical features, we also look forward to accelerating the development and application of radiomic features and proposing standardized criteria for their application, so as to develop intelligent diagnosis of complex disease status and intelligent prediction of disease prognosis based on radiomic features.

TABLE 1
Basic characteristics of the included literature.

TABLE 2
Meta-analysis results of c-index for predicting lymph node metastasis in EC patients using machine learning.

TABLE 3
Meta-analysis results of sensitivity and specificity of machine learning in predicting lymph node metastasis in EC patients.

TABLE 4
Meta-analysis results of the OR of modeling variables used to construct a Logistic regression model for predicting lymph node metastasis in EC.

TABLE 4 Continued
(79)gram features a simple application method and good performance in the visualization of results, which is very important for predicting lymph node metastasis in tumors, such as Briganti nomogram for prostate cancer(79).As shown in this study, logistic regression seemed to have a relatively good predictive value.Therefore, followup studies can try to develop more general nomograms for predicting lymph node metastasis in EC patients.