Developing and Validating Novel Nomograms for Predicting the Overall Survival and Cancer-Specific Survival of Patients With Primary Vulvar Squamous Cell Cancer

Background: To develop and validate novel nomograms for better predicting the overall survival (OS) and cancer-specific survival (CSS) of patients with vulvar squamous cell cancer (VSCC). Methods: A retrospective analysis using a population-based database between 2004 and 2016 was carried. A 10-fold cross-validation with 200 repetitions was used to choose the best fit multivariate Cox model based on the net-benefit of decision curve analysis. Net-benefit, Harrell's C concordance statistic (C-statistic) of calibration plot, and area under the receiver operating characteristic curve (AUC) were used to evaluate the model prediction accuracy. Nomograms of the OS and CSS were generated based on the best fit model. Results: Of the 6,792 patients with VSCC, 5,094 (75%) and 1,698 (25%) were allocated to the training and validation cohort, respectively. All the variables were balanced between the training and validation cohorts. Age, insurance, tumor size, pathological grade, radiotherapy, chemotherapy, invasion depth, lymphadenectomy, sentinel lymph nodes biopsy, surgery, N stage, and M stage were in the best fit model for generating nomograms. The decision curve analysis, calibration plot, and receiver operating characteristic (ROC) curve show the better prediction performance of the model compared to previous studies. The C-statistics of our model for OS prediction are 0.80, 0.83, and 0.81 in the training, validation, and overall cohorts, respectively, while for CSS prediction are 0.83, 0.85, and 0.84. The AUCs for 3- and 5-year OS are the same and are 0.81, 0.83, and 0.81 in the training, validation, and overall cohorts, respectively. The AUCs for 3- and 5-year CSS are 0.78 and 0.80, 0.79 and 0.80, and 0.79 and 0.80 in those three cohorts. Conclusions: Our model shows the best prediction accuracy of the OS and CSS for patients with vulvar cancer (VC), which is of significant clinical practice value.


INTRODUCTION
Primary vulvar cancer (VC) is a rare malignancy that accounts for about 5% of all gynecologic cancer cases, with more than 6,100 newly diagnosed cases yearly and a rising death rate trends in the United States, leading to more than 1,400 in 2020 to 1,500 in 2021 (1,2). Furthermore, 90% of VC is squamous cell carcinoma (VSCC) (3).
The primary therapy for VSCC is surgical resection and radiotherapy with/without chemotherapy (3,4). VC frequently spreads to the regional lymph nodes. The patients with regional lymph nodes involvement had worse survival (5). For VSCC with lymph node metastasis, lymphadenectomy and sentinel lymph nodes biopsy (SLNB) were both carried out. However, lymphadenectomy is associated with a high probability of complications (66-85%) that are the fundamental cause of death after surgery, such as wound breakdown, infection, lymphoceles, lymphedema, cellulitis, and erysipelas (6). After applying several new surgical techniques of lymphadenectomy in recent years, the morbidity after lymphadenectomy decreased in recent years but remains high (7). SLNB is less aggressive and has a lower complication occurrence rate and thus could prolong the survival of patients with VSCC, so it is preferred as the replacement of lymphadenectomy for well-selected patients with VSCC. Moreover, SLNB has a sensitivity of more than 95% to indicate lymph node involvement and a specialty of nearly 100% (8). So SLNB should be included as a predictor in nomograms for survival prediction. However, no nomograms for predicting the survival of the patients with VC have taken SLNB into account.
Nomograms for predicting the cancer-specific survival (CSS) of the patients with VC have been developed. For example, a nomogram based on age, American Joint Committee on Cancer (AJCC) T stage, invasion depth, margin status, and lymph nodes status had a Harrell's C concordance statistics (C-statistic) of 0.78 and 0.83 in the validation study and training study, respectively (9,10). However, detection of margin status and the number of lymph nodes involved are difficult and highly influenced by the experience and imaging techniques of clinicians. Some studies even found that margin status was not associated with survival, probably due to the hardship of identifying margin status (11)(12)(13). A recently developed study comprising age, tumor size, pathological stage, metastasis, radiotherapy, chemotherapy, and surgery had a prediction accuracy of C-statistics of 0.81 for CSS prediction, without considering the invasion depth, SLNB, and N stage (14). Moreover, no nomograms have been developed to predict the overall survival (OS) of the patients with VC.
Therefore, we tried to develop novel nomograms for precisely predicting the OS and CSS for the patients with VC using a population-based database.

Data Source and Study Population
The patients with the International Classification of Diseases for Oncology, 3rd Edition (ICD-O-3) codes of C51.0, C51.1, C51.2, C51.8, C51.9, and the ICD-O-3 histology codes 8050-8084 (squamous cell carcinoma) were selected from the Surveillance, Epidemiology, and End Results (SEER) Program database of the National Cancer Institute from 2004 to 2016 (15). Moreover, the patients were excluded under the following conditions: (1) not squamous cell carcinoma; (2) not the only first primary tumor; (3) age at diagnosis <18 or more than 100 years old; and (4) not confirmed by positive histology.

Outcomes
Overall survival was the primary outcome. CSS was the secondary outcome, calculated based on the patients whose death was attributable to vulvar cancer, while those who died of other reasons rather than VC were considered censors.

Statistical Analysis
The overall sample was randomly split into training (75%) and validation (25%) cohorts, with the constraints of keeping the proportion of death events similar between those two cohorts, following the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) guideline (16). The chi-square test was applied to test the balance of all the available variables between the training and validation groups. Within the training cohort, a 10-fold cross-validation with 200 repetitions was carried out to identify the best fit model based on the net benefit and Harrell's C-statistic, model with the largest average net benefit was considered as the best fit model. If the models have the same net benefit and C-statistic, the one with fewer variables is chosen.
After the best fit model had been identified, the model was refitted on the overall training cohort and validated on the validation cohort. The net benefit from the decision curve analysis (DCA), C-statistic of calibration plot, and areas under receiver operating characteristic curve (AUC) were used to measure the prediction performance of models. The 95% CI of C-statistic and AUC were calculated by bootstrap with 1,000 repetitions. The nomograms based on the best-fit model refitted on the overall sample (combination of training and validation cohort) were generated for 3-and 5-year OS and CSS. Several multivariate Cox models such as all and part of the year of diagnosis, insurance status, age, race, marital status, primary site, historical stage, pathological grade, tumor size, invasion depth, surgery, radiotherapy, chemotherapy, FIGO stage, AJCC T, N and M stage were fitted and compared. Hazard ratio (HR) and their corresponding 95% CIs were calculated.
A two-tailed p-value of <0.05 was considered statistically significant. All the statistics were performed in STATA 16.0 software (StataCorp, College Station, TX, USA). Figure 1 shows the sample selection procedure. Of the 6,792 patients in this study, 5,094 (75%) and 1,698 (25%) were randomly allocated into the training and validation groups. The primary site, year of diagnosis, insurance type, age, marital status, tumor size, pathological grade, radiotherapy, chemotherapy, historical stage, invasion depth, lymphadenectomy, sentinel lymph node biopsy, surgery, FIGO stage, AJCC stage, AJCC T, N and M stage were all balanced between the training and validation groups (all chi-square p > 0.05, Table 1).

Multivariate Cox Proportional Hazards Model Selection
Within the training cohort, a 10-fold cross-validation with 200 repetitions was carried out to choose the best fit model with the  largest average net benefit. If two models have a similar average net benefit, the one with fewer variables was selected. Finally, the model comprising age, insurance, tumor size, pathological grade, radiotherapy, chemotherapy, invasion depth, lymphadenectomy, SLNB, surgery, N stage, and M stage was chosen for both predicting OS and CSS. Tables 2, 3, within the training cohort, age, insurance, tumor size, pathological grade, chemotherapy, invasion depth, SLNB, surgery, and N stage and M stage were the factors significantly associated with OS and CSS (all p < 0.001). However, radiotherapy was not associated.

Model Prediction Accuracy
The DCA plot was shown in Figures 2, 3. Compared with the previously published nomograms with the best accuracy for CSS prediction (14), our model has the larger net benefit.
The calibration plot was displayed in

Nomogram for Predicting 3-and 5-Year Survival
The best fit model was refitted on the overall cohort (combination of the training and validation cohort), and the result of the model was used to generate nomograms for predicting 3-and 5-year OS (Figure 8) and CSS (Figure 9).

DISCUSSION
This study developed novel nomograms to predict the 3-and 5-year OS and CSS for the patients with VSCC aged 18-100 years, based on a cohort of 6,792 cases from a population-based multicenter database. To our knowledge, the novel nomograms in our study have the best prediction accuracy, with excellent clinical practice importance.
Compared with the previously developed nomograms for predicting CSS of the patients with VC, for CSS prediction, our model has the better net benefit and the largest C-statistics of 0.83, 0.85, and 0.84 in the training, validation, and overall cohort, respectively. Our model comprises of factors that are commonly inspected and easy to get in clinical practice. Moreover, we did not exclude cases with the variables having missed or unknown values, which expanded the applying range of our model.
In line with previous studies, our model comprises age, tumor size, pathologic grade, radiotherapy, chemotherapy, surgery, and M stage, which were significant factors associated with CSS and included in the previously generated nomograms (9,10,14). Moreover, the FIGO stage was not included in our final model, although it was the most prevalent stage system for gynecological cancers, similar to two studies (9,10). The invasion depth and N stage were also included in the final model, which has been argued as an essential prognostic factor of VSCC (9, 10) but not included in a more recently published nomogram (14). The inclusion of radiotherapy tended to be associated with improved survival; although it is not significant in the final model, the addition of it increased the prediction accuracy of the model, contrary to a recent study in which radiotherapy was negatively associated with VSCC (14).
To our knowledge, the first unique characteristic of this study is that it is the first study that generated a nomogram for predicting 3-and 5-year OS of the patients with VSCC. The nomogram for OS prediction had a good prediction accuracy measured by C-statistics of 0.80, 0.83, and 0.81 in the training, validation, and overall cohorts, respectively. In our study, the models for predicting OS and CSS include the same variables. Accordingly, once the variables of predicting CSS have been determined, OS can be predicted, which intensifies the application of our nomograms. The second unique characteristic is that we included SLNB and lymphadenectomy in the novel nomograms, and those two variables were statistically significant in the best fit model, which led to the precise prediction of the survival and adaption to modern surgical technique development. SLNB and lymphadenectomy have never been integrated as a predictor in nomograms for predicting the survival of patients with VSCC. However, the beneficial role of SLNB in improving survival has been proved in previous studies (17)(18)(19). The inclusion of SLNB in the model has improved the prediction accuracy considerably. Age and N stage was the strongest predictors of OS and CSS in our model, followed by surgery, M stage, and tumor size.
Before applying those nomograms in clinical practice, several points need to clarify. First, we only included VC patients with squamous cell carcinoma in the training and validation procedures, and accordingly, those nomograms could only be used for patients with VSCC. Applying to other histological types of patients with VC is not suggested. Second, the models were built based on the patients with VC aged between 18 and 100 years old. Whether those can be expanded to patients older than 100 has not been straightforward; thus, expansion should be cautious. Third, the VC patients with other malignancies or not with VSCC as a first tumor were not included in the training and validation samples; accordingly, the novel nomograms should not be applied to those patients. In addition, the nomograms in this study should be preferred to be applied to the patients with just one malignancy of VSCC. Fourth, the patients with VSCC confirmed not by positive histology were not suitable for those nomograms due to the exclusion of those patients from the sample.
This study has some limitations. First, we could not obtain detailed information on radiotherapy and chemotherapy, for example, the drug agent and dose of chemotherapy and the intensity of radiotherapy. Thus, we could not control the impact of those factors on survival. Second, due to the nature of a retrospective study, there might be missing essential factors for predicting survival, which would lead to bias. Third, the usefulness of those nomograms may be limited to the United States because we used the SEER database, which only includes the United States population. Fourth, we could not carry out external validation because no patients with VSCC from a different population or within a single center could be available as a result of the extreme rareness of VC.
Though our study had some limitations, it generated the nomograms with the best prediction accuracy and, first, predicting the OS of the patients with VSCC. This study provides the novel nomograms for the clinicians to accurately predict the OS and CSS of the patients with VSCC and, consequently, clinicians could carry out more targeted therapy procedures.

CONCLUSIONS
The novel nomograms for predicting OS and CSS of the patients with VC have the best prediction accuracy, which is of significant clinical practice value.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found here: https://seer.cancer.gov/.

ETHICS STATEMENT
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
WZ: conceptualization, methodology, data curation, formal analyses, supervision, writing original draft preparation, writing, reviewing, and editing. YY: formal analyses, methodology, software, supervision, visualization, writing original draft preparation, writing, reviewing, and editing. All authors contributed to the article and approved the submitted version.