Construction and validation of log odds of positive lymph nodes (LODDS)-based nomograms for predicting overall survival and cancer-specific survival in ovarian clear cell carcinoma patients

Background Ovarian clear cell carcinoma (OCCC) is one of the special histologic subtypes of ovarian cancer. This study aimed to construct and validate log odds of positive lymph nodes (LODDS)-based nomograms for predicting the overall survival (OS) and cancer-specific survival (CSS) in patients with OCCC. Methods Patients who underwent surgical treatment between 2010 and 2016 were extracted from the Surveillance Epidemiology and End Results (SEER) database and the data of OCCC patients from the First Affiliated Hospital of Dalian Medical University were used as the external validation group to test the validity of the prognostic model. The best-fitting models were selected by stepwise Cox regression analysis. Survival probability was calculated by the Kaplan–Meier method, and the differences in survival time between subgroups were compared using the log-rank test. Each nomogram’s performance was assessed by the calibration plots, decision curve analysis (DCA), and receiver operating characteristics (ROC) curves. Results T stage, distant metastasis, marital status, and LODDS were identified as significant risk factors for OS. A model with four risk factors (age, T stage, stage, and LODDS value) was obtained for CSS. Nomograms were constructed by incorporating the prognostic factors to predict 1-, 3- and 5-year OS and CSS for OCCC patients, respectively. The area under the curve (AUC) range of our nomogram model for OS and CSS prediction ranged from 0.738-0.771 and 0.769-0.794, respectively, in the training cohort. The performance of this model was verified in the internal and external validation cohorts. Calibration plots illustrated nomograms have good prognostic reliability. Conclusion Predictive nomograms were constructed and validated to evaluate the OS and CSS of OCCC patients. These nomograms may provide valuable prognostic information and guide postoperative personalized care in OCCC.


Introduction
Ovarian cancer is one of the most common malignancies of the female reproductive tract, of which 90% are epithelial ovarian cancer (EOC) (1).Approximately 230,000 people are diagnosed with EOC each year, resulting in 150,000 deaths annually (2).Ovarian clear cell carcinoma (OCCC) is one of the special histologic subtypes of EOC, accounting for about 5% of EOC in western countries, and approximately 20% in Asian countries (3).Compared with EOC, OCCC is more refractory to platinum-based first-line chemotherapy, with the response rate in OCCC being 11.56% (4,5).Although early-stage OCCC has a relatively good prognosis, with a 5-year survival rate of 90%, the median overall survival time in advanced-stage OCCC is significantly shorter than that in high-grade serous ovarian cancer (HGSOC) (6,7).Lymph node (LN) metastasis is one of the main metastasis modes of OCCC (8).The status of regional lymph nodes (LNs) retrieved during surgery appears to be not only an independent prognostic factor but also an essential factor in assessing the risk of recurrence of patients with OCCC (9).The American Joint Committee on Cancer/ International Union Against Cancer (AJCC/UICC) tumor-nodemetastasis (TNM) classification is widely used to predict prognosis but may lead to an underestimation of N-stage due to its calculation only based on the absolute number of positive LNs.Therefore, many novel LNs staging systems have been proposed to improve the assessment of prognosis in OCCC.
Log odds of positive lymph nodes (LODDS) comprehensively considers the effect of the number of positive lymph nodes (PLNs) and resected lymph nodes (RLNs) on the prognosis for tumor patients and has been widely proven as an effective prognosis prediction tool and a novel lymph node staging system in various malignancies (10).LODDS is calculated with the following expression: Log ½(PLNs + 0:5)=(RNs − PLNs + 0:5) In addition, compared with the AJCC N stage, LODDS showed better discrimination abilities and well-fitting in predicting survival in patients with stage IV rectal cancer (11).
Based on entropy, the Akaike Information Criterion (AIC) statistic calculates the tradeoff between overfitting and poor-fitting models and takes into account the number of parameters that the model estimates to select the more parsimonious model (12, 13).The corrected Akaike Information Criterion (AICc) is a modified version of the AIC including a correction term for small sample sizes and is calculated as following: The k denotes the number of free parameters, and n is the number of observations (14,15).In this study, we aimed to use AICc to build prognostic models of the overall survival (OS) and cancer-specific survival (CSS) for OCCC.Finally, nomogram is used to integrate multiple prognostic factors, which enables it to predict a patient's survival with relative accuracy (16).

Variables collected
The following variables for this study were extracted: age, race, marital status, grade (G1 is equivalent to well differentiated; G2 is equivalent to moderately differentiated; G3 is equivalent to poorly differentiated; G4 is equivalent to undifferentiated), 7th AJCC stage, 7th AJCC TNM stage, tumor size, chemotherapy record, RLNs, PLNs, organ metastasis.OS and CSS were considered the primary endpoints.The cut-off values were established by X-tile program (3.5.1) (17).

Statistical analysis
All OCCC patients from the SEER database were assigned as the training group, and 30% of them were selected by random sampling as the internal validation group.All 50 OCCC patients collected from the First Affiliated Hospital of Dalian Medical University were used as external validation group.Baseline differences in demographic variables between the training cohort and validation cohort were investigated using chi-square tests and independentsample t tests.Survival probability was calculated by the Kaplan-Meier method, and the differences in survival time between distinct subgroups were compared using the Log-rank test.To identify significant univariate results, the univariate results were visually inspected in R software by comparing the cumulative incidence function (CIF) based on the Turnbull estimator to the cumulative incidence function based on the normal distribution.The Akaike Information Criterion, corrected for small sample size was determined; a smaller AICc means a better fit, and was penalized for being overloaded with parameters (18,19).As a result, the bestfitting model was chosen by selecting the lowest AICc.Then, nomograms were constructed and used to predict 1-, 3-and 5year OS and CSS for OCCC patients.The predictive performance of the nomogram was verified internally for discrimination and calibration through the C-statistics, area under the curve (AUC) and calibration curves (20, 21).Finally, by evaluating model performance by considering the clinical consequences of true positives and false positives, decision curve analysis (DCA) compares the net benefit between the nomogram model and the multivariate Cox regression model across a range of threshold probabilities so that we can select better predictive models for clinical decision making.
All statistical analyses were performed with R version 4.2.1 (www.R-project.org).A P-value of< 0.05 was considered statistically significant.

Patient characteristics
A total of 766 patients with primary OCCC from the SEER database were enrolled in the trial, and data on 50 patients with primary OCCC were collected as an external validation group for the trial (Figure 1) and the characteristics of these patients from the SEER database are listed in Table 1.There were no significant differences between the training group and the validation group with regards to the demographic and clinicopathological characteristics, thus implying that two groups were comparable.The incidence of OCCC is higher in the elderly, with 86.5% of patients older than 45 years.The distribution of race among patients demonstrated that the largest ethnic groups were white people (72.1%).Although most patients were diagnosed at a limited stage (64.8%),53.4% had poorly differentiated tumors, 36.6% had undifferentiated tumors and 82.1% received chemotherapy during treatment in the training cohort.

Construction and validation of the prognostic nomograms
Nomograms were constructed by incorporating the prognostic factors to predict 1-, 3-and 5-year OS (Figure 4A) and CSS (Figure 4B) for OCCC patients.The C-statistic ranges from 0.5, which indicates the absence of discrimination, to 1.0, indicating perfect discrimination.Generally speaking, if the C-statistic value is greater than 0.7, the model has very good predictive value (23,24).The C-statistic values of our nomogram model for OS and CSS prediction were 0.756 (95% CI: 0.728-0.764)and 0.746 (95%CI: 0.744-0.748),which denoted the good performance of the nomogram models.The actual survival rates of OCCC showed a good agreement with the optimal bootstrap predicted values, indicating good prognostic reliability (Supplementary Figures 4-7).
The AUC values also indicated the nomogram had favorable sensitivity and specificity in predicting OS (Figures 5A, B) and CSS (Figures 5C, D) in OCCC patients.Additionally, the DCA curve indicated that the nomogram models had better prediction performance than the multivariate Cox regression model (Supplementary Figures 8, 9).Similar results were observed in the internal validation cohort.Finally, the real-world data was utilized for external validation.The 1, 3, 5-year AUC area was 0.691, 0.724 and 0.749 for OS, and the 1, 3, 5-year AUC area was 0.558, 0.667 and 0.716 for CSS, respectively (Figures 6A, B), suggesting that the prognostic model in this study could effectively predict OS and CSS in patients with OCCC.

Discussion
In the current study, according to stepwise Cox regression analyses, we screened out risk factors separately related to OS and CSS of OCCC patients.By comparing AICc scores, nomograms were constructed to assess the 1-, 3-and 5-year CSS and OS based on the identified prognostic factors (25).AUC, calibration curves and DCA curves in both training and validation sets showed favorable discrimination and calibration, indicating that our nomograms had good calibration power.Each risk factor Frontiers in Oncology frontiersin.orgincluded in the nomograms was attributed a risk score and was applied to successfully a risk stratification system for predicting the OS and CSS of OCCC patients.Generally, younger age implied a better prognosis in EOC patients due to stronger immune response and better physical fitness (26, 27).However, our result indicated that OCCC patients younger than 45 years tended to have poorer prognosis.This result was in line with those of previous studies (28), which indicated the effect of age in OCCC may be different from other EOC.Moreover, we found a significant difference in the prognosis of OCCC patients in different marital statuses.Specifically, the prognosis of unmarried OCCC patients was worse compared to those who were married, which is the same as the finding of Kravdal et al (29).In this regard we generate the following analysis.Firstly, the companionship needs of married patients are met, and previous studies have shown that patients tend to be more emotionally positive when emotional needs are met.Therefore, MS may influence the prognosis of OCCC patients through emotions (30,31).Secondly, Nayeri and colleagues found that married individuals tend be diagnosed with cancer at an early stage (32).
The AJCC N-staging, a two-category system (N0: no regional lymph node metastasis; N1: histologically confirmed retroperitoneal lymph node metastasis), is the most basic and widely used cancer staging system and plays a vital role as a key prognostic factor in the development of postoperative treatment plans as well as in follow-up (33)(34)(35).However, this LN staging system does not account for the prognostic impact of PLNs and the number of RLNs.In fact, Nie et al. found that an increase in the number of PLNs is associated with lower DFS as well as OS (36).There is increasing evidence that the extent of LN dissection is also associated with the prognosis of patients with EOC (37).Therefore, the current LN staging appears inadequate in providing physicians with sufficient valuable information.Both LNR (the ratio of PLNs/RLNs) and LODDS take into account the number of PLNs and RLNs and both are more accurate than the pN staging system in predicting prognosis in several tumors (38, 39), but it is controversial which one is more superior (40,41).There are many drawbacks of LNR led us to choose LODDS as the LN staging tool for this study.First, when the value of LNR is 0, its applicability is limited (e.g., 1/1 vs. 30/30).As the number of RLNs increases, the risk of post-op complications such as infection, vascular/nerve injuries, lymphatic leakage and lymphoedema increases, thus affecting patient prognosis (42).Then, the prognosis of patients may be significantly different despite having the same LNR (e.g., 1/2 vs. 15/ 30).Third, as mentioned, the majority of OCCC patients were still in stage I at the time of diagnosis (6).The probability of LN metastasis in early OCCC is relatively low, with only 3.6% in pT1aM0 and pT2aM0, compared with 71.6% in HGSOC (43).Compared with LNR, LODDS also has a unique value in the prognostic assessment of LN-negative patients (44).The value of LODDS increases with the decrease of RLNs.Additionally, there is an active debate about systematic lymphadenectomy in early-staging OCCC (45,46).However, considering the calculation method of LODDS mentioned above, the clinician only needs to obtain the number of RLNs and the number of PLNs respectively to achieve the accurate value of LODDS.Therefore, LODDS acquisition does not depend on systematic lymphadenectomy.This will greatly reduce the difficulty of the surgery and the postoperative complication rate.
Several studies have found that the applications of nomogram models in several tumors have a better prognostic performance than the staging systems alone (47)(48)(49).With these nomograms, doctors can calculate the risk score for each patient, allowing for individualized prognostic assessment and guides postoperative personalized treatment.The AUC of the training and validation cohorts of the nomogram developed in our study was over 0.7, with the calibration points were separated on both sides of the ideal line.This means that we can obtain a more reasonable and more accurate follow-up schedule.Based on the results of the DCA curves, we believe that our model has higher discriminatory power than the traditional multivariate Cox regression.
It should be noted that there are several limitations in this study.First, while the SEER database certainly has a larger volume of data compared to prior case-series reports, it lacks records of some key variables related to prognoses, such as specific chemotherapy protocols, preoperative comorbidities, or postoperative complications.It is worthy to note that in this study we used part of the training set as the internal validation set, which does run the risk of producing an overly optimistic assessment of the efficacy of the predictive model.Although data from the real world supported our results, we will seek to re-evaluate the efficacy of our model in the future using completely independent data sets of larger sample sizes.Then, selection bias was inevitable due to the study's retrospective nature.Fourth, statistical analyses were performed without correction for multiple testing, which may lead to potential false positives in the survival analysis.

Data availability statement
The original contributions presented in the study are included in the article/Supplementary Material.Further inquiries can be directed to the corresponding author.

5 ROCFIGURE 4
FIGURE 5 ROC analysis for OS and CSS.OS nomogram ROC curve for training cohort (A) and internal validation cohort (B); CSS nomogram ROC curve for training cohort (C) and internal validation cohort (D).OS, overall survival; CSS, cancer-specific survival; ROC, receiver operating characteristics.

TABLE 1
Patients' demographics and clinicopathological characteristics.

TABLE 1 Continued
£ : P-value with Bonferroni adjustment.C D A FIGURE 2 Set of models created with forward-stepwise selection for OS (A) and CSS (B), ranked by AICc.Shaded boxes signify the factors included within the model.Forest plots of independent risk factors in stepwise Cox regression analysis of OS (C) and CSS (D).T, 7th AJCC T Stage; M, 7th AJCC M Stage; MS, Marital status; N, 7th AJCC N Stage; TS, Tumor size; CHE, Chemotherapy; GRD, grade; PLN, positive lymph node; RLN, resected lymph node.