Development and Validation of a Prognostic Model to Predict the Prognosis of Patients With Retroperitoneal Liposarcoma: A Large International Population-Based Cohort Study

Background Retroperitoneal liposarcomas (RPLs), sarcoma of mesenchymal origin, are the most common soft tissue sarcomas (STS) of the retroperitoneum. Given the rarity of RPLs, the prognostic values of clinicopathological features in the patients remain unclear. The nomogram can provide a visual interface to aid in calculating the predicted probability that a patient will achieve a particular clinical endpoint and communication with patients. Methods We included a total of 1,392 RPLs patients diagnosed between 2004 and 2015 from the Surveillance, Epidemiology, and End Results (SEER) database. For nomogram construction and validation, patients in the SEER database were divided randomly into the training cohort and internal validation cohort at a ratio of 7:3, while 65 patients with RPLs from our center between 2010 and 2016 served as the external validation cohort. The OS curves were drawn using the Kaplan–Meier method and assessed using the log-rank test. Moreover, Fine and Gray’s competing-risk regression models were conducted to assess CSS. Univariate and multivariate analyses were performed to select the prognostic factors for survival time. We constructed a predictive nomogram based on the results of the multivariate analyses. Results Through univariate and multivariate analyses, it is found that age, histological grade, classification, SEER stage, surgery constitute significant risk factors for OS, and age, classification, SEER stage, AJCC M stage, surgery, and tumor size constitute risk factors for CSS. We found that the nomogram provided a good assessment of OS and CSS at 1, 3, and 5 years in patients with RPLs (1-year OS: (training cohort: AUC = 0.755 (95% CI, 0.714, 0.796); internal validation cohort: AUC = 0.754 (95% CI, 0.681, 0.827); external validation cohort: AUC = 0.793 (95% CI, 0.651, 0.935)); 3-year OS: (training cohort: AUC = 0.782 (95% CI, 0.752, 0.811); internal validation cohort: AUC = 0.788 (95% CI, 0.736, 0.841); external validation cohort: AUC = 0.863 (95% CI, 0.773, 0.954)); 5-year OS: (training cohort: AUC = 0.780 (95% CI, 0.752, 0.808); internal validation cohort: AUC = 0.783 (95% CI, 0.732, 0.834); external validation cohort: AUC = 0.854 (95% CI, 0.762, 0.945)); 1-year CSS: (training cohort: AUC = 0.769 (95% CI, 0.717, 0.821); internal validation cohort: AUC = 0.753 (95% CI, 0.668, 0.838); external validation cohort: AUC = 0.799 (95% CI, 0.616, 0.981)); 3-year CSS: (training cohort: AUC = 0.777 (95% CI, 0.742, 0.811); internal validation cohort: AUC = 0.787 (95% CI, 0.726, 0.849); external validation cohort: AUC = 0.808 (95% CI, 0.673, 0.943)); 5-year CSS: (training cohort: AUC = 0.773 (95% CI, 0.741, 0.805); internal validation cohort: AUC = 0.768 (95% CI, 0.709, 0.827); external validation cohort: AUC = 0.829 (95% CI, 0.712, 0.945))). The calibration plots for the training, internal validation, and external validation cohorts at 1-, 3-, and 5-year OS and CSS indicated that the predicted survival rates closely correspond to the actual survival rates. Conclusion We constructed and externally validated an unprecedented nomogram prognostic model for patients with RPLs. The nomogram can be used as a potential, objective, and supplementary tool for clinicians to predict the prognosis of RPLs patients around the world.


INTRODUCTION
Liposarcoma, beginning in the fat cells, is a relatively uncommon and heterogenous group of neoplasms, accounting for about 20% of all adult soft tissue sarcomas (STS) (1)(2)(3). Although the incidence of liposarcoma is low, overall liposarcomas affect a significant proportion of cancer patients. Ignored by the pharmaceutical industry as well as epidemiological, clinical, translational, and laboratory-based investigators, liposarcoma has a great influence on its overall outcome (4). Compared with well-studied cancers such as colorectal cancer, limited therapeutic options, high cost of treatment, and frequent misdiagnosis of liposarcoma cause a high burden on the health system (5)(6)(7)(8). The retroperitoneum is the second most common location of liposarcoma after the extremities (9)(10)(11). Retroperitoneal liposarcomas (RPLs), with a higher recurrence rate and worse prognosis than extremity liposarcomas (ELs), are among the most challenging problems facing surgeons. RPLs respond poorly to most chemotherapeutic agents, and toxicity significantly limits the adequate dosing of radiation therapy (12)(13)(14)(15). For most patients with RPLs, complete surgical resection represents the most effective treatment modality. Given the rarity of RPLs, the prognostic values of clinicopathological features in the patients remain unclear. Therefore, several studies have attempted to identify factors influencing RPL prognosis, including histologic subtypes, tumor grading, treatment strategy, tumor size, completeness of resection, and organ invasion (16)(17)(18)(19). However, all these risk factors are difficult to answer the question asked by both clinicians and patients about survival rates, especially the survival time for each individual, and the limitations of these studies were very small sample sizes and lack of independent validation cohorts. As it happens, the nomogram, as a simple pictorial representation of a statistical prediction model to assist in clinical decision-making, generates a precise prediction based on the evaluation of important factors and provides accurate and individualized risk predictions for each individual for estimating the conditional risk of disease outcomes.
In this analysis, we aim to analyze and compare the prognostic features of RPLs using a relatively large number of cases obtained from the Surveillance, Epidemiology, and End Results (SEER) database and to develop a delicate nomogram to predict 1-, 3-, and 5-year overall survival (OS) and cancerspecific survival (CSS) based on significant prognostic factors. Further, we verified the prognostic value of the prediction model using an external validation set from our hospital database.

Data Source and Population Selection
This study used data from two sources. The first source was from the SEER database provided by the National Cancer Institute's SEER*Stat software version 8.3.9.2 (https://seer.cancer.gov/datasoftware/). The screening of patients with RPLs was as follows: 1) patients came from the database of "SEER 18 Regs Custom Data (with additional treatment fields), Nov 2018 Sub (1975-2016 varying) database"; 2) the International Classification of Diseases for Oncology (ICD-O) site codes C48.0 (retroperitoneum) were used to identify patients; 3) according to "Histologic Type ICD-O-3," the following pathological types were included in this study: liposarcoma (8850 to 8858); and 4) "Year of diagnosis" was set to 2004-2015. We only included patients positively diagnosed with histology tests. We excluded the patients with incomplete information, including demographic or survival information. Since the SEER database is publicly available and de-identified and the authors had no access to any participantidentifying information, it was not deemed necessary to obtain informed consent from the study population and local institutional review board review. The second source comprised of RPL patients who were diagnosed and received treatment at Xijing Hospital from 2010 to 2016. The included patients from our center were approved by the Institutional Review Board of Xijing Hospital, with orally informed consents. The study was performed in accordance with the ethical standards laid down in the 1964 declaration of Helsinki and its later amendments.
We extracted demographic information (age, sex, race), clinicopathological characteristics (histological grade (grade), morphology/pathological classification, tumor size, SEER stage, AJCC Stage, AJCC T stage, AJCC N stage, AJCC M stage), primary treatment modality (surgery and radiotherapy), survival time, vital status, and cause-specific death classification at the last follow-up from the chosen cases. The primary end point of this study was OS and CSS, and the effects were expressed as hazard ratios (HRs) with 95% confidence intervals (CIs). OS is defined as time to death from any cause, and CSS was defined as the time to death from RPLs. X-Tile software (version 3.6.1) was used to analyze the optimal cutoff point (65 years old) of age (20). Additionally, RPL histology was categorized as welldifferentiated (WDLS), myxoid (MLS), pleomorphic (PLS), dedifferentiated (DDLS), and other (round cell, mixed, angiomyoliposarcoma, fibroblastic, and not otherwise specified) liposarcomas according to WHO classification (1). Race was categorized as white, black, or other (American Indian/ Alaska Native, Asian or Pacific Islander, Hispanic).

Statistical Analysis
Statistical analyses were performed using SPSS 20.0 (SPSS, Inc., Chicago, IL) and R software (version 4.1.2). Descriptive statistics are presented as proportions (%) and frequencies (n). To estimate cancer survival probabilities, we considered cancer death as the event of interest and non-cancer death as the censored observation. We used Fine and Gray's competing risk analysis (21) to estimate the cumulative incidence function (CIF) to explore each singlevariable incidence of each competing event. Moreover, we used the proportional sub-distribution hazard model to identify the significant variables associated with CSS and the competing risk nomogram was constructed based on these factors to assess the association between predictor variables and the outcomes. The OS curves were drawn using the Kaplan-Meier method and assessed using the log-rank test. The significant prognostic variables (P < 0.05) were selected by the univariate Cox proportional hazard model and further by the multivariate Cox proportional hazard model to build their covariate-adjusted effects on survival time. Multivariate analyses were performed with the backward stepwise regression to identify independent risk factors, and the nomogram for OS was constructed based on these factors. Variables selected for inclusion were carefully chosen to ensure parsimony of the final models. The proportional hazard assumptions were checked using examining scaled Schoenfeld residuals (22) and violation of the proportional hazard assumptions was not observed.
For nomogram construction and validation, patients in the SEER database were divided randomly into the training cohort and internal validation cohort at a ratio of 7:3, while those in our hospital patient data set served as the external validation cohort. All incorporating prognostic variables from the training cohort were included to build the nomogram for predicting the probability of a patient's survival rate at 1, 3, or 5 years. Each subtype of the factors on the nomogram corresponds to a point on the "Point" scale. The points for each variable are summed together to generate a total-point score. The total-point scores projected on the bottom scales indicate the probabilities of 1-, 3-, and 5-year OS or CSS. Validation of each nomogram included three procedures in the training, internal validation, and external validation cohorts. First, the discrimination performance of the nomogram was evaluated using the area under the curve (AUC) value of the receiver operating characteristic (ROC), which ranges from 0.5 (chance discrimination) to 1.0 (perfect discrimination, equivalent to the standard). Second, the calibration plot was conducted using a bootstrap method with 1,000 resamples to compare the consistency between actual observed survival rates and predicted survival rates. Intuitively, the closer the simple regression line between the actual and predicted survival rates is to the diagonal line, the closer the predicted survival rates to the actual survival rates. All statistical tests used a significance level of 5% in a two-tailed test.

Clinical Characteristics
After excluding patients with missing follow-up data, a total of 1,392 RPLs patients diagnosed between 2004 and 2015 from the SEER database were selected and assigned to the training cohort (n = 974) and the internal validation cohort (n = 418). Based on the same inclusion and exclusion criteria as used in the SEER cohort, 65 patients with RPLs were included in the external validation cohort from the Xijing Hospital. Of the total SEER group, the demographic and clinical characteristics did not differ between the training and internal validation cohorts. The general demographic and clinicopathological features of patients from the SEER database are summarized in Table 1. The WDLS/DDLS subtypes were more prevalent histologic subtypes (33.85% and 35.38%, respectively). Concerning the treatment strategy, most patients with RPLs (90.77%) underwent surgery, and only 6 (9.23%) patients with RPLs did not have surgery. This result is consistent with the SEER database.

Survival Analysis
The Kaplan-Meier survival curves of OS for patients by age, sex, race, grade, classification, SEER stage, AJCC stage, AJCC T stage, AJCC N stage, AJCC M stage, surgical options, radiation recode, and tumor size are depicted in Supplementary Figure 1. Kaplan-Meier survival curves indicated that patients with older age, male, higher grade, higher AJCC stage, AJCC N1 stage, AJCC M1 stage, and increased severity of the SEER stage had a relatively poor OS, while patients who underwent surgery had a beneficial effect on OS compared with no surgery. As for histologic classification, patients with WDLS had significantly longer OS compared to patients with the MLS (P = 0.001), PLS (P < 0.001), DDLS (P < 0.001), and other liposarcomas (P < 0.001).
The cumulative incidence function curves of CSS for patients by age, sex, race, grade, classification, SEER stage, AJCC stage, AJCC T stage, AJCC N stage, AJCC M stage, surgical options, radiation recode, and tumor size are depicted in Supplementary  Figure 2. Variables regarding older age, male, higher grade, higher AJCC stage, AJCC N1 stage, AJCC M1 stage, enlarged tumor, and increased severity of the SEER stage had a relatively poor CSS, while patients who underwent surgery had a beneficial  effect on CSS compared with no surgery. As for histologic classification, patients with WDLS had significantly longer CSS compared to patients with MLS (P < 0.001), PLS (P < 0.001), DDLS (P < 0.001), and other liposarcomas (P < 0.001). Univariate analyses of variables potentially influencing OS and CSS are summarized in Table 2. Factors including age, sex, grade, classification, SEER stage, AJCC stage, AJCC N stage, AJCC M stage, surgery, and radiation recode were significantly related to OS through univariate Cox proportional hazard regression analysis, while factors including age, sex, grade, classification, SEER stage, AJCC stage, AJCC N stage, AJCC M stage, surgery, and tumor size were significantly associated with CSS through univariate competing analysis. The variables that were identified significant with univariate analysis were used for subsequent multivariate analysis ( Table 3). After adjustment for possible confounders, we considered that age, grade, classification, SEER stage, and surgery constitute significant risk factors for OS and age, classification, SEER stage, AJCC M stage, surgery, and tumor size constitute significant risk factors for CSS in the multivariable analysis. Figure 1 shows the nomogram of the prognosis of 1-, 3-, and 5-year OS and CSS. Our nomogram showed good discrimination and prediction capabilities. The predictive performance of the nomogram for 1-, 3-, and 5-year OS and CSS in the training, internal validation, and external validation cohorts was evaluated by the ROC curve. We found that the nomogram provided a good assessment of OS and CSS at 1, 3, and 5 years in patients with RPLs The results are illustrated in Figure 2 and Table 4. Figure 3 shows the time-dependent AUC at each time point. The results revealed that the model had a higher AUC at all time points. Figure 4 shows the calibration plots for the training, internal validation, and external validation cohorts at 1-, 3-, and 5-year OS and CSS. The results indicated that the predicted survival rates of 1, 3, and 5 years closely correspond to the actual survival rates.

DISCUSSION
RPLs, sarcoma of mesenchymal origin, are the most common STS of the retroperitoneum (23)(24)(25). RPLs account for approximately 40% of cases of STS in the retroperitoneum (26,27). Several studies have suggested that the surgical management and histologic subtype of RPLs are associated with prognosis in patients with RPLs (28)(29)(30)(31)(32). Given the rarity of RPLs, the prognostic values of clinicopathological features in the patients remain unclear. Meanwhile, RPLs continue to pose a challenge with prediction of clinical behavior. Previous studies have evaluated prognostic factors affecting prognosis in the patients with RPLs, but the sample size of these studies was very small, and none have included more than a few hundred patients (32)(33)(34)(35). Although these findings advance our understanding of RPLs, they may be especially vulnerable to institutional bias, and they still require additional information on this uncommon malignancy. The strengths of the present population-based study of prognosis in the patients with RPLs include its large size and generalizability beyond a few institutions. With 1,392 patients with RPLs at baseline with follow-up data, we identified the prognostic factors of patients in RPLs. OS at 5 years was 51% and CSS at 5 years was 63% in the study, which is consistent with the results of other recent studies (25,28,36,37). Our work analyzes and compares the prognostic features of RPLs using a relatively large number of cases and develops a delicate nomogram to predict 1-, 3-, and 5-year OS and CSS based on significant prognostic factors. The finding is consistent with previous research showing that increased survival is directly associated with better histologic grade (35,(37)(38)(39)(40). Furthermore, previous analyses showed that completeness of resection is another important prognostic factor (17,(40)(41)(42)(43). We also found that the prognosis of patients who underwent resection was significantly better than that of patients without surgical resection, especially that of patients with radical surgery. Additionally, we found that patients with a distant and regional SEER stage had poor OS and CSS than those with a localized SEER stage. As the other authors suggested, patients with tumors invading adjacent structures may be more likely to develop residual microscopic or gross disease after resection (17,31,41,44). In this study, histological subtype was also an important indicator of prognosis. The WDLS/DDLS subtypes were more prevalent histologic subtypes in the present series. Patients with the WDLS subtype had the best prognosis, while patients with the DDLS subtype had the worst prognosis (32,45). DDLS may be difficult to identify at the time of presentation because they exhibit a variable histologic picture (46,47). Thus, careful and extensive sampling is mandatory in each patient. Other subtypes, such as MLS and PLS, were rare as reported previously (46,48). Also interesting is the finding that tumor size is an independent risk factor for CSS, not for OS. The 5-cm threshold used in the AJCC staging system is still limited about the value for RPLs because such small RPLs are uncommon in the present and other international studies (17,25,35,43). Meanwhile, several studies proposed that the optimal threshold in tumor sizes should be revised upward to 10 cm (25,36). In this study, we rigorously evaluated the relationships between tumor size and prognosis in a large cohort of patients with RPLs, using several different cut points for dichotomization. In all of these analyses, tumor size was not associated with OS. However, when the cut points were set at 10 and 20 cm, larger tumor size was associated with poor CSS. Therefore, the role of the AJCC T-classification system in predicting the prognosis of RPL patients should be interpreted with caution. Although the multivariate analysis confirmed the related risk factors for patients' prognosis, these variables have not been able to produce an accurate and discriminatory prediction for RPLs, especially estimating the survival rates of each individual. Thus, a specially designed prognostic prediction model is needed to answer this question. The nomogram generates a precise prediction based on the evaluation of important factors and provide accurate and individualized risk predictions for each individual for estimating the conditional risk of disease outcomes. Although an article reported that the nomogram can accurately estimate the recurrence-free survival (RFS) of patients with RPLs (35), the sample size was relatively small and patients were taken from a single institution. Studies including a bigger sample size from multicenters were necessary. Furthermore, there are no studies that construct a nomogram that can estimate the OS and CSS of patients with RPLs. Our study fills this gap at least partially by creating nomogram models to establish the OS and CSS of RPLs based on a large database. We constructed and validated a nomogram for predicting the 1-, 3-, and 5-year OS and CSS in patients with RPLs. For nomogram construction and validation, patients in the SEER database were divided randomly into the training cohort and internal validation cohort at a ratio of 7:3, while those in our hospital patient data set served as the external validation cohort. Through univariate and multivariate analyses, it is found that age, grade, classification, SEER stage, and surgery constitute significant risk factors for OS, and age, classification, SEER stage, AJCC M stage, surgery, and tumor size constitute risk factors for CSS. We found that the nomogram provided a good assessment of OS and CSS at 1, 3, and 5 years in patients with RPLs and the model had a higher AUC at all time points. In addition, the calibration plots indicated that the predicted survival rates of 1, 3, and 5 years closely correspond to the actual survival rates. Therefore, the nomogram can be used to better assess an individual clinical outcome. Meanwhile, the nomogram can be easily applied to abundant settings, such as clinic, bedside, or at home, without depending on the computer. With simple training, healthcare professionals, patients, and the public can quickly grasp the nomogram to  There are several limitations that should be considered when interpreting our results. Firstly, it is difficult to avoid selection bias because the study was a retrospective study u s i n g p u b l i c d a t a b a s e s . Second, our nomogram provided individual predictions of OS for patients with fi ve clinicopathological factors, and CSS with six clinicopathological factors, lacking other additional variables such as PD-1, vimentin, and Ki-67. The levels of PD-1, vimentin, and Ki-67 expression have been found to increase in patients with RPLs and were associated with poor CSS and RFS (27b; 31,49,50). However, the SEER database lacks these variables and future studies are warranted to further incorporate these variables into analysis. Third, when we included treatment as a prognostic factor, we only considered the effects of surgery and radiotherapy on prognosis, neglecting adjuvant chemotherapy and other medical therapies, such as target therapies, even though most of the research indicates that adjuvant chemotherapy has little to offer for patients with RPLs (17,48,51,52). Fourth, the SEER provided no information regarding the margin of resection for patients with RPLs. The margin of resection might be presented with a prognostic value in RPL patients. Thus, we will focus on it in the future research. Fifth, the sample size of the validation cohort was small and patients were taken from a single center. Although the verification results were good, the results of AUC might change after increasing the sample size and the number of centers. Future studies need to include validation cohorts with a larger sample size from multicenters. Fifth, DDLS may be difficult to identify at the time of presentation, especially in those institutions with lack of specialist expertise in treating RPLs. However, misclassifications would affect study results and tend to obscure differences rather than exaggerate them (25). Therefore, further studies are needed to examine the effects of these factors on the prognosis to provide guidelines for the treatment of RPLs.

CONCLUSION
In conclusion, age, grade, classification, SEER stage, and surgery constitute significant risk factors for OS of patients with RPLs, and age, classification, SEER stage, AJCC M stage, surgery, and tumor size constitute risk factors for CSS. We constructed and validated a nomogram for predicting the OS and CSS in patients with RPLs. This nomogram that provided individual predictions of OS for patients with five clinicopathological factors and CSS with six clinicopathological factors can be used as a potential, objective, and supplementary tool for clinicians to predict the prognosis of RPL patients around the world.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Institutional Review Board of Xijing Hospital. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
YL and GW participated in the design of this study and wrote the manuscript. LH and DF conceived the original idea and supervised the overall direction and planning of the project. YZ, WY, XW, LD, LN, JC, WZ, JL, and HZ contributed to the acquisition of the data, analysis, and interpretation of the data. All authors contributed to the article and approved the submitted version.

ACKNOWLEDGMENTS
The author would like to thank Jing Lou, PhD, employee of School of Aerospace Medicine, for developing the first draft and analyzing data.