A web-based predictive model for overall survival of patients with cutaneous Merkel cell carcinoma: A population-based study

Background Merkel cell carcinoma (MCC) is an aggressive neuroendocrine carcinoma with a high mortality rate, so it is necessary to create models to predict overall survival of MCC. We developed an easy-to-use web-based calculator to predict the OS of MCC patients based on the nomogram. Methods MCC patients between 2004 and 2015 were collected from the Surveillance, Epidemiology, and End Results (SEER) database and randomly assigned to training and validation cohorts. Patients between 2016-2017 serve as the external validation cohort. Relevant risk factors were identified by univariate and multivariate COX hazards regression methods and combined to produce nomograms. The concordance index (C-index), area under the receiver operating characteristic (AUC) curve, and calibration plots have demonstrated the predictive power of the nomograms. Decision curve analysis (DCA) was used to measure nomograms in clinical practice. Patients were divided into three groups according to the scores of the nomogram. Results A total of 3480 patients were randomly assigned to the training group and validation group in this study. Meaningful prognostic factors were applied to the establishment of nomograms. The C-index for OS was 0.725 (95% CI: 0.706-0.741) in the training cohort and 0.710 (95% CI: 0.683-0.737) in the validation cohort. In the external validation cohort, C-index was 0.763 (95% CI: 0.734–0.792). The C-index of training cohort, validation cohort and external validation cohort for CSS were 0.743 (95% CI:0.725-0.761), 0.739(95%CI:0.712-0.766) and 0.774 (95%CI:0.735-0.813), respectively. The AUC and calibration plots of 1-, 3-, and 5-year OS rates showed that the nomogram had good predictive power. DCA demonstrated that the nomogram constructed in this study could provide a clinical net benefit. Our calculator demonstrated excellent predictive capabilities for better risk grouping of MCC patients. Conclusion We created novel nomograms of prognostic factors for MCC, which more accurately and comprehensively predicted 1-, 3-, and 5-year OS/CSS in MCC patients. We established a calculator which can easily and quickly calculate the risk grouping of MCC patients by inputting clinically relevant characteristics. This can help clinicians identify high-risk patients as early as possible, carry out personalized treatment, follow-up, and monitoring, and improve the survival rate of MCC patients.


Introduction
Merkel cell carcinoma (MCC) is a rare, aggressive neuroendocrine carcinoma (1). MCC has been hypothesized to originate from Merkel cell precursors (potentially derived from epidermal stem cells or hair follicle stem cells), pre-B cells, pro-B cells, or dermal fibroblasts (1). Most MCCs present as rapidly growing red or violaceous firm nodules on the sun-exposed skin of the aged (2). MCC carcinogenesis can be initiated by the clonal integration of the Merkel cell polyomavirus (MCPyV) genome or UV-mediated DNA damage caused by chronic exposure to sunlight (1). In addition, immunosuppressive status (1), chronic arsenic exposure (3), and chronic inflammation (4) have also been identified as risk factors for inducing MCC. Therapeutic approaches against MCC include surgery, radiotherapy, chemotherapy, immunotherapy, and targeted molecular therapy (5). In general, surgery is the first-line treatment for primary MCC. For metastatic MCC, the effectiveness of immunotherapy has been validated and becomes the current first choice (5).The incidence of MCC increases exponentially, ranging from 0.1 (per 100,000 person-years) in individuals ages 40 to 44, to 1.0 in those ages 60 to 64, and to 9.8 in those older than age 85 (6). Overall survival at 5 years is approximately 51% for local disease, 35% for nodal disease, and 14% for distant disease (2). Considering the high mortality rate of MCC, it is necessary to summarize the prognostic factors. Prognostic risk factors that have been reported include male gender, advanced age, immunosuppressed status, MCPyV negativity, low CD8+ T cell levels, lymphovascular invasion, tumor growth pattern, lymph node number, and stage, etc (7)(8)(9)(10)(11)(12)(13)(14). Risk stratification of MCC patients allows for better monitoring and management. This has important implications for improving the prognosis of MCC.
A nomogram is a user-friendly mathematical model based on the COX proportional risk regression model that uses known clinical and pathological characteristics to predict the probability of an event, which not only does not sacrifice the accuracy of the regression model, but also adds the excellent features of userfriendliness and ease of use.

Data source and selection of variables
The original data in this study were extracted from the SEER database, one of the largest oncology databases available to the public, covering approximately 28% of the US population. The SEER database agreement has been signed and provided permission to access SEER information (accession username: 12906-Nov2021), so we were able to obtain patient demographics, tumor characteristics, and survival status from the SEER database. Since the SEER database is accessible to the public, institutional review board approval or informed consent was not required for our study. And all the methods used in our study also comply with the rules of the SEER database.
All patients with cutaneous Merkel cell carcinoma diagnosed between 2004 and 2017 were taken into account for this study, and the exclusion criteria were (1): Age<18 years old (2); Primary site unknown or not skin (3). Marital status unknown (4); Tumor size unknown (5); Lymph nodes involvement unknown; (6) AJCC stage unknown; (7) Cause of death unknown; (8) Survival month unknown or <1 month. At last, 4317 patients were included in this study. By using the SEER*Stat 8.4.0 (http://seer.cancer.gov//seerstat/), the demographic and clinical characteristics including age, gender, marital status, primary site, multiple primary tumors, tumor size, lymph nodes involvement, AJCC stage, surgery, radiotherapy, chemotherapy, cause of death, survival status, and survival time were obtained for these patients. The screening flow chart is shown in Figure 1.

Statistics analysis
All statistical analysis in our study was performed with R software version 4.1.3(https://www.r-project.org/). "survival", "rms", "ROCR", "ggDCA", "DynNom" and "shiny" R packages were used to construct and validate the nomograms, plot the ROC curves, formulate the calibration curves and establish DCA. The result is considered statistically significant when the P value is less than 0.05(two-sided).
Using R software, patients between 2004-2015 were randomly divided into training and validation groups in a 7:3 ratio and the association between the two groups was compared using a chi-square test. Moreover, patients between 2016-2017 serve as the external validation cohort. The Cox proportionalhazards risk model was used for univariate and multivariate analysis to identify independent risk factors for MCC. The independent predictors were then used to create a nomogram. This nomogram can be used to predict 1-, 3-, and 5-year overall survival rates of MCC patients. The receiver operating characteristic (ROC), the area under the curve (AUC), calibration curves (bootstrap=1,000 resampling validation),and C-index were used to assess the predictive power of the model. Decision curve analysis was used to evaluate the clinical value of the nomogram. DCA is a new algorithm to assess the clinical utility value of the column line graph by estimating the net benefit at each risk threshold. The Nomogram for CSS was subsequently created and validated in the same way. Finally, patients were divided into three groups according to the scores of the nomogram: low risk, intermediate risk, and high risk, and the Kaplan-Meier curve and log-rank test were used to compare the differences between the three groups.

Demographic and clinicopathological characteristics
After screening, 4317 patients were diagnosed with MCC, of which patients between 2004-2015 (n=3480) were used to establish and internally validate the prediction model. The demographic and clinical characteristics of the training group and validation group are shown in  Flow chart for inclusion and exclusion of patients.

Univariate and multivariate cox regression analysis
We used univariate regression analysis to identify eleven risk factors associated with OS, including age, sex, race, marital status, primary site, multiple primary tumors, tumor size, lymph nodes involvement, AJCC stage, surgery, radiotherapy, and chemotherapy. We then performed a multivariate regression analysis using above-selected clinical characteristics to identify the independent risk factors for OS ( Table 2). The results showed that the variables including age, sex, race, primary site, multiple primary tumors, tumor size, lymph nodes involvement, AJCC stage, surgery, and radiotherapy were the independent risk factors predicting OS in MCC patients. In the same way, we found that age, sex, primary site, tumor size, lymph nodes involvement, and AJCC stage were independent risk factors associated with CSS in MCC patients (Table 3).

Construction and validation of the nomogram
Significant independent risk factors from the multivariate analysis were used to construct the nomograms to predict 1-, 3-, and 5-year OS ( Figure 2A) and CSS ( Figure 2B). The scale at the top of the nomogram provides a score for each prognostic variable, and the sum of all scores corresponds to the scale at the bottom of the nomogram for the nomogram display of OS/ CSS prediction. The nomograms were then validated by Cindex, calibration curves, and ROC curves.

Risk stratification and online application for predicting OS
Based on the total score of the patients derived from the nomogram, we created a risk stratification system. Each patient was divided into three groups: low-risk, intermediate-risk, and high-risk groups. Kaplan-Meire analysis curves showed that the low-risk group had the best prognosis, the intermediate-risk group the second best, and the high-risk group the worst prognosis ( Figure 9). Finally, we developed an easy-to-use web-based calculator to predict the OS of MCC patients based on the nomogram, which can be accessed at https://yijunle.shinyapps. io/DynNomapp/. The probability of survival at the predicted time can be obtained by entering the patient's characteristics in the web page. This calculator is very convenient for clinical use.

Discussion
Merkel cell carcinoma (MCC) is a rare, neuroendocrine, cutaneous malignancy that was first described in 1972 (15). 65% of MCCs present with local disease and no clinical or pathologic evidence of metastasis to regional lymph nodes (LNs) or distant sites (2). Approximately 5% of MCC are found in the LNs without a primary tumor on the skin (2). 5-year OS ranging from 51% for patients with local disease to 14% for patients with distant metastases (2). MCC recurs most often within the first 2 years after diagnosis (16). The monitoring of the first two years is particularly important. To better manage and monitor MCC patients, we believe that summarization of prognostic factors and risk stratification is necessary.
In this study, based on univariate and multivariate cox proportional hazards regression analysis, we found that age, gender, race, primary site, multiple tumors, tumor size, lymph node involvement, stage, surgery, and radiotherapy were all prognostic risk factors for OS rate in MCC patients. Age, gender, primary site, tumor size, lymph node involvement, and stage were all prognostic risk factors for CSS rate in MCC patients. At the same time, we created nomograms to quantitatively predict the 1-, 3-, and 5-year OS rates and CSS rates of different individuals through the characteristics of MCC. Additionally, MCC patients have an increased risk of hematologic malignancies and developing secondary malignancies, both affecting OS (17). Therefore, in this study, the OS rate was used to divide the risk group. Through the nomogram of the OS rate, the risk scores of all MCC patients were calculated, and the patients were divided into low-risk, intermediate-risk, and high-risk groups. We made a calculator that inputs the patient's personal basic information and can quickly calculate the patient's risk score. Through our calculator, high-risk patients with MCC can be better identified, which is helpful for follow-up management and monitoring, and can better improve the survival rate of MCC patients.
Previous studies have confirmed that advanced age affects the survival rate of MCC patients, and this was also verified in our study (8). The incidence of MCC increases with age, the incidence rate reported in patients older than 85 years was even higher with a peak incidence rate of 17.6 (17). This may be related to long-term UV exposure and increased complications in old age. Morbidity and mortality were higher in males than in females, which is consistent with our study (18,19). In the present study, the race was shown to be one of the prognostic factors for OS, but not in CSS. Skin pigmentation seems to protect against MCC, as black individuals have a considerably lower risk of MCC than white populations (1). The majority of the population included in this study were Caucasians, which was related to the limited statistical population in the SEER database. Our study showed that blacks have the worse prognosis, but the lack of a large amount of data from other races may lead to some bias in the research results. We still need larger populations and more comprehensive ethnic data to analyze the relationship between MCC and ethnicity. MCC is located mostly on sun-exposed areas, particularly the head and neck and also, less frequently, the extremities and buttocks (1). The primary site has always been considered to be an important factor affecting prognosis. It has been previously reported that tumor localization in the head and neck has an adverse effect on survival (20). In this study, for OS, we found that MCC of the Frontiers in Endocrinology frontiersin.org trunk had the best prognosis, followed by the extremities, and the head and neck had the worse prognosis, while other sites had the worst prognosis. However, for CSS, MCC of the extremities had the best prognosis, followed by the head and neck, and the trunk had the worse prognosis, while other sites had the worst prognosis. This is slightly different from previous research. Multiple lesions are rare in MCC, and previous reports rarely mention the impact of multiple tumors on prognosis (1). We found that multiple tumors may lead to a worse prognosis. Tumor size is also an important factor affecting prognosis. The 5-year survival rate decreased gradually with the tumor size (2). This is the same result as our study and we found that tumor size had a greater impact on CSS than OS. Our study showed that lymph nodes involvement is an important factor affecting the prognosis of MCC. A previous study demonstrated that pathological nodal staging more precisely predicts survival compared to clinical nodal staging (2). Meanwhile, patients with the occult nodal disease appear to have a better prognosis than those with clinically detected nodal disease (13). This likely reflects active immune clearance of the primary tumor prior to diagnosis (21). Patients with primary skin tumor lesions and lymph node involvement have a worse prognosis. At the same time, consistent with the results of the previous study, the higher the SEER stage, the worse the prognosis (22). That is, surrounding tissue invasion, lymph node metastasis and distant metastasis were all poor prognostic factors for MCC.
The treatment of MCC depends on the pathologic characteristics of the primary tumor and the extent of the disease, particularly the presence or absence of involved LNs or distant metastases (23). The presence or absence of metastasis will influence the choice of MCC treatment. Surgery remains the most common method by which primary MCC tumors are removed (5). Patients whose tumors cannot be completely excised, who are not surgical candidates, or refuse surgery may receive radiation treatment in its place (5). Chemotherapy offers modest benefits and is too toxic to be generally preferred (24). Anti-PD-1 antibodies, such as avelumab, and pembrolizumab, have received FDA approval for use in patients with locally advanced or metastatic MCC (25)(26)(27). In addition, anti-CTLA-4 monoclonal antibody and adoptive T cell or natural killer cell transfer are both possible methods to treat MCC (1,5). Immunotherapy has a good efficacy and safety profile, and it has now become the standard-of-care for metastatic MCC. As the understanding of MCC has deepened, targeted molecular therapy and vaccination are in gradual development, which are potential options to treat and prevent MCC (28)(29)(30). This study also further confirmed that surgery and radiotherapy can achieve a higher OS rate. This is the same result as previously reported (8). Meanwhile,  ROC curve analysis to predict 1-and 3-year OS rates in the external validation cohort. Decision curves of the nomogram predicting OS in training cohort (A) and validation cohort (B). Decision curves of the nomogram predicting CSS in training cohort (C) and validation cohort (D). The y-axis represents the net benefit, and the x-axis represents the threshold probability.

FIGURE 8
Decision curves of the nomogram predicting OS in external validation cohort (A), the nomogram predicting CSS in external validation cohort (B). we did not find chemotherapy to significantly improve outcomes. After 2018, with the introduction of immunotherapy and targeted molecular therapy, the survival rate of MCC patients has improved. At present, there are still more immunotherapy and targeted molecular therapy in research. Since the 5-year survival rate is set as the observation endpoint in this study, we only obtained population samples before 2018 for research, and only used surgery, radiotherapy and chemotherapy as prognostic risk factors for analysis. In this study, we lack data on immunotherapy and targeted molecular therapy, which is related to the limitations of the database and our population selection. This is one of the limitations of this study.
In this study, we included prognostic-related characteristics, such as age, gender, race, marital status, primary site, multiple tumors, tumor size, lymph nodes involvement, AJCC stage, and treatments, through the large population data of the SEER database. These factors are readily available in the clinic and can better assess the risk of MCC patients. In the present study, the internal C-index was above 0.7 and the external C-index was above 0.73, showing a pleasing discriminative ability to provide patients with prognostic information in a personalized manner. Likewise, AUC also implies good discriminative ability. The calibration curve shows that the predicted values of the nomogram have a high agreement. In addition, DCA was performed to provide the clinical net benefit of the predictive model. In this study, all results indicated that the DCA curves of the 1-, 3-, and 5-year OS/CSS rates of the new model yielded a significant net clinical benefit. Our calculator can easily and quickly calculate the risk grouping of MCC patients by inputting clinically relevant characteristics.
This study still had some limitations. Firstly, the population data provided by the SEER database comes from a portion of the Caucasian population, which leads to racial limitations. As we mentioned, we need more complete ethnic data to complete the relevant research. Secondly, our prognostic risk factors were still insufficient. Due to database limitations, we lacked pathologically relevant features of MCC. If vitamin D deficiency, immunosuppressed status, MCPyV infection, CD8+ T cell levels, lymphovascular invasion, tumor growth pattern, and other information can be combined into the nomogram, the prediction of the nomogram will be more accurate and more individual (9)(10)(11)(12). Thirdly, as mentioned above, we lack population data of new treatments and only include classical treatment methods for analysis. At present, there is a lack of large sample data to test the long-term effectiveness of immunotherapy. If immunotherapy and targeted molecular therapy can be used to analyze prognosis in future research, it will be of great benefit to the treatment progress of MCC. Finally, although we performed external validation, we did not analyze 5-year survival due to time constraints.

Conclusion
In conclusion, we combined demographic and clinicopathological characteristics from the SEER database to build efficient nomograms to predict prognostic factors in MCC patients. Among them, advanced age, male, black race, head and neck, lymph nodes involvement, AJCC stage, no surgery, and no radiotherapy were all associated with poor outcomes. The nomograms we established can well combine relevant risk factors to predict the 1-, 3-, and 5-year OS/CSS rates of MCC patients. We established a calculator which can easily and quickly calculate the risk grouping of MCC patients by A B FIGURE 9 Kaplan-Meier curves of OS for patients in the low-, medium-and high-risk groups in the training Cohort (A) and validation Cohort (B). inputting clinically relevant characteristics. For patients in the high-risk group, it is recommended to shorten the follow-up interval, and timely pay attention to whether recurrence, lymph node metastasis, and distant metastasis occur, which is of great significance for improving the prognosis of patients.

Data availability statement
The original contributions presented in the study are included in the article/supplementary material. Further inquiries can be directed to the corresponding author.

Ethics statement
This study was undertaken without institutional review board approval or informed consent since the SEER database is publicly accessible.

Author contributions
WX and JZ designed the study. YL was in charge of data collection and processing. The manuscript was written by WX and YL and was evaluated and modified by JZ. All authors contributed to the article and approved the submitted version.