Increases prognostic value of clinical-pathological nomogram in patients with esophageal squamous cell carcinoma

Background This study was intended to construct a brand new prognostic nomogram after combine clinical and pathological characteristics to increases prognostic value in patients with esophageal squamous cell carcinoma. Methods A total of 1,634 patients were included. Subsequently, the tumor tissues of all patients were prepared into tissue microarrays. AIPATHWELL software was employed to explore tissue microarrays and calculate the tumor-stroma ratio. X-tile was adopted to find the optimal cut-off value. Univariate and multivariate Cox analyses were used to screen out remarkable characteristics for constructing the nomogram in the total populations. A novel prognostic nomogram with clinical and pathological characteristics was constructed on the basis of the training cohort (n=1,144). What’s more performance was validated in the validation cohort (n=490). Clinical-pathological nomogram were assessed by concordance index, time-dependent receiver operating characteristic, calibration curve and decision curve analysis. Results The patients can divide into two groups with cut-off value of 69.78 for the tumor-stroma ratio. It is noteworthy that the survival difference was noticeable (P<0.001). A clinical-pathological nomogram was constructed by combining clinical and pathological characteristics to predict the overall survival. In comparison with TNM stage, the concordance index and time-dependent receiver operating characteristic of the clinical-pathological nomogram showed better predictive value (P<0.001). High quality of calibration plots in overall survival was noticed. As demonstrated by the decision curve analysis, the nomogram has better value than the TNM stage. Conclusions As evidently revealed by the research findings, tumor-stroma ratio is an independent prognostic factor in patients with esophageal squamous cell carcinoma. The clinical-pathological nomogram has an incremental value compared TNM stage in predicting overall survival.


Introduction
Esophageal cancer (EC) is a highly malignant tumor with a poor prognosis. Apart from that, the 5-year survival rate of the EC patients with advanced stage is nothing more than about 10% (1). The morbidity and mortality of EC are considerably high in China. Worse still, about 200,000 people die of EC every year (2). The principal pathological types of EC are two types of esophageal squamous cell carcinoma (ESCC) and esophageal adenocarcinoma (EAC). ESCC is the predominant pathological type with 90% of the EC in China (3). So far, the determination of the prognosis of patients with ESCC primarily rests with the TNM stage, but the current research shows that the TNM staging method cannot predict the prognosis of patients well (4). What's more, the accuracy of the prognosis of patients can be heightened by adding clinical and pathological factors (5,6). As a result, new indicators need to be explored to evaluate the prognosis of patients.
As persuasively demonstrated by a multitude studies (7)(8)(9), the interstitial components of tumor tissue affect the prognosis of tumor patients. Nevertheless, owing to the lack of accurate and reliable evaluation methods for tumor stroma, the association between tumor stroma and patient prognosis is still unclear. This retrospective study explored the correlation between the tumor stroma and the overall survival (OS) as well as clinical characteristics of patients with ESCC by detecting the tumor-stromal ratio (the ratio of the area of stromal to the cancerous, TSR), to establish a classification criteria for TSR and to propose a brand new prognostic indicator. Aside from that, this study combined the TSR with the clinical and pathological characteristics of patients to construct a clinical-pathological nomogram, aiming to provide a better prediction model for the survival of ESCC.

Study populations and follow-up
This study included 1,634 patients with pathologically diagnosed ESCC who were treated with radical surgery. The information of these patients comes from the tissue bank, clinical diagnosis, treatment, pathology, follow-up information database of about 500,000 cases of esophageal and gastric cardia carcinoma , established by the State Key Laboratory of Esophageal Cancer Prevention & Treatment and Henan Key Laboratory for Esophageal Cancer Research of The First Affiliated Hospital, Zhengzhou University. The clinical information of patients collected from the database, including gender, age at diagnosis, smoking history, drinking history, family history, high and low incidence area [the mortality, and mortality of esophageal cancer greater than 50/100,000 is defined as high incidence area (10)], and pathological information, including the degree of differentiation, T stage, N stage, M stage, as well as the patient's H&E tissue microarrays. All patients were staged in accordance with to the 2002 American Joint Committee on Cancer (AJCC) esophageal cancer staging system. The documentation of informed consent was waived owing to anonymity of the participants. This study was approved by the Ethical Committee of The First Affiliated Hospital of Zhengzhou University. All procedures involving human participants were performed in accordance with the ethical committee's standards and the Helsinki declaration. The follow-up of this study is principally through letters, telephone calls, home visits, and direct contact between village doctors and patients/their families, or through the new cooperative medical database, the medical security bureau database, and citizen death information registration management systems. It is carried out in an annual follow-up to record the time of death and the predominant cause of death to January 27, 2022. The primary clinical outcome was overall survival (OS).

Detection of tumor-stroma ratio
Tissue microarrays (TMAs) is a brand new method for highthroughput analysis of diagnostic or predictive markers in cancer specimens (11). What's more, it has excellent applications in immunohistochemical analysis and observation of pathological features (12,13). A total of 1,634 samples were histologically screened by H&E staining, and the sections were observed under a microscope. The 40 magnification was adopted to find the most infiltrating area of tumor tissue, and the representative area was marked in the paraffin block. Two cores (a cylinder with a diameter of 1.5 mm and a height of 3-5 mm) were taken out from the paraffin block to prepare a tissue chip. Each TMAs contains 150 cores (TMAs were constructed by Wuhan Servicebio Technology Co., Wuhan, China). AIPATHWELL software (developed by Wuhan Servicebio Technology Co.) was utilized to scan and probe deep into the tissue microarray. The software distinguishes cancerous and stroma in accordance with tissue characteristics such as cell morphology and nucleus morphology. Apart from that, the output results are data such as cancerous area, stroma area. The tumor-stroma ratio (TSR) was expressed as the area of stroma/total area (stroma area + cancerous area). To ensure the accuracy of the results, we selected two cores for each patient and placed them on the tissue microarray. Subsequently, we took the mean as the final TSR value. The X-tile software was adopted to determine the cut-off value for the TSR. The X-tile software provides a convenient and comprehensive method to stratify the population by finding the optimal cut-off value based upon the patient's overall survival time, when the difference in survival between these two groups is most striking (14).

Construction of the nomogram
The clinical and pathological information of patients were first considered as candidate variables for the prognostic model. It is noteworthy that all variables are categorical variables. The factors which affecting OS in univariate analysis (P < 0.05) were then included in Cox analyses. Subsequently independent prognostic risk factors (P < 0.05) were included as the final variables to construct the nomogram model. In the training set data, these characteristics were used to construct nomogram to predict the probability of patient survival time up to 1-year, 3-year, and 5-year by employing the rms package.

Calibration, validation and risk stratification of nomogram
Predictive performance was quantified with reference to accuracy (calibration), discrimination, and clinical application. The agreement between the predicted survival probability calculated by the nomogram and the patient's actual survival probability was assessed by plotting 1-, 3-, and 5-year calibration curves for a bootstrapped with 1,000 sample. The discrimination of the nomogram model was assessed by calculating the concordance index (C-index), NRI and time-dependent receiver operating characteristic (ROC) (15). The clinical application and net benefit at diverse threshold probabilities of the nomogram model was assessed by decision curve analysis (DCA) (16). Patients in the study were divided into high-risk or lowrisk groups in accordance with the optimal cutoff point for the total prognostic score (TPS) decided by the X-Tile software. Survival curves were drawn by adopting Kaplan-Meier analysis. The potential correlations between risk scores and OS among dissimilar subgroups were investigated in the overall patient population (14).

Statistical analysis
All statistical analyses were performed by employing the R software (version 4.2.0). T test was adopted to probe into continuous variables, categorical variables by adopting c2 test and Fisher's exact two-sided test. Survival analysis was drawn by employing Cox analysis and the Kaplan-Meier method. The nomogram and calibration plot were constructed using the "rms" package; the time-dependent receiver operating characteristic (ROC) was established by adopting the "timeROC" package; the DCA was established by employing the "rmda" and "ggDCA" package. P < 0.05 (two-sided) was considered statistically significant.

Patient characteristics
The characteristics information of the 1,634 patients is shown in Table 1. The male to female ratio is 1.9:1; the mean age is 61 years, and the age range is 33-85. The median survival time of patients with ESCC was 2.0 years, and the 1-, 3-and 5-years OS rates were 77.2%, 35.1% and 23.7% respectively.
Patients were randomly divided into training and validation sets in a ratio of 7:3. The male to female ratio in the training set is 1.9:1; the mean age is 62 years, and the age range is 33-85. The median survival time of patients with ESCC was 2.0 years, and the 1-, 3-and 5-years OS rates were 76.5%, 35.1% and 24.1% respectively. The ratio of male to female in the validation set is 1.9:1; the average age is 61 years, and the age range is 38-82. The median survival time of patients with ESCC was 2.0 years, and the 1-, 3-and 5-years OS rates were 77.6%, 34.9% and 22.9% respectively.

Association analysis of TSR and clinical characteristics
All patients successfully collected the detection value of TSR. The classification criteria for TSR were explored by comparing the relationship between TSR and survival for all patients. And the optimal cut-off value will be found when the survival difference was greatest. The optimal cut-off value was found by X-tile software. The result showed that patients were divided into two groups when the TSR cutoff value was 69.78, at which time the survival difference between the two groups was the greatest (P<0.001), and 69.78 was the optimal cutoff value for TSR. According to the optimal cut-off value, patients with TSR more than 69.78 were divided into stromarich group and patients with TSR less than 69.78 were divided into stroma-poor group (Figure 1). And a total of 1324 patients were divided into stroma-poor group and 310 patients were divided into stroma-rich group. The results showed that the abundance of the stroma in tumor tissue significantly affects the prognosis of patients, and patients survived significantly better when the stroma was abundant, and the TSR is an independent prognostic factor in patients with ESCC. The survival curves of the two groups are shown in Figure 2G. We applied the c2 test to assess associations between TSR and clinical characteristics. We found that the TSR was significantly associated with the differentiation (P = 0.004) and depth of invasion (P = 0.002). In the stroma-rich group, the proportion of well-differentiated patients was relatively high, while, patients in the stroma-rich group had a relatively higher depth of infiltration. The relationship between TSR and differentiation and T stage may be the reason why it affects the prognosis of patients. The relationship between the clinical characteristics and the TSR is shown in Table 2.

Prognostic roles of clinical and pathological characteristics
To evaluate the potential association of clinical factors with overall survival, we performed univariate and multivariate Cox analysis in the total population, the prognostic role of each characteristic in OS was tested in the overall populations ( Figure 2). And the univariate analysis results showed that age, drinking history, family history, area, TSR, differentiation, T stage and N stage were significantly associated with OS (P < 0.05) (Figure 2A), and multivariate Cox analysis results showed that age, drinking history, family history, area, TSR, T stage and N stage were independent predictors of OS (P < 0.05) ( Figure 2B). Data are also represented using Kaplan-Meier curves (Figures 2C-J).

Construction of nomogram
To address the patient's prognosis, we constructed a clinicalpathological nomogram ( Figure 3A), and age, drinking history, family history, area, TSR, differentiation, T stage, and N stage were finally included into the model. Age was the age of patients at the time of diagnosis of the disease and was divided into two age groups, <60 and ≥60; drinking history was divided into positive and negative groups according to whether patients drank alcohol; family history was divided into positive and negative groups according to whether patients had tumor patients in their immediate family; area was divided into two groups, high and low incidence areas, according to the mortality rate of patients with ESCC in the area where patients lived (10); TSR was divided into stroma rich and stroma poor groups according to the results of this study; according to the AJCC stage, differentiation was divided into well differentiated, moderately differentiated, and poor differentiated, and T-stage was divided into T1, T2, T3 and T4; and N-stage was divided into N0 and N1. Among them, the differentiation was not an independent predictor for prognosis in the multivariate analysis, because it was included in the latest version of AJCC stage, differentiation was also included in the final nomogram. Since there was no difference in survival between moderate differentiation and poor differentiation patients in this study, thus combining them into one category. To investigate whether the clinical-pathological nomogram has incremental prognostic value for individualized OS prediction, a TNM stage nomogram model was also constructed, including only differentiation, T stage, N stage ( Figure 3B). In the nomogram, each characteristic has its corresponding points, and total points is obtained by summing the scores corresponding to each characteristic. The scale at the bottom of the total points indicates the survival rates at 1, 3 and 5 years. The probability of survival at   prediction of the clinical-pathological nomogram was better than that of the traditional TNM stage in 1-, 3-, and 5-year OS prediction. Figure 5 shows the time-dependent ROC curves of the clinicalpathological nomogram and the TNM stage in the training and validation cohort. The results showed that in the training cohort, the area of under curve (AUC) of the 3 years and 5 years of the clinicalpathological nomogram model was significantly larger than that of the TNM stage nomogram model, while the AUC of the 1 year of the two models was similar, thus the clinical-pathological nomogram was better than the TNM stage at 3 years and 5 years (P < 0.001), and the two models were similar at 1 year (P = 0.414); in the validation cohort the results were similar to the training cohort, the clinical-pathological nomogram model was better than the TNM stage nomogram model at 3 years and 5 years (P < 0.001), and the two models were similar at 1 year (P = 0.306).

Clinical benefits of clinical-pathological nomogram model
The clinical benefit of the clinical-pathological nomogram model was compared with the TNM stage nomogram model. The Decision curve analysis (DCA) curve showed that when the threshold probability was > 0.5, the area under the decision curve of the clinical-pathological nomogram model was larger than that of the TNM stage nomogram model in the 3-year and 5-year, thus the clinical-pathological nomogram model could better predict the 3-year and 5-year OS of patients. And the clinical-pathological nomogram model yielded a larger net benefit than TNM stage nomogram model when the threshold probability was > 0.5 ( Figure 6).

Performance of prognostic nomogram model
For further revealing the prognostic prediction value of the clinical-pathological nomogram model, the X-tile software was used to categorized into high-risk and low-risk groups according to the optimal cut-off value of the total predictive scores in the training cohort. The relationships among risk scores and survival status of patients in training cohort and validation cohort are shown in Figures 7A, B. Kaplan-Meier survival curves revealed that patients with higher risk scores had significantly poorer OS (P<0.001, Figures 7C, D). Subgroup analyses were performed in the overall populations, and the clinical-pathological model also maintained a good and stable prediction performance (Figure 8).

Discussion
Notwithstanding the fact that great progress has been made in the prevention and treatment of ESCC in recent decades, the prognosis of ESCC is still poor with the high mortality rate (17, 18). The TNM stage system is adopted to predict the prognosis of numerous cancers, but TNM stage system is difficult to effectively predict the prognosis of individual patients due to multitudinous factors affecting the prognosis. For instance, patients with the same pathological stage, and treated by the same treatment plan, have completely dissimilar prognosis, indicating that there are biological differences between distinct patients. Furthermore, it is not enough to judge the prognosis on the basis of the TNM stage alone. As a consequence, there is a need to add new factors that affect survival and establish more reliable prognostic tools that can be used for risk stratification and further clinical decision-making, thereby which is advantageous for patients to ameliorate their prognosis.
Tumor is composed of cancerous cells and their surrounding complex stroma, stroma play a paramount role in tumor genesis and tumor metastasis (19). The occurrence and development of tumor is a complex process in which cancerous cells and stroma continuously interact and influence each other (20). There are complex stroma surrounding the tumor, including blood and lymphatic vessels, fibroblasts, endothelial cells, immune cells, cytokines, extra-cellular vesicles, and extra-cellular matrix (21). The extra-cellular matrix is the primary component of stroma. Apart from that, its function is not only limited to connect, support and protect cells, but also play a crucial role in cell proliferation and differentiation, cell migration, cell adhesion, angiogenesis, tissue development and repair (22)(23)(24)(25). Aside from that, the accumulation and quantity of extra-cellular matrix components can result in tumor tissue fibrosis, which is required for wound healing and tissue repair to fight tissue infection, inflammation, autoimmune diseases, degenerative diseases, and tumors, resulting in prolonged survival (26). The stroma also contains a multitude of immune cells, such as NK cells, CD8+ cytotoxic T cells and antigen presenting cells (APCs), which can inhibit tumor growth and the immune response generated by the immune cells has anti-tumor properties (27). The stroma affects tumor growth and biological behavior. What's more, the value of stroma in tumor prognosis prediction has been paid an heightening amount of attention (28). Tumor-stromal ratio (TSR) is a common stroma statistical method by calculation the ratio of area occupation of the stroma to tumor tissue. For the time being, a myriad of studies (29-31) have used the TSR in the prediction of postoperative prognosis of tumor patients. Furthermore, the results show that the TSR is independent predictor of prognosis in cancer patients. In the practice of Chinese doctors, those tumor with rich stroma are called hard cancers, and with poor stroma are called soft cancers. Aside from that, they found that the prognosis of hard cancers is better than that of soft cancers. In this investigation, we found a novel classification criteria by comparing the correlation between the richness of the stroma (tumor-stromal ratio) and survival in patients with ESCC. This classification criteria can be adopted to group patients by adopting TSR, which provides a basis for subsequent studies on the role of stroma in tumor development. In line with the classification criteria in this research, the stroma-rich group survived remarkably better than stroma-poor group. Apart from that, we also found that in the stromal rich group, the differentiation of patients was better than that in the stromal poor group, nevertheless, the depth of invasion was higher than that in the stromal poor group. With this classification criteria, the survival and clinical characteristics of patients in the two groups differed strikingly, which demonstrated that the classification criteria had great value. As a consequence, the TSR was incorporated into the construction of the prognostic model.
Esophageal cancer in China has the characteristics of conspicuous familial aggregation phenomenon and marked regional distribution differences (forming apparent high and low incidence areas) (1,32). As already demonstrated by the epidemiological studies, individuals with a family history of esophageal cancer have a higher probability of developing esophageal cancer and a worse prognosis than individuals without a family history (33,34). The reasons of higher risk of individuals with a positive family history may be on account of genetic inheritance. Apart from that, patients with a family history of esophageal cancer are more likely to have a genetic predisposition than those without a family history of cancer (34).
Since family history is a reflection of an individual genetic risk factor, and it is believed that family history can be used as an indicator to be included in the predictive model for identification of high-risk groups (35). What's more, the environment is also a pivotal factor affecting the occurrence of esophageal cancer. In China, there are striking geographical differences in the incidence of esophageal cancer, with pronounced high-incidence areas and low-incidence areas, the natural environment and living customs in diverse regions are crucial factors affecting the incidence of esophageal cancer (1,32). As persuasively demonstrated by relevant studies (34,36), environmental and genetic risk factors may exert collective influence on individual morbidity risk and prognosis survival. Shared genetic susceptibility and environmental exposure, or the possibility of their interaction, may be paramount factors in the development and progression of esophageal cancer. In this exploration, we found that patients with negative family history and patients in low-incidence areas had markedly better survival than patients with positive family history and patients in high-incidence areas (P < 0.05), so high and low incidence areas were incorporated into the model construction. Furthermore, age, drinking history, differentiation, T stage, and N stage are evidently prognostic factors that affect the survival of patients with esophageal cancer (37,38). What's more, they were also included in the construction of the prognostic model in this research.
In this investigation, we constructed a nomogram model to predict the prognosis of patients with ESCC. The nomogram has been extensively employed as a predictive method in oncology (39-41), it meets the requirements of an ensemble model and plays a role in driving personalized medicine for clinicians to use for prognosis prediction (42). Traditionally, tumor staging on the basis of the AJCC/UICC criteria has been the initial choice for predicting the prognosis of ESCC. Nevertheless, diverse prognoses were observed in patients at the same stage. This prognostic heterogeneity can be explained by age, family history, high and low incidence regions, and other factors that are not considered in tumor staging based upon AJCC/UICC criteria. This study included the largest sample size of patients with ESCC and the longest follow-up time to explored the association between TSR and clinical characteristics and prognosis of patients with ESCC, clarified that TSR is an crucial factor affecting the prognosis of patients with ESCC, and developed a novel criteria for the classification of TSR and include TSR in the nomogram model for patients with ESCC for the first time. A clinical-pathological nomogram model was constructed by combining the clinical information (age, drinking history, family history, and high-low risk area) and pathological information (TSR, differentiation, T stage and N stage) of patients. The C-index and NRI of the internally validated clinical-pathological model constructed in this exploration were remarkably higher than those of the TNM staging. In the training set and validation set, the 3-year and 5-year predictive ability of the clinical-pathological nomogram model was found to be strikingly better than that of the TNM staging (P < 0.001), indicating that the predictive performance of nomogram model constructed in this investigation was superior to that of TNM analysis. The results of decision curve analysis (DCA) indicated that the clinical-pathological model had a better net benefit than TNM staging in both the training and validation sets for 3-year and 5-year OS prediction. Based upon the risk scores of patients predicted by the nomogram model, patients were divided into two groups (high-risk and low-risk). Aside from that, the survival of patients in the high-risk group was conspicuously worse than that in the low-risk group (P < 0.001); in the subgroup analysis, the survival of patients in the high-risk group was noticeably worse than that in the low-risk group (P < 0.001). As evidently displayed by all these the research findings, the prognostic prediction model constructed in this research has better predictive and clinical application value than TNM staging.
There were some limitations to our research. First and foremost, our analysis may have been affected by bias owing to the retrospective of the study. Aside from that, in this exploration, whether the patients received chemoradiotherapy before surgery was considered, but may receive dissimilar treatments during the long postoperative course, but this factor was not addressed detailedly in our present study. Furthermore, this study used TMAs to observe TSR, which has the limitation of under-representation compared to whole sections.

Conclusion
This study confirmed that the TSR was an independent prognostic factor in patients with ESCC. Combined with tumor-stromal ratio and clinical and pathological factors, a prognostic nomogram is constructed, which has higher accuracy and better clinical value than traditional TNM stage. Furthermore, it provides a method for clinicians to predict ESCC prognosis and to adjust the treatment strategy.

Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement
The studies involving human participants were reviewed and approved by Ethical Committee of The First Affiliated Hospital of Zhengzhou University. The patients/participants provided their written informed consent to participate in this study.

Author contributions
LDW, JFH and XS conceptualized and designed the study. FYZ, JLL, XZW, XML, RW, ZMF, XNH, BL, HJY, FBC, JLR and SLZ provided tumor samples collection and assembly. XKZ, RHX, LLL, YC, LYL JJJ, YZY and MMY provided patients' information collection and follow-up. JFH, XS and KZ provided analyzed and interpreted the data. JFH, PPW and MXW drafted the manuscript. All authors reviewed the manuscript. All authors contributed to the article and approved the submitted version.

Funding
This work was supported by the National Natural Science Foundation of China (No. 81872032 and No. Ul804262).