A Nomogram Based on Serum Biomarkers and Clinical Characteristics to Predict Survival in Patients With Non-Metastatic Nasopharyngeal Carcinoma

Objective This study focused on developing an effective nomogram for improving prognostication for patients with primary nasopharyngeal carcinoma (NPC) restaged according to the eighth edition of the AJCC/UICC TNM staging system. Methods Based on data of 5,903 patients with non-metastatic NPC (primary cohort), we used Cox regression analysis to identify survival risk factors and created a nomogram. We used the nomogram to predict overall survival (OS), distant metastasis-free survival (DMFS) and disease-free survival (DFS) in the primary and independent validation (3,437 patients) cohorts. Moreover, we compared the prognostic accuracy between the 8th TNM system and the nomogram. Results The nomogram included gender, age, T stage, N stage, Epstein–Barr virus DNA, hemoglobin, C-reactive protein, lactate dehydrogenase, and radiotherapy with/without induction or concurrent chemotherapy. In the prediction of OS, DMFS and DFS, the nomogram had significantly higher concordance index (C-index) and area under ROC curve (AUC) than the TNM system alone. Calibration curves demonstrated satisfactory agreements between nomogram-predicted and observed survival. The stratification in different groups permitted remarkable differentiation among Kaplan–Meier curves for OS, DMFS, and DFS. Conclusion The nomogram led to a more precise prognostic prediction for NPC patients in comparison with the 8th TNM system. Therefore, it could facilitate individualized and personalized patients’ counseling and care.


INTRODUCTION
Nasopharyngeal carcinoma (NPC), which arises from the nasopharynx epithelium, is the commonest head and neck malignant tumor in southeastern Asia and southern China (1). The risk factors for NPC contain genetic sensitivity, diet, Epstein-Barr virus infection and so on (2,3). NPC caused 129,079 incident cases as well as triggered 72,987 deaths worldwide in 2018 (4). As for its treatment, radiotherapy is the mainstay therapy for patients with NPC. Additionally, combined chemoradiation has better efficaciousness in the therapy in advanced stage of NPC (5). Nevertheless, the survival of most NPC patients remains poor. Furthermore, though patients who were in the same TNM stage and obtained similar treatments, more than 20% of the patients showed poor effect (6), which indicated that therapy failure was partly attributed to the prognostic evaluation of the TNM staging system. Therefore, besides trying our best to improve therapies for NPC, making prognostic evaluation more precise is also necessary for us to determine the malignant grade of NPC, and optimize treatment. The AJCC/UICC TNM staging system is the commonest prognostic factor. However, previous studies illustrated that sometimes this staging system fails to predict prognosis satisfactorily (7)(8)(9). Thus, recognizing factors related to prognosis can ameliorate the TNM staging system to predict survival of NPC patients. In recent years, an increasing number of serum markers, which can be conveniently gained, were regarded as prognostic markers for NPC patients, containing Epstein-Barr virus DNA (EBV-DNA) (10), hemoglobin (HGB) (11), albumin (ALB) (12), C-reactive protein (CRP) (13), lactate dehydrogenase (LDH) (14) and so on. These factors serve as practical biomarkers in common clinical testing.
Recently, nomograms function as new reliable tools for prognosis prediction in carcinomas (15)(16)(17). Nomograms involve some variables by analyzing their respective effects on survival and serve as convenient models to predict survival (18). Therefore, based on the data of 9,340 patients with nonmetastatic NPC, we analyzed the prognostic effects of the serum factors on NPC. Besides hematological features, we also incorporated the TNM staging system and clinical factors to establish a nomogram to precisely predict overall survival (OS), distant metastasis-free survival (DMFS), and disease-free survival (DFS) of NPC patients, which can aid clinical decision making and enhance treatment effects.

Patients
NPC patients were divided into a primary cohort (5,903 patients, about 60% of all data in this study) and a validation cohort (3,437 patients, remaining about 40% of the data) according to the chronological order in which these patients received initial treatments. From January 2009 to June 2014, 5,903 primary NPC patients at Sun Yat-sen University Cancer Center were collected in the primary cohort. The inclusion criteria were as follows: [1] non-metastatic NPC patients confirmed by histopathology; [2] adequate clinical data and examination information; [3] no distant metastasis before or during therapies; [4] no evidence for other sources of tumor or other serious diseases. Additionally, we used the same criteria to screen 3,437 primary NPC patients from July 2014 to April 2016 at the same institution and regarded them as an independent validation cohort.
All NPC patients would receive radiotherapy with/without induction or concurrent chemotherapy. For NPC patients receiving induction chemotherapy (IC), docetaxel plus cisplatin/ nedaplatin plus 5-fluorouracil, or docetaxel plus cisplatin/nedaplatin, or gemcitabine plus cisplatin/nedaplatin, or cisplatin/nedaplatin plus 5-fluorouracil was administered every 3 weeks for three cycles before radiotherapy. Concurrent chemotherapy (CC) consisted of cisplatin administered every 3 weeks for 2-3 cycles (100 mg/m 2 ) or weekly until the completion of radiotherapy (40 mg/m 2 ). For NPC patients with a contraindication to cisplatin, nedaplatin or carboplatin was substituted.
The patients' gender, age, smoking, or drinking history, family history of tumor, radiotherapy with/without induction chemotherapy (IC) or concurrent chemotherapy (CC), and serological data including pretreatment (pre-) EBV-DNA levels, pre-HGB, pre-ALB, pre-CRP, and pre-LDH, were obtained from the clinical records. We restaged all patients by the eighth edition of the AJCC/UICC TNM staging system. The data of all NPC patients' serum biomarkers and clinical characteristics were measured and collected within the two weeks before initiating treatment. The Hospital Ethics Committee at Sun Yat-sen University Cancer Center in China approved the study, which analyzed anonymous information as well as waived the demand for informed consent.

Follow-Up
Our main endpoint was overall survival (OS), and secondary endpoints were distant metastasis-free survival (DMFS) as well as disease-free survival (DFS). Patients were followed up every three months in the first two years, every six months in the next three years, and annually thereafter until death.

Statistical Analysis
We transformed continuous variables into categorical variables. The age was grouped into <40, 40-49, 50-59, and ≥60 years old. According to the standard of anemia, pre-HGB was grouped into <120 g/L and ≥120 g/L. The optimal cut-off values for other continuous variables were determined by maximizing Youden's index calculated in the data of the primary cohort, which serve as the difference between sensitivity and 1-specificity in the receiver operating characteristic (ROC) curves. Based on the maximizing Youden's index of OS in NPC patients, all cut-off values were as follows: pre-EBV-DNA levels (4,000 copies/ml), pre-ALB (45 g/L), pre-CRP (2 mg/L), and pre-LDH (180 U/L).
Abbreviations: AIC, Akaike information criterion; ALB, albumin; AUC, area under curve; CRP, C-reactive protein; C-index, concordance index; CI, confidence interval; CC, concurrent chemotherapy; DFS, disease-free survival; DMFS, distant metastasisfree survival; EBV-DNA, Epstein-Barr virus DNA; HGB, hemoglobin; IC, induction chemotherapy; LDH, lactate dehydrogenase; NPC, nasopharyngeal carcinoma; OS, overall survival; pre-, pretreatment; ROC, receiver operating characteristic. Variables satisfying P <0.05 in univariate Cox regression analyses were put into multivariable analysis. P <0.05 in multivariable Cox regression analyses selected independent prognostic variables of survival. The TNM staging system and therapy items were regarded as necessary prognostic variables of survival in this study. All independent or necessary prognostic factors were used to create a predictive nomogram (by the package of rms in R).
The Akaike information criterion (AIC) and concordance index (C-index) with 95% confidence intervals (CIs) for the model were calculated to assess the accuracy of the nomogram in the primary and validation cohorts. Calibration plots for OS, DMFS, and DFS at three and five years were done by comparing predicted OS, DMFS, and DFS with actual OS, DMFS, and DFS. Moreover, for comparing the nomogram with the TNM staging system, the predictive precision and discrimination of the nomogram and the TNM system were analyzed by AIC, Cindex (95% CI), area under curve (AUC) of ROC curves, and decision curves.
The curves for OS, DMFS, and DFS were performed using the Kaplan-Meier method. The comparisons of survival among three risk groups were analyzed using the log-rank test.
We completed the statistical analysis by R version 3.6.1 software (http://www.r-project.org) and IBM SPSS software version 25.0 (IBM, Armonk, NY, USA). Statistical data were all two-sided, and the significant effect was determined as P <0.05.

RESULTS
Patient Characteristics and Follow-Up 5,903 (primary cohort) and 3,437 (validation cohort) patients with NPC were found eligible for this study. The median age was 45 (range, 7-80) years old for the primary cohort and 45 (range, 6-85) years old for the independent validation cohort. The maleto-female ratio was 2.86:1 (primary cohort) and 2.57:1 (validation cohort). Table 1 listed the comparisons between the primary cohort and validation cohort, for which patients in the validation cohort had poor N stage together with lower levels of pre-EBV-DNA, pre-HGB, pre-CRP, and pre-LDH.
The median follow-up for OS, DMFS, and DFS as well as the 3-and 5-year OS, DMFS, and DFS were shown in Table 2.

Univariate and Multivariate Cox Regression Analyses
The variables significantly related to poorer OS in univariate Cox regression analysis were gender (male); advanced age, T stage, N stage; smoking history; higher plasma pre-EBV-DNA (≥4000 copies/ml), pre-CRP (≥2 mg/L) and pre-LDH (≥180 U/L); lower pre-HGB (<120 g/L) and pre-ALB(<45 g/L); radiotherapy with induction chemotherapy (IC) ( Table 3). Some phase III randomized trials proved that radiotherapy with concurrent chemotherapy (CC) is the standard therapy for advanced nasopharyngeal carcinoma, which remarkably ameliorates the survival of NPC patients (19,20). In this study, radiotherapy with/without CC was an independent prognostic factor for DFS ( Figure 1C). Thus, though radiotherapy with/without CC had a non-significant P-value of 0.076 for OS, we still regarded it as a necessary prognostic variable of survival in this study and put it into multivariate Cox regression analysis as well as the establishment of the nomogram. All factors above entered into multivariate Cox regression analysis. Finally, gender, age, T stage, N stage, plasma pre-EBV-DNA, pre-HGB, pre-CRP, pre-LDH, and radiotherapy with/without IC or CC were the significant prognostic factors. Detailed summaries of univariate and multivariate Cox analysis for OS, DMFS, and DFS were shown in Table 3 and Figure 1.

Establishing and Validating a Nomogram
For providing a clinically quantitative tool to predict OS, DMFS as well as DFS probability, a nomogram was created based on the important prognostic factors mentioned above. All factors were involved, including gender, age, 8th T stage, 8th N stage, plasma pre-EBV-DNA, pre-HGB, pre-CRP, pre-LDH, and radiotherapy with/without IC or CC. By aggregating the score of each variable and locating the total scores on the score scale, the nomogram was constructed to prognosticate 3-as well as 5-year OS, DMFS, and DFS in the primary cohort (Figures 2A-C).
The concordance index (C-index) for the nomogram to predict OS and DMFS over 0.7 in all cohorts indicated the model is satisfactory (Table 4). In the calibration plots, the x-axis was the prediction of OS, DMFS, or DFS computed by the nomogram, and the y-axis was the observed OS, DMFS, or DFS calculated by the Kaplan-Meier method. The solid line is the ideal reference line to represent the consistency between predicted survival and observed survival. Reassuringly, the calibration plots concerning the probability of three-year or five-year OS, DMFS, and DFS had remarkable correspondence between prediction and observation in all cohorts ( Figure 3).

Comparison of the Eighth Edition of the UICC/AJCC TNM System and the Nomogram in Patients With NPC
The prognostic accuracy of the eighth edition of the TNM system and the nomogram concerning OS, DMFS, and DFS was compared in all cohorts. As a result, the nomogram had lower Akaike information criterion (AIC) value and higher C-index than the 8th TNM system when predicting OS, DMFS, and DFS in NPC patients ( Table 4). It revealed that the nomogram had markedly higher predictive precision and discrimination than the TNM staging system. The ROC curves of 3-and 5-year OS, DMFS, and DFS also demonstrated the better predictive function of the nomogram (Figure 4). Further, the decision curve indicated that the nomogram had a higher net medical benefit than the 8th TMN stage across a broader range of threshold probabilities to prognosticate OS, DMFS, and DFS in both primary ( Figures 5A, C, E) and validation ( Figures 5B, D,  F) cohorts.

Nomograms for Risk Stratification
Because the nomogram was better than the 8th edition TNM staging system to predict OS, DMFS, and DFS, stratification was  Figure 6).

DISCUSSION
The 8th edition of the UICC/AJCC TNM staging system is the commonest predictor, by which NPC patients are classified based   on T (tumor size), N (lymph node involvement), as well as M (distant metastasis). Nevertheless, the survival of NPC patients differs significantly in the same TNM stage (21,22). This phenomenon may be partly due to the TNM system, which is unable to reflect the NPC patients' prognosis completely. Thus, we need a more reliable prognostic tool to precisely predict which patients may obtain clinical benefit from more intensive therapy and avoid overtreatment.
In this study, we established and validated a nomogram for predicting OS, DMFS, and DFS in NPC patients based on serum biomarkers and clinical characteristics. The nomogram remarkably outperformed the 8th TNM system to predict 3-, 5-year OS, DMFS, and DFS, which would assist clinicians in distinguishing high-risk NPC patients as well as selecting suitable therapies.
Several serum markers serve as potential predictors of prognosis in patients with NPC. For instance, previous studies demonstrated that increased EBV-DNA level is related to local recurrence as well as distant metastasis. It is closely associated with the extent of tumor, serving as a tumor biomarker to predict survival of NPC patients (23)(24)(25). HGB is a significant marker of patients' nutritional status. Its level reveals the state of hypoxia in tumor tissues. Some studies indicated that decreased HGB is significantly related to poorer prognosis in patients with NPC (26,27). ALB is also an important indicator reflecting the patients' nutritional status and has been used for prognostic assessment of patients with NPC (28). CRP, an acute-phase protein, increases quickly related to inflammation or infection (29). High level of serum CRP in NPC patients is associated with poor prognosis (30). LDH is also a prognostic marker in NPC patients, high level of which represents worse 5-year OS, DMFS, and DFS (31).
Based on these studies, the levels of pre-EBV-DNA, pre-HGB, pre-ALB, pre-CRP, and pre-LDH have been evaluated in this study, combined with gender, age, T stage, N stage, smoking, drinking history, family history of tumor, and radiotherapy with/ without IC or CC. We recognized the significant prognostic factors for OS, DMFS, and DFS through univariate and multivariate Cox analyses, which included gender, age, T stage, N stage, pre-EBV-DNA, pre-HGB, pre-CRP, pre-LDH, and radiotherapy with/without IC or CC. Based on these predictive factors, the nomogram model was thus established.    The nomogram showed a significant improvement in OS, DMFS, and DFS prediction of NPC patients when compared with the TNM stage system. The model was also tested in the independent validation cohort, verifying its reliability and reproducibility. Also, according to the nomogram, we divided patients into high, intermediate, and low-risk groups, in which the high-risk group had a markedly poor OS, DMFS, and DFS.
There are four main advantages of the study. First, we had a relatively large number of patients (9,340 patients) that made the conclusion more convictive. Second, after integrating clinical features, serum markers, and the selection of therapy items into the nomogram, our nomogram can predict the survival of NPC patients more comprehensively than the TNM staging system. Third, we can get all variables included in the nomogram easily  in most medical institutions, so the nomogram has wide generalizability. Last but not least, the nomogram serves as a visualized prediction tool, which may help doctors to evaluate patients with their expected survival rapidly via simple calculation in clinical practice. The classification of patients with different severity of the disease is beneficial to determine appropriate therapies.
To be honest, our study also has some limitations. At first, the study served as a retrospective study, which would have an inevitable selection bias. But this kind of retrospective studies is worth performing because it is significant to the design of some prospective studies. Secondly, all the cohorts involved patients at a single hospital, which may limit the applicability of our findings for patients from other geographical regions and institutions. However, the large primary cohort (more than 5,500 patients) and the independent validation cohort could largely enhance the convincingness of results. In summary, we established and validated a nomogram to predict OS, DMFS, and DFS in NPC patients, which involved gender, age, T stage, N stage, pre-EBV-DNA, pre-HGB, pre-CRP, pre-LDH, and radiotherapy with/without IC or CC. The nomogram showed outstanding discriminative ability as well as satisfactory consistency to classify patients with NPC into low-, intermediate-and high-risk groups, which can provide helpful clues for doctors to identify the high-risk NPC patients and select suitable treatments.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by The Hospital Ethics Committee at Sun Yat-sen University Cancer Center. Written informed consent from the participants' legal guardian/next of kin was not required to participate in this study in accordance with the national legislation and the institutional requirements.