Integration of MRI-Based Radiomics Features, Clinicopathological Characteristics, and Blood Parameters: A Nomogram Model for Predicting Clinical Outcome in Nasopharyngeal Carcinoma

Purpose This study aimed to develop a nomogram model based on multiparametric magnetic resonance imaging (MRI) radiomics features, clinicopathological characteristics, and blood parameters to predict the progression-free survival (PFS) of patients with nasopharyngeal carcinoma (NPC). Methods A total of 462 patients with pathologically confirmed nonkeratinizing NPC treated at Sichuan Cancer Hospital were recruited from 2015 to 2019 and divided into training and validation cohorts at a ratio of 7:3. The least absolute shrinkage and selection operator (LASSO) algorithm was used for radiomics feature dimension reduction and screening in the training cohort. Rad-score, age, sex, smoking and drinking habits, Ki-67, monocytes, monocyte ratio, and mean corpuscular volume were incorporated into a multivariate Cox proportional risk regression model to build a multifactorial nomogram. The concordance index (C-index) and decision curve analysis (DCA) were applied to estimate its efficacy. Results Nine significant features associated with PFS were selected by LASSO and used to calculate the rad-score of each patient. The rad-score was verified as an independent prognostic factor for PFS in NPC. The survival analysis showed that those with lower rad-scores had longer PFS in both cohorts (p < 0.05). Compared with the tumor–node–metastasis staging system, the multifactorial nomogram had higher C-indexes (training cohorts: 0.819 vs. 0.610; validation cohorts: 0.820 vs. 0.602). Moreover, the DCA curve showed that this model could better predict progression within 50% threshold probability. Conclusion A nomogram that combined MRI-based radiomics with clinicopathological characteristics and blood parameters improved the ability to predict progression in patients with NPC.


INTRODUCTION
Nasopharyngeal carcinoma (NPC) is a malignant tumor in the mucous membrane of the nasopharynx. The incidence and mortality of NPC vary in regional distribution, especially in Southeast Asia (1)(2)(3). Although intensity-modulated radiotherapy (IMRT) significantly improved the prognosis of NPC, some patients still experience progression (4,5). At present, the risk assessment of NPC is mainly determined by the tumor-nodemetastasis (TNM) staging system, which only has 61% accuracy for predicting the local recurrence of NPC (6). While it incorporates local tumor invasion, positive lymph nodes, and distant metastases, TNM cannot explain the temporal and spatial heterogeneity or changes in the internal and external environments of tumor cells. Plasma Epstein-Barr virus (EBV) DNA, which may affect the growth and apoptosis of the NPC cell line, has been used as an independent prognostic marker in endemic areas, but the detection rate of EBV is low in nonendemic areas (7,8). Therefore, it is urgent to identify more representative and comprehensive biomarkers to predict NPC prognosis.
Many studies reported that a large number of clinical biomarkers such as monocytes (MONO), mean corpuscular volume (MCV), and Ki-67 expression are associated with the tumor microenvironment and tumor immune escape (9)(10)(11). There are no regional differences in the expression of these markers. Beyond these biomarkers, the emerging field of radiomics is supposed to be a bridge between medical imaging and clinical medicine (12). Radiomics features are used for tumor diagnosis, phenotype, and prognosis (13)(14)(15). By extracting innumerable quantitative imaging features, the differences in tumor heterogeneity and microenvironment may be explained. Some recent studies showed that magnetic resonance imaging (MRI) radiomics were significantly associated with NPC prognosis (16)(17)(18). However, no publications integrated blood parameters, Ki-67, and MRI radiomics to predict progression-free survival (PFS) in patients with NPC.
We built and validated a nomogram prediction model based on MRI, clinicopathological parameters, and blood parameters to visually demonstrate the PFS of NPC and guide clinical diagnosis and treatment.

Patients
Data from patients treated in Sichuan Cancer Hospital from January 2015 to December 2019 were reviewed. The inclusion and exclusion criteria are presented in the Supplementary Data. The study workflow is displayed in Figure 1. A total of 462 patients were included and randomly divided into a training cohort (n = 323) and validation cohort (n = 139) at a 7:3 ratio.

Follow-Up
After patients completed all treatments, they were followed-up every 3 months in the first 2 years, every 6 months in years 3-5, and annually thereafter. The review items included blood parameters, nasopharyngeal MRI, chest computed tomography, abdominal ultrasonography, or isotope bone scanning, and each review item was determined according to the specific situation of the patient. PFS was set as the primary endpoint.

MRI Acquisition and Image Preprocessing
The pretreatment MRI parameters are listed in the Supplementary Data. To avoid inhomogeneity due to different MRI devices, two image preprocessing steps were applied. First, we used the N4ITK algorithm to remove bias field artifacts (20). Second, the intensity range was adjusted from 0 to 255. In addition to the original images, Abbreviations: MRI, multiparametric magnetic resonance imaging; PFS, progression-free survival; NPC, nasopharyngeal carcinoma; LASSO, least absolute shrinkage and selection operator algorithm; MONO, monocytes; MONO%, monocyte ratio; MCV, mean corpuscular volume; C-index, concordance index; DCA, decision curve analysis; IMRT, intensity-modulated radiotherapy; TNM, tumor-node-metastasis staging system; EBV, Epstein-Barr virus; CCRT, concurrent chemoradiotherapy; IC, induction chemotherapy; AC, adjuvant chemotherapy; GLCM, gray-level co-occurrence matrix; GLRLM, graylevel run length matrix; GLSZM, gray-level size zone matrix; GLDM, gray-level dependence matrix; NGTDM, neighborhood gray tone difference matrix; ROI, region of interest; ROC, receiver operating characteristic curve the Gaussian Laplace filter with sigma values of 4 and 5 mm was used to reconstruct the images, and the features of the multiscale resolution were extracted (21,22). Preprocessing was performed in the SimpleITK 2.0.2, which is an open-source platform for Python 3.8.5 (www.python.org).

Image Segmentation
We used 3D Slicer 4.11 software (open source and multiplatform software; www.slicer.org) for manual segmentation (23). A radiologist with 20 years of experience delineated the region of interest (ROI), which refers to the margin of the nasopharyngeal tumor at each level on axial CET1-w and T2-w images.

Postprocessing of Radiomics Features and Building of Radiomics Signature
To ensure the comparability of different features, Z-score normalization was performed to unify data from different levels into the same level. Feature selection was conducted in the training cohort (n = 323). We used the least absolute shrinkage and selection operator (LASSO) algorithm for feature dimension reduction and screening. LASSO attempts to shrink some coefficients of the models and sets others to zero, but it may lead to overfitting, so we added a 10-fold cross-validation. Nine noteworthy features were selected. These features were linearly fitted according to the weights of their coefficients; for each patient, the rad-score was calculated. The rad-score was then used to build the radiomics signature.

Radiomics Survival Model Development and Validation
To find the rad-score cutoff with the best sensitivity and specificity, we generated a receiver operating characteristic curve (ROC) using data from the training cohort. To explore the potential association between radiomics features and PFS, we separated patients in both cohorts into high-and low-risk groups based on the cutoff value of rad-scores (patients below this cutoff value were considered low risk). Kaplan-Meier survival analysis was used to identify PFS differences in both cohorts.

Evaluation and Comparison of the Multifactorial Prognostic Nomogram Model
Four models were set up to compare the prognostic efficacy (model 1: clinical stage; model 2: radiomics; model 3: clinical stage + rad-score; model 4: clinical data + rad-score). The concordance index (C-index) was used to evaluate univariate or multivariate Cox models. A nomogram was built to visualize the results of the best prediction model in the training cohort using the R software (version 4.1.0). We evaluated the uniformity of the nomogram by plotting 3-and 5-year calibration curves. Decision curve analysis (DCA) was performed to compare the net benefit rate between the TNM stage system and this nomogram for predicting prognosis.

Statistical Analysis
Statistical analyses were performed with the R software (version 4.1.0; www.r-project.org), SPSS (SPSS version 20.0, IBM Corp, Armonk, NY, USA), and Python 3.8.5. Clinical data were compared between the training and validation cohorts with Independent samples t-tests, Mann-Whitney U tests, or Chisquare tests. Missing data was processed using the "miceforest" package from Python. Several R packages were employed: LASSO in the "glmnet" package was used to select radiomics features. Kaplan-Meier survival, Cox proportional hazard regression, and C-index were calculated by the "survival" and "rms" packages. DCA was performed with the "ggDCA" package. The "pROC" and "ggplot2" packages were applied to generate the ROC curve and rad-score histogram, respectively. For all statistical tests, differences were considered significant at p < 0.05.

Clinical Parameters
This retrospective study included 462 patients with pathologically confirmed nonkeratinizing NPC who were treated at Sichuan Cancer Hospital between January 2015 and December 2019. The clinical parameters of all patients in the training and validation cohorts are listed in Table 1. The median age was 49 years (range: 12-82 years), with 329 men and 133 women. The numbers of patients with each clinical stage were 0, 23,193,226, and 20 for stages I, II, III, IVA, and IVB, respectively. The Ki-67 cutoff value from the ROC curve was 37.5% (range: 3%-90%). The cutoff value for classifying EBV infection status was 400 copies/ml (negative: <400 copies/ml; positive: ≥400 copies/ml). A total of 330 patients who met the inclusion criteria underwent plasma EBV DNA tests before treatment, and 112 were positive. Among them, there were 2 cases of stage II, 24 cases of stage III, and 86 cases of stage IV. The interpolation of EBV DNA missing data was performed using the multiple substitutions in chained equations (MICE) method of random forest. The Supplementary Data detail the results after interpolating EBV DNA. The median PFS was 33.15 months (0.6-76.2 months) for all patients; 45 patients progressed, including 23 deaths, 14 distant metastases, and 8 recurrences.

Blood Parameters
All blood parameters in the training and validation cohorts are shown in Table 2. The cutoff values identified with ROC curves are shown in the Supplementary Data, as are the values of the areas under curve (AUCs) for blood parameters. The highest AUC values were found for MONO, MONO%, and MCV, which were 0.637, 0.626, and 0.568, respectively. These were incorporated into model 4.

Radiomics Signature Development
In total, 2,074 features were obtained from each ROI. The final nine key features were selected by LASSO ( (1)

Model Predictions and Comparison
The C-indexes of the four models are listed in Table 3. The Cindex of model 2 was significantly higher than that of model 1 in both cohorts, which suggested that the predictive effect of radiomics may surpass that of the TNM stage system. Moreover, when comparing models 1 and 3, we found that model 3 that included the rad-score could remarkably predict the prognostic potency of the clinical stage. Model 4 integrating clinical data and radiomics had the best probability that the predicted results were consistent with the observed results (Cindex of training and validation: 0.823 (95% CI: 0.745-0.901) vs. 0.812 (95% CI: 0.693-0.930)). The nomogram of model 4 is also shown in Figure 3A. Notably, the calibration curves of 3-5 years were very close to the diagonal line ( Figures 3B, C). The DCA results for models 4 and 1 are presented in Figure 3D, confirming the remarkable effectiveness of model 4.

Kaplan-Meier Survival Analysis
Kaplan-Meier survival curves were drawn based on rad-scores. The cutoff value from the ROC curve was −0.021. A rad-score below this cutoff was considered low risk. In both cohorts, the low-risk group had significantly longer PFS (p < 0.05) (Figure 4).

DISCUSSION
We designed this study to build and validate multimodal information from MRI-based radiomics as an effective way to estimate PFS in patients with NPC. Our findings suggested that the multidimensional nomogram combining clinicopathological characteristics, blood parameters, and rad-score was superior to the prediction performance of the TNM staging system. Moreover, using the cutoff value of the rad-score, patients could be distinguished into high-and low-risk groups, and the latter had longer PFS.
In recent years, a growing number of studies have reported that MRI radiomics features can better reflect prognostic information for NPC because they may explain the inherent temporal or spatial heterogeneity of tumors on imaging (24)(25)(26) (28). In our study, the model that integrated clinical data with radiomics features also performed best; the C-index values of model 4 were 0.823 (95% CI: 0.745-0.901) in the training cohort and 0.812 (95% CI: 0.693-0.930) in the validation cohort. The Cindex of our validation cohort was lower than that reported in the study by Shen. A possible explanation may be that we included metastatic NPC patients and had a larger sample size, which may have improved the generalizability of the prediction model. Compared with other research, the parameters included in our model are more universal, without regional differences, so the model has a higher degree of applicability. Based on our nomogram, the probabilities of 3-and 5-year PFS of a given patient can be visually and easily estimated by using the corresponding parameters measured before treatment. If patients with short PFS are identified as early as possible, clinicians can enhance treatment without increasing side effects  (e.g., by combining targeted treatment or immunotherapy), pay attention to adverse prognostic factors, and ensure an adequate follow-up period to reduce the risk of disease progression. Conversely, for patients with a high probability of 3-and 5year PFS predicted by the model, it may be possible to reduce the drug dose and mitigate side effects.
However, several key aspects need to be considered when developing a clinical radiomics predictive model. Firstly, since plasma EBV DNA is used as an independent prognostic marker in endemic areas, many studies have incorporated it in nomogram construction. The MICE algorithm has the benefit of fast and efficient memory, which makes the results reliable      (29). One study reported that EBV DNA can induce monocytes to produce interluekin-10, which leads to immune escape (30). Based on this, we collected easily obtainable blood parameters from NPC, expecting to find stable markers and incorporate them into the radiomics nomogram. After drawing the ROC curve for blood parameters, we found that monocytes had the best sensitivity and specificity. Two retrospective studies validated age, gender, Ki-67, and smoking and drinking habits as independent prognostic factors for NPC (11,31). Our results showed the model integrating clinical data and the rad-score was more useful than those only using radiomics features. Although we successfully demonstrated the utility of radiomics data for predicting PFS in patients with NPC, this study has three major limitations. First, this was a single-center retrospective study, so the results may not readily be applicable to other situations and prospective multicenter studies are needed to confirm our findings. Second, we selected patients according to strict inclusion criteria, which may have introduced selection bias. Third, our study only focused on PFS at 3 and 5 years. In the future, we will investigate the long-term overall survival of NPC and pay more attention to predicting long-term quality of life using imaging radiomics.
In conclusion, we established an effective clinical-radiomics nomogram based on MRI findings and several clinical, pathological, and blood factors. This approach is noninvasive, visualizable, and individualized and has great potential in predicting NPC prognosis and treatment. Moreover, we further confirmed that radiomics features were independent prognostic factors for NPC.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the ethics committee of Sichuan Cancer Hospital. Written informed consent from the participants' legal guardian/next of kin was not required to participate in this study in accordance with the national legislation and the institutional requirements.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fonc.2022.815952/ full#supplementary-material FIGURE 4 | The Kaplan-Meier survival curves of high-risk and low-risk groups in the training cohort and validation cohort. In both cohorts, the low-risk group had longer PFS (P < 0.05).