Predictive Value of a Combined Model Based on Pre-Treatment and Mid-Treatment MRI-Radiomics for Disease Progression or Death in Locally Advanced Nasopharyngeal Carcinoma

Purpose A combined model was established based on the MRI-radiomics of pre- and mid-treatment to assess the risk of disease progression or death in locally advanced nasopharyngeal carcinoma. Materials and Methods A total of 243 patients were analyzed. We extracted 10,400 radiomics features from the primary nasopharyngeal tumors and largest metastatic lymph nodes on the axial contrast-enhanced T1 weighted and T2 weighted in pre- and mid-treatment MRI, respectively. We used the SMOTE algorithm, center and scale and box-cox, Pearson correlation coefficient, and LASSO regression to construct the pre- and mid-treatment MRI-radiomics prediction model, respectively, and the risk scores named P score and M score were calculated. Finally, univariate and multivariate analyses were used for P score, M score, and clinical data to build the combined model and grouped the patients into two risk levels, namely, high and low. Result A combined model of pre- and mid-treatment MRI-radiomics successfully categorized patients into high- and low-risk groups. The log-rank test showed that the high- and low-risk groups had good prognostic performance in PFS (P<0.0001, HR: 19.71, 95% CI: 12.77–30.41), which was better than TNM stage (P=0.004, HR:1.913, 95% CI:1.250–2.926), and also had an excellent predictive effect in LRFS, DMFS, and OS. Conclusion Risk grouping of LA-NPC using a combined model of pre- and mid-treatment MRI-radiomics can better predict disease progression or death.


INTRODUCTION
Nasopharyngeal carcinoma (NPC) is epithelial carcinoma originating from the inner layer of the nasopharyngeal mucosa. In 2018, there were 129,000 new cases of NPC in the world (1). The TNM stage system is widely used in risk stratification and therapeutic decision in NPC, and about 70% are diagnosed with locally advanced stage (2). Concurrent chemoradiotherapy with or without induction chemotherapy is the standard treatment with locally advanced nasopharyngeal carcinoma (LA-NPC). However, it is worth noting that there are still significant differences in clinical outcomes among the same TNM stage and similar treatment in LA-NPC; metastasis and recurrence, especially the former, are the considerable causes of treatment failure (3). The 5-year progression-free survival (PFS) for stage III and IVa in NPC were 68.7-87% and 50.4-68%, and the 5-year overall survival (OS) were 75.5-91.4% and 58.3-75%, respectively (4)(5)(6). Therefore, developing individualized methods to predict the effect in LA-NPC is necessary.
Radiomics is an algorithm that could automatically extract high-dimensional quantitative features from medical images. These features are extracted from the whole tumor in different ways. They can provide comprehensive information about tumor phenotype, tumor microenvironment, and response to treatment to characterize tumor heterogeneity (7,8). Magnetic resonance imaging (MRI) was the preferred imaging modality for diagnosis and local stage of NPC (9). Previous studies had shown that MRIradiomics is an independent risk factor for distant metastasis, local recurrence, and PFS in NPC (10)(11)(12). Most of these studies focus on primary tumors of the nasopharynx. A recent study showed that primary tumors and metastatic lymph nodes have different biological characteristics (13). Therefore, it is necessary to consider adding metastatic lymph node information to radiomics based on primary nasopharyngeal tumors.
Due to individualized differences, different NPCs have different responses to chemoradiotherapy, leading to differences in tumor cell populations (i.e., differences in tumor heterogeneity). Currently, there is no literature report on constructing an MRIradiomics model during chemoradiotherapy to predict LA-NPC. This study aims to screen features associated with PFS labeling in pre-and mid-treatment MRI-radiomics, respectively, to construct a model to predict disease progression or death in LA-NPC (stage III-IVa).

MATERIAL AND METHOD Patient
This retrospective study was approved by the institutional review board of our institution. Informed consent from patients was exempted due to the retrospective nature of this study. The experiment included newly diagnosed LA-NPC (stage III-IVa) in Sichuan Cancer Hospital from January 2015 to December 2016. The inclusion criteria were as follows: (1) histologically confirmed LA-NPC (restage according to AJCC 8th edition) and at least one metastatic lymph node. Previous studies associated with head and neck cancer have shown that the radiomics features of increasing the region of interest (ROI) of the lymph nodes provide a better predictive power than those from primary tumors alone (14,15). According to the definition of Ho et al. (16), the diagnostic criteria of N + include central necrosis, extracapsular spread, the shortest diameter of cervical lymph nodes >10 mm, and the shortest diameter of retropharyngeal lymph nodes >5 mm. (2) pre-and mid-treatment (20 times of radiotherapy) MRI examination of nasopharynx and neck, MRI sequence included axial contrastenhanced T1 weighted imaging (CET1WI), and axial T2 weighted imaging (T2WI); (3) radical chemoradiotherapy were completed; (4) have available clinical data. The exclusion criteria were (1) motion artifacts, blurring, and in-continuity in MRI images; (2) history of anticancer therapy before baseline MRI scans, such as radiotherapy, chemotherapy, immunotherapy, and surgery; (3) patients with distant metastasis; (4) recurrence or complicated with other malignant tumors; (5) incomplete radiotherapy planning records. Finally, a total of 243 patients were included in further analysis.

Treatment
The treatment regimen was concurrent chemoradiotherapy ± induction chemotherapy. The chemotherapy regimen was platinum-based single or dual drug (cisplatin ± paclitaxel), beginning on the first day of radiotherapy. Gross tumor volume (GTV), included both primary nasopharyngeal tumor (GTVnx) and metastatic lymph nodes (GTVln) as demonstrated by clinical, endoscopic, and imaging data. All ROI segmentations were firstly manually performed by a radiation oncologist who had 3 years of experience in NPC radiotherapy and then validated by a senior radiation oncologist who had 10 years of experience. GTV was planned to receive a total dose of 66-76 Gy with conventional fractionation (2.1-2.25 Gy per fraction, five fractions per week). Some patients were treated with anti-EGFR monoclonal antibodies during radiotherapy simultaneously. Nasopharynx and neck MRI were reexamined at 20 times of radiotherapy.

Follow-Up and Survival Endpoint
MRI scan showed soft tissue swelling or space-occupying and then by histopathology to determine local recurrence. Distant metastasis was diagnosed synthetically by clinical symptoms, physical examination, imaging data, and histopathology. The main endpoint was PFS, while loco-recurrence-free survival (LRFS), distant metastasis-free survival (DMFS), and OS were secondary endpoints. PFS was defined as the time during the tumor progressing (for any aspect) or at death (for any reason) and the first MRI scan. LRFS was defined as the time between the first local recurrence and the first MRI scan. DMFS was defined as the time between the first distant metastasis and the first MRI scan. OS was defined as the time between the death of any cause and the first MRI scan.

Image Acquisition and Segmentation
The MRI image was exported through PACS and saved in DICOM format. The saved image was then imported into the MIM planning system for ROI drawing. To ensure the accuracy of the sketch, we used manual segmentation to outline the masses on the CET1WI and T2WI sequence of the primary nasopharyngeal tumor and metastatic lymph nodes in pre-and mid-treatment (as shown in Figure 1). The resulting 3D mass area was ROI. In this study, the metastatic lymph nodes with the largest short diameter were selected as the target lesions for GTVln, which is consistent with the study of Bologna (17).

Image Preprocessing
The uAI Research Portal (Version: 430 sp1) was used to image preprocessing. We processed the image by several filters, including Box Mean, Additive Gaussian Noise, Binomial Blur, Curvature Flow, Box Sigma, Laplacian of Gaussian (LoG), Wavelet, Normalize, Laplacian Sharpening, Discrete Gaussian, Mean, Speckle Noise, Recursive Gaussian, Shot Noise/Poisson Noise filter. In our study, four different LoG filtered images were obtained through different combinations. After three times of wavelet decomposition, the wavelet images of eight various frequency bands were finally obtained, and normalize filter adjusted all MRI images to 255 gray levels in order to standardize the scanning parameters and machinery differences reflected on the images.

Radiomics Feature Extraction
The uAI Research Portal was also used for feature extraction.
Features of different categories were considered: 14 shape features, 18 first-order statistics features, 21 features computed on gray level co-occurrence matrix (GLCM), 16 features computed on gray level run-length matrix (GLRLM), 16 features computed on gray level size zone matrix (GLSZM), 14 features computed on gray level dependence matrix (GLDM), and 5 features computed on gray level dependence matrix (GLDM), a total of 104 radiomics features. The original and filtered image generated 25 groups, so each ROI extracted a total of 2,600 features. Finally, from each image type (CET1WI or T2WI), 2,600 radiomics features were extracted from both the primary tumors and the largest affected lymph node on pre-and mid-treatment, for a total of 20,800 features, namely, 10,400 for pre-and mid-treatment, respectively. Excel S1 and S2 of Supplementary Materials shown the all radiomics feature in pre-and mid-treatment.

Radiomics Feature Selection, Model Building, and Validation
To avoid the influence of class imbalance (85 cases of progress/ death vs 158 cases of disease-free survival) on the model building, we used a SMOTE algorithm to oversample the original dataset of pre-and mid-treatment, respectively. After amplification, the dataset was randomly divided into a training dataset (476/595) and a test dataset (119/595) according to 4:1. The model building was based on the training dataset, after being preprocessed by center and scale and Box-cox; the feature with no difference between categories was removed. The Pearson correlation coefficient was used to remove redundant features. LASSO regression was used for a further selection of the remaining features, which is consistent with most previous studies (10-12, 18, 19). Then, 20 MRI-radiomics that were most closely related to PFS tags were selected, and the importance of features in the model was sorted. Finally, we selected the top five features respectively to create a radiomics model of pre-and midtreatment. At the same time, the prediction ability of the model was tested in the training, test, and original dataset by ROC curve and confusion matrix. Eventually, we received the radiomics risk score of pre-treatment named P score and midtreatment named M score.

Final Model Development and Risk Stratification
The clinical information, P score, and M score were analyzed by Cox univariate analysis, and we selected the variables with P < 0.05 (bilateral test) to Cox multivariate analysis. According to the results of multiple factors, we chose the variables with P < 0.05 (bilateral test) to train a multivariate Cox proportional hazard regression model, and the predicted values of linear predictive variables of PFS were obtained. The higher the predictive value, the greater the risk of progress/death. The median of the predictive value was used as the threshold for risk stratification. Finally, we compared the Kaplan-Meier survival curves between different risk groups and TNM stages at different clinical endpoints.

Statistical Analysis
All statistical analyses were conducted using SPSS (version 26.0), GraphPad Prism (version 8), and R software (version 3.5.2). LASSO logistic regression was completed by the "glment" package. The Kaplan-Meier survival analyses were presented by GraphPad Prism. P < 0.05 was considered as statistically significant.

RESULT
A total of 243 patients were included for the final analysis. The median follow-up period was 52.7 months (range 10.6-72 months). The specific clinical data were shown in Table 1.

Establishment and Validation of Pre-Treatment MRI-Radiomics Prediction Model
In the pre-treatment prediction model, there were 243 samples in the original dataset, which were expanded to 595 samples by SMOTE algorithm. After randomly grouping according to 4:1, there were 476 samples in the training dataset and 119 samples in the test dataset. Top five of 20 radiomics features were selected, including three from primary nasopharynx tumors and two from metastatic lymph nodes. Supplementary Figure S1 shown the 20 radiomics feature in pre-treatment.  Figure 2.
The results of the confusion matrix ( Figure 3) of the three datasets (training dataset, original dataset, and test dataset) in this study were as follows: the accuracy, precision, sensitivity, specificity, and F1 values of the training dataset were 0.725, 0.704, 0.618, 0.805, and 0.658, respectively. In the original dataset, they were 0.728, 0.614, 0.600, 0.797, and 0.607, respectively. In the test dataset, they were 0.790, 0.795, 0.686, 0.868 0.737, respectively. Finally, according to the weighted coefficient of logistic regression analysis, we obtained a formula for calculating the risk value of each LA-NPC patient:

Establishment and Validation of Mid-Treatment MRI-Radiomics Prediction Model
In the mid-treatment prediction model, the original dataset after oversampling and grouping showed 476 samples in the training dataset and 119 samples in the test dataset. Five radiomics features were selected, including three from primary nasopharyngeal tumor and two from metastatic lymph nodes.
Supplementary Figure S2 shown

Final Model Development and Risk Stratification
In univariate Cox analysis, age, alkaline phosphatase, T stage, TNM stage, P score, M score were significantly correlated with PFS. Subsequent multivariate Cox analysis showed that P score (HR: 13.515, 95% CI: 5.185-35.230) and M score (HR: 17.604, 95% CI: 8.113-38.195) were independent risk factors for PFS, as shown in Table 2.
We put P score and M score into multivariate Cox regression model, and the predicted values of PFS linear predictive variables were obtained. The median predicted value was used as a threshold to classify high-and low-risk patients. In terms of prognostic power for PFS, the high-and low-risk groups (P<0.0001, HR: 19.17, 95% CI: 12.77-30.41) was significantly prognostic than TNM stage (P=0.004, HR: 1.913, 95% CI: 1.250-2.926). Similar results could be found by looking at the Kaplan-Meier curves for LRFS, DMFS, OS of the high-/low-risk groups and TNM stage, for as far as LRFS is concerned, the log-rank test showed P < 0.0001 (HR: 44.61, 95%  Figure 6).

DISCUSSION
In recent years, radiomics has developed rapidly in medicine, and good results have been achieved in predicting the effect of tumors. MRI is a standard imaging method in NPC, and it has unique advantages. First of all, MRI can provide superior anatomical information (such as spatial location) and has good soft tissue contrast-detection ability. Secondly, different MRI sequences may be sensitive to critical components of tumor physiology, such as blood flow and cell density, and MRI also can distinguish regions in the tumor that contain different environments that may affect local cell phenotypes and genotypes, such as blood flow changes. Finally, MRI can be the non-invasive and repeated examination of the tumor to evaluate the treatment response to be integrated into the treatment strategy. So, the MRI image was used to establish the LA-NPC prediction model through radiomics. This study explored the value of MRI-radiomics features on pre-and mid-treatment in predicting effect in LA-NPC. The results showed that the M score and P score were independent prognostic indexes of PFS. Finally, we put them into the multivariate Cox model to calculate the risk score. We successfully stratified the risk of the LA-NPC. Through the Log-rank test, we found that MRI-radiomics showed good predictive ability in PFS, LRFS, DMFS, and OS. By screening the pre-treatment MRI-radiomics features, we got 20 radiomics features related to PFS in LA-NPC. It is better to consider that the ratio between the amount of data and the number of features that can be accommodated by logistic regression is more   (20). We selected the top five features to establish a pretreatment prediction model, and the risk score named P score was calculated (21). In previous studies, an MRI-based model on primary nasopharyngeal tumors had been proved to be a significant prognostic biomarker for PFS in LA-NPC (22,23). Furthermore, the research by Yang et al. indicated that an MRIbased model on metastatic lymph nodes is a significant risk factor for PFS in LA-NPC (24). Thus, MRI-radiomics features from both metastatic lymph nodes and primary nasopharynx tumors contribute to PFS prediction in LA-NPC, which is consistent with our research. As far as we know, there is no related research on the radiomics features of mid-treatment. Similarly, we calculated the risk score of mid-treatment named M score. The MRI-radiomics model of pre-and mid-treatment was internally validated by 10-fold cross-validation in the training dataset. The average AUC values were 0.7905 (95% CI: 0.7506-0.8304) and 0.9205 (95% CI: 0.8967-0.9442), respectively, which indicates that the model has good repeatability. In addition, the two models have high AUC values in both original and test datasets (Figures 2, 4), which shows that the model has good generalization ability and portability. Furthermore, the performance of the two models in the confusion matrix in different datasets (Figures 3, 5) is also outstanding.
Comparing the radiomics features included in the two models, the pre-treatment prediction model had two first-order features (average eigenvalues and maximum eigenvalues) and three texture features (GLCM, GLDM, GLSZM); the mid-treatment prediction model had one shape feature (surface area/volume ratio), one firstorder feature (average eigenvalue), and three texture features (GLDM, GLSZM). The shape features reflect the volume, sphere, surface area/volume ratio of the tumor. Previous studies had found that primary tumor volume is closely related to local control, distant metastasis, and OS in NPC (25). Zhang et al. worked on the development and validation of an MRI-based model (including surface area/volume ratio) for predicting distant metastasis of NPC. The model has good evaluation ability in the validation cohort (C index: 0.74, 95% CI: 0.58-0.85) (11). First-order statistical features are the simplest statistical descriptors, including gray average, maximum, minimum, variance, percentile, etc. (24). GLCM can reveal the spatial complexity of tumors and may provide information about central necrosis or tumor metastasis-dependent factors, such as yes-related proteins (13). Several studies had shown that GLCM is closely related to the recurrence, metastasis, and OS of NPC (10-12, 17, 18, 24, 26). Zhang et al. demonstrated that GLSZM is associated with the risk of distant metastasis of NPC (10). Farhan et al. found significant differences between recurrent and nonrecurrent regions in seven features (including GLSZM) in the radiomics analysis of intratumoral spatial heterogeneity in LA-NPC (19). GLDM quantifies the dependence between the gray values of adjacent pixels and the gray values of central pixels within a certain distance, and its predictive value in NPC had been confirmed by Zhang et al. (10).
We also found that three of the features in the pre-treatment prediction model came from CET1WI, and two were from T2WI, while all the features of the mid-treatment prediction model came from CET1WI. By comparing the accuracy, precision, sensitivity, specificity, F1 value, and AUC value of the two models, we noticed that the mid-treatment prediction model is better than the pre-treatment in training and original dataset, which may indicate that T2WI mainly reflects the density and boundary of the tumor. However, CET1WI reflects the heterogeneity and structure within the tumor (such as tumor angiogenesis) (27), which is crucial for judging the prognosis. Zhang et al. also found that the contribution of CET1WI to the model is more significant than that of T2WI (11), which is consistent with the results of their another study (the radiomics prediction based on CET1WI sequence is better than T2WI sequence or combined with CET1WI and T2W sequence) (28). Jiang et al. also proposed that using CET1WI to build a model produces better results than T2WI (29).
The features' inconsistency between pre-and mid-treatment prediction model is attributed to LASSO regression. In the screening radiomics features, LASSO regression will compress some relatively unimportant features, adjust the coefficients to zero for insignificant parameters, and rank the importance of  features, for example, "wavelet_firstorder_wavelet_LHH-Mean_GTVnxT1" ranks thirteenth in the Pre-treatment prediction model and sixteenth in the mid-treatment, showing the features included in the pre-treatment model are not entirely useless, just their importance has changed. It also indicates that the tumor cell population has changed after chemoradiotherapy, leading to changes in heterogeneity within the tumor. We compared the Kaplan-Meier survival curves between different risk groups and TNM stages at different clinical endpoints. The results showed that the high-and low-risk group had an excellent ability to predict PFS (P<0.0001 HR: 19.17, 95% CI: 12.77-30.41) was better than the TNM stage (P=0.004, HR: 1.913, 95% CI: 1.250-2.926). The MRI-radiomics model's ability to predict the LA-NPC effect is better than the TNM stage had been confirmed in some studies, consistent with our study (12,18,26). Interestingly, we tested the high-and low-risk group at other endpoints and found that they all performed well in LRFS, DMFS, and OS, which was similar to some of the results of Marco Bologna (26), who used OS as the label for radiomics features screening, and the final prediction model also had good predictive ability in LRFS. In the study, our radiomics features were labeled with PFS, which includes patients with recurrence, metastasis, and death according to the definition, so the features we screened have predictive values for different endpoints.
Marius suggested several considerations when conducting radiomics studies (30). Firstly, in addition to randomized clinical trials, the class imbalance is common, especially in retrospective studies using routine clinical data. There is little uniformity between interesting and non-interesting events in the cohort. For example, in our study, about 35% of patients had events of interest (progress/death). When evaluating MRI-radiomics features to predict PFS in NPC, we must take the imbalance between the percentage of patients with and without interesting events (35%) into account. The classifier that assigns all the cases in the sample to the "no event of interest" group seems to have a 65% correct rate. Still, it doesn't make clinical sense because it cannot actually distinguish whether interesting events have occurred by MRI in LA-NPC. Therefore, the overall accuracy and sensitivity, specificity, AUC value should be reported. Our study also used a SMOTE algorithm to balance the impact of class to reduce data imbalance on the research (31). Secondly, overfitting occurs when a model with many input parameters or too many degrees of freedom "memorizes" data. In addition to the features related to disease, the model also contains features reflecting image noise and random fluctuations. Generally, there are two processing methods: reducing the number of features, or performing regularization on the data. Here we compared the Pearson correlation coefficients to check and avoid collinearity between variables, and used LASSO regression for feature selection to avoid overfitting. Besides, the SMOTE algorithm balances the class distribution by synthesizing a small number of samples, which reduces the possibility of overfitting.
This study has two main advantages. Firstly, our research is the only one that demonstrates the predictive effect of the mid-   treatment radiomics features on PFS in LA-NPC. We found that the use of radiomics information of mid-treatment can more comprehensively evaluate the response of LA-NPC to treatment and better evaluate the prognosis. On the other hand, we indirectly confirmed that the heterogeneity of tumors would change during chemoradiotherapy. The Cox model combined the pre-and mid-treatment radiomics features for risk stratification and found an excellent predictive effect across different clinical endpoints. Secondly, it had been proved that the population of different genomes is one reason for the clinical heterogeneity of radiotherapy efficacy (32). It is well known that radiomics is assumed to represent the histological heterogeneity of solid tumors (33). Although more than 90% of LA-NPC had positive lymph nodes, previous studies ignored metastatic lymph nodes (22,23). We also collected the radiomics features of primary nasopharyngeal tumors and metastatic lymph nodes to describe tumor biological characteristics better. This study also has some limitations. Firstly, this study is a retrospective study conducted by a single agency in non-endemic areas of NPC and lacks external validation. It is necessary to perform a large-sample multicenter prospective validation in NPC endemic and non-endemic regions to obtain strong evidence of clinical application. Secondly, the disunity of the treatment plan will also affect the prediction effect of the model. Finally, MRI-radiomics models and statistical analysis algorithms are unfamiliar and complex to the clinic. To solve this problem, we can set up a website or application, and doctors can upload images and clinical variables to obtain results.

CONCLUSION
The MRI-radiomics model (pre-and mid-treatment) is a powerful tool to predict the disease progression/death in LA-NPC. We calculate the risk score of disease progression/death in LA-NPC by combining the radiomics characteristics of pre-and mid-treatment and stratify the patients with high and low risk, which can not only predict the PFS in LA-NPC but also predict the LRFS, DMFS, and OS.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors.

AUTHOR CONTRIBUTIONS
PZ, GY, LK, and YN designed this study. LK, YN, RH, SL, QT, AC, and YF conducted the study and analyzed the results, development of the model, and drafted the manuscript under the supervision of JL, PZ, and GY. LK took part in the drawing target outline, data extraction, development of the model, YN took part in the research general design, data extraction, development of the model,and they carried out the main part of the study, they contributed equally to this work and share first authorship. PZ and GY have contributed equally to this work and share corresponding authorship. The remaining authors are ranked by their contribution to research. All authors contributed to the article and approved the submitted version.