MRI-based multiregional radiomics for predicting lymph nodes status and prognosis in patients with resectable rectal cancer

Purpose To establish and evaluate multiregional T2-weighted imaging (T2WI)-based clinical-radiomics model for predicting lymph node metastasis (LNM) and prognosis in patients with resectable rectal cancer. Methods A total of 346 patients with pathologically confirmed rectal cancer from two hospitals between January 2019 and December 2021 were prospectively enrolled. Intra- and peritumoral features were extracted separately, and least absolute shrinkage and selection operator regression was applied for feature selection. Radiomics signatures were built using the selected features from different regions. The clinical-radiomic nomogram was developed by combining the intratumoral and peritumoral radiomics signatures score (radscore) and the most predictive clinical parameters. The diagnostic performances of the nomogram and clinical model were evaluated using the area under the receiver operating characteristic curve (AUC). The prognostic model for 3-year recurrence-free survival (RFS) was constructed using univariate and multivariate Cox analysis. Results The intratumoral radscore (radscore 1) included four features, the peritumoral radscore (radscore 2) included five features, and the combined intratumoral and peritumoural radscore (radscore 3) included ten features. The AUCs for radscore 3 were higher than that of radscore 1 in training cohort (0.77 vs. 0.71, P=0.182) and internal validation cohort (0.76 vs. 0.64, P=0.041). The AUCs for radscore 3 were higher than that of radscore 2 in training cohort (0.77 vs. 0.74, P=0.215) and internal validation cohort (0.76 vs. 0.68, P=0.083). A clinical-radiomic nomogram showed a higher AUC compared with the clinical model in training cohort (0.84 vs. 0.67, P<0.001) and internal validation cohort (0.78 vs. 0.64, P=0.038) but not in external validation (0.72 vs. 0.76, P=0.164). Multivariate Cox analysis showed MRI-reported extramural vascular invasion (EMVI) (HR=1.099, 95%CI: 0.462-2.616; P=0.031) and clinical-radiomic nomogram-based LNM (HR=2.232, 95%CI:1.238-7.439; P=0.017) were independent risk factors for assessing 3-year RFS. Combined clinical-radiomic nomogram based LNM and MRI-reported EMVI showed good performance in training cohort (AUC=0.748), internal validation cohort (AUC=0.706) and external validation (AUC=0.688) for predicting 3-year RFS. Conclusion A clinical-radiomics nomogram exhibits good performance for predicting preoperative LNM. Combined clinical-radiomic nomogram based LNM and MRI-reported EMVI showed clinical potential for assessing 3-year RFS.


Introduction
Rectal cancer ranks eighth among all cancers worldwide (1). Lymph node metastasis (LNM) has been confirmed to be a poor prognostic factor in rectal cancer (2,3). Preoperative prediction of LNM can provide useful information for determining the need for adjuvant therapy or surgical resection. Therefore, an accurate prediction of LNM plays an important role in clinical decision-making and improved prognosis (2,4). Traditional imaging methods mainly focus on the size, shape and edge of lymph nodes to determine the lymph node status. However, these morphological features alone are not sufficient to reliably identify LNM in rectal cancer because reactive or inflammatory lymph nodes can be enlarged, normal-sized, or even small and account for a significant proportion of malignancy (5-7). An alternative technical approach is needed to complement the routine imaging tools used in the assessment of LNM.
Radiomics is a noninvasive method that allows the extraction of quantitative features from medical images (8). Several studies reported that CT-or MRI-based radiomics features could predict LNM in other malignant tumors (9)(10)(11). For rectal cancer, some studies reported that CT or MRI radiomics signature-based nomograms of the primary tumor have attained the ability to discriminate colorectal cancer patients with or without LNM (12)(13)(14). However, these previous reports only measured intratumoral regions, and the peritumoral region, which may contain valuable information about the tumor, was excluded. Tumor heterogeneity is not only solely limited to cancer cells but also relates to nonmalignant and infiltrating cells surrounding the tumor, commonly referred to as the microenvironment. It is the interaction between tumor cells and the surrounding microenvironment that influences tumor evolution and progression (15). Several studies have shown that radiomics based on peritumoural regions improves the diagnostic performance for identifying LNM in other cancers (16)(17)(18). Therefore, we can presume that radiomics derived from intratumoral and peritumoral regions could also predict LNM in rectal cancer. Furthermore, previous studies had small sample sizes and lacked complete external validation cohorts. To the best of our knowledge, there has been no study investigating associations between preoperative MRI-radiomics signatures on LNM and 3-year recurrence-free survival (RFS). A recently study reported that the advantage of radiomics offering better disease characterization might allow better performance of radiomics models based on T2WI alone, that is, without combining with diffusion weighted imaging (DWI) (19). Therefore, the purpose of this study was to develop and validate a T2WI-based clinicalradiomics model from intratumoral and peritumoral tissues for the preoperative prediction of LNM and prognosis in patients with resectable rectal cancer using a multicenter database.

Study population
This prospective study was approved by the institutional review board in our institution, and the requirement for informed patient consent was obtained. From January 2019 to December 2021, we prospectively recruited 431 patients with rectal cancer from two hospitals who underwent radical surgery. We included the following patients: (1) patients who underwent MRI examination two weeks before surgery; (2) rectal adenocarcinoma diagnosis based on pathology of surgical specimens; and (3) 12 or more regional lymph nodes in the surgical specimen that needed to  (5) incomplete clinical data, such as lack of presurgical carcinoembryonic antigen [CEA] data (n=8). These patients were divided into three groups, namely, the training cohort (n=134) from hospital 1, the internal validation cohort (n=56) from hospital 1 at a ratio of 7:3 based on the scanning date, and an external validation cohort (n=156) from hospital 2. A flowchart of the study population is shown in Figure 1.

MRI evaluation
Two radiologists (the first author and second author with 5 years and 12 years of experience in reporting rectal cancer MRI, respectively) blinded to the histopathology results reviewed the MR images in consensus. The tumor length and tumor thickness were measured on the sagittal and oblique axis T2WIs, respectively. extramural vascular invasion (EMVI) positivity on MRI was defined as follows: (1) tumor signal intensity in a vascular structure, (2) dilated vessels, and (3) tumoral extension through the vessel wall invading the vessel border. Qualitative criteria of MRIreported lymph node metastasis were based on the 2016 European Society of Gastrointestinal and Abdominal Radiology consensus meeting (7). Disagreements between two radiologists in the assessment of these features were resolved through discussion.

Tumor segmentation and feature extraction
A flowchart of the radiomics process is shown in Figure 2. One radiologist (the first author) segmented the volumes of interest of tumors on T2WI images with the AK software (Artificial Intelligence Kit, version 3.3.0, GE Healthcare) blinded to the histopathology results and a senior author (the last author) with 20 years' experience scrutinized them. To acquire information at the invasive margin, peritumoral regions were obtained with automated dilation of the tumor boundaries by 2 mm on the outside and shrinkage of the tumor boundaries by 1 mm on the inside, resulting in a ring with a thickness of 3 mm (20). We carefully excluded obvious vessels, peritumoral organs, and air cavities. Intraclass correlation coefficients (ICCs) were calculated to assess the interobserver correlation coefficient reproducibility of the radiomic feature extraction. The reproducibility of radiomic features between two observers (the first author and second author) was evaluated with ICC based on the first 30 patients' data. The subsequent feature extraction was performed by a radiomic module (backed by Pyradiomics) embedded in the open-source software package 3D Slicer (version 4.9, 107 http://www.slicer.org). Gray level of T2WI was quantized to 25 gray levels. Seven radiomic features categories included 14 first-order statistical features, 18 shape-based features, 22 gray level co-occurrence matrix, 16 gray level size zone matrix, 16 gray level run length matrix, 14 gray level dependence matrix, and 5 neighboring gray tone difference matrix. Moreover, two image filters, wavelet and Laplacian of Gaussian were applied to original images, respectively. Before the feature extraction, z score normalization of the MRI signal intensities for T2WI. Consequently, 1409 features were obtained for each of intratumoral region and peritumoral region. The time required for a senior radiologist to take segmentation was controlled to 300 seconds while for a junior radiologist to take segmentation was controlled to 600 seconds. Flowchart of patient selection.

Feature selection and model building
The values of the features with ICC >0.75 were included for subsequent analysis. ComBat harmonization was first used to remove batch effects that caused by the handling of samples by different centers or different scanner/protocol that can obscure individual variations (21). Feature selection and model building were performed using R software (version 2.15.3 www.r-114project.org).
The radiomics features were initially screened by maximum relevance and minimum redundancy, and then least absolute shrinkage selection operator regression was used to select the most useful predictive features from the training cohort. A radiomics signature score (radscore) was calculated for each patient as a linear combination of the selected features weighted by their respective coefficients. The predictive accuracy of the radscore was evaluated by the area under the curve (AUC) in the training and validation cohorts. The highest AUC value among the radscores was included in the subsequent analysis.
Wilcoxon test was first applied to all clinical risk factors and radscore, and then the factors with P<0.05 in univariate logistic regression was performed to choose the independent predictors. Multivariate logistic regression analysis was performed to construct the combined model. The nomogram and clinical model for predicting LNM were constructed using the selected predictors. The Hosmer-Lemeshow test was performed to assess the goodness-of-fit of the nomogram. Calibration curves were generated to evaluate the calibration of the nomogram. The AUC was calculated to assess the discrimination performances of the clinical model and the nomogram for predicting LNM. The clinical utility of the nomogram was evaluated by decision curve analysis (DCA).

Outcome
Patients with rectal cancer at pathological T1-2N0M0 after surgery received "follow-up and watch" strategy, without giving any adjuvant treatment. For patients at pathological T3-4N0M0 or T1-4N1-2M0 after surgery, these patients received 5-fluorouracilbased adjuvant therapy. Relapse was assessed every 3-6 months based on clinical or radiological locoregional or distant progression after surgery. The primary endpoint was 3-year RFS.
Statistical analysis SPSS 23.0 (IBM) and R software were used for statistical analysis. The baseline characteristics of patients with rectal cancer were compared using Student's t test, nonparametric test, chi-squared test, and Fisher's exact test (where appropriate). The diagnostic performance was compared by ROC analysis, and the difference in AUCs between these models was compared using Delong's test. The prognostic model for 3-year RFS was constructed using univariate and multivariate Cox analysis.

Patient characteristics
A total of 346 patients (mean age 61.86 years, age range 26-88 years) were included in this study population. Among the 346 patients with rectal cancer, 134 patients were in the training cohort (66 pathologically reported LNM+ and 68 LNM-), 56 patients were in the internal validation cohort (27 pathologically reported LNM+ and 29 LNM-), and 156 patients were in the external validation cohort (88 pathologically reported LNM+ The workflow of a typical radiomics process in our study included tumor segmentation, feature selection, and model construction and evaluation. and 68 LNM-). Among these three cohorts, significant differences were found in MRI-reported EMVI (P= 0.031) and tumor length (P=0.030), as shown in Table 1.

MR-reported LNM correlation with pathologic results
MR-reported LNM correlation with pathologic results is summarized in Supplementary Table 1. The correlation of MR-reported LNM with pathologic results were validated with Kappa of 0.248, with a sensitivity of 62.4% and specificity of 62.4%. Therefore, the correlation of MR-reported LNM with pathologic results indicated poor consistency because the Kappa value was less than 0.4.

Development and evaluation of the clinical-radiomic nomogram
A clinical model was constructed using three factors, including cT stage, MRI-reported EMVI, and CEA. The clinical-radiomic combined model was constructed by adding radscore 3 to the clinical model [odds ratio (OR)=1.566 for radscore 3, 1.841 for cT, 8.340 for EMVI, and 1.020 for CEA], as summarized in Table 2. The nomogram was constructed for visualizing the combined model, as shown in Figure 4. The calibration curves and DCA results of the clinical-radiomics nomogram are shown in Figure 5. Good calibration in training cohort and validation cohort was identified using the Hosmer-Lemeshow test (all P>0.05).

Subgroup analyses
Subgroup analyses of the nomogram are shown in Figure 6. Extranodal extension (ENE), which is defined as the extension of tumor cells through the nodal capsule into the perinodal fatty tissue, is an adverse prognostic factor in rectal cancer (22-24). Pathological specimens of 93 patients with LNM were reviewed by a pathologist to determine the ENE status. In total, 38 patients were ENE positive. The nomogram had a higher AUC than the clinical model for identifying ENE (0.837 vs. 0.715, P=0.004). Lateral lymph node metastasis (LLNM) has a significantly higher risk of lateral pelvic recurrence compared to those who had negative LLNM. LLNM is considered as distant metastasis that is treated with neoadjuvant chemoradiotherapy (nCRT) followed by surgery. Forty patients underwent dissection. There were 16 patients with LLNM confirmed by pathological specimens. Although the nomogram had a higher AUC than the clinical model for identifying LLNM, the difference was not significant (0.752 vs. 0.715, P=0.538). Patients with N2 stage had worse prognosis than the patients with N0 and N1 stage. In total, 56 patients with N2 stage were confirmed by pathological specimen. The nomogram had a higher AUC than the clinical model for differentiating N2 stage from LNM-negative patients, but the difference was not significant (0. Receiver operating characteristic curves of three radiomics models for predicting lymph node metastasis in training cohort (A) and internal validation cohort (B).

Survival analysis
The median follow-up in the event-free population was 26 months (range, 5-36 months) in training cohort, 26 months (range, 6-36 months) in testing cohort, and 36 months (range, 5-36 months) in external validation cohort. The rate of recurrence in patients with LNM was higher than that of those without LNM (31.8% vs. 16

Discussion
In the current study, the radscore 3 outperformed the radscore 1 and radscore 2 for identifying LNM. After adding the radscore 3 model to the clinical model, our study revealed that the clinicalradiomics nomogram could significantly improve diagnostic performance compared to the clinical model in training cohort and internal validation cohort. However, the clinical-radiomic nomogram failed to outperform the clinical model in external validation cohort, but the difference was not significant. Moreover, prognostic model constructed by MRI-reported EMVI and clinical-radiomic nomogram-based LNM indicated good performance for predicting 3-year RFS.
The radscore 3 consisting of 10 radiomics features could predict LNM with acceptable performance in training cohort (AUC of 0.77) and internal validation cohort (AUC of 0.76). Of the 10 radiomics features, peritumoral features accounted for most of the features in radscore 3 (6/10, 60%). In this study, the two important positive coefficients of radiomics features included interquartile range and skewness extracted from peritumoral region. Interquartile range is the 25th and 75th percentile of the image array, respectively. The large interquartile range indicates the greater difference between the range of gray values in the region of interest, which implies the inhomogeneous intensity of tumor. Therefore, the two positive coefficient of radiomic features indicating the tumor heterogeneity suggested that the peritumoral region around the rectum is important in the formation of LNM. Our study found that radscore 3 model showed minor improvements in diagnostic efficacy compared with radscore 1 and radscore 2 model. Wavelet features are extracted from the images transformed by a wavelet filter. Consistent with a previous study (25), the selected radiomic signature in this study was mainly constructed by wavelet  features (6/10, 60%). Another study also reported the effectiveness of wavelet features on T2WI in predicting lymph node status (26). Therefore, these results confirmed that wavelet features better reflected tumor heterogeneity. Some studies have reported that some clinical characteristics are related to LNM (27)(28)(29) (29). Compared with these studies, the AUC of the nomogram in our study was slightly higher. This finding could be explained by the multiregional radiomics feature extraction in our study. Other previous studies reported that a multiparametric MRI-based radiomics nomogram for the tumor region alone showed a slightly improved diagnostic performance compared with that noted in our study (14,25). However, the sample size in these studies was relatively small, and these retrospective studies lacked independent external validation. Moreover, multiparametric MRI-based radiomics, especially for incorporating DWI-based radiomics features that could be influenced by MRI systems or bvalues, is not stable and typically exhibits different diagnostic performance. Total mesorectal excision was introduced to reduce the local recurrence because the probable microtumors around the cancer have been completely removed. Therefore, the importance of the perirectal tissue status may possess some crucial biological information, including potential predictive markers. Liu et al. demonstrated that clinical data combined with multiregionalbased MRI radiomics can improve the diagnostic efficacy in predicting LNM (30). Jayaprakasam et al. reported that radiomics features of mesorectal fat can predict tumor response after neoadjuvant chemoradiation therapy (31). However, peritumoral regions in this study were defined the region along the mesorectal fascia and the outer edge of the tumor and rectal wall. Several previous studies in other tumors indicated that peritumoral regions were defined as the area immediately surrounding the tumor (18,20,32,33). Some studies also reported metabolic changes in the peritumoral region, including increased uptake of FDG by the tissues adjacent to the tumor compared with distant tissues (34,35). Therefore, we chose a 3mm area around the tumor boundary as the peritumoral region according to previous studies (18,20).
Regarding subgroup analyses, patients with LNM≥4 (stage N2) had at least stage III rectal cancer (36). Different treatments and outcomes are noted between stage I-II and III rectal cancer patients. Our results showed that the nomogram had moderate value for differentiating N2 stage from LNM-negative patients. ENE was associated with a poorer prognosis in colorectal cancer patients (37). The nomogram had good diagnostic performance (AUC, 0.837) for differentiating ENE-positive from LNMnegative patients. Our study showed that the nomogram had moderate diagnostic performance (AUC, 0.752) for differentiating LLN-positive from LNM-negative patients. At T1-T2 stage subgroup analysis, we found that the nomogram had good diagnostic performance for identifying LNM (AUC, 0.813). In T3 stage subgroup analysis, the nomogram had moderate diagnostic performance for identifying LNM (AUC, 0.739). Although the sample size in these subgroup analyses was small, our study provided preliminary evidence to confirm that the nomogram could potentially assess these subgroups. In addition, we further reported that the rate of recurrence in patients with LNM was increased compared with that noted in those without LNM (51.9% vs. 8.3%). We found the patients with low clinical-radiomic nomogram score had better 3-year RFS than those with high scores. MRI-reported EMVI has been confirmed to be strongly associated with distant recurrence (38). In this study, multivariate Cox analysis showed MRI-reported EMVI and clinical-radiomic nomogram-based LNM were shown to be adverse prognostic factors for 3-year RFS. Prognostic model constructed by these two indicators indicated good performance for predicting 3-year RFS. These results may indicate T-stage and N-stage are not enough for classifying the patient, while combination of more indicators, such as clinical-radiomic nomogram-based LNM and MRIreported EMVI, is more sensible.
Our study has several limitations. First, the sample size was relatively small, especially in the subgroup analysis. It is still necessary to expand the sample size in further study. Second, although a dilation distance of 3 mm around the tumor was defined as the peritumoral region in this study, we did not compare the different dilation distances. Third, the data were obtained from two different centers with different scanning devices. However, ComBat harmonization was used to efficiently remove the scanner/protocol effect. Fourth, even though DWI is routinely included in rectal MRI protocols and offers several benefits in various applications, it also has multiple possible shortcomings. Manual drawing of ROIs onto the tumor for quantitative or qualitative assessment may result in interobserver variation. Furthermore, image distortion due to artifacts is common on DWI, particularly around air tissue interfaces. These shortcomings may interfere with radiologists in drawing tumor ROI. Finally, our findings are applicable to resectable rectal cancer, whereas patients who had a contraindication for surgery were excluded. Therefore, a selection bias might exist.
In conclusion, our study confirmed clinical-radiomics nomogram exhibits good clinical potential for predicting preoperative LNM. Prognostic model constructed by MRIreported EMVI and clinical-radiomic nomogram-based LNM indicated good performance for predicting 3-year RFS. These results can assist predicting preoperative LNM and identifying high-risk patients with rectal cancer for assessing 3-year RFS.

Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement
This prospective study was approved by the institutional review board of Sichuan Provincial People's Hospital. The patients/participants provided their written informed consent to participate in this study.

Author contributions
ZL-L and TL directed the project and revised the paper. HaL and X-LC conceptualized and designed the study, analyzed the data, and wrote the paper. HuL wrote section of the manuscript. HL analyzed the data. All authors contributed to the article and approved the submitted version.

Funding
This study has received funding from Sichuan Science and Technology Program (grant number, 2020YFH0166) from X-LC and Sichuan Science and Technology Program (grant number, 2022YFS0249) from HaL. The second funds (grant number, 2022YFS0249) will pay the open access publication fees.