Computed Tomography-Based Radiomics for Preoperative Prediction of Tumor Deposits in Rectal Cancer

Objective To develop and validate a computed tomography (CT)-based radiomics model for predicting tumor deposits (TDs) preoperatively in patients with rectal cancer (RC). Methods This retrospective study enrolled 254 patients with pathologically confirmed RC between December 2017 and December 2019. Patients were divided into a training set (n = 203) and a validation set (n = 51). A large number of radiomics features were extracted from the portal venous phase images of CT. After selecting features with L1-based method, we established Rad-score by using the logistic regression analysis. Furthermore, a combined model incorporating Rad-score and clinical factors was developed and visualized as the nomogram. The models were evaluated by the receiver operating characteristic curve (ROC) analysis and area under the ROC curve (AUC). Results One hundred and seventeen of 254 patients were eventually found to be TDs+. Rad-score and clinical factors including carbohydrate antigen (CA) 19-9, CT-reported T stage (cT), and CT-reported peritumoral nodules (+/-) were significantly different between the TDs+ and TDs- groups (all P < 0.001). These factors were all included in the combined model by the logistic regression analysis (odds ratio = 2.378 for Rad-score, 2.253 for CA19-9, 2.281 for cT, and 4.485 for peritumoral nodules). This model showed good performance to predict TDs in the training and validation cohorts (AUC = 0.830 and 0.832, respectively). Furthermore, the combined model outperformed the clinical model incorporating CA19-9, cT, and peritumoral nodules (+/-) in both training and validation cohorts for predicting TDs preoperatively (AUC = 0.773 and 0.718, P = 0.008 and 0.039). Conclusions The combined model incorporating Rad-score and clinical factors could provide a preoperative prediction of TDs and help clinicians guide individualized treatment for RC patients.


INTRODUCTION
Rectal cancer (RC) is one of the most common cancers and a leading cause of cancer-related death worldwide (1,2). Tumor deposits (TDs) in RC have been shown to be an important marker of poor prognosis (3)(4)(5). This adverse association persists even in those patients with lymph node metastasis (LNM), strongly suggesting that their effect on prognosis is separate and additive (3). Detecting TDs in advance is very important for assessing prognosis of RC patients.
TDs, also called extranodal TDs, peritumoral deposits, or satellite nodules, are defined as discrete tumor foci in the pericolic or perirectal fat, without histological evidence of residual lymph node or identifiable vascular or neural structures (6,7). According to the eighth edition of the American Joint Committee on Cancer (AJCC) TNM staging system, any T lesions with negative regional LNM and positive TDs are classified as N1c (8). Positive TDs can elevate clinical stages of RC patients. For example, a stage I patient (T1-2N0) with TDs should be reclassified and treated as stage III (T1-2N1c). The early identification of TDs is important for evaluating the stage and treatment plan.
Rectal magnetic resonance imaging (MRI), computed tomography (CT), and endorectal ultrasound are the first-line examinations in RC. However, no imaging modality has been proved to be reliable to predict TDs (9)(10)(11)(12). Currently, the diagnosis of TDs still depends on the pathology after surgery, which is not conducive to the early evaluation of tumor characteristics (9). In recent years, radiomics has attained ability of processing medical images and understanding information invisible to human eyes, and it has been widely used in tumor research. Chen et al. (11) and Yang et al. (12) established radiomics models based on ultrasound or MRI for predicting TDs. However, the sample sizes in these studies were small (TDs + : [23][24][25][26][27][28][29][30][31][32][33][34][35][36][37][38][39][40]. At present, there is still a lack of CT-based radiomics research in this field. Therefore, we aimed to evaluate predictive value of CT-based radiomics for TDs prediction in a bigger cohort of RC patients.

Patients
This study was approved by the local Institutional Review Boards (No. 2019-1159, Date: 2019/12/26), and the need for written informed consent was waived.
The institutional database of medical records was searched for suitable patients between December 2017 and December 2019. A total of 254 patients with pathologically confirmed RC (mean age 59.2 years, age range 32-86 years) were finally enrolled according to the following criteria. The inclusion criteria: (1) Patients with pathologically confirmed RC; (2) Sufficient clinical data [e.g., carcinoembryonic antigen (CEA), carbohydrate antigen (CA) 19-9, and CA125]; (3) no prior therapy before surgery. The exclusion criteria: (1) CT scanning was not performed (n = 234); (2) Image quality was poor (n = 2); (3) Lack of tumor markers (n = 8); (4) Patients with other malignant tumors besides RC; (5) Patients receiving neoadjuvant chemoradiotherapy (nCRT) (n = 73). The flowchart of patient recruitment is shown in Figure 1. The baseline characteristics and pathological data of patients are listed in Table 1. The patients were divided into two groups, namely the training set (n = 203) and the validation set (n = 51), at a ratio of 8:2 according to the scanning date.

CT Examination
In our hospital, the chest-abdomen-pelvis contrast-enhanced CT is routinely used in patients with clinically suspected RC for evaluating the primary tumor and metastasis. In this study, CT scanning was performed on a 128-MDCT scanner (Somatom Definition AS+, Siemens Healthcare Sector, Forchheim, Germany) and a dual-source CT system (Somatom Definition Flash, Siemens Healthcare Sector, Forchheim, Germany). Both CT scanners used the same main parameters, as shown in Supplementary Material. The radiomics features were extracted from the portal venous phase images.

Reference Standard for Pathology
TDs were pathologically proven based on surgical specimens. Pathological confirmatory reports were acquired from medical records of the Department of Pathology. The numbers of LN and TDs were calculated and reported in the pathological reports.

CT Evaluation
Two experienced radiologists (10 and 5 years' experience in the diagnosis of RC) were assigned to review CT images, without any patient identification and clinicopathological information. Because of limited ability of CT to distinguish T1 from T2 lesions, T1 and T2 lesions were classified as one group (T1-2 group). The nodules with diameter > 3 mm within the lymphatic drainage space of RC on CT images were defined as peritumoral nodules. The interobserver reliability of CT-reported T stage (cT) and peritumoral nodules (+/-) was evaluated by a weighted kappa statistics test. Then any disagreement between the two readers was solved by discussion during the image interpretation. The results of cT and peritumoral nodules (+/-) are shown in Table 1.

Feature Extraction and Model Building
The tumoral and peritumoral regions in all patients were separately drawn slice by slice to obtain intra-and peritumoral features ( Figure 2). The radiologists selected 20 patients randomly for evaluating feature stability. For the intra-class correlation analysis, one radiologist drew volumes of interest (VOI) twice (one month apart). The inter-observer correlation coefficient was calculated by comparing VOIs of radiologist 1 (first time) and radiologist 2. It is commonly admitted that intraand inter-class correlation coefficient (ICC) < 0.5 indicates poor reliability, 0.5 -0.75: moderate reliability, and > 0.75: good or excellent reliability (13). Thus, the features with ICC ≤ 0.75 were excluded.
The CT images were resampled to a pixel spacing of 1.0 mm in three anatomical directions. High-pass and low-pass wavelet filters, Laplacian of Gaussian (LoG) filters with different s parameters, and the other image transformation methods such as square, square root, logarithm, exponential, gradient, lbp2d, and lbp3 were employed to pre-process original images. Then, we extracted radiomics features (i.e., the first-order, shape, and texture features) by using PyRadiomics (14). The texture features included the following types: the gray-level co-occurrence matrix (GLCM), the gray-level run-length matrix (GLRLM), the graylevel size zone matrix (GLSZM), and the gray-level dependence matrix (GLDM). Finally, a total of 2107 features were extracted from original and filtered images. To eliminate the differences in the value scales, all features were normalized by the z-score analysis. Redundant features were randomly removed by correlation analysis with a threshold of 0.5. Then different feature-selection and machine-learning methods were combined to form 84 classifiers, as shown in Supplementary Material. The optimal parameters of radiomics were adjusted to output the best classifier (Rad-score). The Rad-score and clinical factors were assessed by the univariate logistic regression analysis. The features revealed as statistically significant were then involved into the multivariate logistic regression analysis for constructing the combined model. A nomogram was generated for the model visualization, graphical evaluation of variable importance, and the calculation of predictive accuracy. The Hosmer-Lemeshow test was performed to assess the goodness-of-fit of the nomogram. A calibration curve, obtained by plotting the actual TDs + probability against the nomogram-predicted probability of TDs + , was used to assess the calibration of the nomogram (15). Decision curve first introduced in 2006 by Vickers et al. (16) was used to evaluate clinical utility of the nomogram. The receiver operating characteristic curve (ROC) analysis was performed to assess the predictive performance of the models.

Statistical Analysis
Student's t test, non-parametric test, chi-squared test, and Fisher's exact test (where appropriate) were used to analyze differences of baseline characteristics in

Feature Selection and Model Building
For the consistency test of VOIs, 1490 tumoral and 1605 peritumoral features had good reliability with ICC > 0.75. Radscore involving 10 peritumoral and 3 tumoral features was finally established by the logistic regression analysis. The 13 features and their coefficients are shown in Supplementary Material. Rad-score had statistical difference between the TDs + and TDgroups (0.60 ± 0.19 vs 0.42 ± 0.20, P < 0.001).
A clinical model was composed of three factors selected by the logistic regression analysis, namely CA19-9, cT, and peritumoral nodules (+/-). The combined model was built by adding Radscore to the clinical model [odds ratio (OR) = 2.378 for Radscore, 2.281 for cT, 4.485 for peritumoral nodules (+/-), and 2.253 for CA19-9], as summarized in Table 2. Although volume and CEA were significantly different between the TDs + and TDsgroups, they were both excluded by the multivariate logistic regression analysis ( Table 2).
A nomogram was generated for visualizing the combined model ( Figure 3). In the nomogram, the point for each variable on the corresponding axis can be added to determine the risk of TDs + . Higher total score was associated with greater risk of TDs + . The combined model had a good fit according to the Hosmer-Lemeshow test (P = 0.642 > 0.05). The calibration curve of the nomogram demonstrated a good agreement between the predicted probability and actual observed probability ( Figure 4A), because the solid line was close to the reference line (dotted line). However, this model underestimated actual risk of TDs + (the range of the threshold probability: 30%-75%) and overestimated risk when threshold probability > 75%. The decision curve was performed to assess clinical usefulness of the combined model ( Figure 4B), showing that the combined model obtained more benefit than "treat all", "treat none", Rad-score, and the clinical model, when the threshold probability was between 18% and 70%.

Model Comparisons
The  Table 3 and Figure 5.

Subgroup Analyses
The results of subgroup analyses were listed in In TDs + group, there were 77 patients with 1-2 TDs and 40 patients with ≥ 3 TDs. The group with ≥ 3 TDs had higher values of both combined and clinical models than the 1-2 TDs group (P = 0.015 for combined model, and 0.08 for the clinical model). Moreover, the combined model outperformed the clinical model in both 1-2 and ≥ 3 TDs + groups when differentiating TDs + from TDspatients (both P = 0.005).
The patients with peritumoral nodules on imaging were all classified as clinical stage III in this study. The combined model had moderate diagnostic performance (AUC = 0.771, 95%CI: 0.701-0.831) in the stage III patients. As for patients without peritumoral nodules on imaging, the combined model also showed moderate diagnostic performance with an AUC of 0.751. As for patients with different pathological T stages, the combined model had similar AUCs between the T1-2 and T3-4 groups (0.740 and 0.789).

DISCUSSION
In this study, a combined model incorporating Rad-score, CA19-9, cT, and peritumoral nodules (+/-) was established based on CT in a bigger cohort (compared with the previous studies),      TDs are an important prognostic factor in RC. A metaanalysis reported that a total of 21 included studies all found a significantly worse prognosis in patients with TDs (3). Goldstein et al. (18) found that when patients with differing numbers of LNM were assessed separately, those with TDs still demonstrated a worse prognosis. For example, with one positive node 5-year survival was 62% with no TDs detected versus 44% with TDs. When six or more LNs were involved 5-year survival was 16% without TDs versus 3% with TDs. This result strongly suggests that the effect of TDs on prognosis is separate from that of LNM. Thus, preoperative prediction of TDs is of great significance to assess the prognosis of patients with LNM or without LNM (N1c). The selection of treatment strategies mostly depends on cancer staging. According to the eighth edition of the AJCC TNM staging system, the presence of TDs without LNM causes patients to be classified as N1c, and these patients are staged as III. That is, once TDs are present, nCRT is recommended. If TDs status is absent pretherapeutically, the treatment plan may be misguided.
Traditional imaging techniques, such as CT, MRI, and US, that depend on the naked eye cannot reliably assess the condition of TDs. Recently, radiomics has appeared as a potent tool for constructing decision-support models. Researchers have started to use radiomics to predict TDs in RC. Chen et al. (11) developed a ultrasound radiomics model with an AUC of 0.795 in a cohort of 127 patients (TDs + : n = 40). Yang et al. (12) established a MRIbased radiomics model in 139 RC patients (TDs + : n = 23), which had an AUC of 0.820. Our results showed a comparable AUC with the previous studies in a bigger cohort (254 patients; TDs + : n = 117). We included T stage in the combined model, which was consistent with Yang et al. (12). Different from Yang et al. (12) [two-dimensional (2D) region of interest (ROI)], we established the model based on 3D ROI, namely VOI. 2D ROI did not cover the whole lesion, and thus some information of tumor heterogeneity may be lost.
In this study, "peritumoral nodule" was defined as any nodule (diameter > 3 mm) within the lymphatic drainage space of RC, involving LNM and TDs. The CT-reported factors (i.e., cT and peritumoral nodule) were reviewed by two experienced radiologists, and thus reliable data were acquired. The volume in the TDs + group was larger than that of the TDsgroup (median: 15.1 vs 12.0 cm 3 ), which was consistent with the conclusion of Wei et al. (19). Although elevated CEA was found in the TDs + group, CEA was not included in the combined model. Peritumoral features accounted for the majority of features in Rad-score (10/ 13, 76.9%), suggesting the important role of environment around the rectum in the formation of TDs (20).
Although AJCC has not correlated a higher number of TDs with staging, which is unlike LNs (e.g., N1: 1 to 3 regional LNs, N2: ≥ 4 regional LNs) (8). Several authors have found a significant relationship between an increasing number and worsening of prognosis (18,21,22). For example, in patients with ≥ 3 TDs, none was alive at 5-year follow up. It is worthy of note that this is significantly worse than patients who had similar number of LNM (in fact even those with ≥ 6 positive LNs had a 5-year survival of 11%). In our study, the group with ≥ 3 TDs had higher value of the combined model than the 1-2 TDs group (P = 0.015), indicating that the combined model was helpful for predicting the number of TDs. Moreover, the N1c group had lower value of the combined model than the rest TDs + group (P = 0.002), suggesting possibility of the combined model for predicting N1c. In the future, a large multicenter study is certainly needed to confirm these observations.
The patients with peritumoral nodules on imaging were all classified as clinical stage III in this study. The combined model had moderate diagnostic performance in the stage II and III patients, indicating the good stability of the model. There were 78 patients without peritumoral nodules on imaging, in which 14 patients were TDs + . Because of the small sample size, the Our study had several limitations. First, the selection bias existed due to the retrospective design. Second, prospective and external validation was not performed. Third, because it is impossible to achieve one-to-one correspondence between pathological and radiological peritumoral nodules in this study, we delineated the whole peritumoral area in the lymphatic drainage space of RC. Finally, we excluded nodules with diameter < 3 mm on imaging, while there was still a risk of TDs in these small nodules (23).
In conclusion, the CT-based radiomics model is helpful for the preoperative prediction of TDs in RC patients.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding authors.