Delta Radiomics Can Predict Distant Metastasis in Locally Advanced Rectal Cancer: The Challenge to Personalize the Cure

Purpose Distant metastases are currently the main cause of treatment failure in locally advanced rectal cancer (LARC) patients. The aim of this research is to investigate a correlation between the variation of radiomics features using pre- and post-neoadjuvant chemoradiation (nCRT) magnetic resonance imaging (MRI) with 2 years distant metastasis (2yDM) rate in LARC patients. Methods and Materials Diagnostic pre- and post- nCRT MRI of LARC patients, treated in a single institution from May 2008 to June 2015 with an adequate follow-up time, were retrospectively collected. Gross tumor volumes (GTV) were contoured by an abdominal radiologist and blindly reviewed by a radiation oncologist expert in rectal cancer. The dataset was firstly randomly split into 90% training data, for features selection, and 10% testing data, for the validation. The final set of features after the selection was used to train 15 different classifiers using accuracy as target metric. The models’ performance was then assessed on the testing data and the best performing classifier was then selected, maximising the confusion matrix balanced accuracy (BA). Results Data regarding 213 LARC patients (36% female, 64% male) were collected. Overall 2yDM was 17%. A total of 2,606 features extracted from the pre- and post- nCRT GTV were tested and 4 features were selected after features selection process. Among the 15 tested classifiers, logistic regression proved to be the best performing one with a testing set BA, sensitivity and specificity of 78.5%, 71.4% and 85.7%, respectively. Conclusions This study supports a possible role of delta radiomics in predicting following occurrence of distant metastasis. Further studies including a consistent external validation are needed to confirm these results and allows to translate radiomics model in clinical practice. Future integration with clinical and molecular data will be mandatory to fully personalized treatment and follow-up approaches.


INTRODUCTION
Colorectal cancer is the third most incident malignancy and the fourth in cancer-related death, being more prevalent in regions with high human developmental index (1).
The combination of nCRT and surgery has improved local control (LC) of the disease in LARC patients, but it does not affect the disease-free (DFS) and overall survivals (OS) (3).
Recurrence in the form of distant metastases (mainly affecting the liver) is the main cause of treatment failure and near 25% of treated LARC patients develop metastases in 5 years (4,5). Early development of metastases (within 2 years) identifies biologically aggressive tumors and is considered a strong predictor of OS (3). Identification of patients with higher risk of developing distant metastasis within 2 years (2yDM) represents therefore a topic of great interest for the clinical community, as it could allow a more accurate personalized management, defining more strict clinical and imaging vigilance or even proposing more intensive treatments.
Mesorectal fascia involvement, depth of invasion, lymphovascular invasion and lymph node involvement currently represent key features that imply worse prognosis (6,7).
Similarly, pathological response predicts patient prognosis and outcome, regarding local or distant recurrence and OS (3).
Rectal cancer is a rather heterogeneous disease, both inter and intratumoral, in space and time, regarding histology, immunochemistry and genetic profiles. This heterogeneity in the tumor cell populations may explain the variability of biological behavior and response to therapy existing in rectal cancer (8,9). Tumor heterogeneity can be reflected in imaging, arising the opportunity of identifying imaging biomarkers that correlate with the tumor's biological behavior (10).
Magnetic resonance imaging (MRI) is the standard imaging technique for local staging and re-evaluation after nCRT in rectal cancer (11), although it still has some limitations in the clinical and pathological prediction staging (12,13).
The already existing data show that active oncological treatments can modify radiomics features, an approach known as "delta radiomics", and that the evaluation of these changes may successfully predict tumor behavior in terms of synchronous or metachronous distant metastasis (DM), DFS and OS (4, 14-16, 21, 24-26).
The purpose of this study is to assess the ability of the delta radiomics approach in predicting 2yDM in LARC, combining radiomics features extracted from staging and post-treatment MRI (24,27).

Population of Study
The target population were LARC patients treated with nCRT and subsequently addressed to surgery.
We retrospectively and consecutively selected patients from our institution, a national reference centre for rectal cancer treatment, between May 2008 and June 2015, who met the following inclusion criteria: (a) patients older than 18 years old; (b) with pathologically proven rectal adenocarcinoma (including the mucinous variant that was regarded as separate); (c) clinical stage T3-4 N0, T1-4 N1-2 or with mesorectal fascia involvement (MRF+) according to the AJCC TNM 7 th edition; (d) nCRT followed by surgery at our centre; (e) with both pre-treatment (staging) and post-treatment (reevaluation) MRI performed in our institution; (f) maximum intervals of 3 months between the end of nCRT and posttreatment MRI (14); (g) clinical and imaging follow-up of at least 3 years from surgery.
All patients underwent radiotherapy treatment with a prescribed total dose of 45 Gy (1.8 Gy/die) delivered on the whole mesorectum and the drainage nodal stations and a boost on the tumor plus corresponding mesorectum up to 55 Gy with simultaneous integrated boost (SIB) technique (2.2 Gy/die) or to 50.4 Gy in case of sequential boost.
The considered neoadjuvant chemotherapy regimens were: CapOx (60 mg/m 2 of iv oxaliplatin at the first day plus 1300 mg/ (die*m 2 ) of oral capecitabine, day to 1 st to 7 th , q7), capecitabine alone (1300 mg/m 2 day 1 st to 7 th or 1 st to 5 th q7 during radiotherapy), or 5-fluorouracil (225 mg/(mq*die) from 1 st to 7 th day q7 during radiotherapy) depending on clinical stage and clinical patients compliance.
Surgery was performed at from 8 to 12 weeks from the end of nCRT and included: anterior resection (AR), abdominal-perineal resection (APR), transanal endoscopic microsurgery (TEM).
Adjuvant chemotherapy was based on 5-fluorouracil or capecitabine with or without oxaliplatin.

MRI Protocol
All MRI images were acquired using 1.5 T scanners (Signa Excite, GE Medical Systems, Milwaukee, Wisconsin, USA), with a pelvic phased-array surface coil.
All patients were scanned in supine position. An enema of ultrasound gel (63 cm 3 ) to distend rectal lumen and limit luminal air and 20 mg of intramuscular hyoscine-N-butylbromide (Buscopan; Boehringer Ingelheim Italia, Florence, Italy), as antiperistaltic, were administered to reduce artefacts. All MRI followed the standard protocol of our centre for rectal cancer (T2-weighted FSE images in axial, coronal and sagittal planes, T2-weighted FSE 3D high-resolution images perpendicular to the tumor, and axial DWI using b values of 0 and 1000 s/ mm2) (12).
For radiomics analysis T2-weighted fast spin-echo 3D highresolution images acquired in a plane orthogonal to the tumor longitudinal axis were used, according to the fact that it is the main staging modality and its use for radiomics was previously widely explored in rectal cancer (11)(12)(13)(14)(15)(16)25).
Pixel spacing of these images was not greater than 0.8 mm and slice thickness was not higher than 3 mm.
For each patient, pre and post nCRT MRI were analyzed. MRI images were then uploaded on a radiotherapy delineation console (Eclipse, Varian Medical System ™ , Palo Alto, California, USA) for gross tumor volume (GTV) segmentation.
Gross tumor volumes (GTV) were delineated by an abdominal radiologist and blindly reviewed by a radiation oncologist (28).
Contouring and revision of MRI images were blinded with respect to all clinical data including the histology of the tumor, treatment received, surgical results and clinical evolution.
In case of disagreement between the two experts a final GTV was agreed on consensus.
Tumor response on MRI after nCRT was classified as "complete", "partial" or "stable". Complete response was considered when tumoral tissue had completely disappeared on the analyzed T2-weighted images, in absence of any suspicious residual tissue of intermediate signal or no residual hyperintense signal in DWI sequences (12,13). In these cases of apparent complete response at MRI, the former tumor bed was contoured.

Radiomic Analysis
Radiomics features were extracted from both pre-nCRT and post-nCRT MR images using an in-house developed radiomics software, called Moddicom (29,30). Different families of features were extracted: statistical, morphological, textural grey level cooccurrence matrix (GLCM), textural grey level run length matrix (GLRLM), textural grey level size zone matrix (GLSZM), and fractals. GTV extraction and filter application are shown in Figure 1.
Before extracting the statistical and textural radiomics features, a Laplacian of Gaussian (LOG) filter was applied to the MR images, considering 13 different sigma values in the range of 0.4-1.6 mm. Fractal features were calculated on the processed MR images, as described by Cusumano et al. (16).
Finally, pre-nCRT features were combined with the post-nCRT features to define the delta features as the ratio of the latter to the former, so that a value smaller (bigger) than 1 implies that the post-nCRT feature value has decreased (increased) with respect to the pre-nCRT value.

Statistical Analysis
For radiomics features analysis, the dataset was randomly split into 90% training and cross-validation data and 10% testing data. The time of distant metastases (DMs) was calculated as the difference between the surgery date and the last follow-up date or the date of metastases event. The analyzed outcome was 2yDM rate, defined as the occurrence of DM within 2 years from the date of the surgery.
Features selection was performed using a 5-folds crossvalidation method: the training set was divided in 5 combinations of 4 folds, the remaining fold (each unique combination of 20% training data set) was used for crossvalidation. For each combination, a univariate analysis using Wilcoxon Mann-Whitney test was performed if the variables did not show a normal distribution, and T-test was used instead if a normal distribution was observed. Features showing statistical significance (p<0.05) at least in three configurations were selected.
Correlation analysis among the selected features was then performed in terms of Pearson correlation coefficient, selecting only those with a correlation inferior to 30%.
The final set of features was used to train 15 different classifiers on the 5-fold partitioned training set, repeating the cross-validation 3 times and using the accuracy as target metric. The up-sampling method was used to handle the outcome class imbalance. The predictive performance of the trained models was then assessed on the testing data and the best performing classifier was chosen maximising the confusion matrix balanced accuracy. R statistical software version 3.4.4 was used for statistical analysis (R Core Team (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.).
Features selection, models training, and validation processes are shown in Figure 2.

RESULTS
From May 2008 to June 2015 out of 580 LARC patients consecutively treated, 213 patients (37%) met enrolment criteria. In fact, 186 patients (32%) were excluded because the staging and/or restaging MRI was performed in another institution; 105 patients (18%) were excluded because they underwent surgery in another institution; 76 patients (13%) had an inadequate follow-up period.
Details of patient characteristics, clinical and treatment features are summarized in Table 1. At a median follow-up of 61 (14-119) months, the 2yDM rate was 17% ( Figure 3) and the median OS of 65 months (13.7-118.8).
Kaplan-Meier estimator for DMs was computed as shown in Figure 3.

Feature and Model Selection
For the delta radiomics analysis, the described feature selection strategy selected 216 features on the first 5-fold run, 390 on the second, 168 on the third, 190 on the fourth, and 275 on the fifth (9, 16, 7, 8, and 11% of the total number of features, respectively). The selected features were 110, equal to 4% of the total. The correlation analysis identified 4 non-collinear features ( Table 2) which were used to train the 15 classifiers, whose results on the testing set were reported in Table 3.
Our previous experiences confirmed that the radiomics models based on staging MRI provide a relevant predictive tool to identify tumor behavior in terms of pCR after nCRT (14)(15)(16)24).
Despite the important effort made in terms of treatment response prediction, few experiences reported the relation between radiomics predictors and early distant recurrence (14)(15)(16).
In the framework of personalized medicine, a new radiomics approach is spreading in the scientific literature, called delta radiomics, which aims to elaborate predictive models analysing the variation of radiomics features extracted from images acquired before and after the treatment and including information regarding the response to the treatment of the individual patient.
In this context, some experiences tried to predict tumor behavior considering clinical features (4) or merged texture analysis features in addition with morphological MRI and histopathological parameters for both staging and post-nCRT MRI (21) or just on the staging MRI (25).
Liu et al. (26), investigated the predictive role of pre-nCRT MRI radiomics parameters to predict synchronous DM in 177 rectal cancer patients with an area under the curve of receiver operating characteristic of 0.827.
Liang et al. (31) analyzed the differences between metastatic and non-metastatic patients using a support vector machine and identifying MRI radiomics features able to predict metachronous liver metastasis in a cohort of 108 patients with an AUC of 0.87. Jeon et al. (32). identified a nomogram to predict, using deltaradiomics signatures, LR, DM and DFS on 101 patients (67 patients for model training, 34 for internal validation).
Our study was focused on the 2yDM prediction in patients affected by LARC based on a larger retrospective cohort than the one reported by Jeon et al.
Starting from clinical nomogram based on a pooled analysis (27) ypN stage, ypT stage, surgery procedure and adjuvant chemotherapy (CT) seem to contribute to DM prediction. Furthermore, the role of adjuvant CT is still controversial with only small benefit in high-risk group (33,34).
The value of a stronger predictive model of early systemic disease in LARC patients could help to identify the subset of patients with a higher risk of DM for a tailoring specific adjuvant treatment.
This personalized approach may allow avoiding unnecessary systemic toxicities for patients with low risk of DM, considering the small contribute of CT in this subset of patients.
On the other hand, treatment intensification, based on a multidrug combination or personalized approaches, could be designed for patients with a high risk of early development of DM.
Using a delta radiomics approach, with a logistic regression classifier, we built a model with a balanced accuracy, accuracy, specificity and sensitivity of 0.785, 0.809, 0.857, and 0.714, respectively.
There are several limitations in this study: first, the lack of an external validation with an independent dataset of patients, mandatory to confirm the applicability of the model in a cohort of patients from other institutions (35). The known variability in MRI acquisition parameters and the signal obtained from different patients, scanners and protocols, pose an additional challenge to the reproducibility of radiomic signatures and represent sources of uncertainty. In fact, despite for this study all MRI were acquired using the same protocol and the same MRI scanner, the applicability of this model is tightly linked to the opportunity to conduct an external validation with an independent dataset to confirm the MRI vendor-independency of deltaradiomics features, as previous confirmed for radiomics ones (15). Other limitations of our study are the lack of other prognostic, clinical, histological and genetic endpoints in the analysis, which would allow to perform a multivariate analysis and build a more robust hybrid predictive model. Despite the disclosed limitations, this paper shows the relevance of the delta radiomics approach to predict the subset of patients with a higher risk of 2yDM in a large single-institution cohort.
In conclusion, delta radiomics is a promising imaging biomarker that can estimate the disease's behavior in LARC, predicting the risk of early systemic recurrence. Early diagnosis of aggressive tumors may represent a significant added value in order to offer innovative personalized and tailored treatments, allowing physicians to guide their choices avoiding unjustified toxicity or preferring an intensified treatment when necessary.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
GC, PR-C, JL, CC, CM, DCu, DCa, and LB participated in developing the concept of this manuscript, imaging segmentation, data analysis, researching and writing, manuscript preparation, and approval of the final manuscript draft. ND, EM, AD, BB, RM, VV, and MG participated in researching and writing this manuscript, manuscript preparation, and approval of the final manuscript draft. All authors contributed to the article and approved the submitted version.

FUNDING
The participation of one of the authors was supported by an ESOR-Bracco Research Grant.