MRI Radiomics Signature as a Potential Biomarker for Predicting KRAS Status in Locally Advanced Rectal Cancer Patients

Background and Purpose Locally advanced rectal cancer (LARC) is a heterogeneous disease with little information about KRAS status and image features. The purpose of this study was to analyze the association between T2 magnetic resonance imaging (MRI) radiomics features and KRAS status in LARC patients. Material and Methods Eighty-three patients with KRAS status information and T2 MRI images between 2012.05 and 2019.09 were included. Least absolute shrinkage and selection operator (LASSO) regression was performed to assess the associations between features and gene status. The patients were divided 7:3 into training and validation sets. The C-index and the average area under the receiver operator characteristic curve (AUC) were used for performance evaluation. Results The clinical characteristics of 83 patients in the KRAS mutant and wild-type cohorts were balanced. Forty-two (50.6%) patients had KRAS mutations, and 41 (49.4%) patients had wild-type KRAS. A total of 253 radiomics features were extracted from the T2-MRI images of LARC patients. One radiomic feature named X.LL_scaled_std, a standard deviation value of scaled wavelet-transformed low-pass channel filter, was selected from 253 features (P=0.019). The radiomics-based C-index values were 0.801 (95% CI: 0.772-0.830) and 0.703 (95% CI: 0.620-0.786) in the training and validation sets, respectively. Conclusion Radiomics features could differentiate KRAS status in LARC patients based on T2-MRI images. Further validation in a larger dataset is necessary in the future.


INTRODUCTION
Colorectal cancer (CRC) is one of the most prevalent cancers worldwide, and locally advanced rectal cancer (LARC) shows strong heterogeneity in real-world medical practice. The best treatment strategy for LARC patients still depends on the findings of further clinical trials.
KRAS mutation status has a strong relationship with the prognosis of CRC patients. In rectal cancer patients, KRAS mutant (KRAS-mut) patients have a worse prognosis (1), which emphasizes the importance of detecting KRAS status for prognostic evaluation and treatment strategy selection. Among metastatic CRC patients, RAS mutation is a negative predictive biomarker for treatment with epidermal growth factor receptor (EGFR) antibody therapies such as cetuximab and panitumumab (2). The role of KRAS status in stage III CRC patients is still being investigated. Years ago, researchers held the position that KRAS status was not associated with worse overall survival (OS) or disease-free survival (DFS) (3). With follow-up data maturing and treatments evolving, more studies are challenging this opinion based on the findings that KRAS-mut patients have worse OS and DFS (4,5). Notably, most of these studies were conducted in CRC patients, and the number of patients with KRAS mutations was limited because their main research objective was immune-related biomarkers. As a result, the effect of targeted therapy in LARC patients remains unclear. From limited clinical trials, KRAS status was shown to be a significant predictor in multivariate analysis, and KRAS-mut patients had a worse response to neoadjuvant radiochemotherapy with worse OS than KRAS wild-type (KRAS-wild) patients (1,(6)(7)(8). Hence, information on KRAS mutation status has great meaning for physicians in predicting patient response to neoadjuvant chemotherapy and prognosis in practical medical treatment.
Because physicians will choose a targeted treatment strategy for metastatic CRC patients depending on KRAS status, efforts to obtain KRAS status from radiological images have been ongoing for years. To avoid invasive operations, an increasing number of studies on KRAS status and radiological image characteristics have been reported. For decades, several kinds of studies have been conducted on computed tomography (CT) (9)-based, positron emission tomography-CT (PET-CT) (10-17)-based and magnetic resonance imaging (MRI) (18)-based texture features to assess the relationships between genetic mutations and CRC metastatic rectal cancer patients (19). However, the results remain unstable and conflicting, and it is still unfortunate that the effects various radiological technologies remain unknown. Moreover, LARC patients are quite different from metastatic CRC patients in terms of treatment strategies and biological characteristics, especially the KRAS status. Therefore, specific studies on LARC patients deserve more attention.
Radiomics is a rapidly developing image acquisition and analysis technology that is used in various kinds of medical evaluations, especially in the diagnosis and prognosis of patients as well as the classification of different genotypes (20)(21)(22). As the first study focused on LARC patients, this study aimed to investigate whether MRI radiomics can predict KRAS status in LARC patients.

Patient Profiles
A retrospective study of 83 LARC patients was performed. All patients had undergone an MRI examination of the primary tumor and RAS mutation analysis from our center. The inclusion criteria were as follows: (1) the primary tumor was proven to be rectal adenocarcinoma by biopsy; (2) MRI images could be acquired from our image database; and (3) clinical and treatment information could be acquired from our database. This study was approved by the Institutional Review Board of Fudan University Shanghai Cancer Center.

MRI Image Acquisition
The primary tumor was imaged in a 3.0 Tesla (T) MRI (Signa Horizon, GE Medical Systems, Milwaukee, WI) using a phasedarray body coil. The standard imaging protocol consisted of a sagittal T2-weighted (T2W) fast spin-echo image and an oblique axial thin-section T2W image, which was used for contouring the primary tumor.

Radiomic Feature Extraction
Regions of interest (ROIs) were distinguished from axial thinsection T2WI images and segmented by two experienced radiation oncologists (4 and 7 years of experience) in MIM software. The gross tumor was included in image delineation, and the air inside the rectum was carefully excluded.
The DICOM images and structure were sent to MATLAB (Math Works Inc.) for radiomics feature calculation and analysis. A total of 253 features were extracted from the ROI images. The features included grey features, texture features, shape features, fractal dimension features, and wavelet features. The detailed algorithm of these features was described by an updated quantitative radiomics standard from Alex (23).

Feature Selection and Model Building
Clinical and radiomics features were extracted from the clinical database and DICOM images of the patients. For clinical features, the chi-square test was performed to compare the differences between two cohorts based on KRAS status. For features from T2WI images, the least absolute shrinkage and selection operator (LASSO) regression algorithm was performed for predictive feature selection and model establishment. The LASSO algorithm is a widely used method for the dimensionality reduction of high-dimensional data in artificial intelligence research and radiomics studies. Selected radiomics features were calculated for the radiomics score (rad-score) based on linear regression in the training cohort, and the formula was used in the validation cohort for rad-score calculation.

Statistical Analysis
The distribution of continuous numeric data was affirmed by the Shapiro-Wilk test. The comparison of continuous numeric data was ascertained by the Kolmogorov-Smirnov test, and categorical data were compared by the chi-square test. The area under the curve (AUC) was used to depict the predictive accuracy of the model. The training set and validation set were divided according to a 7:3 ratio, and the concordance index (Cindex) was presented for the result. The C-index can calculate the concordance of the model prediction and actual condition, whose value equals the AUC of the receiver operator characteristic (ROC) curve. And the decision curve analysis (DCA) was also applied. The best cut-off value was based on Youden's index. A p-value <0.05 (z-value of 1.96) was considered statistically significant.
The packages involved in our research were listed as follow: tableone, MASS for table on creation, caret, lattice, dplyr, glmnet for data analysis and model building, ggplot2, pROC and rmda were used for result visualization and DCA analysis.

RESULT Patient Characteristics
The summary profile of this research was shown in Figure 1. A total of 83 LARC patients were included in this study. Fifty-one (61.4%) of these patients were male, and the median age was 55 years, with a range of 29 to 87 years. Among all the patients, 74 (89.2%) were in stage III, and 7 (8.4%) patients were managed with a watch and wait (W&W) strategy. Seventy-six (91.6%) patients received neoadjuvant chemoradiation therapy, and 71 (87.7%) patients underwent surgery. For mutation status, 41 (49.4%) patients had mutations in the KRAS gene, and 2 (97.6%) patients had mutations in the NRAS and BRAF genes. The detailed characteristics are displayed in Table 1.
The patients were divided into two categories based on KRAS status. For the overall clinical features, no obvious baseline differences were observed between the two cohorts (the details are displayed in Tables 1 and 2).

MR Radiomic Analysis
After regression, one radiomic predictor was selected from 253 texture features. This feature is listed in Table 3. Figure 2 presents the tuning parameter (l) and the coefficient of LASSO regression. Figure 2 presents the distribution of the selected parameter, X.LL_scaled_std, which is the standard deviation value of the scaled wavelet-transformed low-pass channel filter.

Characteristics of the Patients in the Training and Validation Sets
Based on the random selection of KRAS-mut and KRAS-wild patients, 59 (70%) patients were distributed to the training set, and 24 (30%) patients were distributed to the validation set. In the training set, there was no significant difference in the baseline information obtained based on the KRAS status cohort, but some differences appeared after neoadjuvant chemoradiation therapy according to the curative effect, as the ypTNM stage. In the

Model Efficacy in the Training Set and Validation Set
In the training set, the predictive model achieved a C-index of 0.801 (95% confidence interval (CI) 0.772-0.830) based on 59 patients' radiomic image data. The sensitivity and specificity for differentiating tumors with mutant KRAS status from those with wild-type status were 64% and 85.3%, respectively, based on the cut-off value of 0.452. In the validation set, this model achieved a C-index of 0.703 (95% CI 0.620-0.786), which was shown in Figure 3. The sensitivity and specificity for differentiation were 43.8% and 100%, respectively, based on the cut-off value of 0.365. The detailed information was listed in Table 5.  Figure 4, which shows that patients with high prediction values had KRAS-mut status based on our prediction.

DISCUSSION
With years of development of targeted therapy, the targeted therapy strategy based on KRAS status has changed substantially. According to the treatment recommendation of the European    (27). This finding hints that the determination of KRAS status is still important in LARC patients. Nevertheless, the crucial role of KRAS has been reported for years, and the result of gene status can be revealed by only biopsy samples from colonoscopy or surgery in medical practice. Our research aims to detect KRAS status by radiomic to provide earlier information on gene expression as a noninvasive medical practice for patients.
To explore the value of radiomic features, we choose the T2-MRI images for radiomic features selection. As the treatments involving, MRI images have become the necessary tool for cancer staging. Because MRI images have the excellent ability for lymph node recognition, for neoadjuvant treatment selection, LARC patients are recommended to receive MRI examination at first diagnosis (28). Except for the great accessibility of MRI images, compared to other radiological tools, MRI images can also provide distinct tissue contrast for biological information and tumor border delineation.
We have found the value of X.LL_scaled_std, which can differentiate KRAS status with the best performance. This value A B FIGURE 2 | (A) Text features were selected by the LASSO regression model. The performance of the radiomics signature was assessed by the ROC curve and Cindex. Tuning parameter (l) selection used ten-fold cross-validation via the minimum criteria. The optimal value was calculated by the minimum criteria and the 1standard error of the minimum criteria (the 1-SE criteria). A l of 0.1782 with log(l) -1.75562 was chosen. (B) A LASSO coefficient profile plot was produced against the log(l) sequence. In addition, one radiomics feature was selected. was calculated to describe the standard deviation of the scaled wavelet-transformed low-pass channel filter. From the result, the higher value was observed in the KRAS mutant cohort. This deviation, as a value that can not detect visually, performed the heterogeneity of the ROI images. Previous research also revealed that higher heterogeneity can be observed in KRAS mutant tumor images, and they also found some value implied the shape characteristic of the tumor, not in our research (29). We believe that the morphological heterogeneity correlated to image reader strongly and tumor stage closely, which needs more researches to determine the delineation standard of ROI, and the role of shape will be clear. Based on the value we found, the effect of our model is also comparable to other studies based on T2-images in rectal cancer. The prediction based on our research yielded a C-index of 0.703 (95% CI 0.620-0.786), Cui and his colleague got the AUC of 0.682 (95% CI 0.569-0.794) with 0.714 (95% CI 0.602-0.827) in their validation sets (29), and 0.886 from one dataset of oh and his colleagues (30). The researches based on T2-MRI images got a similar ability in the prediction of KRAS status, and some other studies have also focused on the same topic.
From the view of PET-CT, Pierre et al. assessed PET-CT for standardized uptake value (SUV), maximum SUV (SUVmax), mean SUV, skewness, SUV standard deviation, and SUV coefficient of variation (SUVcov). Both SUVcov and SUVmax showed an AUC of 0.65 (17). PET-CT is a great instrument for metabolic demonstration, and some studies presented a relationship between glucose metabolism and RAS status (31). In Pierre's research, SUVmax was the most distinct parameter for KRAS status; in patients with KRAS mutations, SUVmax  presented a higher latitude of elevation. However, these data did not reveal the same correlation between SUVmax and KRAS status (12,13). SUVcov was also a latent parameter for KRAS recognition in the PET-CT results. Even though the predictive efficacy of treatment based on SUVcov baseline has been shown for neoadjuvant rectal cancer treatment (32), the whole PET-CT parameters show a low sensitivity and specificity of 0.66 (95% CI 0.60-0.73) and 0.67 (95% CI 0.62-0.72) (14), respectively. In    summary, PET-CT is a direct demonstration of tumor metabolism but still cannot uncover the strong relationship between the parameters of SUV and KRAS status based on the current evidence.
In addition to studies on PET-CT, some researchers have also focused on CT images and gene characteristics. Lei Yang (9) tried to use CT-based radiomics signatures to predict gene mutations. In their study, five feature sets were extracted from the primary set that was established for model building. The five feature sets included the shape set, grey-level histogram feature set, grey-level co-occurrence matrix feature set, grey-level run-length matrix feature set, and overall feature set. For the validation of the CTbased model, the accuracy of the validation cohort was 0.750 (95% CI, 0.623-0.845), with a sensitivity of 0.686 and a specificity of 0.857. The value of radiomics was highly related to genetic mutations, with P<0.001 and odds ratio (OR) 11.18 (95% CI, 2.88-43.46) in the validation cohort.
Most of these studies focused on CRC patients, and some studies focused on rectal cancer for further research. Yang tried to differentiate KRAS status by CT-based radiomics signatures, and the AUC was 0.829 in the validation set (9). Xu summarized the KRAS-related features in rectal cancer. The mean values of six texture parameters were significantly higher in the KRAS-mut group than in the KRAS-wild group. The AUC values of the texture features ranged from 0.703 to 0.813 and used T2-MRI radiomics to predict KRAS status, and they had an accuracy of 81.7% for the decision tree (18). However, the sample size of their research was 60, and 12% of patients were stage IV (M1), so it is limited in sample size and cohort consistency.
LARC patients have specific clinical characteristics, and T2-MRI radiomics features deserve more exploration based on the limited study focus on such technology.
Our study also has some limitations. First, external validation needs to be performed in the future to consolidate the results. Second, in addition to radiomics, deep learning and other artificial intelligence technologies could be used in image data analysis and model establishment, which may further improve the results. Third, more MRI images with latent bio-information, for example, enhanced sequence and DWI can be achieved for further exploration with KRAS status, which may increase the predictive precision.
To summarize, our study focused on the exploration of the relationship between T2-MRI and KRAS status in LARC patients. We present the strong value of radiomics in the prediction of KRAS status before neoadjuvant chemoradiation therapy and provide a non-invasive method for further targeted therapy strategy selection.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material. Further inquiries can be directed to the corresponding author.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the ethics committee of the Shanghai Cancer Center.