CT Radiomics and Machine-Learning Models for Predicting Tumor-Stroma Ratio in Patients With Pancreatic Ductal Adenocarcinoma

Purpose To develop and validate a machine learning classifier based on multidetector computed tomography (MDCT), for the preoperative prediction of tumor–stroma ratio (TSR) expression in patients with pancreatic ductal adenocarcinoma (PDAC). Materials and Methods In this retrospective study, 227 patients with PDAC underwent an MDCT scan and surgical resection. We quantified the TSR by using hematoxylin and eosin staining and extracted 1409 arterial and portal venous phase radiomics features for each patient, respectively. Moreover, we used the least absolute shrinkage and selection operator logistic regression algorithm to reduce the features. The extreme gradient boosting (XGBoost) was developed using a training set consisting of 167 consecutive patients, admitted between December 2016 and December 2017. The model was validated in 60 consecutive patients, admitted between January 2018 and April 2018. We determined the XGBoost classifier performance based on its discriminative ability, calibration, and clinical utility. Results We observed low and high TSR in 91 (40.09%) and 136 (59.91%) patients, respectively. A log-rank test revealed significantly longer survival for patients in the TSR-low group than those in the TSR-high group. The prediction model revealed good discrimination in the training (area under the curve [AUC]= 0.93) and moderate discrimination in the validation set (AUC= 0.63). While the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value for the training set were 94.06%, 81.82%, 0.89, 0.89, and 0.90, respectively, those for the validation set were 85.71%, 48.00%, 0.70, 0.70, and 0.71, respectively. Conclusions The CT radiomics-based XGBoost classifier provides a potentially valuable noninvasive tool to predict TSR in patients with PDAC and optimize risk stratification.


INTRODUCTION
Pancreatic ductal adenocarcinoma (PDAC) is a challenging disease. Considering all stages of this disease, it has the worst prognosis of all major tumor types in humans, with a five-year survival rate of 9% (1). Surgical resection combined with systemic chemotherapy facilitates the only chance of long-term survival. Moreover, decisions on surgery and adjuvant treatment should be based on an assessment of the tumor stage and surgery-related risks (2,3). However, patients with similar tumor stages based on the TNM categories have extremely different clinical outcomes (4). This necessitates better biomarkers and tools to predict the treatment response and prognosis, optimize risk stratification, and assist clinicians during decision-making.
The interaction between tumor cells and their microenvironment has gained attention in the past decade. The tumor microenvironment (TME) comprises complex mixtures of nontumor cells, which play an important role in tumorigenesis, development, metastasis, and drug resistance (5,6). The tumor stroma promotes tumor progression by producing various nutrients, growth factors, chemokines, and cytokines. Tumorstroma ratio (TSR) refers to the ratio of tumor cells to the surrounding stroma. Furthermore, it is the most popular macroscopic index that evaluates the TME (7). TSR is reportedly an independent prognostic factor for various solid tumors, including breast cancer (8), lung adenocarcinoma (9), gastric cancer (10), colorectal cancer (11), and pancreatic cancer (PC) (12). Therefore, an evaluation of TSR before decision-making contributes to an accurate risk stratification and facilitates accurate individualized treatment (13).
The evaluation of TSR is usually performed on sections of surgical specimens stained with hematoxylin and eosin (H&E). It is determined by the area with the highest proportion of stroma in the most invasive site (14). Therefore, it is difficult to determine the interstitial state without surgery. Clinicians are unable to accurately evaluate TSR through needle biopsies in patients with advanced PC (15). This can be attributed to the small amount of tissue obtained and the spatial heterogeneity of the tumor. Imaging examination can be an effective and noninvasive method to evaluate the microenvironment of PC. Most related studies have explored the correlation between TSR and conventional imaging parameters in patients with PDAC; however, they did not explore their diagnostic performance (16,17). Hence, a non-invasive and repeatable method for preoperative TSR evaluation of PDAC is needed.
Medical images can not only reflect the macroscopic characteristics but also the cellular and molecular characteristics of the tissue. In clinical practice, only one-or two-dimensional information that reflects macroscopic characteristics, such as tumor size, location, and attenuation, can be obtained. Radiomics can transform the imaging data into a high-dimensional feature space and use it to describe the tumor phenotype in depth (18,19). In this study, we used computed tomography (CT) images to extract high-dimensional radiomics features from PDAC, evaluate their relation to PC TSR, and their diagnostic efficacy in patients with PDAC.

Patients
This retrospective single-center cross-sectional study was reviewed and approved by the Biomedical Research Ethics Committee of our institution. The requirement of informed consent from patients was waived by the Institutional Review Board. We obtained the data from consecutive patients who had been treated for PC at our institution between December 2016 and April 2018 ( Figure 1).
The inclusion criteria were as follows (1): patients who had undergone a surgical treatment and (2) patients in whom PDAC had been pathologically confirmed. In contrast, the exclusion criteria were as follows (1): patients who underwent treatment of any type (radiotherapy, chemotherapy, or chemoradiotherapy) before the imaging studies (2), patients who had not been evaluated by contrast-enhanced multidetector computed tomography (MDCT) within a week before the surgery, or (3)

CT Scanning
We performed multiphasic CT with a pancreas-specific protocol using 320-slice multidetector-row CT scanners (Aquilion ONE, Canon Medical Systems, Tokyo, Japan). The CT parameters were as follows: 120 kV; effective mAs, 150; beam collimation, 160×0.5 mm; matrix, 350×350; and gantry rotation time, 0.5 s. We conducted a non-enhanced CT, followed by a dynamic contrast-enhanced CT scan. The scan delay time was determined according to the test bolus. We injected the contrast agent (90-95 mL of 355 mgI/mL iopromide; Ultravist 370, Bayer Schering Pharma, Berlin, Germany) at a rate of 5.5 mL/s with a power injector (Medrad Mark V plus, Bayer, Leverkusen, Germany) via the forearm vein, followed by an injection of 98 mL of normal saline to irrigate the tube. Following the injection, we performed contrast-enhanced CT in the arterial (20-25 s), portal venous (60-70 s), and delayed (110-130 s) phases. The slice thickness/intervals of CT were 0.8/1.0 mm, respectively. The scanning range extended from the level of the diaphragm to that of the pelvis.

Pathological Image Analysis
We standardized the pathological examination and analysis as described previously (20). We sliced the entire specimen into 5-mm thick sections, resulting in 10-35 (average, 24.5 ± 6.7) formalin-fixed paraffin-embedded (FFPE) blocks for each specimen. Subsequently, we cut each FFPE block into 4-µm-thick sections on whole-tissue glass slides measuring 7.6×5.2 cm 2 . Slides stained with H&E were scanned using a Hamamatsu whole slide scanner (NanoZoomer S60, Hamamatsu Healthcare, Japanese) to obtain digitalized whole-mount slide images (DWMSIs), with an average file size of 6.47 GB. Moreover, we could observe DWMSIs using NanoZoomer Digital Pathology view2 software version 2.7.25. The TSR was determined in all patients with available DWMSIs. We semiquantitatively assessed the percentages of epithelial and stromal components using the mean value of medium power fields at 100× magnification of the entire tumor scope on DWMSIs (range, 2-3) with a tumor identified at 200× magnification. The TSR was estimated at 5/5, 6/4, 7/3, 8/2, 9/1. Two senior pathologists with 30 and 20 years of experience in pancreatic pathology independently scored the TSR. They resolved any disagreement by discussion. We had determined "5/5 (1)" as the best cut-off value of TSR for prognosis discrimination. Hence, TSR>1 denoted a low stromal component. In contrast, TSR ≤ 1 indicated a high stromal component.

Radiological Imaging Analysis
We used original cross-sectional arterial and portal venous phase images for the analysis. All images were analyzed by two abdominal radiologists with 30 and 10 years of experience, respectively. They were blinded to the clinical and pathological details. Moreover, the final results were determined by a consensus.
All tumors were evaluated for the following characteristics (1): CT-reported tumor size [i.e., the maximum cross-sectional diameter of the tumor (22)] (2); tumor location: pancreatic head, body and tail (3); pancreatitis identified by the stranding of the peripancreatic fat tissue, ill-defined parenchymal contours, and fluid collections in the peripancreatic region (4); pancreatic duct cut-off and dilation (>3 mm) (5); common bile duct cut-off and dilation (>10 mm) (6); parenchymal atrophy (7); contour abnormality (8); cyst: the presence of pseudocysts and retention cysts; and (9) vascular invasion: an invasion of the common hepatic artery, splenic artery and vein, celiac artery trunk, gastroduodenal artery, superior mesenteric artery and vein, and portal venous vein. The criteria included vessel occlusion, stenosis, or more than half of the perimeter being in contact with the tumor.

Radiomics Workflow
The radiomics workflow included the following stages (1): image segmentation (2), feature extraction, and (3) feature reduction and selection. The detailed method has been described in a previous study ( Figure 2) (23).
We used the draw tool, available in the Editor module of 3D Slicer version 4.8.1 (open-source software; https://www.slicer.org/), to delineate the tumors in multiple slices. We extracted the volume of interest for each patient by stacking the corresponding regions of interest (ROIs), delineated slice-by-slice. Radiomics feature extraction was performed using the open-source Python package Pyradiomics 1.2.0 (http://www.radiomics.io/pyradiomics.html) (24). We used the following two classes of feature extraction methods: original feature and filter class. The latter included the following seven categories: logarithm, exponential, gradient, square, square root, lbp-2D, and wavelet. We extracted a total of 1,409 two and three-dimensional features from primary tumors in the arterial and portal venous phase and classified them into seven groups as follows: (a) first-order statistics, (b) shape features, (c) gray-level cooccurrence matrix features, (d) gray-level dependence matrix features, (e) gray-level run-length matrix features, (f) gray-level size-zone matrix features, and (g) neighborhood gray-zone difference matrix features. Feature selection comprised the following three steps: variance analysis, Spearman correlation analysis, and LASSO logistic regression algorithm. Finally, a radiomics score (rad-score) was calculated for each patient via a linear combination of selected features that were weighted by their respective coefficients.
Two radiologists (readers 1 and 2) performed the ROI segmentation in a blinded fashion to assess the interobserver reliability. Reader 1 repeated the feature extraction twice during a one-week period to evaluate the intraobserver reliability. Moreover, the reader completed the remaining image segmentations. The readout sessions were conducted over two weeks. The inter-and intraobserver reliability were assessed by obtaining the intraclass correlation coefficient (ICC). ICC values >0.75 were selected for the subsequent investigation.

Statistical Analyses
We conducted normal distribution and variance homogeneity tests on all continuous variables. While those with a normal distribution are expressed as the mean and standard deviation, those with nonnormal distributions are expressed as medians and ranges. We evaluated the overall survival (OS). While deaths were set as events, deaths attributed to other causes were set as censored observations. We calculated survival times from the date of diagnosis to the time of death or the end of follow-up (August 1, 2020). Initially, we classified all patients into two groups, namely TSR-low and TSRhigh group. We examined the differences in all variables between the groups. We conducted the student's t-test (normal distribution), Kruskal-Wallis H test (skewed distribution), and chi-square test (categorical variables) to determine the intergroup statistical differences. The rad-scores were subsequently constructed by the least absolute shrinkage and selection operator (Lasso) regression. Moreover, we constructed the prediction model by extreme gradient boosting (XGBoost). XGBoost was performed using R software supplemented with the XGBoost package. The discrimination of the models was evaluated by the receiver operating characteristic (ROC) curves, and the area under the curve (AUC) was calculated concurrently. We assessed the calibration of the model using the calibration curves and Hosmer-Lemeshow test. Furthermore, we grouped the patients according to the prediction results of the XGBoost classifier. Kaplan-Meier estimates were applied to plot the survival curves, and the log-rank test was performed to analyze the differences between the curves. Moreover, we determined the clinical usefulness of the model with a decision-curve analysis by quantifying the net benefit at different threshold probabilities.
A two-tailed p-value <0.05 was considered statistically significant. All analyses were performed using R software (version 3.3.3, The R Foundation for Statistical Computing, Vienna, Austria).

Clinical Characteristics
There were 91 (40.09%) and 136 patients (59.91%) in the TSRlow and TSR-high groups, respectively. However, 44 and 83 patients had died in the TSR-low and TSR-high groups, respectively. The Kaplan-Meier curves of the two groups were significantly distinct (p=0.002) (Figure 3) Among the clinical, pathological, and imaging characteristics, there were significant between-group differences in the T category in the training and validation set, and bile invasion in the training set. Table 1 summarizes the patient characteristics.

Radiomics Analysis
In total, 1,409 radiomics features were extracted from portal-phase CT scans. The interobserver ICCs were good, ranging from 0.79 to 0.89. Likewise, the intraobserver ICCs were also good, ranging from 0.80 to 0.91. However, we excluded the radiomics features that did not significantly differ between the groups or did not show significant correlations with TSR expression. The remaining 25 radiomics features were further reduced using a Lasso logistic regression model. We eventually reduced the radiomics characteristics to 12 features (Figures 4A, B). Moreover, we used the Lasso logistic regression formula to obtain the rad-score ( Table 2). The rad-score was significantly lower (p<0.001) in the TSR-high group (median: 0.24; range: -0.72-1.30) than in the TSRlow group (median: 0.52; range: -0.15-2.16) ( Figure 4C).

Apparent Performance of the XGBoost Classifier
We developed the XGBoost classifier using the rad-score and tumor size. Figure 5 depicts the performance of the prediction model. Forty-nine patients were accurately predicted among 66 patients (74.24%, 49/66) in the TSR-low group, whereas 98 patients (97.03%, 98/101) were accurately predicted among 101 patients in the TSR-high group using the XGBoost classifier in the training set ( Figure 6A). In contrast, 12 patients were accurately predicted among 25 patients (48.00%, 12/25) in the TSR-low group, and 29 patients (82.85%, 29/35) were accurately predicted among 35 patients in the TSR-high group using the XGBoost classifier in the validation set. (Figure 6B). The XGBoost classifier predicted an association between TSR and OS (training set: p=0.03, validation set: p=0.04) (Figures 6C, D).

Clinical Utility of the XGBoost Classifier
Figures 7C, D outline the decision curves of the prediction model. The prediction model offered greater benefit than the treat-all-patients as high TSR expression scheme or the treatnone as low TSR expression scheme, with a threshold probability >0.06 in the training set. Moreover, the prediction model offered greater benefits than the aforementioned expression schemes, with a threshold probability between 0.29 and 0.63 in the validation set.

DISCUSSION
This is the first study wherein CT radiomics features were used to evaluate the TSR content in the tumors of patients with PDAC. The TSR in this study was more accurate than the traditional method (1). The area of whole-tissue glass slide was 7.6×5.2 cm 2 , which was different from the traditional slide area (7.6 ×2.6 cm 2 ) (2). We quantified TSR of the whole slide using DWMSIs, which was different with the microscope at a 100x magnification by the pathologists. The survival duration in the TSR-low group was significantly longer than that in the TSR-high group. Furthermore, we predicted the tumor TSR using the XGBoost classifier that incorporates 12 CT radiomics features and the  tumor size. The XGBoost classifier demonstrated favorable discrimination in the training set, but decreased in the validation set.
With the revelation of additional TME mechanisms in determining tumor invasiveness, TSR, as a complete morphological feature of the TME, has been widely confirmed as an independent prognostic factor for various solid cancers. Numerous studies have mentioned that low TSR in gastric cancer, breast cancer, colorectal cancer, and lung cancer, among others, indicate cancer metastasis and poor prognosis (8)(9)(10)(11). However, the effect of stromal content on the prognosis of patients with PC is controversial. According to Shi et al. (15) the median OS in patients with stromal ratio >60% was shorter than that in patients with a relatively lower stromal ratio, thus suggesting that stroma is an adverse factor for patients with PC. Joni et al. (25) believed that TSR has no value in evaluating the prognosis of PDAC; however, more studies have reported on high interstitial content (low TSR) being a protective factor for patients with PC. According to the aforementioned studies (12,17,26) patients with high tumor interstitial density have longer non-recurrence survival and OS than those with low tumor interstitial density. In this study, the low TSR group had longer survival than the high TSR group. Our results supposedly contradict the tumor-promoting effect of tumor stroma. Therefore, an additional evaluation of the stroma components of the entire tumor can clearly clarify the effect of stroma on the prognosis of patients with PDAC (12). In addition, the TSR-high group was associated with higher T categories, consistent with the study by Li et al. (27). The latter reported on larger tumors with poor stroma than those with rich stroma in patients with breast cancer.
Imaging provides a comprehensive view of the entire tumor and can continuously monitor the development of the disease or its response to treatment (18). Therefore, imaging is a better choice than puncture biopsy for the preoperative evaluation of TSR. Eugene J et al. (28) proposed the delta value (defined as the peak change of the peritumor CT in the parenchymal phase of pancreatic enhancement) in an imaging study of PC. Moreover, they observed that the lower the delta value, the more abundant is the stroma content in the tumor. Shi et al. (15) reported a positive correlation between the high strain ratio (SR) obtained by endoscopic ultrasound elastography and the stromal ratio of PC. Philipp et al. (29) used diffusion-weighted magnetic resonance imaging to evaluate PDAC lesions. The diffusion coefficient was negatively correlated with the percentage of tumor stroma. The aforementioned studies only explored the correlation between TSR and conventional imaging parameters in PC. However, they failed to develop the predicted model. In this study, we developed the XGBoost classifier and determined its discrimination ability and  Hence, further large-scale multicenter studies are needed to obtain high-level evidence for the clinical application of the prediction model. In addition, the incorporation of biochemical markers and genetic marker panels into our prediction model could improve its ability to predict TSR in patients with PDAC. Our research had several limitations. First, we obtained all images using the same CT scanner and imaging scheme in this retrospective study. This, in turn, may limit the generalizability of our study findings. Second, we evaluated TSR in the entire section of tumor specimens, which may overestimate TSR compared to the traditional evaluation method. However, the procedure was supposedly more accurate than the traditional method. Third, to obtain the initial results independent of clinical intervention, we excluded patients who received prior chemoradiotherapy. A previous study showed that preoperative neoadjuvant chemotherapy can affect tumor microenvironment (30). In the future, we will include these patients to further explore the effect of chemotherapy on TSR.

CONCLUSION
The CT radiomics-based XGBoost classifier provides a potentially valuable noninvasive tool to predict TSR in patients with PDAC and optimize risk stratification.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Changhai hospital. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.