Predictive Power of a Radiomic Signature Based on 18F-FDG PET/CT Images for EGFR Mutational Status in NSCLC

Radiomics has become an area of interest for tumor characterization in 18F-Fluorodeoxyglucose positron emission tomography/computed tomography (18F-FDG PET/CT) imaging. The aim of the present study was to demonstrate how imaging phenotypes was connected to somatic mutations through an integrated analysis of 115 non-small cell lung cancer (NSCLC) patients with somatic mutation testings and engineered computed PET/CT image analytics. A total of 38 radiomic features quantifying tumor morphological, grayscale statistic, and texture features were extracted from the segmented entire-tumor region of interest (ROI) of the primary PET/CT images. The ensembles for boosting machine learning scheme were employed for classification, and the least absolute shrink age and selection operator (LASSO) method was used to select the most predictive radiomic features for the classifiers. A radiomic signature based on both PET and CT radiomic features outperformed individual radiomic features, the PET or CT radiomic signature, and the conventional PET parameters including the maximum standardized uptake value (SUVmax), SUVmean, SUVpeak, metabolic tumor volume (MTV), and total lesion glycolysis (TLG), in discriminating between mutant-type of epidermal growth factor receptor (EGFR) and wild-type of EGFR- cases with an AUC of 0.805, an accuracy of 80.798%, a sensitivity of 0.826 and a specificity of 0.783. Consistently, a combined radiomic signature with clinical factors exhibited a further improved performance in EGFR mutation differentiation in NSCLC. In conclusion, tumor imaging phenotypes that are driven by somatic mutations may be predicted by radiomics based on PET/CT images.


INTRODUCTION
Lung cancer is one of the most frequently diagnosed malignancies worldwide, and is the leading cause of cancer-related death, with a 5-year survival rate of only 15% (1). Non-small cell lung cancer (NSCLC) accounts for more than 80% of all primary lung cancers (2). From a genetic perspective, NSCLC is significantly driven by somatic mutations in some critical oncogenes, such as epidermal growth factor receptor (EGFR) (3). Subsequently, several EGFR tyrosine kinase inhibitors (TKIs) have been developed as small molecule targeted therapeutic agents for the treatment of NSCLC (4)(5)(6). However, only some groups of patients harboring an EGFR mutation have benefited from EGFR TKI therapy, even with a high percentage of EGFR expression in NSCLC (7,8). Given the predictive role of EGFR mutational status in the efficacy of EGFR-TKI treatment, identification of EGFR mutational status in advance is crucial for selecting the most effective therapeutic strategy to achieve precise medicine (9). Currently, the assessments for EGFR mutational status are based on biopsies of tumor tissue or surgical resection acquisition (10,11). Therefore, molecular testing to identify the mutational status may be limited by invasive procedure, long processing time, tissue sample availability and sampling error due to tumor heterogeneity. Thus, a non-invasive, direct radiographic method for the early detection of EGFR mutational status is needed.
As a functional imaging modality, non-invasive 18 F-Fluorodeoxyglucose positron emission tomography/computed tomography ( 18 F-FDG PET/CT) is widely used for the diagnosis and staging of oncology, playing an increasingly important role in the evaluation and management of cancer (12). Concurrently, 18 F-FDG PET/CT imaging has been suggested as a part of the standard initial regimen for NSCLC patients (13). Different metabolic phenotypes captured in 18 F-FDG PET/CT images represent different glucose metabolism styles associated with somatic mutation (14,15). As previously reported, active mutations in EGFR could activate relevant intracellular signaling pathways to enhance tumor glycolysis; consequently, intense 18 F-FDG uptake manifestation in PET images was observed (15). Previous studies conducted by other groups have also demonstrated a positive correlation between the oncogene mutational status and the maximum standardized uptake value (SUVmax) in PET images (16)(17)(18). Nevertheless, there have been conflicting results (19). Even though it is a widely accepted semi-quantitative imaging parameter derived from 18 F-FDG PET/CT, SUVmax was the main cause of the controversy in these investigation. As a single pixel value, SUVmax is not able to reflect the glucose metabolism of the whole tumor. Metabolic tumor volume (MTV), defined as the volume of tumor tissues with high glycolytic activity, and total lesion glycolysis (TLG) are increasingly being recommended as a volumetric or quantitative measurement of tumor cells to overcome the partial volume effects and statistical bias induced by the usage of SUVmax (20,21). Apart from all of these traditional PET imaging parameters, more useful information than what can be seen with the naked eye, can be captured in standard medical images (22). In particular, an accurate quantification of the spatial relationships between image voxels is significantly helpful to describe the degree of tumor heterogeneity (23). Fortunately, radiomics, which is an advanced mathematical model to quantify the spatial relationships between image voxels, is now an emerging area of interest in medical imaging (24). The high-throughput extraction of radiomic features from medical images w allows for a quantitative assessment of tumor imaging phenotypes to achieve individualized therapy and precise medicine (25). An emerging field that is closely related to radiomics is radiogenomics, which integrates imaging and genomic data to attain biological interpretation for imaging phenotypes (25). Somatic mutations affect the ability of cells to grow in otherwise non-permissive conditions. For example, a mutation in EGFR (26,27) and/or KRAS (28,29) may induce increased glycolysis and promotes glucose consumption via the activation of the Akt signaling pathway, and these alterations in glucose metabolism may be captured and reflected in PET imaging. Based on the assumption that the extracted radiomic metrics from medical images are linked to the molecular profile of the tumor lesion, such as some key oncogene mutational status, an increasing number of studies aim to investigate the association between somatic mutations and radiomic features in NSCLC (30)(31)(32)(33). However, the majority of the investigations have focused on the texture analysis in CT (34)(35)(36) and/or magnetic resonance imaging (MRI) (37); whereas few studies in relation to PET/CT radiogenomics are conducted (38)(39)(40).
In the present study, we established a radiomic signature based on PET/CT images to reveal the predicative role of PET/CT radiomics in EGFR mutation status. The radiomic signature based on PET/CT images, especially when in combination with a clinical model, is believed to be capable of predicting the EGFR mutational status as a minimally or non-invasive imaging biomarker to complement the molecular test in identification of somatic mutational status.

Study Design and Patient Selection
This retrospective investigation was conducted with the approval of Tianjin Medical University Cancer Hospital Institutional Ethics Committee. NSCLC patients with a single pulmonary lesion (diameter > 1 cm) who underwent somatic mutation testing and diagnostic 18 F-FDG PET/CT imaging prior to any treatment between June 2016 to July 2017 were included in this study. Histological diagnosis of primary NSCLC was confirmed by pathological examination of pulmonary surgical resection specimen. Patients were excluded if they were pregnant, lactating or had any or had any malignancies before. In addition, A total of 25 patients with NSCLC were excluded because the radioactivity accumulated in their lesions which were mainly composed of ground glass density components was too weak to be automatically measured by PET VCAR software (GE Healthcare, USA). Written informed consent was obtained from all the patients in this study, and all of the general clinicopathological characteristics of the eligible patients included were collected and summarized in Table 1. This study was performed in compliance with the Declaration of Helsinki and the relevant ethical guidelines.

Patients Imaging
Briefly, the appropriate patient preparation (fasting for at least 6 h) and adequate blood glucose levels (<140 mg/dL) were requested before an intravenous injection of 4 MBq/kg of 18 F-FDG was administered to all of the included patients. Then whole body PET/CT imaging on a GE Discovery elite (GE HealthCare, Waukesha, WI, USA) was performed 60 min after the 18 F-FDG injection. Prior to the PET scan, a low-dose CT scan (helical pitch 0.75:1, 5 mm slice thickness, 120 kV and 50-80 mAs) was acquired for anatomical correlation and attenuation correction. A PET emission scan of 2 min per bed position in a three-dimensional mode was performed to integrate with the corresponding CT images. After reconstruction via an iterative algorithm, all PET imaging data were converted into SUV units. The SUV was calculated using the formula: [region of interest activity (mCi/mL)]/[injected dose (mCi)/body weight (g)]. All PET images were reviewed in consensus by two experienced PET/CT imaging-specialized experts. The volume of interest (VOI) was determined using an isocontour threshold method based on SUV using a commercial software (PET VCAR; GE Healthcare, USA) on GE Advantage Workstation 4.6 (AW 4.6). SUVmax, SUVmean and SUVpeak were calculated automatically within the VOI. MTV was defined as the tumor volume with 18F-FDG uptake segmented above a threshold SUV of 2.5. If SUVmax of the primary tumor was lower than a threshold SUV of 2.5, we regarded the MTV of the lesion as 0. TLG was calculated by multiplying MTV by SUVmean.

Somatic Mutation Assessment
Tissue samples submitted for mutational analysis were obtained through biopsy or surgical resection. Genomic DNA of the tumor specimen was extracted using a microdissection method based on the manufacturer's protocols. The nucleotide sequences encoding the kinase domain (exons 18-24) of EGFR were amplified via a quantitative real-time polymerase chain reaction (PCR)-based method (qPCR). The presence of an appropriate PCR product was confirmed by resolving the PCR products on a 2% agarose gel. After purification, corresponding fragments on the gel were sequenced in both sense and antisense directions using an ABI PRISM R 9700 and ABI PRISM R 310 Genetic Analyzer (Applied Biosystems, USA). The sequenced data using SeqScape (Applied Biosystems) were analyzed and compared with the archived human sequence of EGFR (GenBank accession no. NG_007726.1), to identify the mutation. Of the 115 patients who were tested for somatic mutations, 64 patients were mutant-type of EGFR, whereas 51 patients tested negatively for the EGFR mutation (wild-type of EGFR).

Measurement and Extraction of PET/CT Based Radiomic Features
All segmentation was performed by two experienced PET/CT imaging-specialized experts using ImageJ 1.50i software (National Institute of Health, USA) to manually outline the contour of the region of interest (ROI) which was delineated on PET images using a 42% threshold of SUVmax. Any disagreement was resolved by consensus. All radiomic features were calculated by applying an existing automated computer program (MATLAB, The MathWorks Inc., USA). Over the segmented tumor region, a set of 38 quantitative radiomic features were extracted for each patient. The 38 features included: (1) morphological features (41,42), such as area, perimeter, diameter, and concavity. Area was defined as the number of pixels in the tumor region; perimeter was determined by counting the number of pixels in the tumor boundary; diameter was determined by counting the maximum number of pixels between any two points; and concavity rate was defined as where δ = ( x, y), and their mean values were taken (45,46). All the formula used to calculate GSS, GLCM, GGCM, and GLDS features were provided in Supplemental Document S1.
The ensembles for boosting machine learning scheme were employed for classification, and the least absolute shrinkage and selection operator (LASSO) method (47) was used to select the most predictive features for the classifiers. LASSO is a regression analysis process utilized to identify the top-ranked or most predictive features to minimize the predicting error of outcome by altering the model fitting process. Three multivariate radiomic signatures based on PET alone, CT alone and combined PET/CT radiomic features were developed in the present study. Finally, a subset of 7 PET radiomic features and 2 CT radiomic features were finally identified and included in the PET/CT radiomic signature establishment. More importantly, several clinical factors, including age, gender, smoking status, clinical stage and lesion location, were also combined with these three developed radiomic signatures to create the corresponding integrated signatures. The typical radiomic flowchart used in this investigation is presented in Figure 1.

Statistical Analyses
Results were expressed as the mean ± standard deviation for quantitative variables, whereas numbers and percentages were used for categorical variables. The Wilcoxon rank-sum test was used to determine whether there was a significant difference in the feature values between EGFR mutated cases and cases without the EGFR mutation. The correlations between conventional PETderived parameters and radiomic features based on PET images were evaluated using the Spearman's coefficient r value. The predictive performance of each feature in classifying patients according to their EGFR mutation status was evaluated and quantified using the area under curve (AUC) in receiveroperating-characteristic (ROC) curve analysis. The value of AUC ranged from 0.5 to 1.0, where a value of 0.5 was interpreted with the same probability as a random guess and a value of 1.0 indicated a perfect classification. We used a Noether's test to determine whether the value of AUC was significantly greater than a random guess (AUC = 0.5). Considering the relatively small sample size included in this study, a 10-fold cross validation was utilized and repeated 10

Comparison of Conventional PET Parameters Between EGFR Mutations
Of the 115 NSCLC patients with results for EGFR mutational status assessment included in the present study, 56% (64/115) of patients harbored a EGFR mutation (EGFR+), whereas 44% (51/115) of patients tested negatively for the EGFR mutation (EGFR-; Table 1). To assess the association between conventional PET parameters (SUVmax, SUVmean, SUVpeak, MTV, and TLG) and EGFR mutational status, we first compared the conventional PET values between the mutant-type of EGFR and wild-type of EGFR subgroups, and then conducted ROC analyses to evaluate their performances in distinguishing the EGFR mutation. For EGFR mutated NSCLC patients, the SUVmax, SUVmean, SUVpeak, and TLG were found to be underrepresented in comparison with the EGFR-subgroup, whereas no significant difference existed in the MTV between the EGFR+ and EGFR-subgroups (Figure 2A). Using ROC analyses, AUCs were assessed to evaluate the ability of the four significant conventional PET parameters to predict EGFR mutational status in NSCLC. As illustrated in Figure 2B, all three of the SUV parameters were able to significantly discriminate the mutanttype of EGFR subgroup from the wild-type of EGFR subgroup [AUC = 0.621 (P = 0.026), 0.624 (P = 0.023), and 0.615 (P = 0.035) for SUVmax, SUVmean, and SUVpeak, respectively], but TLG did not exhibit significant predictive power for EGFR mutation status in NSCLC [AUC = 0.597 (P = 0.074)].

Correlation Between Radiomic Features Derived From PET Images and Conventional PET Parameters
In order to reduce the potential redundancy among all of the radiomic features extracted in this study, a feature selection method called LASSO was adopted to select only a subset of radiomic features to minimize the predicting error of outcome. Then the correlations between the identified radiomic features based on PET images and conventional PET parameters were further determined in the present investigation. As illustrated in Figure 3, among all of the identified PET images-derived radiomic features, homogeneity, entropy and contrast were found to be significantly correlated (P < 0.05) with all five of the conventional PET quantitative parameters. Spearman's coefficients between these three radiomic features and conventional parameters ranged from 0.336 to 0.500 (homogeneity), 0.252 to 0.388 (entropy), and −0.262 to −0.338 (contrast). Gray mean, concavity and ASM had poor or no significant correlations (P > 0.05) with SUVmax, SUVmean, and SUVpeak, whereas correlations between these three radiomic features and MTV and TLG were stronger (P < 0.05), with Spearman's coefficients from 0.196 to 0.474. By contrast, radiomic features called correlations were less correlated to MTV and TLG (P > 0.05) than to SUVmax, SUVmean, and SUVpeak (P < 0.05).

Predictive Power of the PET/CT-Derived Radiomic Signature for EGFR Mutational Status
To more comprehensively evaluate the value of radiomic analysis to predict EGFR mutational status in NSCLC, we developed three radiomic signatures derived from PET images alone, CT images alone and combined PET/CT images. As shown in Table 2, among the three established radiomic signatures, the one based on PET/CT images had the most predictive power in discriminating between mutant-type of EGFR and wild-type of EGFR cases with an AUC of 0.805, an accuracy of 80.798%, a sensitivity of 0.826 and a specificity of 0.783. Meanwhile, a radiomic signature based on the identified PET radiomic features alone (AUC = 0.789) significantly outperformed a radiomics signature based on CT radiomic features alone (AUC = 0.667) in EGFR mutation discrimination in NSCLC. As several clinical parameters, such as gender, smoking status, and histopathological type, have been reported to be informative for EGFR mutational status in NSCLC (49), we combined several clinical parameters (including age, gender, smoking status, clinical stage, and lesion location) with these developed signatures to create integrated models in order to evaluate how these clinical factors affect the performance of these radiomics signatures. Consistently, combined radiomic signatures with clinical factors exhibited improved performance, especially for PET/CT radiomic signatures with an AUC of 0.822, an accuracy of 82.652%, a sensitivity of 0.821 and a specificity of 0.823 ( Table 3).
As represented in the box plots in Figure 4A, marked differences existed between the EGFR+ subgroup and EGFRsubgroup in regard to all of the identified radiomic features included in the developed PET/CT radiomic signature [a total of 7 PET-derived radiomic features (concavity, gray mean, homogeneity, ASM, entropy, contrast, and correlation) and a CTbased radiomic feature called gray span] except for a CT-based radiomic feature called gray mean. The abilities of all of these identified radiomics features to predict EGFR mutational status in NSCLC were evaluated by assessing the AUC (Figure 4B). All of the identified PET-derived and CT-based radiomic features (gray span) were significantly predictive of EGFR mutational status (P < 0.05). The discriminative power ranged from AUC = 0.609 for concavity to AUC = 0.776 for homogeneity with respect to radiomic features based on PET images, and ranged from AUC = 0.590 for gray mean (P > 0.05) to AUC = 0.665 (P < 0.05) for gray span regarding to CT images-derived radiomic features. As  clearly presented in Figure 4B, all of the PET radiomic features significantly outperformed the three significant conventional PET parameters (SUVmean, SUVmax, and SUVpeak).

DISCUSSION
Due to the high incidence and mortality associated with NSCLC, the early precise determination of some of the most common somatic mutations, such as EGFR mutational status, will be beneficial in improving lesion differentiation, responses to predictions and evaluations, and prognostication (7)(8)(9). A growing body of evidence has illustrated that radiomic assessments of the tumor imaging phenotype captured in integrated PET/CT images markedly facilitated tumor management, including differential diagnosis, tumor staging, response evaluation, and survival prediction (25,34,35,(38)(39)(40). Unfortunately, few studies have focused on evaluating the performance of radiomics derived from PET/CT in somatic mutation prediction for NSCLC patients (30,31,36).
In the present study, the aim was to reveal the association between PET/CT radiomic features with EGFR mutational status and evaluate their ability to predict mutational status in NSCLC. In general, radiomic signatures based on PET/CT images indicated a stronger predictive power for the EGFR mutation than the CT radiomic signature and conventional PET parameters. The results revealed that tumors with EGFR mutations tended to have a more irregular boundary (higher concavity), an overall lower randomness and complexity in a gray-level distribution (higher gray mean, higher ASM and lower entropy) and a higher heterogeneity (lower homogeneity, lower correlation, and higher contrast) in comparison with tumors without an EGFR mutation.
Consistent with the widely acceptable notion that tumors with a EGFR mutation are more indolent than tumors without a EGFR mutation (50), the metabolic parameter measurements in our study, such as SUVmax, SUVmean, and SUVpeak, were notably decreased in the tumors bearing EGFR mutations when in contrast with those observed in tumors without EGFR mutations. Results from Mak et al. and Na et al. also indicated a decreased SUVmax in NSCLC patients with a EGFR mutation compared with in those without a EGFR mutation (51,52), which is in agreement with our results. Conversely, reports from Huang et al. and Ko et al. suggested that a higher SUVmax was more likely to predict the presence of an EGFR mutation (50,53), whereas other previous results assumed no evident correlation between EGFR mutational status and PET metabolic parameters (54)(55)(56). The discrepancies in these aforementioned studies may be attributed to the patient demographics and ethnicity. Further analysis on larger cohorts and in different countries are needed to resolve this issue. The overall unsatisfactory performance of conventional CT and/or conventional PET features in the prediction of EGFR mutation status inspired us to develop superior radiomic indices to ascertain the somatic mutational status. Depending on the imaging modality, texture analysis of heterogeneity through radiomics conveys different meanings. PET-based radiomic analysis refers to the variability of the metabolic phenotype, while CT-based radiomic analysis manifests the distribution pattern of tissue density. Even with the presence of a larger sample size and an external validation design for previous studies with respect to genotype-phenotype interaction in NSCLC (30,31,36), these investigation merely performed CT-based radiomics (31,36) or PET-based radiomics alone (26) for discrimination between patients with positive somatic mutations and those without somatic mutation. To determine this, we performed comprehensive radiomic analysis based on combined PET/CT images to evaluate their performances in EGFR mutation prediction in NSCLC. In contrast to previous reports which chose individual radiomic biomarkers in radiomic analysis in order to identify somatic mutations in NSCLC (30,36), a radiomic signature that combined multiple radiomic features was established in the present study, as one single radiomic parameter is not sufficient to detect the gross heterogeneity in tumor lesions. The identification of a radiomic signature predictive of EGFR mutational status would be helpful in precision medicine for NSCLC. It was assumed that radiomic features based on PET and CT images were complementary to each other, and a radiomic signature based on combining PET and CT radiomic features could substantially improve its predictive power for EGFR mutational status. In our investigation, the developed PET/CT derived radiomic signature exhibited a comparable predictive value to that of the radiomic signature based on PET images alone in differentiating the mutant-type of EGFR and wild-type of EGFR subgroups, whereas the established radiomic signature based on PET images alone significantly outperformed the radiomic signature based on CT images alone. A further study involving a larger sample size and more extracted radiomic features is required to be able to ascertain the outperformance of PET/CT-derived radiomic signature over PET alone based radiomic signatures in EGFR mutation prediction in NSCLC. Box plots for all 7 of the identified PET-derived radiomic features and the 2 CT-based radiomic features between the mutant-type of EGFR and wild-type of EGFRsubgroups. Except for a CT-derived radiomic feature called gray mean, all of the identified radiomic features were significantly different between the mutant-type of EGFR and wild-type of EGFR subgroups. (B) Evaluation of the predictive value of individual identified radiomic features for EGFR mutational status by receiver operating characteristic (ROC) analysis. *indicates that the value of the area under the curve (AUC) was significantly greater than a random guess (AUC = 0.5). As presented, all of the identified individual radiomic features were capable of discriminating EGFR mutated cases from cases without EGFR mutation, except for a CT-based radiomic feature called gray mean. In general, the PET-derived individual radiomic feature outperformed the conventional PET parameters in distinguishing the mutant-type of EGFR and wild-type of EGFR subgroups. EGFR, epidermal growth factor receptor; PET, positron emission tomography; CT, computed tomography; EGFR+, mutant-type of EGFR; EGFR-, wild-type of EGFR.
Despite the valuable results described above, there are several limitations in the present study. First of all, owing to the retrospective nature of the study, the acquisition, reconstruction and delineation settings were not standardized or optimized for the patients included in this investigation. As reported previously, the repeatability of radiomics could be markedly influenced by all these parameters (33,(57)(58)(59)(60). Secondly, partial volume effects as a result of the limited PET spatial resolution may lead to an underestimation of metabolic measurements in PET images (61), and probably affect the PET-based radiomics for NSCLC patients with relatively smaller tumor volumes. Furthermore, the lack of respiratorygated PET/CT imaging (62) may induce image blurring, which consequently led to a relatively poor performance in the quantification of the imaging phenotype. In the end, due to the small sample size of this study, we did not perform a robust external validation by applying a strict statistical design with independent training and validation cohorts in a large number of patients.

CONCLUSIONS
Tumor imaging phenotypes that are driven by somatic mutations may be quantitatively measured by radiomic features extracted from PET/CT images for NSCLC patients.
Radiomic features outperformed conventional PET parameters in the prediction of EGFR mutational status. PET/CT radiomic signatures combined with clinical factors exhibited a further improved performance. More importantly, in future investigations, we should be aware that intra-tumor heterogeneity is a big challenge for imaging and genetic correlation study. This is hard to overcome in current radiogenomic study design. The recently proposed habitat imaging study may have the potential to shed light on this issue (63,64).

DATA AVAILABILITY STATEMENT
All datasets generated for this study are included in the manuscript/Supplementary Files.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Tianjin Medical University Cancer Hospital Institutional Ethics Committee. The patients/participants provided their written informed consent to participate in this study.