CT-based radiomic phenotypes of lung adenocarcinoma: a preliminary comparative analysis with targeted next-generation sequencing

Objectives This study aimed to explore the relationship between computed tomography (CT)-based radiomic phenotypes and genomic profiles, including expression of programmed cell death-ligand 1 (PD-L1) and the 10 major genes, such as epidermal growth factor receptor (EGFR), tumor protein 53 (TP53), and Kirsten rat sarcoma viral oncogene (KRAS), in patients with lung adenocarcinoma (LUAD). Methods In total, 288 consecutive patients with pathologically confirmed LUAD were enrolled in this retrospective study. Radiomic features were extracted from preoperative CT images, and targeted genomic data were profiled through next-generation sequencing. PD-L1 expression was assessed by immunohistochemistry staining (chi-square test or Fisher's exact test for categorical data and the Kruskal–Wallis test for continuous data). A total of 1,013 radiomic features were obtained from each patient's CT images. Consensus clustering was used to cluster patients on the basis of radiomic features. Results The 288 patients were classified according to consensus clustering into four radiomic phenotypes: Cluster 1 (n = 11) involving mainly large solid masses with a maximum diameter of 5.1 ± 2.0 cm; Clusters 2 and 3 involving mainly part-solid and solid masses with maximum diameters of 2.1 ± 1.4 cm and 2.1 ± 0.9 cm, respectively; and Cluster 4 involving mostly small ground-glass opacity lesions with a maximum diameter of 1.0 ± 0.9 cm. Differences in maximum diameter, PD-L1 expression, and TP53, EGFR, BRAF, ROS1, and ERBB2 mutations among the four clusters were statistically significant. Regarding targeted therapy and immunotherapy, EGFR mutations were highest in Cluster 2 (73.1%); PD-L1 expression was highest in Cluster 1 (45.5%). Conclusion Our findings provide evidence that CT-based radiomic phenotypes could non-invasively identify LUADs with different molecular characteristics, showing the potential to provide personalized treatment decision-making support for LUAD patients.


. Introduction
Lung cancer is the most commonly malignant cancer worldwide and the main cause of cancer-related death (1,2).Non-small cell lung cancer (NSCLC) is the main type of lung cancer, accounting for ∼80-90% of all lung cancers, and lung adenocarcinoma (LUAD) has been identified as the primary histologic subtype (3).When LUAD progresses to an inoperable tumor in advanced stages, systemic chemotherapy is the only option.Unfortunately, response rates for platinum-based chemotherapy ranged only between 20% and 40% (4).Targeted therapy for some molecular abnormalities and immunotherapy eliciting T-cell immunoreactivity dramatically improve the survival of some LUAD patients and alter management regimens (5,6).However, only a small proportion of patients with special molecular characteristics or tumor-immune microenvironments (TIMEs) respond to these therapies (7,8).Therefore, knowledge of these metrics is needed for selecting patients who would benefit from targeted therapy or immunotherapy.Nevertheless, all of these metrics require an invasive approach to obtain tissue specimens through expensive, time-consuming, and laborintensive laboratory and clinical testing, and because tumor molecular profiles or the TIME can evolve during treatment, this process may be repeated.In some clinical scenarios, obtaining tissue specimens is difficult.On the other hand, there are sampling errors for tissue-based biomarkers due to the heterogeneity of LUAD, especially the specimens obtained by biopsy (9).Therefore, it is necessary to find non-invasive surrogate biomarkers.
Radiomics, which extracts a large number of quantitative features from medical imaging with high throughput to translate digital images into a wealth of mineable data, is a promising discipline that bridges imaging and precision medicine (10,11).Previous studies have shown that computed tomography (CT)-based radiomics can decode the molecular or immune characteristics of LUADs (12)(13)(14).To the best of our knowledge, only one study has probed the relationship between radiomics and genomic profiles of LUADs (15).The purpose of this study was to develop CT-based radiomic phenotypes using consensus clustering to predict the molecular characteristics and TIME of LUADs to facilitate patient selection for targeted therapies and immunotherapies.
. Materials and methods

. . Patient population
This study was approved by the Institutional Review Board, which waived the informed consent requirement due to its retrospective nature.From January 2018 to December 2021, a total of 378 consecutive patients with surgically pathologically confirmed LUAD were enrolled.The inclusion criteria were as follows: (1) successful retrieval of CT images from the picture archiving and communication system (PACS); ( 2 1).Demographic and clinical data included age, sex, family history, smoking status, carcinoembryonic antigen (CEA), carbohydrate antigen 125 (CA125), carbohydrate antigen 125 (CA199), clinical stage, and tumor mutational burden (TMB).

FIGURE
Flowchart showing the radiomic image analysis process.

FIGURE
Based on the area change under the conditional density function curve, we observed that clustering separation was optimal at a k-value of .This value corresponded to a sharp decrease in the area change under the receiver operating characteristic curve, which suggested that after this k-value, further improvements in separability were negligible.

. . Immunohistochemistry
Formalin-fixed paraffin-embedded (FFPE) samples from LUADs were sliced at a thickness of 3-4 µm, and IHC was used to detect the expression of programmed cell death ligand 1 (PD-L1) in the FFPE samples.The PD-L1 test kit used 22C3 pharmDx (Dako Company).When using this antibody, only staining of the tumor cell membrane was considered, whereas positive staining of the cytoplasm was ignored.Some or all of the tumor cells expressing any linear or granular staining on the cell membrane were counted as positive.The tumor proportion score (TPS) is defined as the percentage of tumor cells stained with the PD-L1 membrane at any intensity.PD-L1 expression was dichotomized according to the TPS level.The widespread consensus is that TPS <1% is negative for expression but that TPS ≥1% is positive for expression, with the latter being appropriate for treatment with PD-L1 antibodies (16).
. .Targeted NGS and data processing NGS was performed as previously described (17,18).Tumor DNA and corresponding patient-matched blood DNA were extracted.TMB was defined as the total number of nonsynonymous single-nucleotide or insertion/deletion mutations divided by the length in Mb of the coding region sequenced by each panel (0.98, 1.06, and 1.22 Mb in the 341-, 410-, and 468-gene panels, respectively) (19).The fraction of the genome altered (FGA) was defined as the fraction of log 2 copy number variation (gain or loss) >0.2 divided by the size of the genome for which the copy number was profiled (20).A total of 520 genes closely related to cancer mechanisms and targeted therapies were detected, covering the full exonic regions of 310 genes and 210 hotspot mutation regions (exons, introns, or promoter regions) of 310 genes.

. . Non-contrast CT image acquisition
CT examinations were performed using a 128-detector CT scanner (Philips Brilliance iCT, Philips Medical Systems, Best, the Netherlands).The CT parameters were as follows: collimation of 0.625 mm × 128; tube voltage, 120 kV; and tube current, automatically adjusted.All CT images were reconstructed with a slice thickness of 1.0 mm and a gap of 0.5 mm using a lung kernel.

. Assessment of non-contrast CT morphological features
Two radiologists with 5 and 12 years of experience in thoracic radiology reviewed the CT images and estimated the types of nodules [solid, part-solid, and ground-glass nodules (GGOs)] in consensus on our PACS.They were all blinded to the identity and clinical data of each subject.A consensus was reached prior to the assessment of CT morphological features.A total of 15 CT imaging features were evaluated, including CT location, tumor size, type of nodules, necrosis, vacuole sign, cavity sign, thickened pleura, pleural traction sign, pleural effusion, lymph node enlargement, vascular cluster sign, lobulation, spiculation, calcification, and air bronchograms.
The CT location was divided into the left upper lobe, lower lobe, right upper lobe, middle lobe, and lower lobe.Nodule types were categorized as GGO (GGO = 100%), part-solid (0% < GGO <100%), and solid (GGO = 0%) according to the proportion of ground glass.GGO was defined as a hazy increase in the lung window setting with the preservation of bronchial and vascular markings (21).The tumor size was assessed by the maximum diameter of the nodules.The vacuole sign was measured by a tumor diameter <5 mm with a hypointense radiolucent shadow, and the cavity sign was defined as a thick-walled cavity with a cavity wall larger than 3 mm.The two features of pleural invasion were pleural thickening and traction.Enlarged lymph nodes were defined as lymph nodes in the mediastinum with a short axis >10 mm.Lobulation was defined as the shallow wavy contour of a tumor's surface, with the exception of the portion adjacent to the pleura (22).Spiculation was defined as sharp linear projections in the targeted tumor lesion.Calcification on CT images was defined as the presence of high-density material in the tumor.Air bronchogram signs on CT images were defined as small foci or branches of air attenuation within the solid part of the tumor (23).

. . Radiomic feature extraction and consensus clustering
Digital Imaging and Communications in Medicine (DICOM) images were downloaded from PACS and transferred to a personal computer (PC) installed with ITK-SNAP version 3.6.0-beta(http:// www.itksnap.org/).The two radiologists were blinded to all clinical and gene information and used ITK-SNAP to manually delineate LUAD lesions slice by slice, obtaining regions of interest (ROI) of the whole tumor in lung window settings.Intraclass correlation coefficients (ICCs) were used to exclude features with low reliability (ICCs < 0.75), and averages of the included feature values of the radiologists' segmentation were used for further analysis.The opensource software reconstructs the three-dimensional volumes of interest (VOIs) of whole tumors automatically.Radiomic features were extracted from the VOIs using the software Pyradiomics In this study, we utilized a consensus clustering approach to discover intrinsic radiomic subtypes of LUAD (Figure 2).Consensus clustering was performed using the "ConsensusClusterPlus" R package, which accomplished unsupervised clustering analysis to identify LUAD subgroups from 1,013 radiomic features without human intervention.
During the clustering process, 80% of the samples were sampled 1,000 times by adopting the resampling iterations.The distance correlation between samples was calculated using the Euclidean distance, with the clustering algorithm using k-means for reliable subgroup classification (Figure 3).The optimal k-value, which corresponded to the most well-separated and stable cluster, was determined by a sharp decrease in the area change under the receiver operating characteristic curve.Further improvements in separability beyond this k-value were deemed insignificant.LUADs were effectively grouped into appropriate subgroups based on this k-value.   .Results

. . Patient characteristics
Patient characteristics are listed in Table 1.Of the 288 eligible patients, the median age was 58 years (IQR, 48-67 years), 123 (42.7%) were men, and 165 (57.3%) were women.Most patients did not have a positive family history of lung cancer.The majority of patients were current (n = 185 [64.2%]) smokers.The mean TMB value was 3.0 ± 3.7 mutations per megabase (range: 0-31.9).In addition, most of the patients (87.2%) were in the early clinical stage (I-II).Differences in age, sex, and smoking status between the four clusters were statistically significant, and the CEA, clinical stage, and TMB among the four clusters were also statistically significant (P < 0.05).

. . CT findings of the LUADs
The CT findings of LUADs are listed in Table 2.There were no statistically significant differences in CT location, cavity sign, pleural effusion, or calcification characteristics among the four clusters (P > 0.05) (Figure 4).

Consensus clustering analysis based on radiomic features showed the most significant relative change under the conditional
Frontiers in Medicine frontiersin.orgdensity function curve at a k-value of 4. Clinicopathology, the genetic profiles, and CT findings according to the cluster are listed in Tables 1-3.Cluster 1 (n = 11) mainly comprised large solid masses with a maximum diameter of 5.1 ± 2.0 cm; these cases are likely to involve tumor necrosis, thickened pleura, pleural traction, and lymph node enlargement.Almost all nodules in Cluster 1 were accompanied by lobulation.Clusters 2 and 3 were dominated by part-solid and solid masses with maximum diam eters of 2.1 ± 1.4 cm and 2.1 ± 0.9 cm, respectively; the tumors were often associated with vacuole signs, vascular cluster signs, and spiculation on CT images.Cluster 4 was mostly small ground-glass opacity lesions, with a maximum diameter of 1.0 ± 0.9 cm (P < 0.001).The majority of patients in Cluster 1 were in clinical stages III-IV; in Clusters 2-4, patients were mostly in early clinical stages.Differences in TMB, PD-L1 expression, and mutations in EGFR, TP53, ERBB2, BRAF, and ROS1 among the four clusters were statistically significant.Regarding targeted therapies and immunotherapies, mutations in EGFR were highest in Cluster 2 (73.1%, 95/130), followed by Cluster 3 (67.8%,59/87).PD-L1-positive expression was highest in Cluster 1 (45.5%,5/11), followed by Cluster 3 (28.7%,25/87) (Figure 6).The highest TMB was in Cluster 1 (9.1 ± 4.8 mut/Mb, range: 1.99 to 14.96) (Figure 7).Representative cases are shown in Figure 8.

. Discussion
In the last decade, major breakthroughs regarding the treatment of LUADs have shifted from the empirical application of cytotoxic therapy to personalized treatments based on genetic alterations and TIME.For these treatment strategies, knowledge of targeted genomics and TIME status is needed for patient selection.As both genomic sequencing and IHC require tissue specimens to be obtained through invasive processes, there is a need to find non-invasive surrogate biomarkers to facilitate the clinical translation of personalized medicine for patients with LUAD.In this study, we established imaging phenotypes of LUADs through CT-based radiomic consensus clustering with a comparison of clinicopathological metrics and targeted genomic data to guide patient selection.LUADs were clustered into four clusters according to CT-based radiomic features.When all patients were analyzed, Cluster 1 mainly consisted of large solid masses  associated with advanced clinical staging (III-IV, 63.6%) with a high frequency of PD-L1 expression and TP53 mutation.CT features were more likely to be accompanied by tumor necrosis, thickened pleura, pleural traction, and lymph node enlargement.Demography shows that patients in Cluster 1 were mainly men who smoke.Clusters 2 and 3 were dominated by part-solid and solid masses associated with the EGFR mutation.These cases were likely to be associated with the vacuole sign, vascular cluster sign, and spiculation on CT images.Cluster 4 mostly consisted of GGOs.Therefore, patients in Cluster 1 might be more responsive to Immune checkpoint inhibitor (ICI) treatment, whereas tyrosine kinase inhibitors may be recommended for patients in Clusters 2 and 3.Although TMB did not correlate with PD-L1 expression in NSCLC, its elevation showed the likelihood of benefit from immunotherapy (19).Therefore, TMB has the potential to serve as a biomarker to predict response to ICI therapy in NSCLC (19, [24][25][26].A recent clinical trial identified that a TMB of at least 10 mut/Mb was an effective cutoff for predicting the efficacy of immunotherapy, irrespective of the tumor PD-L1 expression level (27).In this study, we observed that the difference in TMB among CT-based radiomic phenotypes was statistically significant.The highest TMB was in Cluster 1 (9.1 ± 4.8 mut/Mb, range: 1.99 to 14.96).This further supports the conclusion that Cluster 1 patients might be more suitable for immunotherapy.
Taking advantage of computer science and artificial intelligence, radiomics has made great progress in oncology to improve diagnosis, stage, prognosis, and treatment response prediction (10,28,29).Previous studies have shown that CTbased radiomics can predict EGFR mutation, microenvironment, and treatment response of targeted and immunotherapies in LUAD (13,(30)(31)(32).These studies highlight special molecular characteristics to develop an algorithm for a single surrogate biomarker.In clinical scenarios, comprehensive molecular characteristics are required for personalized treatment decisionmaking.Integration of multiple molecular characteristics might improve the predictive capacity of treatment response.Recently, Perez-Johnston et al. implemented unsupervised learning to build an image phenotype of LUADs based on CT radiomics and showed an association between imaging phenotype and genomics (15).Similarly, we identified four phenotypes using CT-based radiomic features of LUADs that correlated with genomic profiles and PD-L1 expression.In the study of Perez-Johnston et al., EGFR and STK11 mutations were statistically significant among clusters.EGFR mutation was highest in a cluster consisting of mainly sub-solid masses with solid components <10%.We also identified that the difference in EGFR mutation prevalence was statistically significant among our clusters.Cluster 2 had the highest number of EGFR mutations (73.1%), followed by Cluster 3 (67.8%).We also noted that the TP53 mutation was highest in Cluster 1, which was mainly comprised of solid masses.These findings were consistent with a previous study that reported that the TP53 mutation increased with the growth of the solid component (33).Assoun et al. found that TP53 mutations reflected TMB and were associated with immunotherapy benefits in advanced NSCLC (34).Furthermore, we found that PD-L1 expression was significantly different among clusters, with the highest expression in Cluster 1. Zu et al. analyzed TIME-related indicators by conducting a series of TIME studies.They identified emerging key biomarkers of TIME, providing new biomarkers to guide precision therapy (35)(36)(37)(38).Therefore, the results of this study may provide guidance for targeted therapies and immunotherapies that integrate genomic profiles and the TIME of LUADs.To the best of our knowledge, this is the first comprehensive study to date to explore the association of radiomic phenotypes with the genomic profile and immune microenvironment of LUADs.Theoretically, targeted therapy might induce rapid tumor death, leading to the release of neoantigens, which in turn affect immune pathways and improve the efficacy of immunotherapy (39, 40).Thus, our study provides the possibility of immuno-targeted combination therapy, which has strong scientific merit.In addition, our study can provide generalizable guidance across various treatment settings.
Simultaneously, we also observed that mutations in BRAF, ROS1, MET, and ERBB2 were statistically significant.Although there are currently no targeted therapies for tumors with these mutations, multiple clinical trials are underway to evaluate the efficacy of targeting these genes in cancers, and this finding might provide guidance for future targeted medicine research.
There were several limitations in this study.First, this was a retrospective study performed at a single institution, and the small sample size might limit the generalizability of the findings.Cluster 1 had a small sample size of only 11 cases.Therefore, a multicenter study including more patients and prospective validation is warranted to improve the model's robustness.Second, the reproducibility of our findings and their clinical implications may be challenging in a more diverse clinical context, as our study only included clinical lung cancer patients who underwent tumor resection.The clinical utility of radiomics needs to be further established through rigorous clinical validation studies.Third, only a few patients received targeted therapy or immunotherapy.The association between the imaging phenotypes and treatment response was not probed.Finally, ROI drawing was manual rather than semi-automatic or automatic, which might be operatordependent.
In conclusion, CT-based radiomic phenotypes were able to identify LUADs with different molecular characteristics noninvasively, showing the potential to provide treatment decisionmaking support for clinicians about patients with LUAD.
Radiogenomics is still at an early stage of research, and future efforts are needed to optimize its methods and standardize its processes.In future clinical environments, integrating radiogenomics into existing workflows may add value to conventional imaging to facilitate personalized medicine in patients with LUAD.

FIGURE
FIGUREFlowchart of the patient selection process.

FIGURE
FIGUREDistribution of CT morphological features in the four clusters.
Statistical analysis was performed in R version 4.2.2 (R Foundation for Statistical Computing) and SPSS version 23.0.A P-value of <0.05 indicated statistical significance.We varied the Frontiers in Medicine frontiersin.org

FIGURE
FIGURESeveral gene expression profiles in the four clusters.
number of clusters from 2 to 8 and selected the optimal number of clusters in the training cohort for unsupervised clustering.Clinical metrics, imaging characteristics, and genomic profiles of the final clusters were compared using the chi-square test or Fisher's exact test for categorical data and the Kruskal-Wallis test for continuous data.Continuous data are expressed as the mean ± SD or median (lower and upper quartiles), and categorical data are expressed as frequencies and percentages.

FIGURE
FIGUREClinical pathologic and genomic data for all LUADs.Cluster characteristics were compared using the chi-square test or Fisher's exact test for categorical data and the Kruskal-Wallis test for continuous data.LVI = lymphovascular invasion, VPI = visceral pleural invasion.

FIGURE
FIGUREViolin plot of TMB among four clusters.

FIGURE
FIGURE CT images of lesions for radiomic cluster analysis.(A) Cluster .Solid nodule in the right upper lobe (arrow) measuring maximum diameter .cm, predominantly solid histologic subtype, with TP positive, and the TMB was .mut/Mb.(B) Cluster .Part-solid nodule in the left lower lobe (arrow) measuring maximum diameter .cm, glandular vesicle-dominant histologic subtype, EGFR positive, and the TMB was .mut/Mb.(C) Cluster .Part-solid nodule in the right upper lobe (arrow) measuring maximum diameter .cm; solid and micropapillary histologic subtype; ALK-positive, and the TMB was .mut/Mb.(D) Cluster .Ground-glass nodule in the left upper lobe (arrow) measuring maximum diameter .cm; ERBB and PD-L positive; and the TMB was .mut/Mb.
TABLE CT morphological features of patients in the clusters.
TABLE Tumor pathologic characteristics and gene expression.