Pre-operative Prediction of Ki-67 Expression in Various Histological Subtypes of Lung Adenocarcinoma Based on CT Radiomic Features

Purpose: The aims of this study were to combine CT images with Ki-67 expression to distinguish various subtypes of lung adenocarcinoma and to pre-operatively predict the Ki-67 expression level based on CT radiomic features. Methods: Data from 215 patients with 237 pathologically proven lung adenocarcinoma lesions who underwent CT and immunohistochemical Ki-67 from January 2019 to April 2021 were retrospectively analyzed. The receiver operating curve (ROC) identified the Ki-67 cut-off value for differentiating subtypes of lung adenocarcinoma. A chi-square test or t-test analyzed the differences in the CT images between the negative expression group (n = 132) and the positive expression group (n = 105), and then the risk factors affecting the expression level of Ki-67 were evaluated. Patients were randomly divided into a training dataset (n = 165) and a validation dataset (n = 72) in a ratio of 7:3. A total of 1,316 quantitative radiomic features were extracted from the Analysis Kinetics (A.K.) software. Radiomic feature selection and radiomic classifier were generated through a least absolute shrinkage and selection operator (LASSO) regression and logistic regression analysis model. The predictive capacity of the radiomic classifiers for the Ki-67 levels was investigated through the ROC curves in the training and testing groups. Results: The cut-off value of the Ki-67 to distinguish subtypes of lung adenocarcinoma was 5%. A comparison of clinical data and imaging features between the two groups showed that histopathological subtypes and air bronchograms could be used as risk factors to evaluate the expression of Ki-67 in lung adenocarcinoma (p = 0.005, p = 0.045, respectively). Through radiomic feature selection, eight top-class features constructed the radiomic model to pre-operatively predict the expression of Ki-67, and the area under the ROC curves of the training group and the testing group were 0.871 and 0.8, respectively. Conclusion: Ki-67 expression level with a cut-off value of 5% could be used to differentiate non-invasive lung adenocarcinomas from invasive lung adenocarcinomas. It is feasible and reliable to pre-operatively predict the expression level of Ki-67 in lung adenocarcinomas based on CT radiomic features, as a non-invasive biomarker to predict the degree of malignant invasion of lung adenocarcinoma, and to evaluate the prognosis of the tumor.


INTRODUCTION
Lung adenocarcinoma is the most commonly diagnosed histological subtype of non-small-cell lung cancer (NSCLC), which is the leading cause of cancer-related deaths worldwide (1). In 2011, a new classification system for lung adenocarcinomas according to the International Association for the study of Lung Cancer (IASLC), American Thoracic Society (ATS), and European Respiratory Society (ERS) has been put forward, wherein the lung adenocarcinomas are mainly classified as atypical adenomatous hyperplasia (AAH), adenocarcinoma in situ (AIS), minimally invasive adenocarcinoma (MIA), and invasive adenocarcinoma (IAC). Among them, AAH and AIS were pre-invasive lesions (2). More and more treatment methods can be used for the treatment of lung cancer. However, many patients, even patients with resectable lung cancer, still have poor prognoses (3). For lung adenocarcinoma, studies have found that, even for patients with complete surgical resection and in pathologic stage T1 (pathologic-T1, pT1), the treatment effects and prognoses may be significantly different (4). There is an urgent need to determine reliable prognostic factors that can predict clinical outcomes and more precisely stratify the group of patients susceptible to poorer outcomes.
Currently, Ki-67 is commonly regarded as a prognosis biomarker to predict the cell proliferation and aggressiveness of tumors in clinical practice, which can be used for quantitative analysis of tumor growth fraction and the classification of tumors and for assisting in early diagnosis and therapeutic effect evaluations (5). Ki-67 is expressed at all stages of the cell cycle except G0, with the highest expression levels in the G2/M phase. It has been reported that the overall survival (OS) and disease-free survival (DFS) of patients with high Ki-67 expression are shorter than those with low Ki-67 expression (5)(6)(7). Previous studies have identified the Ki-67 labeling index as a strong prognostic biomarker for lung adenocarcinoma (8,9). Yamashita et al. found that Ki-67 can be used as an indicator of recurrence of lung cancer after resection (10), and the level of its positive expression is closely related to the differentiation degree, lymph node metastasis, and other factors of lung cancer (8,11). The most commonly used method to quantify Ki-67 expression is immunohistochemistry (IHC), which is not practical for the dynamic monitoring of Ki-67 during lung cancer treatment because of invasion, which is time-consuming (12). Due to the existence of tumor heterogeneity, Ki-67 values varied in different regions of the tumor samples, and traditional invasive immunohistochemical methods only evaluate the biopsy specimens of a small sample of the tissue and cannot reflect the overall heterogeneity of the tumor (13,14). Therefore, finding a non-invasive, cost-effective, and comprehensive method for clinical Ki-67 expression level assessment is necessary.
Radiomics is a recently emerging technique in computational medical imaging. It involves the extraction and analyses of a large number of quantitative imaging features from medical images (15,16). It is different from traditional methods because it converts medical images into mineable high-dimensional data. Radiomics can help support patient diagnosis, prognosis, treatment, and prediction in clinical practice. The relationship between Ki-67 expression level and radiomic features has always been a hot topic. Studies have shown that radiomics can be used to pre-operatively predict the expression level of Ki-67 in breast cancer (17) and adrenal cancer (18). In addition, several studies have shown that the quantitative imaging features from CT can predict Ki-67 levels and subtypes in patients with lung cancer (19,20), but the role of Ki-67 in distinguishing the pathological stages of lung adenocarcinoma remains unclear (12,21). To our knowledge, there have been no studies on the use of CT-based radiomic features to predict Ki-67 expression levels in subtypes of lung adenocarcinoma.
This study aimed to investigate the correlation between Ki-67 expression level and the subtypes of lung adenocarcinoma and to assess whether CT-based radiomic features could serve as non-invasive predictors of the Ki-67 levels in patients with lung adenocarcinoma.

Patient Characteristics
We retrospectively collected data from patients who underwent a chest CT scan, Ki-67 expression level detection, and postoperative pathological confirmation of lung adenocarcinoma at our institute from January 2019 to April 2021. The inclusion criteria were: (1) patients confirmed with lung adenocarcinoma by surgical resection, (2) maximum diameter of tumor ≤3 cm, (3) complete clinicopathological data, (4) IHC examination of Ki-67 expression levels, and (5) complete CT images. The exclusion criteria were: (1) greatest tumor diameter >3 cm, (2) radiotherapy, chemotherapy, or radiotherapy and chemotherapy were performed before surgery, and (3) incomplete or poorquality CT images. Our institutional review board approved this retrospective study, and the requirement for informed consent was waived.

Computed Tomography Examination
All patients in this study used a 64-slice CT scanner (Discovery CT 750 HD, GE Healthcare, Chicago, IL, USA) for their chest scans. All CT scans were obtained with the patients in the supine position and holding their breath at the end of a full inspiration.
The scan ranged from above the apex of the lungs to below the level of the diaphragm. The scanning parameters were as follows: tube voltage of 120 or 140 kV, tube current of 200-340 mA, beam pitch of 1.2, pixel resolution of 512 × 512, field of view (FOV) of 360 mm, thickness of 5 mm, and reconstructed slice thickness and slice increment of 1 mm. Afterward, the CT scans were reviewed as lung window images (window width = 1,200 HU; window level = −700 HU) and mediastinal window images (window width = 350 HU; window level = 50 HU). All images were exported in a DICOM format for image feature extraction after scanning.
The CT imaging signs included (22): (1) lesion location, (2) maximum diameter of the tumor on axial images, (3) tumor-lung interface: clear or unclear, (4) density: pure groundglass opacity, mixed ground-glass opacity, and solid nodule, (5) spiculation, (6) lobulation, (7) bubble-like lucency, (8) air bronchogram sign, (9) vascular sign, and (10) pleural traction. Two diagnostic radiologists with 3 and 9 years of experience reviewed the CT images of each patient and identified positive and negative findings by consensus. The entire process was performed without the patient having knowledge of the pathological results.

Immunohistochemical Analysis
Lung tissues were fixed with a 10% buffered formaldehyde solution by transbronchial or transpleural perfusion for ∼48 h and embedded in paraffin wax. Tissue sections were stained with HE. A mouse anti-human Ki-67 monoclonal antibody (Beijing Zhongshan Jinqiao Biotechnology Co., Ltd., Beijing, China) was used to perform the immunohistochemical detection according to the kit instructions. Positive and negative controls were set up, respectively. Ki-67 was positive with brown-yellow granules in the nucleus. The number of Ki-67 positive tumor cells was calculated in five fields of high-power field (×400) under light microscopes. The percentage of Ki-67 expression level positive staining of tumor cells in each field = the number of positive tumor cells in each field/total tumor cells in each field × 100%. The Ki-67 indices in five fields were calculated and averaged. Histological and cytological subtypes were assessed according to the WHO classification system for lung cancer (5th version) (23). The thresholding of the Ki-67 expression level was used to separate the tumor samples into positive and negative groups: The expression level of Ki-67 ≤5% was negative, and >5% was positive (24).

Radiomics Analysis
Image Pre-processing and Image Segmentation Firstly, the CT scan images of all patients were exported in a DICOM format from the PACS system workstations, and the AK (Analysis Kinetics, V3.2.0, Workbench2014, GE Healthcare) software was used to preprocess the resampled images of 0.5 × 0.5 × 0.5 on the X, Y, and Z axes, respectively. The threedimensional segmentation of the tumor regions of interest (ROIs) was performed using the ITK-SNAP software (version 3.8, Philadelphia, PA, USA) with the window width and window level as 1,200 HU and −700 HU, respectively. Then, the ROIs were outlined, and the outlined image was saved in the format of "Merge. nii." The scope of the image delineation includes tumor necrosis, cystic, and cavity, excluding burr, thickened pleura, and surrounding signs. The continuous delineation includes the whole lesion. If it is found to be contradictory, other senior radiologists will evaluate the tumor mask again to reach an agreement.

Radiomic Feature Selection and Classifier Construction
The interobserver intraclass correlation coefficient (ICC) selects values >0.75. Stratified sampling was used to divide all the patients into a training cohort (n = 165) and a validation cohort (n = 72) according to a ratio of 7:3. First, an ANOVA was performed to remove features with p > 0.05, and then the rest of the radiomic features were retained to select the most relevant features using recursive feature elimination (RFE). Next, the least absolute shrinkage and selection operator (LASSO) model, which could improve prediction accuracy and interpretation, was used to further select the features. According to Mann-Whitney U test, the top-class features were screened out to build the final logistic regression classifier, which was used to perform radiomic feature selection in the training dataset. Classification performance was evaluated using the area under the receiver operating characteristic curve (AUC). Finally, a radiomic score (Rad score) was developed using the logistic regression model and then used to calculate the training and validation datasets. A simplified flowchart of the study is given in Figure 1.

Clinicopathologic Characteristics of Patients
All statistical analyses were performed with SPSS version 21.0 (IBM Corporation, Armonk, NY, USA). Results were given as mean ± SD or median and range values. The chisquare test or Fisher's exact test was adopted to compare the distribution of the categorical variables. Student's t-test or one-way ANOVA was also calculated for the comparison of continuous variables. Manne-Whitney U testing was used for non-parametric data. Binary logistic regression was used to analyze the potential risk factors affecting the Ki-67 expression level. The cut-off value of the labeling index was obtained from the receiver operating curve (ROC) with the Youden index. The statistical analysis was considered significant when the p-value was <0.05.

Performance of the Radiomic Prediction Model
To evaluate the performance of the proposed radiomic prediction model, we adopted accuracy, sensitivity, specificity, positive predictive value, and negative predictive value as the evaluation indexes. Furthermore, the ROC curves and the AUCs were calculated to quantitatively assess the predictive capacity of the radiomic classifiers in the training and validation datasets. A p-value of <0.05 was considered statistically significant.

Characteristics of Study Subjects
The clinicopathologic characteristics of the patients with lung adenocarcinoma were summarized in Table 1. After screening, a total of 215 patients met the requirements. Among them, 19 patients had multiple lesions. There were 68 males (31.63%) and 147 females (68.37%) with a median age of 56 years old. One hundred ninety-six patients (91.16%) were non-smokers. Seventeen patients (7.91%) had a history of non-pulmonary tumors, 9 had thyroid cancer, 4 had breast cancer, 1 had cervical cancer, 1 had endometrial squamous cell carcinoma, 1 had colon cancer, and 1 had vocal cord squamous cell carcinoma. In the end, 237 lung adenocarcinoma lesions were selected for our study.

Ki-67 Expression Levels and Histological Subtypes
As shown in Tables 2, 3     As shown in Table 3 and Figures 3A-C

Comparison of CT Imaging Signs and Clinical Data of Lung Adenocarcinoma in Ki-67 Negative and Positive Expression Group
In our study, the median expression of Ki-67 was 5%. In addition, the lung adenocarcinomas were divided into a non-invasive adenocarcinoma group (AAH/AIS/MIA group) and an invasive adenocarcinoma group according to the prognosis of the lesions. Those classified as MIA were grouped with AAH/AIS due to its good prognosis. Therefore, 5% was selected as the cut-off value for grouping different stages of lung adenocarcinoma. Patients were divided into two groups: 132 (55.7%) patients had negative Ki-67 expression, and 105 (44.3%) patients exhibited positive Ki-67 expression. We first explored whether the imaging signs and clinical data could distinguish between the Ki-67 negative expression group and the Ki-67 positive expression group. The results showed that CT imaging signs (maximum diameter, density, shape, lobulation, spiculation, air bronchogram, vascular sign, and pleural traction) could be used to discriminate between the two groups (p < 0.001). There was a higher percentage of lymph node metastasis in the Ki-67 positive expression group than in the Ki-67 negative expression group (p < 0.001) ( Table 4). Air bronchogram and histopathological subtype had moderate predictive values, and the AUC values were 0.711 and 0.809, respectively ( Table 5). The Ki-67 positive expression group was more inclined to have air bronchograms than the Ki-67 negative expression group. The histopathological subtype of the Ki-67 positive expression group was more likely to be IAC, while the Ki-67 negative expression group was more likely to be a pre-invasive lesion (AAH and AIS) or MIA (Figure 4).

Development and Validation of the Radiomic Prediction Model
In the training set, the expression level of Ki-67 was taken as the dependent variable, and the CT radiomic features of lung adenocarcinoma were used as the independent variable to establish a pre-operative prediction model of Ki-67 expression level. The AUC value was 0.871 in the training dataset, the sensitivity was 76.7%, and the specificity was 83.7%, with a positive predictive value of 0.789. For the testing set, the classifier had an AUC value of 0.8, the sensitivity was 68.8%, and the specificity was 80%, with a positive predictive value of 0.733 (Table 7, Figures 6A,B). The calibration curve of the radiomic features also showed that the predicted probability was in good agreement with the actual probability in the training cohort (Figures 6C,D).

DISCUSSION
Ki-67 is considered to represent the proliferative state of tumors and is a prognostic biomarker in multiple malignant tumors, such as breast, prostate, and lung cancer (14). Ki-67 has a broad prospect in the study of lung cancer, especially the occurrence, development, early diagnosis, and prognosis of ground-glass opacity (GGO) in early lung cancer under low-dose CT scans (25,26). Ki-67 has been widely introduced into clinical practice to differentiate lung cancer subtypes and predict oncology outcomes (26)(27)(28). In our study, we systematically evaluated the expression level of Ki-67 according to the histological subtypes of lung adenocarcinoma and revealed the prognostic role of     Ki-67 in lung adenocarcinoma. Strikingly, we found that Ki-67 expression differed across lung adenocarcinoma histological subtypes, with IAC harboring the highest expression level, followed by the MIA, AIS, and AAH subtypes, which was consistent with the finding by Ishida et al. and Yan et al. (29,30). Ki-67 expression levels demonstrated good performance in our study, with AUCs of 0.851, 0.787, and 0.872 for differentiating between MIA and IAC, AAH/AIS and MIA/IAC, and AAH/AIS/MIA and IAC, respectively. The Youden index of the paired groups of pathological subtypes were 5.5, 6.5, and 5.5, respectively. It showed that the Ki-67 values were below 5% for non-invasive adenocarcinomas (AAH/AIS/MIA) and more than 5% for invasive adenocarcinomas. Notably, it means that Ki-67 expression could be identified as an independent prognostic factor of lung adenocarcinoma (28). The overexpression of Ki-67 infers poor differentiation and prognosis. Hence, the accurate pre-operative evaluation of the Ki-67 level may be helpful in distinguishing the different subtypes of patients with lung adenocarcinoma (31). For some suspicious patients with follow-up observations and no indication of surgery or needle biopsy, Ki-67 could serve as a useful predictive biomarker to select suspicious lesions with high proliferation. The early detection of this cancer could enhance the cure of the disease and even prolong overall survival. Several previous studies demonstrated that conventional CT images could be a non-invasive measurement to predict the Ki-67 index in lung adenocarcinoma. Our results showed that the degree of Ki-67 expression was related to nodule diameter, density, spiculation, lobulation, and air bronchogram sign. Moreover, an air bronchogram was the independent factor influencing the Ki-67 expression level, and the AUC in the ROC analysis for distinguishing different Ki-67 expression levels was 0.711. It inferred that the CT images of lung adenocarcinoma were related to the expression of Ki-67 (30). Our results were consistent with previous findings (32)(33)(34)(35)(36). Thus, the conventional CT examination might indirectly reflect the proliferative activity of lung adenocarcinoma, which was of high value to the early identification of the positive Ki-67 expression from negative Ki-67 expression and the facilitation of early diagnoses and individualized treatments, improving the survival rate. However, the ability of CT images to predict the Ki-67 index is controversial. Conventional CT provides limited information regarding lung adenocarcinoma grading and cannot replace the biopsy and surgery in obtaining specimens for a definitive diagnosis (37).
Radiomics can extract information-rich imaging functions with high throughput, which is different from traditional subjective imaging, and can quantify imaging information that the human eye cannot detect (15,38,39). In mathematics, radiomic features have different functions and definitions. Thus, it has a very good advantage in measuring the heterogeneity of tumor texture features (40). Several studies have shown that radiomics have been effective in predicting the Ki-67 index in multiple tumors (41). In this study, we established a pre-operative Ki-67 classification model in patients with lung adenocarcinoma using CT-based radiomic features. The result shows that eight radiomic features were significantly different between the negative Ki-67 group and the positive Ki-67 group (p < 0.001) (42). The CT-based radiomic predictive model demonstrated a stable and reliable performance, reaching an AUC of 0.871 and 0.8 and an accuracy of 80.6 and 75% in the training and testing cohorts, respectively. Therefore, the analysis revealed that CT-based radiomic features could pre-operatively predict Ki-67 levels in patients with lung adenocarcinoma, especially for suspicious patients under conservative treatment or patients who have lost the opportunity of a biopsy. The preliminary judgment of tumor proliferative activity through radiomic features can improve the accuracy and effectiveness of treatment, and could avoid the delay of disease and economic loss caused by ineffective treatment, which could have potential implications for future patient management and aid in the implementation of precision medicine.
Choosing an appropriate Ki-67 cut-off value is convenient for clinicians to treat and manage patients. However, no consensus on the prognostic value of the Ki-67 expression level was found among the published studies, neither according to disease stage nor histological subtype. In previous studies on lung cancer, the cutoff values for Ki-67 prediction of prognosis were mostly used at 25, 30, 40, and 50% (43-45). For stage I lung adenocarcinoma, a cut-off value of 0.1 was commonly used. Ishida suggested that the Ki-67 index of 2.8% might be used as a marker to distinguish between MIA and AIS (29). Determining a cut-off value is often based on the median value. In this study, 5% was selected as the classification threshold based on our data characteristics and previous similar studies, and relatively good results were obtained, which indicated that proliferative activity with a Ki-67 expression level of 5% may be a crucial turning point for progression from non-invasive adenocarcinomas (AAH/AIS/MIA) to invasive adenocarcinomas.
Our study had some limitations. First, this study had a small sample size, which may increase concerns regarding selection bias. Moreover, this study was a single-centered retrospective study. Further studies involving multiple centers and a large number of patients are necessary. Second, the manual outline of ROIs is time and labor-consuming, and there is no standardized outline process and rules, which may lead to poor consistency among different radiologists. The automatic recognition of tumor lesions and the characterization of ROIs for feature extraction are some of the future research directions. Third, the largest diameter of the lung adenocarcinoma lesions included in this study was <3 cm, and there is a bias in the selection of study subjects, which may affect the results of the study. In the future, how to better mine information to assist clinical decision-making so that patients can get more accurate individualized treatments is also an opportunity and challenge in the development of radiomics.
In conclusion, Ki-67 expression levels with a cut-off value of 5% could be used to differentiate non-invasive lung adenocarcinomas (AAH/AIS/MIA) from invasive lung adenocarcinomas. The radiomic characteristics of CT have potential as non-invasive biomarkers for predicting Ki-67 levels in patients with lung adenocarcinoma, which might allow for a precise evaluation of tumor biological behavior, aid in clinical treatment decision making for the precise management of patients with lung adenocarcinoma, as well as provide supplemental information for depicting the heterogeneity of lung adenocarcinoma in different histological subtypes.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.

ETHICS STATEMENT
Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.