Development and Validation of a Nomogram for Preoperative Prediction of Lymph Node Metastasis in Lung Adenocarcinoma Based on Radiomics Signature and Deep Learning Signature

Background and Purpose The preoperative LN (lymph node) status of patients with LUAD (lung adenocarcinoma) is a key factor for determining if systemic nodal dissection is required, which is usually confirmed after surgery. This study aimed to develop and validate a nomogram for preoperative prediction of LN metastasis in LUAD based on a radiomics signature and deep learning signature. Materials and Methods This retrospective study included a training cohort of 200 patients, an internal validation cohort of 40 patients, and an external validation cohort of 60 patients. Radiomics features were extracted from conventional CT (computed tomography) images. T-test and Extra-trees were performed for feature selection, and the selected features were combined using logistic regression to build the radiomics signature. The features and weights of the last fully connected layer of a CNN (convolutional neural network) were combined to obtain a deep learning signature. By incorporating clinical risk factors, the prediction model was developed using a multivariable logistic regression analysis, based on which the nomogram was developed. The calibration, discrimination and clinical values of the nomogram were evaluated. Results Multivariate logistic regression analysis showed that the radiomics signature, deep learning signature, and CT-reported LN status were independent predictors. The prediction model developed by all the independent predictors showed good discrimination (C-index, 0.820; 95% CI, 0.762 to 0.879) and calibration (Hosmer-Lemeshow test, P=0.193) capabilities for the training cohort. Additionally, the model achieved satisfactory discrimination (C-index, 0.861; 95% CI, 0.769 to 0.954) and calibration (Hosmer-Lemeshow test, P=0.775) when applied to the external validation cohort. An analysis of the decision curve showed that the nomogram had potential for clinical application. Conclusions This study presents a prediction model based on radiomics signature, deep learning signature, and CT-reported LN status that can be used to predict preoperative LN metastasis in patients with LUAD.


INTRODUCTION
Lung cancer is the most common cancer worldwide and the leading cause of cancer-related death (1). NSCLC (Non-small cell lung cancer) is the most common type of lung cancer, and adenocarcinoma is the most common subtype of NSCLC (2,3). Studies showed that most cancer patients die of cancer cell metastasis (4). In lung cancer, Lymph node metastasis is the most common way of metastasis (5). In the recent decades, SND (systematic nodal dissection), as a core method for evaluating node involvement levels at the mediastinal and hilar, has been accepted by the IASLC (International Association for Lung Cancer Research) as a key component of intrathoracic staging (6). However, for patients with no LN metastasis, SND has no other benefits except to prove that their pathological state is N0, which obviously leads to unnecessary invasive treatment. In addition, SND prevents the lymphatic fluid in the influenced area from being discharged, thereby resulting in lymphedema. This then leads to over-treatment. It is therefore important to develop a preoperative, non-invasive, and effective method to predict the extent of LN involvement.
Imaging methods, such as CT and PET (positron emission tomography), are commonly used in clinical LN diagnosis. CT can diagnose lymph nodes based on their size, but it cannot detect small LN metastasis. In PET imaging, LN metastasis usually shows increased FDG (Fludeoxyglucose) uptake, but inflammation and infection can also contribute to this. Compared with imaging methods, imaging-guided biopsy has better sensitivity and specificity in identifying LN metastasis, but it may lead to complications such as pneumothorax and bleeding (7)(8)(9)(10). In recent years, radiomics has provided alternative ways for the diagnosis and prognosis of cancer (11)(12)(13). Some studies have successfully used radiomics features to predict LN metastasis in lung cancer (14,15). In addition, thanks to the development of computer hardware and algorithms, deep learning has achieved great success in the field of computer vision (16). The model developed by deep learning has been successfully applied to the detection of skin cancer, diabetic retinopathy, breast cancer and so on (17)(18)(19)(20). There are also studies related to deep learning in the diagnosis of lymph nodes of lung cancer (21,22). However, few studies used both radiomics and deep learning to predict LN metastasis.
Therefore, the purpose of this study is to develop and validate the effectiveness of a nomogram (23) with a radiomics signature, deep learning signature, and clinical risk factors for the preoperative prediction of LN metastasis in patients with LUAD.

Patients and Data Acquisition
We retrospectively collected the data of 300 patients with LUAD from the Liaoning Cancer Hospital over the period of April 2015 to July 2019. We randomly divided 300 patients into the training cohort, internal validation cohort, and external validation cohort in equal proportions. In total, the training cohort included 200 patients: 99 males and 101 females; mean age, 63.21 ± 6.82. Internal validation cohort included 40 patients (18 males and 22 females; mean age, 64.35 ± 6.69). External validation cohort included 60 patients (27 males and 33 females; mean age, 63.18 ± 6.94). The baseline clinicopathological data included age, sex, CT-reported LN status, and CEA (carcinoembryonic antigen). However, owing to the lack of CEA data in more than half of the patients, CEA was abandoned. The inclusion and exclusion criteria of the data were as follows. Inclusion criteria: (a) the LN status was confirmed by operation and pathology reports, (b) the focus was single nodal mass type, (c) the time interval between CT scan and operation was no more than 1 month, (d) the slice thickness of CT plain scan image was 5 mm. Exclusion criteria: (a) preoperative radiotherapy or chemotherapy, (b) central lesions in the lung, (c) atelectasis and consolidation, (d) history of other tumors. The workflow of the study is illustrated in Figure 1.
Before CT scanning, foreign metal bodies were removed from the upper body of the patient to avoid the problem of artifacts. The patient was asked to raise his/her arms across the top of his/ her head in the supine position and remain fixated in this position. The scanning was performed from the entrance of the chest to the diaphragm, when both the body and mind of the patient were in a relaxed state. The scanning machine used was Philips iCT 256 (Netherlands), and it had the following parameters: tube voltage of 120 kVp; 3D tube current in the range 110-325 mAs; layer thickness of 5.0 mm; acquisition matrix of 512 × 512; and, the FOV (field of view) was affected by the body fat and adjusted for thickness. The CT-reported LN status was determined by the radiologists based on the clinical radiological report of the preoperative CT. The presence of either regional LN of >1 cm and/or clusters of ≥3 lymph nodes was scored as LN-positive, and otherwise as LN-negative. The ROI (region of interest) was delineated by the radiologist according to the maximum cross-sectional area of the tumor boundary. CEA was obtained by a routine blood test and laboratory analysis within one week before operation. A CEA <5 ng/mL was recorded as normal, and otherwise as abnormal.

Statistical Analysis
For determining the differences in the distribution of variables between cohorts, we used Kruskal-Wallis rank sum test to analyze the continuous variables (age, radiomics signature, deep learning signature) and chi-square test to analyze the discrete variables (sex, CT-reported LN status). Furthermore, for determining the correlation between variables and LN status within the cohort, we used Wilcoxon rank sum test to analyze the continuous variables and chi-square test or Fisher exact test to analyze the discrete variables. All the statistical tests in the study were two-sided with a significance level of 0.05.

Building the Radiomics Signature and Deep Learning Signature
Pyradiomics (24) was used to extract features from the ROI. T-test was used to select features with a statistical significance of P <0.05, and Extra-trees was used to further select features with rich information from the training cohort. Then, logistic regression (25) was used to weight and combine the selected features, which built the radiomics signature. The extracted features and their weighting coefficients were applicable to the internal as well the external validation cohorts.
VGG-16 was used to build the deep learning signature. We first used each tumor slice and its adjacent two slices as R, G and B channels respectively, and combined them to obtain a threechannel image. Then an 80x80 pixel size area containing the tumor was cropped out as the final image input to VGG-16. As the amount of data collected was small, we used data augmentation to increase the amount of data and transfer learning to make the model easier to converge. Data augmentation technology included rotation, horizontal and vertical displacement, horizontal and vertical flipping, cropping, and scaling of the image. Transfer learning involved taking the pretraining weights of VGG-16 on the ImageNet dataset as the initial weights of the model, and then finetuning the model using the data of our training and internal validation cohorts. Next, the features of the last fully connected layer of VGG-16 were combined with weights and biases as the feature of each tumor image. The average value of the feature of multiple tumor images was calculated as the deep learning signature of the patient.

Development of the Prediction Model
The candidate features of the multivariate logistic regression analysis included age, gender, CT-reported LN status, radiomics signature, and deep learning signature. The Akaike information criterion (26) was used as the stop criterion to determine the best A B C FIGURE 1 | The workflow. (A) Traditional radiomics was used to extract artificial pre-defined features from the ROI region, and then the extracted features were selected and weighted to obtain the radiomics signature. (B) CNN was used to extract the automatic learning features from the slice where the ROI was located and then weighted to obtain deep learning signature. (C) Radiomics signature and deep learning signature were used to build the prediction model. features using a stepwise backward method. Additionally, the prediction model developed using logistic regression for the training cohort was also suitable for the internal as well as the external validation cohorts. Then, we developed a nomogram based on the developed prediction model.

Performance of the Prediction model
The calibration curve and Hosmer-Lemeshow test (27) were used for model calibration. To quantify the discrimination, we calculated the C-index of the prediction model, and to compare the performance of the multi-factor model and the single-factor model, the NRI (net reclassification improvement) was calculated. In addition, we calculated the additional NRI of 5fold cross-validation to obtain more reliable results. Decision curve analysis (28) was used to quantify the net benefit at different threshold probabilities in the external validation cohort to determine the clinical value of the prediction model.

Clinical Characteristics
The characteristics of all the cohorts are listed in Table 1.
Because the data were divided into the different cohorts in equal proportions, the probability of LN metastasis was 50% in all the cohorts (P= 1.000). Furthermore, there was no significant difference observed with regard to gender among all the cohorts (P= 0.763), as was the case with the CT-reported LN status (P= 0.475) and age (P= 0.551). This indicated that the division of data was effective. Radiomics signature (P= 0.996) and deep learning signature (P= 0.869) also showed good reproducibility in all the cohorts (Supplementary Material and Table S1).

Building the Radiomics Signature and Deep Learning Signature
A total of 1288 radiomics features were extracted from the CT images and the T-test was used to select 528 features (P <0.05) with good statistical significance. Next, Extra-trees was used to further select 8 features with rich information. The first eight features were selected because the eighth feature was a breakpoint, in other words, the value of the eighth feature was significantly different from that of the ninth feature. After the ninth feature, the importance of the features changed slightly (Figure 2). For the eight selected features, we used logistic regression for performing weighted summation to obtain the radiomics signature. The distribution of the radiomics signature showed that the signature had good separability in metastasis and not in the metastasis categories ( Table 1).
The features of the last fully connected layer of VGG-16 were weighted to obtain the deep learning signature. To help users build trust in VGG-16 predictions, the grad-cam (29) method was used to generate a heat map. The heat map tells the user the position of the feature on which the prediction is based in the image, and uses the color depth to represent the importance of the feature. The deeper the color is, the more important the feature in the region is. In this study, the heat maps of two of the filters in the last convolution layer of VGG-16 were plotted. The heat maps suggested that the positive filter focused on the features of metastatic LN, ignoring the features of non-metastatic LN, while the negative filter focused on the features of nonmetastatic LN, thus ignoring the features of metastatic LN ( Figure 3). This indicated that the LN features extracted by VGG-16 can distinguish LN metastasis from non-metastatic, and the distribution of the deep learning signature further confirmed this finding ( Table 1).

Development of the Prediction Model
Multivariate logistic regression analysis confirmed that radiomics signature, deep learning signature, and CT-reported LN status were independent predictors ( Table 2). The model combining the above-mentioned independent predictors was developed and presented in the form of a nomogram ( Figure  4). The specific method for estimating the LN metastasis probability is explained in the Supplementary Material (Equation S1).

Performance of the Prediction Model
The calibration curve (  Table S2). This proved that the deep learning signature was helpful for improving the performance of the prediction model.
The decision curve showed that if the threshold probability of determining the presence of LN metastasis was greater than 0.18, using nomogram to predict LN metastasis will benefit more than the all the treatment plans or no treatment plan at all ( Figure 6).

DISCUSSION
In this study, we developed and validated a nomogram based on a radiomics signature, deep learning signature, and CT-reported LN status for the preoperative prediction of LN metastasis in patients with LUAD.
For constructing the radiomics signature, the total number of features (1288) was reduced to 528 using the T-test, and then 8 features with rich information were selected using Extra-trees. Extra-trees is an ensemble learning method, which is composed of multiple decision tree. Extra-trees reduces the risk of overfitting a single model. Therefore, the features selected by Extra-trees are robust. Then, we used logistic regression to combine features to obtain the radiomics signature.
Considering that deep learning achieves better results than traditional machine learning methods in ImageNet large-scale Visual recognition Challenge, this study also used deep learning method. The difference between radiomics method and deep FIGURE 2 | Feature selection by Extra-trees. Feature importance was obtained by averaging the results of multiple decision tree in Extra-trees. The larger the feature score, the more important is the feature.   both deep learning features and radiomics features were used to predict LN metastasis of lung adenocarcinoma. However, different from some studies, we didn't directly output thousands of features of the last fully connected layer, but combined the features with its weights and biases to get a deep learning signature (30,31). This helped to draw nomogram and analyze the individual influence of deep learning features on LN metastasis. Because the amount of data is relatively small, this study used data augmentation technology and transfer learning technology in deep learning. Data augmentation technology expanded the amount of data. Transfer learning technology made the training of VGG-16 easier. Specifically, we took the pretraining weights of VGG-16 on ImageNet as the initial weights of the model, and then used our data to fine-tune the model.
Multivariate logistic regression analysis showed that radiomics signature, deep learning signature and CT-reported LN status were independent and effective predictors. The c-index of the nomogram constructed with these three features in the training cohort, the internal validation cohort and the external validation cohort was respectively 0.820 (95% CI, 0.762 to 0.879), 0.830 (95% CI, 0.694 to 0.996), 0.861 (95% CI, 0.769 to 0.954), which was better than any single-factor model. The results of NRI showed that nomogram was significantly improved compared with the single-factor model.
The limitations of this study mainly include the following: (a) No enough clinical information. Smoking history and CEA have been proved to be effective predictors of LN metastasis (14, 15); (b) No genetic information was used. Some studies have shown that in the primary tumor, miR-31, miR-34b/c, miR-148 and miR-9-325 were significantly correlated with LN status (32,33). Incorporating genetic features may improve the performance of the radiomics model, which may be a future research direction; (c) The amount of data is relatively small. The more the amount of data, the features learned by the deep learning method can better explain the data; (d) This is a single-center retrospective study. A prospective multicenter clinical trial is needed to validate our model. In summary, this study proposes a nomogram based on radiomics signature, deep learning signature, and CT-reported LN status that can be conveniently used to predict preoperative LN metastasis in patients with LUAD.

DATA AVAILABILITY STATEMENT
The data analyzed in this study is subject to the following licenses/ restrictions: The datasets are privately owned by Liaoning Cancer Hospital and are not made public. Requests to access these datasets should be directed to DZ, zhaodan777@126.com.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Medical Ethics Committee of Cancer Hospital of Liaoning Province, China. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.