CT Morphological Features Integrated With Whole-Lesion Histogram Parameters to Predict Lung Metastasis for Colorectal Cancer Patients With Pulmonary Nodules

Purpose: To retrospectively identify the relationships between both CT morphological features and histogram parameters with pulmonary metastasis in patients with colorectal cancer (CRC) and compare the efficacy of single-slice and whole-lesion histogram analysis. Methods: Our study enrolled 196 CRC patients with pulmonary nodules (136 in the training dataset and 60 in the validation dataset). Twenty morphological features of contrast-enhanced chest CT were evaluated. The regions of interests were delineated in single-slice and whole-tumor lesions, and 22 histogram parameters were extracted. Stepwise logistic regression analyses were applied to choose the independent factors of lung metastasis in the morphological features model, the single-slice histogram model and whole-lesion histogram model. The areas under the curve (AUC) was applied to quantify the predictive accuracy of each model. Finally, we built a morphological-histogram nomogram for pulmonary metastasis prediction. Results: The whole-lesion histogram analysis (AUC of 0.888 and 0.865 in the training and validation datasets, respectively) outperformed the single-slice histogram analysis (AUC of 0.872 and 0.819 in the training and validation datasets, respectively) and the CT morphological features model (AUC of 0.869 and 0.845 in the training and validation datasets, respectively). The morphological-histogram model, developed with significant morphological features and whole-lesion histogram parameters, achieved favorable discrimination in both the training dataset (AUC = 0.919) and validation dataset (AUC = 0.895), and good calibration. Conclusions: CT morphological features in combination with whole-lesion histogram parameters can be used to prognosticate pulmonary metastasis for patients with colorectal cancer.


INTRODUCTION
Colorectal cancer (CRC) is the third common cause of morbidity and mortality worldwide (1,2). Pulmonary is the most common extra-abdominal site of metastasis for those with CRC, with 5-10% of CRC patients developing pulmonary metastasis (PM) (3,4). The 5-year survival rates after initial colorectal surgery in patients with and without resection for pulmonary metastasis are 68 and 13%, respectively (3). The strong survival benefits of pulmonary metastasectomy make this treatment the generally accepted treatment for patients to achieve long-term survival when there is a definite and clear diagnosis (5,6). Furthermore, if pulmonary metastasis is diagnosed early and resected aggressively, the survival rate is further improved (7).
However, with chest CT applied as part of preoperative routine examination, an increasing number of CRC patients are being diagnosed with indeterminate pulmonary nodules (IPNs) of unknown nature (8). The reported incidence of IPNs in CRC patients is 25-45.5% (8)(9)(10). Further diagnostic tests can also be problematic as nodules <10 mm in diameter may fall below the threshold of detection for positron emission tomography (PET) (11), and fine-needle aspiration cytology may not be feasible for thoracoscopic localization (12). Therefore, in CRC patients with IPNs, the accurate diagnosis of metastatic disease at an early and surgically treatable stage remains a challenge.
Though early-stage metastatic nodules and benign lesions have similar appearance in images, the importance of morphology should not be underestimated (13). CT imaging allows detailed observation of the morphological features of nodules and lesions, such as their internal density, shape, margin, and other typical characteristics. In recent years, texture analysis has emerged as a valuable methodology for facilitating diagnosis through the deep mining of information from medical images (14,15). It has achieved great utility in evaluating many kinds of pulmonary diseases, including pulmonary embolisms (16), interstitial lung disease (17), and pulmonary nodules (18,19). By extracting features of subtle pixel distributions and spatial variations of the gray levels of lesions that are imperceptible to the naked eye, texture analysis provides a complementary method for evaluating subjective and megascopic morphological features.
To date, studies concentrating on the morphological and textural features of IPNs 5-20 mm in diameter on contrastenhanced CT in CRC patients remain limited. This study sought to determine the morphological characteristics and histogram parameters derived from texture analysis for CRC patients with IPNs and to construct a risk model with a combination of independent predictors to facilitate the accurate diagnosis of pulmonary metastasis.

Patients
This retrospective analysis had obtained the ethical approval, and the informed consent requirement was waived. Our study enrolled 196 consecutive colorectal cancer patients (88F/108M; age range, 32-80 years; mean age, 58.49 ± 10.80 years) with lung nodules admitted in our institution between January 2010 and December 2017. The inclusion criteria were as follows: (i) colorectal cancer was histopathologically confirmed; (ii) at least one lung nodule measuring 5-20 mm detected by contrastenhanced chest CT examination; (iii) available pathology reports with diagnosis of pulmonary metastasis or primary lung cancer for the malignant nodules and at least 2 years follow-up for the benign nodules; and (iv) complete medical history. The exclusion protocol were as follows: (i) with pretreatment 6 months before initial CT examination (including chemotherapy or pneumonectomy); (ii) obsolete nodules detected 6 months before colorectal cancer was detected; (iii) obvious benign nodules with typical imaging characteristics (such as cysts, tuberculosis, or inflammatory nodules); and (iv) adjuvant therapy (including radiation therapy or chemotherapy) applied for noprogress lesions in the process of follow-up. When there are multiple nodules, we choose the largest nodule for morphological and radiomics analysis. Of the 196 people included in the study, 194 of them have been published in our previous research (20).
Nodules were divided into two groups: (i) a pathologically confirmed lung metastasis group (95 PMs; 42F/53M; mean age, 57.46 ± 10.58 years), and (ii) a non-metastasis (NM) group (101 NMs; 46F/55M; mean age, 59.47 ± 10.91 years), including benign nodules (90 cases) with at least 2 years follow-up (88 cases) and pathology confirmation (2 cases) or primary lung cancer confirmed by pathology (11 cases). We used a computer algorithm to randomly divide the patients into a training dataset and a validation dataset at the ratio of 7:3. Figure 1 shows the process of patients' recruitment.

CT Scanning Protocol
Chest CT examinations were performed at our institution with the Sensation 64 scanner (Siemens Healthcare) or the Somatom Definition AS scanner (Siemens Healthcare). The Contrastenhanced CT scan parameters were as follows: contrast medium, inhexol; tube voltage, 120 kVp; tube current, 250-350 mA; slice thickness, 1.5 mm; slice interval, 1.5 mm; matrix, 512 × 512; field of view (FOV), 35-50 cm; pitch, 1.078; reconstruction algorithm, standard. The arterial phase of the target nodule which was pathologically confirmed or under follow-up was selected for reconstruction.

CT Image Interpretation
The interpretations of CT features are listed in Supplementary Table 1. The CT morphological features were independently evaluated by two operators (SW and TH, with 20 and 3 years of experience in chest CT, respectively). In cases of disagreement, a third radiologist (TT, with 20 years of experience in CT imaging) was consulted, and the majority value was used. Mean values were calculated for continuous variables. The CT images were read with both mediastinal and lung window settings. All of the operators were blinded to the clinical and histologic findings.

Histogram Analysis
Reconstructed images were transferred to the MIM software (v6.6.3; MIM Software Inc.) for histogram analysis. For each patient, regions of interest (ROIs) were first semi-automatically contoured in the largest-cross sectional area of the tumor outline and then manually delineated by an operator and verified by an expert radiologist. Each ROI was propagated to include the entire tumor volume in each consecutive slice using the same contouring method. In the process of delineation, we excluded the border of the lesion and any other irrelevant tissues or regions, such as pleura, normal tissue, air, peripheral vessels, and surrounding organs. Supplementary Figure 1 shows an example of ROI delineation.
The histogram parameters were automatically measured by the software using a volumetric approach on the ROI of the nodule. Single-slice and whole-lesion histogram parameters were extracted and analyzed. From each segmented tumor, we extracted 11 single-slice histogram parameters and 11 whole-lesion histogram features. More information about the methodology used to extract histogram features can be found in Supplementary Material.

Statistical Analysis
R software (version 3.3) was applied for statistical analysis. To measure the agreement of CT morphological features between two readers, intraclass correlation coefficients (ICCs) were calculated (poor: 0.00-0.20; fair: 0.21-0.40; moderate: 0.41-0.60; good: 0.61-0.80; excellent: 0.81-1.00). To compare the proportional differences between the training dataset and the validation dataset, chi-square tests were applied for the categorical variables, and two-sample t-tests were used for the continuous variables. To compare the differences between the PM and NM group, chi-square and twosample t-tests were applied as appropriate for both the training and validation datasets. Two-sided p < 0.05 was considered significant.

Model Selection
The significant factors were introduced into the stepwise logistic regression to select the independent features for the CT morphological model, the single-slice histogram model and the whole-lesion histogram model. The Akaike information criterion (AIC) was employed as the stopping rule. The validation dataset was used to test the diagnostic performance of the models by applying the multivariable regression formula derived from the training dataset to the patients of the validation dataset, and the probability of metastasis was calculated for each. The area under the receiver characteristic curve (AUC) was calculated to quantify the predictive accuracy of the three models in the training and validation datasets. We also calculated the accuracy, sensitivity, specificity, positive predictive value, and negative predictive value for each model. We compared the relative strengths of the single-slice and whole-lesion histogram models and then used the more efficient model in combination with the morphological features to construct the morphological-histogram model. A morphological-histogram nomogram was then constructed for clinical application. A receiver operating characteristic (ROC) curve was used to describe the discrimination abilities of the nomogram. An AUC above 0.75 is considered as good (21). Nomogram performance was graphically demonstrated by calibration plots in both the training and validation datasets. Finally, decision curve analysis (DCA) was applied to assess the clinical usefulness of the nomogram.

Patient Characteristics
The patients characteristics and statistically significant CT morphological features are shown in Table 1 Frontiers in Oncology | www.frontiersin.org  Table 3 contains complete morphological features comparison), and the histogram parameters are presented in Table 2. There were no significant differences between the training and validation datasets except in pleural attachment (Supplementary Table 2). The agreement between the two operators was excellent for most characteristics and good for several features (Supplementary Table 4).

Comparison of Single-Slice and Whole-Lesion Histogram Analyses
The morphological-histogram nomogram was successfully constructed, with good discrimination, based on the morphological-histogram model (Figure 4A). The calibration plots also presented good accordance between the nomogram prediction and actual outcome for PM and NM in both the training and validation datasets (Figures 4B,C). The decision curve analysis demonstrated that given a threshold probability ranging from 0 to 100%, the morphological-histogram model was superior to the treat-all and treat-none schemes in predicting lung metastasis (Figure 4D).

DISCUSSION
In the present study, we investigated the imaging characteristics of IPNs 5-20 mm in diameter on initial CT in CRC patients and compared the predictive accuracy of whole-lesion and single-slice histogram parameters. We then constructed a morphological-histogram nomogram using a combination of morphological features and whole-lesion histogram parameters for IPNs. This nomogram may be clinically useful for discriminating CRC patients who might benefit from early and curable metastasectomy for metastatic lesions or an appropriate surveillance program.
CT offers direct visualization of lesions and potentially allows a detailed characterization of the morphologic extent of lesions. The careful evaluation of morphologic features is an essential step in pulmonary nodules assessment (13). Although several studies (22)(23)(24) have sought to identify significant image features for metastatic nodules, there is no consensus regarding the definition of IPNs, which led to slight differences between our results and previously published ones. In our study, we found that significant morphological features associated with pulmonary metastasis were long-axis diameter, density, and contour.
As reported by many other studies, nodule diameter is a reliable indicator of malignant potential (22,23,25). We found that solid nodules are more likely to be metastatic lesions. As more than 95% of nodules that originate from colorectal  cancer are adenocarcinomas (4), metastatic lesions tend to appear as solid pulmonary nodules (SPN) in CT scans, whereas benign lesions, such as inflammation lesions, or organizing pneumonia/fibrosis consistently present as patchy consolidations or mixed-density regions surrounded by ground-glass opacity (GGO) owing to inflammatory cell infiltration (26). Primary lung cancer consistently evolves from pre-invasive lesions (AIS/AAH) that manifested as pure GGO (27) at the early stage. A post-hoc analysis (24) found that a solid consistency and increasing size were statistically associated with malignancy. In addition, our study found that metastatic nodules tended to be round or oval, consistent with previous research (28). We speculate that as metastatic nodules often exhibit a largely uniform growth rate and homogenous invasion in all directions, these features contribute to a round or quasi-circular contour, whereas non-metastatic lesions, including benign lesions and primary lung cancer, have irregular shapes due to uneven growth rates at various sites (26). Thus, short-interval CT follow-up is highly recommended for IPNs larger than 5 mm in diameter with solid components and approximately regular margins detected on preoperative chest CT.
In addition to the identification of morphological features, the use of texture analysis is a strength of our study. Previous studies have demonstrated that texture analysis can not only distinguish malignant nodules from benign ones (18) but also differentiate in situ and minimally invasive lung adenocarcinoma subtypes (19). These studies have shown that texture parameters can reveal the underlying histological changes in tissue below the resolution of the given modality or protocol. In this study, we found that the W-average ratio, W-mean, and W-median, which represent the zone of CT attenuation within the ROI, were substantially higher in the metastasis group than in the nonmetastasis group. Thus, short-interval CT follow-up is highly recommended for IPNs larger than 5 mm in diameter with solid components and approximately regular margins detected on preoperative chest CT. This speculation is also in line with another finding of our study that vascular convergence was more common and the enhancement degree was higher in the metastasis group than in the non-metastasis group. However, as texture analysis is a mathematical method, the biological mechanisms underlying the textural features are complex and not completely understood (29). In cases where vascular convergence or the enhancement degree is insufficient to differentiate metastatic lesions, the values from the CT attenuation zone might exhibit local variation and more sensitive preservation of spatial information (30).
Another finding of our study was that the whole-lesion texture analysis outperformed the single-slice analysis in evaluating pulmonary nodules, consistent with a previous study (31). Whole-lesion analysis may provide a more comprehensive understanding of the stereo structure of the whole lesion and thereby reflect the integral heterogeneity better than can single-slice analysis. Despite the time-consuming process of the contouring around the whole lesion, it seems more cost-efficient to use this method as it provides improved prediction relative to single-slice analysis and a more definite diagnosis, allowing timely treatment and maximizing the benefits to the patient.
For clinical use, we constructed a risk stratification nomogram for the clinician to predict the risk of PM for an individual CRC patient. As the early and accurate diagnosis of pulmonary metastasis has been recognized as one of the most important steps in treating potential curable lesions with surgery, we propose that patients with a high risk of PM be considered candidates for thoracotomy for resectable lesions to enhance local control and improve the survival rate. We also hope this model can help low-risk patients avoid aggressive follow-up and reduce the burden of radiation exposure. We believe that the clinical use of the nomogram can contribute to reliable diagnoses and help clinicians optimize therapeutic plans for IPNs at an early stage after detection.
Our study has several limitations. First, as a retrospective study, thin-slice contrast-enhanced CT images from our database were used, which limited the number of cases for analysis. And the inclusion and exclusion criteria also limits the implementation of the study in clinical practice. Second, only histogram parameters were extracted in this study. In our previous research (20), 203 radiomic features, including firstand second-order parameters, attained a prognostic value in the differentiation of pulmonary metastasis with an AUC of 0.888, which is slightly higher than that obtained using the histogram parameters (AUC = 0.887). However, the process of extracting radiomic features through MATLAB is intricate and demanding for radiologists and clinicians, which constrains its clinical utilization. The volume histogram analysis performed here allowed the simple, efficient, and automatic acquisition of a density histogram and achieved an accuracy comparable to that of the radiomics analysis. Thus, volume histogram analysis may be more appropriate for imperative clinical decisions, and radiomics analysis can be used as a supplementary method when needed. Another limitation is that the development and validation were performed in a single institution. External validation and multi-center clinical trials are therefore needed for further generalization.
In conclusion, the results of our study demonstrated that histogram parameters may serve as non-invasive imaging biomarkers for differentiating pulmonary metastasis from nonmetastatic lesions. When complemented with morphological features, the morphological-histogram nomogram can greatly benefit the diagnosis of pulmonary metastasis in CRC patients.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Medical ethics committee of Fudan University Shanghai Cancer Center. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
TT and SW carried out the concepts and design of the study. DS provided the patients information. YL confirmed the pathology results. XE, YY, HL, and JW provided assistance for data acquisition and statistical analysis. WP provided the permission of imaging acquisition. SW and TH carried our literature research and manuscript editing. TH and SW contributed equally to this work. All authors have reviewed the final version of the manuscript and approved it for publication.