Development of a Novel, Multi-Parametric, MRI-Based Radiomic Nomogram for Differentiating Between Clinically Significant and Insignificant Prostate Cancer

Objectives: To develop and validate a predictive model for discriminating clinically significant prostate cancer (csPCa) from clinically insignificant prostate cancer (ciPCa). Methods: This retrospective study was performed with 159 consecutively enrolled pathologically confirmed PCa patients from two medical centers. The dataset was allocated to a training group (n = 54) and an internal validation group (n = 22) from one center along with an external independent validation group (n = 83) from another center. A total of 1,188 radiomic features were extracted from T2WI, diffusion-weighted imaging (DWI), and apparent diffusion coefficient (ADC) images derived from DWI for each patient. Multivariable logistic regression analysis was performed to develop the model, incorporating the radiomic signature, ADC value, and independent clinical risk factors. This was presented using a radiomic nomogram. The receiver operating characteristic (ROC) curve was utilized to assess the predictive efficacy of the radiomic nomogram in both the training and validation groups. The decision curve analysis was used to evaluate which model achieved the most net benefit. Results: The radiomic signature, which was made up of 10 selected features, was significantly associated with csPCa (P < 0.001 for both training and internal validation groups). The area under the curve (AUC) values of discriminating csPCa for the radiomics signature were 0.95 (training group), 0.86 (internal validation group), and 0.81 (external validation group). Multivariate logistic analysis identified the radiomic signature and ADC value as independent parameters of predicting csPCa. Then, the combination nomogram incorporating the radiomic signature and ADC value demonstrated a favorable classification capability with the AUC of 0.95 (training group), 0.93 (internal validation group), and 0.84 (external validation group). Appreciable clinical utility of this model was illustrated using the decision curve analysis for the nomogram. Conclusions: The nomogram, incorporating radiomic signature and ADC value, provided an individualized, potential approach for discriminating csPCa from ciPCa.


INTRODUCTION
Prostate cancer (PCa) is the second most frequently diagnosed cancer in men worldwide (1). The serum prostate-specific antigen (PSA) and digital rectal examination are the most widely used in the PCa screenings in clinical practice (2). If a patient presents with an elevated PSA, transrectal ultrasound (TRUS)guided biopsy is the conventional diagnostic approach. However, about over 30% of men undergo side effects with TRUS-guided biopsy, including pain, bleeding infection, and hematuria, and ∼1% need to be hospitalized for observation (3). Furthermore, some patients experience unnecessary biopsies as clinically insignificant PCa (ciPCa), defined as a Gleason score (GS) <3+4 or a maximum cancer core length of <6 mm, may be detected (4). The clinically significant PCa (csPCa) is defined as a GS ≥ 3+4 in at least one biopsy core pathology (4)(5)(6). The principal treatment of ciPCa is active surveillance rather than radical prostatectomy, which is routine treatment for localized csPCa. In addition, the detection of ciPCa by transrectal ultrasound-guided biopsy may cause overtreatment in a few patients.
Multi-parametric MRI (mp-MRI) containing anatomical sequences (T1-and T2-weighted imaging; T1WI and T2WI) and functional sequences [diffusion-weighted imaging (DWI) and dynamic contrast-enhanced (DCE)] has been regarded as an advanced imaging pattern in the identification of PCa (7,8). Mp-MRI plays an important role in decreasing the overdiagnosis and overtreatment for ciPCa, arranging target biopsy, tumor stage, or treatment for csPCa patients. However, its diagnostic performance and evaluation capacity varies based on each individual radiologist. The overall inter-reader consistency of multiple reports ranges from poor (0.5) to moderate (0.71), mainly depending on the experience and learning level of radiologists (9,10).
Radiomic methods are regarded as a noninvasive, efficient, and reliable method for adopting advanced image-processing techniques to extract a variety of quantitative features from imaging data (11). Radiomics has been mainly used in oncology, for instance, lung cancer, brain astrocytoma, and breast carcinoma, wherein radiomics is utilized to identify tumor stage, curative effect, prognosis assessment, and genetic analysis (12)(13)(14). Radiomics has also been extended to PCa, mainly focusing on PCa diagnosis and differentiation (15)(16)(17)(18). Min et al. investigated an mp-MRI-based radiomic signature for predicting patients with csPCa (18). The results showed that the radiomic signature had a potential to discriminate csPCa from ciPCa, wherein the area under the curve (AUC) was 0.823 in the validation cohort. However, the diagnostic efficacy of an mp-MRI-based radiomic nomogram in the identification of csPCa has not been completely determined. The use of nomograms has been widely accepted as a reliable method for determining quantitative risk factors for clinical events (19). In this study, we hypothesized that a radiomic nomogram incorporating an mp-MRI-based radiomic signature and independent clinical risk factors can non-invasively discriminate csPCa from ciPCa in patients with suspected PCa. Therefore, we sought to develop and validate a radiomic nomogram that would incorporate a radiomic signature and clinical risk factors for the pre-biopsy prediction of csPCa.

Patient Cohort
This retrospective study was approved by the Institutional Ethical Committee of the Guangxing Hospital Affiliated to Zhejiang Chinese Medical University and the First Affiliated Hospital of Zhejiang Chinese Medical University, which waived the requirement for written informed consent. The study consecutively enrolled 159 patients with biopsy pathologyproven PCa who received mp-MRI examination from January 2016 to February 2020. All patients were scanned on the same model scanner and did not receive TRUS-guided biopsy prior to MRI examination. Exclusion criteria were (1) prior therapy history for PCa patients including antihormonal therapy, radiation, cryotherapy, or prostatectomy; (2) incomplete information or severe imaging artifacts of the MRI images; (3) lesion diameter <5 mm on mp-MRI images; and (4) lack of serum PSA level (Figure 1). The enrolled patients were randomly assigned to a training group (n = 54) and an internal validation group (n = 22) from the Guangxing Hospital Affiliated to Zhejiang Chinese Medical University along with an external independent validation group (n = 83) from the First Affiliated Hospital of Zhejiang Chinese Medical University another center.
Baseline clinical features were derived from medical records, including age and PSA level with the cutoff value of 10 ng/ml. The interval time between MRI and PSA testing was less than 1 month.

MRI Examination
All recruited patients were scanned using the same model 3.0 T MRI (Discovery 750W 3.0T, GE Healthcare, Milwaukee, USA) with a 32-channel pelvic coil. The protocol included transverse T1WI; transverse, sagittal, and coronal T2WI; transverse DWI; apparent diffusion coefficient (ADC) imaging derived from DWI; and dynamic contrast-enhanced. DWI was applied with a b value of 0 s/mm 2 , 1000 s/mm 2 . The details of the imaging

Lesion Segmentation on MR Images
Only T2WI, DWI, and ADC images were incorporated in this study because of the availability and emphasis in Prostate Imaging and Reporting and Data System version 2(PI-RADS v2) (7). The software package ITK-SNAP (version 3.4.0; www. itksnap.org) was used for manual segmentation of PCa lesion. The region of interest (ROI) was delineated along the boundaries of the lesion layer by layer in reference to the biopsy's pathological results. Given the importance of heterogeneity analysis, ROI was designed to contain regions of calcification, necrosis, bleeding, and cystic tissue, not including structures such as the urethra, seminal vesicle, and other normal anatomical structures. For differing pathological GSs, the highest biopsy GS regions were uniquely selected for delineation. If all lesions demonstrated the same GS on multi-focal PCa, the ROIs were depicted at each level manually until all lesions were incorporated.
A radiologist (W.C. with 3 years of experience of abdominal MRI) who was blind to the GS of each PCa lesion measured ADC value. The ROIs were placed to comprise as much of the inner aspect of the lesion as possible without encompassing surrounding normal structure on the ADC map. There was between one and three ROIs of each patient with a mean area of 40 mm 2 (range, 10-80 mm 2 ). Another abdomen radiologist (F.C. with 21 years of experience of abdominal MRI) who was blind to the PCa lesion evaluated the MR-T stage for each patient in reference to NCCN guidelines (20).

Intra-and Inter-observer Agreement
The intra-and inter-observer agreements for feature extraction were assessed by the intra-class correlation coefficient (ICC). Initially, integrated imaging data of 20 patients were randomly selected from the study group. All ROIs on T2WI, DWI, and ADC images were rigorously outlined with the same criteria by two experienced radiologists independently. Intra-observer ICC was analyzed by comparing two extractions of reader 1 (Y.Z. with 10 years' experience of abdominal MRI). Inter-observer ICC was evaluated by comparing the extraction of a second reader (F.C. with 21 years' experience of abdominal MRI) and the extraction of reader 1. An ICC that was >0.8 was regarded as a good agreement and the remaining image segmentation was implemented by reader 1 (21).

Radiomic Feature Extraction and Model Building
AK software (Artificial Intelligence Kit V3.0.0.R, GE Healthcare) was performed to extract a total of 396 radiomic features per ROI of each MRI scan, including the histogram, second-order statistic, Gray-Level Co-occurrence Matrix (GLCM), Run length matrix (RLM), and form factor parameters (15). The histogram, also called first-order statistic, represents the distribution of values of each voxel without concern for spatial relationships. The second-order statistic was routinely named as the texture features, which described the statistical relationships between voxels with similar (or dissimilar) contrast values. The overall number of the radiomic features in this study was 1,188. Before feature selection, the values of individual feature for the whole patients was normalized with Z-scores ((x-µ)/σ), wherein x is the value of the feature, µ represents the mean values of this feature for all patients in the set, and σ describes the corresponding standard deviation so as to get rid of the unit limits of each feature prior to being performed for a machine learning model for classification (22).
As the imbalance between csPCa and ciPCa patients may impact the classification capability, the synthetic minority over-sampling technique (SMOTE) was implemented in the training and validation group. Then, the two-feature selection method, minimum-redundancy maximum-relevance (mRMR), and least absolute shrinkage and selection operator (LASSO) were used to select the feature. At first, mRMR was performed to eliminate the redundant and irrelevant features; 20 features were retained. Then, LASSO was conducted to choose the optimized subset of features to construct the final model. Tenfold cross-validations were used to determine the optimal values of λ. Finally, only 10 of the most predictive features were chosen and the corresponding coefficients were evaluated. Predictive models were constructed by multivariable logistic regression with the selected 10 features. A Radiomic signature (Rad-score) was then calculated for each patient via a linear combination of selected features weighted by their respective coefficients in the predictive models. The radiomic workflow is demonstrated  in Figure 2. The radiomics procedure is described in detail in Supplementary Material 2.

Statistical Analysis
Categorical variables demonstrate the frequency, whereas continuous variables demonstrate the mean and standard deviation (SD). The Fisher's exact test or Chi-squared test was adopted to assess the categorical variables, when appropriate. The Mann-Whitney U test was implemented to analyze the nonnormally distributed continuous variables. R software (v. 3.5.1, Vienna, Austria) and SPSS 22.0 (IBM, Armonk, NY) were used to perform statistical analysis. The LASSO logistic regression was utilized with the "glmnet" package. The receiver operating characteristic (ROC) plots were constructed by the "pROC" package. Delong test was used to compare statistical difference in AUC of patient discrimination among groups. The nomogram construction and calibration plotting were used by the "rms" package. The decision curve analysis curve plots were performed using the "rmda" package. The diagnostic efficacy of the predictor was evaluated using the values of accuracy, sensitivity, and specificity. A P < 0.05 in two-tailed analyses was used to define statistical significance.  significant associations with the ADC value and PSA level, while other clinical factors were excluded ( Table 2). The ADC value and PSA level were entered into multivariate logistic analysis. However, PSA was excluded due to a lack of significant differences (p = 0.340). The ADC value was lower in csPCa than in ciPCa and was the only remaining independent clinical risk factor (p = 0.022).

Inter-observer and Intra-observer Agreement
The intra-observer ICC computed based on two extractions of reader 1 ranged from 0.827 to 0.934. The inter-observer agreement between two readers varied from 0.783 to 0.905.  The results manifested high intra-and inter-observer feature extraction agreement.

Radiomic Signature Development and Accuracy
A total of 1,188 radiomic features were extracted from T2WI, DWI, and ADC imaging. During mRMR and LASSO processing, 10 radiomic features (5 from DW imaging, 4 from ADC imaging, and 1 feature from T2W imaging) were selected and were performed to build the radiomic signature (Figure 3). The values of the 10 selected features in each patient were input to the formula, and the rad-score was then acquired to reflect the probability of csPCa. The rad-score revealed a great predictive efficacy, with an AUC of 0.95 [95% confidence interval (CI), 0.87 to 1.0] in the training group and 0.86 (95% CI, 0.70 to 1.0) in the internal validation group. Furthermore, the AUC in external validation group achieved 0.81 (95% CI, 0.68 to 0.94).

Development and Performance of the Radiomic Nomogram
The rad-score and ADC value were identified as independent predictors for discriminating between csPCa and ciPCa and then a radiomic nomogram was developed. Each independent predictor was allocated a weighted number of points. The overall number of points for each patient was computed using the nomogram and was associated with the likelihood of csPCa. The sensitivity, specificity, and accuracy of the radiomic signature and radiomic nomogram are demonstrated on Table 3.
To compare the discrimination performance, the ROC curves were plotted for radiomic nomogram, rad-score, and ADC value in the training group. The radiomic nomogram demonstrated a favorable classification capability with the AUC of 0.95 (training group), 0.93 (internal validation group), and 0.84 (external validation group) (Figures 4A-C). Therefore, the nomogram was superior to the rad-score and ADC value alone in discriminating csPCa from ciPCa, especially in the internal and external validation group. Details of the performance of radiomic nomogram are shown in Figure 5. Delong test was performed to verify the statistical difference in AUC of patient discrimination between nomogram, rad-score, and ADC score. This result was presented in Supplementary Table 2.
Finally, a decision curve analysis was performed to evaluate whether this nomogram would assist in differentiating between csPCa from ciPCa (Figure 6). When the threshold probability ranged from 0 to 1 according to the decision curve analysis, the nomogram obtained the greatest benefit compared with a "treat all" strategy, a "treat none" strategy.

DISCUSSION
This study developed and validated a radiomic nomogram for discriminating between csPCa and ciPCa in the present  5 | Radiomic nomogram to discriminate clinically significant and clinically insignificant prostate cancer. The radiomic nomogram was built on the training group, with the rad-score and ADC value. For example, a 74-year-old prostate cancer patient with an ADC value of 800 × 10 −6 s/mm 2 , its radiomic signature score was 2, the total number of points of this tumor was 100 (30 + 70), and the risk rate of clinically significant prostate cancer was determined to be 90%. ADC, apparent diffusion coefficient. study. The nomogram was constructed by containing the radscore from the radiomic method and ADC value. Rad-score was described as the probability of csPCa computed from the radiomic signature, which was built based on 10 selective radiomic features. Both the radiomic signature and nomogram demonstrated the same capability to discriminate between csPCa and ciPCa in the training group (AUC = 0.95 vs. 0.95). However, the nomogram exceeded the radiomic model in the internal (AUC = 0.93 vs. 0.86) and external (AUC = 0.84 vs. 0.81) validation group. Thus, the results shown herein indicate that the radiomic model may serve as a potential non-invasive method to differentiate between csPCa and ciPCa in clinical practice.
Recently, radiomics has been successfully applied in oncology and extended to PCa identification and evaluation (15,(23)(24)(25). Chen et al. compared a radiomic-based model with PI-RADS v2 scores in differentiating and grading PCa (15). This result suggested that radiomic models offered a high diagnostic accuracy and outperformed the corresponding PI-RADS v2 scores. Min et al. investigated an mp-MRI-based radiomic signature for identifying csPCa with an AUC of 0.823 in the validation group (18). The AUC of the radiomic signature for predicting csPCa was 0.86 (internal validation group) and 0.84 (external validation group) in our study, which differed from the result provided by Min et al. The difference may be illustrated by differences in research populations and patient selection criteria. In addition, our study incorporated the ADC value, PSA level, MR-T stage, and age. These parameters were included as they are of great importance in differentiating csPCa in clinical settings. The nomogram constructed from the aforementioned features may provide an individualized evaluation of csPCa. Our results suggested that the radiomic nomogram had a great efficacy for prediction csPCa in both training group and internal and external validation groups (AUC = 0.95, 0.93, and 0.84, respectively).
In our present study, the overall 1,188 radiomic features were extracted from T2WI, DWI, and ADC imaging. In total, 10 radiomic features were selected. Of these, nine radiomic features were derived from DWI and ADC imaging, including six texture features, two form factor features, and one histogram feature. The mostly radiomic features selected in this study were texture features about the statistical correlation between local nearby voxels with similar (or dissimilar) contrast values (26). This indicated that radiomic signature could support a prebiopsy potential in differentiating between csPCa and ciPCa.
ADC value was the only risk factor found in all clinical risk factors. The performance of both the radiomic signature and ADC value were high and comparable in the validation group in our study. This is consistent with a recent report with radiomic machine learning, which showed similar results (27). It may be the result of the principal nature of DWI and ADC that could dramatically reflect PCa pathological status in the peripheral zone. Indeed, most of PCa lesions lay in the peripheral zone in our study. DWI and more specifically ADC have been regarded as the most powerful sequence of prostate MR, especially in the peripheral zone (28). ADC values have been suggested to be reproducible quantitative markers to evaluate PCa aggressiveness (29,30). It is worth noting that the PSA level widely used in the PCa detection was not a significant factor regarding the differentiation of csPCa, which makes the elimination of this variable for model development. It is likely explained that the PSA level is specific to prostate tissue but not to PCa lesion. Another explanation may be related with the nuances in the data group or confounding by other risk factors. MR-T stage demonstrating the highest odds ratio value was also excluded to build the predictive model in our study. This finding probably associates with the extension degree of csPCa lesions. When csPCa lesions did not present with invasion of extra prostate capsular tissues, such as the neurovascular bundle, seminal vesicles, and distal sphincter, the MR-T was ascribed to the T2 stage. Obviously, the MR-T stage of all ciPCa patients was ascribed to the T2 stage.
The ratio of the csPCa and ciPCa patients was different (120 vs. 39) in the present study. This inter-group imbalance may give rise to bias for the build radiomic signatures in the training group, which would impact the prediction capability of the radiomic signature in the validation group. To reduce the effect of the imbalance, the SMOTE algorithm was applied to construct the radiomic model. However, the performance of the training and validation group was still in agreement with our original data and sample size. The quality assurance of the MRI scanner should also be illustrated. The present material spanned up to 3 years, so the imaging quality of the MRI scanner was essential to maintaining rigor to the long duration of this study. Therefore, the quality assurance maintenance records of the MRI scanner were reviewed and approved.
Several limitations to the current study should be noted. First, the current study has a small sample size and is a retrospective study from two centers. Therefore, large sample sizes from multiple centers are necessary to validate our primary findings. Second, systematic biopsy was applied for the pathological standard instead of the whole-mount pathological specimen. The experienced radiologists exerted all efforts to match the MRI lesion and the pathological site. It is obviously unreasonable that all of our subjects would have the whole-mount pathological specimen, especially for ciPCa patients. Moreover, patients with a lesion diameter of less than 5 mm on mp-MRI images were eliminated because we could not outline the PCa region during MRI segmentation. This may cause patient selection bias. Although our methodical strategies have a few limitations, we hold the view that they supply ample verification for the principal findings of our primary study.
In conclusion, this study presents a radiomic nomogram that incorporates both the radiomic signature and clinical risk factors for discriminating csPCa from ciPCa. The nomogram, incorporating radiomic signature and ADC value, provided an individualized, potential approach for discriminating csPCa from ciPCa. Further studies with large sample sizes from multiple centers are necessary to validate our primary results. With further investigation, it is possible that this radiomic nomogram may aid clinicians in determining prebiopsy and pre-treatment risk stratification for PCa.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by The Ethics Committee of the Guangxing Hospital Affiliated to Zhejiang Chinese Medical University. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.