Impact Factor 4.848 | CiteScore 3.5
More on impact ›

Original Research ARTICLE

Front. Oncol., 07 May 2020 |

Repeatability of Quantitative Imaging Features in Prostate Magnetic Resonance Imaging

Hong Lu1,2, Nestor A. Parra2, Jin Qi2, Kenneth Gage3, Qian Li1, Shuxuan Fan1,2, Sebastian Feuerlein3, Julio Pow-Sang4, Robert Gillies2,3, Jung W. Choi3* and Yoganand Balagurunathan3,4,5*
  • 1Department of Radiology, Tianjin Medical and Cancer Hospital, Tianjin, China
  • 2Departments of Cancer Physiology, H. Lee Moffitt Cancer Center, Tampa, FL, United States
  • 3Departments of Diagnostic Imaging, H. Lee Moffitt Cancer Center, Tampa, FL, United States
  • 4Departments of Genitourinary Oncology, H. Lee Moffitt Cancer Center, Tampa, FL, United States
  • 5Departments of Bioinformatics & Biostatistics, H. Lee Moffitt Cancer Center, Tampa, FL, United States

Background: Multiparametric magnetic resonance imaging (mpMRI) has emerged as a non-invasive modality to diagnose and monitor prostate cancer. Quantitative metrics on the regions of abnormality have shown to be useful descriptors to discriminate clinically significant cancers. In this study, we evaluate the reproducibility of quantitative imaging features using repeated mpMRI on the same patients.

Methods: We retrospectively obtained the deidentified records of patients, who underwent two mpMRI scans within 2 weeks of the first baseline scan. The patient records were obtained as deidentified data (including imaging), obtained through the TCIA (The Cancer Imaging Archive) repository and analyzed in our institution with an institutional review board–approved Health Insurance Portability and Accountability Act–compliant retrospective study protocol. Indicated biopsied regions were used as a marker for our study radiologists to delineate the regions of interest. We extracted 307 quantitative features in each mpMRI modality [T2-weighted MR sequence image (T2w) and apparent diffusion coefficient (ADC) with b values of 0 and 1,400 mm/s2] across the two sequential scans. Concordance correlation coefficients (CCCs) were computed on the features extracted from sequential scans. Redundant features were removed by computing the coefficient of determination (R2) among them and replaced with a feature that had the highest dynamic range within intercorrelated groups.

Results: We have assessed the reproducibility of quantitative imaging features among sequential scans and found that habitat region characterization improves repeatability in ADC maps. There were 19 T2w features and two ADC features in radiologist drawn regions (native raw image), compared to 18 T2w and 15 ADC features in habitat regions (sphere), which were reproducible (CCC ≥0.65) and non-redundant (R2 ≥ 0.99). We also found that z-transformation of the images prior to feature extraction reduced the number of reproducible features with no detrimental effect.

Conclusion: We have shown that there are quantitative imaging features that are reproducible across sequential prostate mpMRI acquisition at a preset level of filters. We also found that a habitat approach improves feature repeatability in ADC. A validated set of reproducible image features in mpMRI will allow us to develop clinically useful disease risk stratification, enabling the possibility of using imaging as a surrogate to invasive biopsies.


Prostate cancer detection using multiparametric magnetic resonance imaging (mpMRI) has been gaining consensus in the community for disease detection due to superior lesion sensitivity compared to transrectal ultrasound (TRUS) imaging (1, 2). Multiparametric MRI modalities have been useful in estimating size, volume, and relation to the underlying pathology of prostate cancer (3). Improvements in imaging technologies coupled with advances in mpMRI have led to its combined use with TRUS to guide prostate biopsies that improve detection of clinically aggressive cancers (4). Most clinical diagnoses follow a consensus reporting standard with the adoption of Prostate Imaging Reporting and Data Systems (PI-RADS v2) (5), which provides qualitative guidelines for clinical assessment. Variability in mpMRI scan interpretations among radiologists can in part be attributed to the steep learning curve required to interpret the scans (6). Quantitative imaging metrics or radiomics has been used to distinguish clinical abnormalities found in medical imaging (7, 8). For example, radiomics has been shown to be both reproducible in lung cancer computed tomography imaging and prognostic of lung cancer patient survival (9, 10). Recently, quantitative imaging features obtained from tumor regions on prostate mpMRI scans have been shown to be both predictive of clinically aggressive disease (11) and improve PI-RADS performance (12). In a recent survey on the role of imaging biomarkers in clinical decision making, the European Organization for Research and Treatment of Cancer and Cancer Research UK released a consensus statement with key recommendations to accelerate clinical biomarker translation (13). The key component of the consensus statement emphasizes the importance of validating the repeatability and reproducibility of these biomarkers for.

Information extraction (as part of a technical assay) and for proper downstream clinical utilization. Repeatability and reproducibility are necessary, but not sufficient, conditions for clinical usage of imaging biomarkers, as there is a higher relevance requirement such as accurate cancer prediction and prognosis (14, 15). As mpMRI has no biological reference for derived image intensity values, there are studies that have proposed standardizing these values (1618). Recently, there have been efforts to find repeatable quantitative (radiomic) features in mpMRI scan of various cancers, such as rectal (19), cervix (20), lacrimal gland (21), and prostate (22, 23). Notably in prostate (23) and cervical studies (20), enrolled patients were scanned in a test–retest setting. Quantification of regions of interest has been accomplished in various ways, either through the use of a few open source tools (24) or more commonly through custom implementation methods. Recently, there has been an initiative to standardize definitions of these quantitative metrics, as recommended by the Image Biomarker Standardization Initiative (IBSI) (25). In our study, we obtained test–retest deidentified prostate mpMRI studies from patients enrolled at the Brigham and Women's Hospital, shared in a public repository (26). Patients with pathologically verified lesions were scored by a clinical pathologist (Gleason score). Independently marked regions of interest were standardized and quantified using custom radiomic features that followed the IBSI consensus criteria (25). We investigated the feasibility of reproducing these features across the cohort for a diverse set of prostate lesions. We also propose a habitat-based approach that converges regions of interest, followed by lesion characterization to improve repeatability of image features. This work will provide the basis for using repeatable quantitative features in prognostic evaluation of prostate cancer patients.

Materials and Methods

We obtained deidentified mpMRI patient images along with segmentation masks (Dicom-Seg) through The Cancer Imaging Archive (TCIA) collection titled “QIN-PROSTATE-Repeatability” with detailed descriptions summarized about the cohort (26).

The patients were accrued for a research study at Brigham and Women's Hospital, Harvard Medical School. The patients waived informed consent, and their deidentified records were analyzed through our institutional review board–approved Health Insurance Portability and Accountability Act–compliant retrospective study protocol. The original study collection had 15 treatment-naive men who had mpMRI scans and biopsy-confirmed pathology and were scanned again within 2 weeks of their first baseline scan, during which patients did not receive any interim treatment. The cohort had 11 patients with a standard template biopsy and four patients who had suspicion of prostate cancer based on their clinical record. The mpMRI scans had T2w axial images [repetition time (TR) 3,350–5,109 ms, time to echo (TE) 84–107 ms, field of view (FOV) 140–200 nm] and ADC map derived from the diffusion-weighted MRI (b = 0, 1,400 s/mm, TR 2,500–8,250 ms, TE 76–80 ms, FOV 760–280mm). Figure 1 illustrates sample lesions delineated on the test and retest T2w axial images.


Figure 1. Screen capture of mpMRI prostate scan (A) with radiologist-marked lesion shown for the baseline and follow-up scans in T2w, (B) habitat converged with a sphere (15mm), and (C) habitat (≤ median) (in cyan) for a lesion (in red) shown for test and retest (along the rows) in T2w, ADC (along the column).

Our study radiologist (H.L.) read the patient scans and localized lesions within the regions on patient scans identified by the prior study and provided consensus region segmentation in consultation with the second study radiologist (J.Q.). A third radiologist (K.G.) provided a random overread. Our radiologists in consensus agreed to use 13 of the 15 patient mpMRIs; scans from two patients were dropped because of disagreements in identifying the abnormality and suboptimal quality of the scans. Among the converged patients, our radiologists in consensus identified 15 tumor lesions that were anatomically matched, longitudinally, across the test and retest time images. Of these, 11 identified lesion boundaries were verified to match with the prior study at the same anatomical location that had been pathologically verified (Gleason score); in addition, four additional lesion abnormalities were identified by our study radiologist(s), matched longitudinally. Table 1 provides patient clinical details, including subject identifier, lesions anatomical location, prostate-specific antigen (PSA) value, and pathological diagnostic (Gleason) score. Newly identified lesions without corroborating pathological findings were marked as not available (NA), by our study radiologists.


Table 1. Summary of patient scans with clinical diagnosis for the biopsies.

Segmentation and Feature Extraction

Our study radiologists used MIM™ PACS [MIM Software Inc. (Cleveland, OH, USA)] to delineate regions of interest three-dimensionally (3D) on the prostate mpMRI scans, using T2w images as the reference sequence. Lesion boundaries were independently marked on the test and retest scans, whose cancer status was pathological identified by prior study. Four additional abnormalities that appeared radiologically malignant were identified and anatomically matched in sequential time points by our radiologists, but these lesions did not have pathological assessment. All lesion boundary segmentation was carried out as consensus reads by the study radiologists. Independent boundary delineation between lesions in the test and retest time point scan not only depicts the real clinical situation, but also introduces boundary variations, which can increase variability in the computed quantitative features.

Using the MIM libraries, T2w and ADC sequences were coregistered to avoid any motion artifacts in acquisition between the modalities. The registered multimodal image sequences were exported as 3D image matrices along with segmentation masks. We developed custom radiomic feature extraction tools, whose feature definition and formulation followed the IBSI consensus recommendations (25). We extracted 307 quantitative imaging features in the converged region of interest, which could be broadly categorized into three broad groups: C1: size and shape (45 features), C2: intensity, co-occurrence, run length (107 features), and C3: texture—laws and wavelets (155 features); see Supplemental Tables S1S3.

Standardization of Image Regions

To assess the role of standardization procedures on feature stability in test–retest imaging, we propose to use conventional z-score standardization. We started by segmenting the prostate gland in 3D, and the region voxels were standardized by subtracting the mean and dividing by the deviation obtained at the gland level. The standardization was carried out independently for each modality (T2w and ADC) at a patient level. The lesion region of interest is standardized at the scan level and tend to have relative intensity with respect to the entire gland for a patient scan.

Habitat Image Region

We intend to find aggressive tumor-like regions in a marked lesion boundary of interest, which we call a habitat region. We define this region as one with restricted diffusion, whose characteristics resemble malignancy. We converge on a habitat region in two different ways: (a) sphere around lesion and (b) converge region within lesion. To find such a region, we first consider the entire lesion in 3D and on a colocalized volume across modalities. In the first case (a), we increase the search space to a 3D sphere with a fixed diameter of 15 mm and converge on a restricted diffusion region based on ADC values using a threshold defined by the distributional deviation (27, 28) and conforming regions to within the prostate gland structure. In the second case (b), we find the most contiguous lower median cutoff that is spanned in the ADC map within the radiologist-marked lesion region of interest. Converged habitat region will be mapped back to each modality of interest (T2w and ADC), and quantitative features are computed on a newly defined boundary. In the first case, it is possible to obtain a region larger than that marked by the radiologists. In the second case, the habitat region will always be contained within the marked lesion.

Concordant Features

Quantitative features that are reproducible in repeated experiments and can describe differential physiology are a necessary step for consideration as biomarkers. The feature values that are consistent between the test and retest experiment were evaluated. For each image feature, the concordance correlation coefficient (CCC) was computed to quantify reproducibility between the two scans for a patient across the cohort and independently computed in each modality (T2w, ADC). The CCC measures deviation from the diagonal line averaged over samples in the cohort and is commonly used to measure fidelity in repeated experiments (25). On this set of highly reproducible features, the next step was to select the features with a large interpatient variability, measured using the dynamic range (DR) metric. The normalized DR for a feature was defined by the inverse of the ratio of the average difference between measurements to the observed interpatient variability or biological range:

DR=(1-1/ni=1n|Test(i)-Retest(i)|Max-Min)    (1)

where n is the total number of patient case; the DR varies from 0 to 1. Values close to 1 are preferred and imply that the feature has a large relative biological range, limited by the diversity in the cohort. As the variation between test–retest features increases, the DR values will show a reduction. Screening for a large DR will eliminate features that show greater variability in the repeat scans compared to the range of coverage. It is critical that a clinically relevant feature have a large DR to adequately distinguish the variations with tumor types, but show minimal variability in describing the same tumor type.

Redundancy Reduction

We propose to eliminate redundancies in features that are found to be reproducible. We computed the coefficient of determination (R2) between the features that are considered to be reproducible, which measures the level of dependency between features. The R2 has a range of 0 to 1 and is a ratio of the known variance as measured by linear model to the total variance between two variables or features, where one is the outcome, and the other is used to form the predictor. Values close to 1 would mean that the data points are close to the fitted line (i.e., closer to dependency) (24, 25). The coefficient of determination of simple regression is equal to the square of the Pearson correlation coefficient (29, 30). The features were grouped based on the R2 values between them; in this subset, one representative was picked that had the highest DR. The procedure was repeated recursively to cover all the features. We implemented different cutoff values for R2 that assess linear dependence with any of the other features in the list. The purpose of this filter step is to eliminate redundancies, but not necessarily identify independence. The test–retest values were averaged before computing the R2. We set different cutoff limits to reduce redundancy and combine features that are over the cutoff range. We repeated this process for a range of cutoffs (0.95–0.99), in our study.


As described in the Materials and Methods, the lesion was independently delineated in test and retest mpMRI scan, with each delineation done in consensus between the study radiologists. Using the lesion boundary as reference, the habitat region was converged automatically. We define habitat as a contiguous region colocalized to a low diffusion region defined by the ADC map. We then standardize the image voxels using z-score prior to any computations; in addition, we contrasted our findings with a non-standardized (raw) image region. In total, four image regions were investigated (raw-radiologist, z-score radiologist, raw-habitat, z-score habitat) by computing 307 quantitative image features in each of the regions, independently in test and retest images. We computed CCC to find repeatable image features, followed by application of a DR filter. Additionally, redundant features were removed based on coefficient of determination between the feature sets, repeated at different cutoffs. Distribution of features with different level settings in concordance correlation (CCC) and DR across the patient cohort is shown in Table 2. The imaging features that were extracted for respective modalities are listed in Table 3, obtained with R2 ≥ 0.99 (CCC and DR ≥0.65) and Supplemental Tables 4,5, obtained with R2 ≥ 0.95 (CCC and DR ≥0.65). Figure 2 shows the distribution of CCC and DR for features extracted using different boundary regions; radiologist delineated (R), habitat converged (H), and habitat within the manually delineated region (H50).


Table 2. Distribution of quantitative imaging features at various levels of concordance and redundancy limits (Rsq at ≥0.99 and ≥0.95) for regions identified by (A) radiologist marked and (B) habitats converged (sphere), (C) habitat converged (≤ median, ADC map).


Table 3. Radiomic features that show concordance and non-redundancy in the test–retest cohort (CCC and DR ≥ 0.65;Rsq ≥ 0.99) for (A) radiologist-marked region, (B) habitat region (sphere, 15 mm), (C) habitat (≤ median, ADC).


Figure 2. Repeatability of quantitative features across different lesion boundaries. (A) Concordance coefficient, (B) dynamic range.

In our analysis, we find there are similar distributions of features between T2w- raw (native intensity values), radiologist-marked regions (19 features, CCC ≥0.65), and T2w-habitat with sphere regions (18 features, CCC ≥0.65), with standardized T2w z-score habitat (23 features, CCC ≥0.65) regions showing more stable features compared to T2w z-score raw regions. There were 12 stable features in T2w and 10 in T2w z-normalized regions, both evaluated at CCC ≥0.65 and with redundancy R2 ≥ 0.99 (Tables 2B,C, 3B,C). Of the 19 features that are stable in T2w radiologist-marked regions, there are two volume features that measure within a certain intensity range and 17 others that are texture features.

In ADC map images, there were two features found within radiologist-marked regions compared to three features in ADC z-score regions, both evaluated at CCC ≥0.65 and with redundancy R2 ≥ 0.99. Using ADC-sphere–based habitats, we find the number of stable features increased to 15 in the ADC-habitat, seen in both radiologist-marked and z-scored normalized regions. While using habitat region within lesion approach, the new region was restricted to be within the lesion. We find there was one stable feature in ADC and ADC z-normalized region; in these regions, five and one feature were concordant, respectively (see Supplemental Tables). It seems z-score standardization moderately helps to improve the number of repeatable features in ADC maps.

Figures 3, 4 show the distribution of concordance coefficient and DR metric values, computed on features, respectively. They are grouped into the following broad categories: size and shape (C1), intensity and co-occurrence (C2), and laws and wavelets (C3). Texture features in the C2 intensity and co-occurrence category show higher concordance compared to other categories of features in T2w. The features computed in ADC map do not show any consistent trend. It is also interesting to note that features in size and shape categories show lower concordance values.


Figure 3. Concordance of quantitative features across feature subgroup. (A) T2w, (B) T2w z-normalized, (C) ADC, (D) ADC z-normalized.


Figure 4. Dynamic range of quantitative features across feature subgroup. (A) T2w, (B) T2w z-normalized, (C) ADC, (D) ADC z-normalized.

While the ADC map shows intensity statistics to be reproducible, the z-score region shows reproducible co-occurrence matrix. The habitat region using sphere approach shows more than eight features related to fine texture (Laws) and two features related to shape category. While the T2-habitat (sphere) shows more features from co-occurrence, neighborhood gray tone difference categories. In region converged by habitat within lesions, ADC map shows one feature related to gray level that is stable and non-redundant. The T2-habitat (within lesion) shows 12 features that are related to texture–neighborhood gray tone, co-occurrence, and wavelet based. The z-score standardization in T2-habitat (within lesion) region shows features related to gray-level intensities, co-occurrence, and neighborhood gray tone features that are reproducible and stable.


Clinically relevant imaging biomarkers are expected to be repeatable in a test–retest patient cohort, reproducible across centers, and relevant to describing the tumor physiology across different conditions. It is essential for imaging features to be used as a biomarker to be repeatable, at an acceptable level, which is dependent on the current imaging technology. In our study, we obtained prostate patient mpMRI scans within 2 weeks of the baseline time point and believe that the cohort is a unique public data set in prostate cancer. While we understand that the cohort size may be small for obtaining elaborate inferences, the methods applied by our study nonetheless allow us to assess feature stability and generate potential biomarkers in prostate mpMRI. We analyzed the repeatability in four different regions: (a) raw, radiologist-drawn; (b) Z score, radiologist drawn; (c) raw, habitats; (d) z-score, habitat regions. The study allowed us to contrast the reproducible features under these constraints.

The sphere-based habitat tends to increase the capture region that may provide a larger lesion boundary. This certainly helps to find stable and reproducible features in the ADC map and T2w image region. In comparison to habitat region formed within the lesion, it seems to restrict ADC intensity gray level that helps to find stable features in T2w, with more than 21 features with high concordance (CCC ≥0.75), of which 18 features are stable and reproducible (CCC ≥0.85, R2 > 0.099).

We believe that the habitat approach reduces variability in T2w and rather highly variable ADC map images, which typically have lower resolution. There are a number of automated and semiautomated segmentation procedures that could be used in mpMRI, but we restricted our approach to manual, expert radiologist–drawn boundaries to initially delineate the lesions. We used the manual segmented region as an initial seed point for habitat region delineation, which is automatically converged using multimodal sequences (T2w, ADC).

In a prior study (23), they used an interclass correlation with a cutoff of 0.85 and reported features related to entropy, inverse difference moments to be highly repeatable. In our study, we find that co-occurrence and neighborhood gray tone difference matrices (NGTDMs) are two feature categories that are repeatable in T2w and T2w z. In ADC maps, the statistics of intensity-type features seem to show up as stable even in raw intensities (without any standardization), whereas average co-occurrence, short run length gray level emphasis-type features are stable in z-normalized ADC maps. We also find habitat (sphere) approach seems to improve the number of repeatable features in ADC maps and in T2w (Figure 2).

In the previously mentioned study, the authors claimed neither standardization nor prefiltering improved repeatability of image features. In our study, we used CCC with additional criteria such as DR and redundancy reduction to filter the features. We also find that most size and shape–based features show lower concordance in T2w/T2w z, but a larger spread on ADC map in comparison to two categories of features (Figure 3). This is probably due to the use of different regional convergence methods coupled with independently defined, delineated lesion boundaries in the test and retest scans. In comparison, the prior study (23) claimed high concordance for features in the size and shape–based category.

Because of scan quality limitations, some of the prior marked regions could not be ascertained by our radiologist, and additional regions of abnormality were located in consensus by the study radiologists. Additionally, prior studies (22, 23) restricted lesions to the peripheral zone, while our study radiologists identified lesions without any zonal restrictions. These differences have certainly increased the feature variability, which could be one cause for a lower number of repeatable features. Nevertheless, our cohort of patients provides a diverse set of lesions that are spread across the gland. The habitat approach proposed in the study shows promise in increasing the number of repeatable imaging features.

Study Limitations

This study provides a unique patient cohort with test–retest scans obtained within 2 weeks between scans; the cohort size is certainly a limited factor for a broader inference. The methodology used in the study with endorectal coils introduced artifacts that could have altered the voxel intensities and influenced the image feature reproducibility. We have taken effort to remove patient scans that show large artifacts and regions that could not be converged in a consensus read. Despite our efforts, there could be a certain level of variation in features value due to voxel level changes.


In the current study, we demonstrate that there are quantitative imaging features that can be obtained repeatedly in prostate mpMRI. We show that sublocalized regions or habitats can improve repeatability of imaging features, possibly by restricting the range of variations in the voxel intensity levels in these MRI scan modalities. We also find that z-score normalization of the image intensities had minimal effect on the feature reproducibility. Current findings allow us to obtain reproducible and non-redundant sets of image features that could be used for predictive and prognostic purpose.

Data Availability Statement

Dataset used in this study can be accessed using following URL: and additional analysis, study information for this study are included in the article/Supplementary Material.

Ethics Statement

The studies involving human participants were reviewed and approved by USF IRB. The ethics committee waived the requirement of written informed consent for participation.

Author Contributions

HL and YB: Hypothesis, methods development. HL, JQ, QL, KG, and SFa: lesions identification and marking. YB, NP, and HL: results inference, implementation of methods. YB, HL, and JC: manuscript writing. HL, YB, RG, JP-S, JC, JQ, QL, KG, SFa, and SFe: manuscript/results proof read and approval.


We acknowledged research grant that support this work NIH/NCI 1R01CA190105-01, U01-CA200464 and Cohen's Family Donation 2018-19 for Moffit's Prostate Cancer research Program.

Conflict of Interest

HL, QL, and SFa received research scholarship to partial support their salary from Tianjin Medical and Cancer Hospital, Tianjin, China during their tenured research time at Moffitt Cancer Center. RG is an investor and consultant in Health Myne.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.


We would like to extend our gratitude to Dr. Federov and his team for reviewing the lesion markings in the shared patient data records. We would like to thank the Moffitt Radiomics team and IRAT core (Dr. Abdalah and his team) for the invaluable discussion and for extending support to this work.

Supplementary Material

The Supplementary Material for this article can be found online at:


1. Ahmed HU, El-Shater Bosaily A, Brown LC, Gabe R, Kaplan R, Parmar MK, et al. Diagnostic accuracy of multi-parametric MRI and TRUS biopsy in prostate cancer (PROMIS): a paired validating confirmatory study. Lancet. (2017) 389:815–22. doi: 10.1016/S0140-6736(16)32401-1

PubMed Abstract | CrossRef Full Text | Google Scholar

2. Brown LC, Ahmed HU, Faria R, El-Shater Bosaily A, Gabe R, Kaplan RS, et al. Multiparametric MRI to improve detection of prostate cancer compared with transrectal ultrasound-guided prostate biopsy alone: the PROMIS study. Health Technol Assess. (2018) 22:1–176. doi: 10.3310/hta22390

PubMed Abstract | CrossRef Full Text | Google Scholar

3. Kwak JT, Sankineni S, Xu S, Turkbey B, Choyke PL, Pinto PA, et al. Prostate cancer: a correlative study of multiparametric mr imaging and digital histopathology. Radiology. (2017) 285:147–56. doi: 10.1148/radiol.2017160906

PubMed Abstract | CrossRef Full Text | Google Scholar

4. Siddiqui MM, Rais-Bahrami S, Turkbey B, George AK, Rothwax J, Shakir N, et al. Comparison of MR/ultrasound fusion-guided biopsy with ultrasound-guided biopsy for the diagnosis of prostate cancer. Jama. (2015) 313:390–7. doi: 10.1001/jama.2014.17942

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Radiology ACo. Prostate Imaging Reporting and Data System (PIRADS) version 2. (2015). Available online at: (accessed January 10, 2020).

Google Scholar

6. Rosenkrantz AB, Ginocchio LA, Cornfeld D, Froemming AT, Gupta RT, Turkbey B, et al. Interobserver reproducibility of the PI-RADS Version 2 lexicon: a multicenter study of six experienced prostate radiologists. Radiology. (2016) 280:152542. doi: 10.1148/radiol.2016152542

PubMed Abstract | CrossRef Full Text | Google Scholar

7. Kumar V, Gu Y, Basu S, Berglund A, Eschrich SA, Schabath MB, et al. Radiomics: the process and the challenges. Magn Reson Imaging. (2012) 30:1234–48. doi: 10.1016/j.mri.2012.06.010

PubMed Abstract | CrossRef Full Text | Google Scholar

8. Lambin P, Rios-Velazquez E, Leijenaar R, Carvalho S, van Stiphout RG, Granton P, et al. Radiomics: extracting more information from medical images using advanced feature analysis. Eur J Cancer. (2012) 48:441–6. doi: 10.1016/j.ejca.2011.11.036

PubMed Abstract | CrossRef Full Text | Google Scholar

9. Zhao B, Tan Y, Tsai WY, Qi J, Xie C, Lu L, et al. Reproducibility of radiomics for deciphering tumor phenotype with imaging. Sci Rep. (2016) 6:23428. doi: 10.1038/srep23428

PubMed Abstract | CrossRef Full Text | Google Scholar

10. Balagurunathan Y, Gu Y, Wang H, Kumar V, Grove O, Hawkins S, et al. Reproducibility and prognosis of quantitative features extracted from CT images. Trans Oncol. (2014) 7:72–87. doi: 10.1593/tlo.13844

PubMed Abstract | CrossRef Full Text | Google Scholar

11. Algohary A, Viswanath S, Shiradkar R, Ghose S, Pahwa S, Moses D, et al. Radiomic features on MRI enable risk categorization of prostate cancer patients on active surveillance: preliminary findings. J Magn Reson Imaging. (2018) 48:818–28. doi: 10.1002/jmri.25983

CrossRef Full Text | Google Scholar

12. Wang J, Wu CJ, Bao ML, Zhang J, Wang XN, Zhang YD. Machine learning-based analysis of MR radiomics can help to improve the diagnostic performance of PI-RADS v2 in clinically relevant prostate cancer. Eur Radiol. (2017) 27:4082–90. doi: 10.1007/s00330-017-4800-5

PubMed Abstract | CrossRef Full Text | Google Scholar

13. O'Connor JP, Aboagye EO, Adams JE, Aerts HJ, Barrington SF, Beer AJ, et al. Imaging biomarker roadmap for cancer studies. Nat Rev Clin Oncol. (2017) 14:169–86. doi: 10.1038/nrclinonc.2016.162

PubMed Abstract | CrossRef Full Text | Google Scholar

14. Liu Y, deSouza NM, Shankar LK, Kauczor H-U, Trattnig S, Collette S, et al. A risk management approach for imaging biomarker-driven clinical trials in oncology. Lancet Oncol. (2015) 16:e622–8. doi: 10.1016/S1470-2045(15)00164-3

PubMed Abstract | CrossRef Full Text | Google Scholar

15. European Society of Radiology (ESR). White paper on imaging biomarkers. Insights Imaging. (2010) 1:42–5. doi: 10.1007/s13244-010-0025-8

CrossRef Full Text | Google Scholar

16. Nyul L, Udupa J. On standardizing the mr image intensity scale. Magn Reson Imaging. (1999) 42:1072–81. doi: 10.1002/(SICI)1522-2594(199912)42:6<1072::AID-MRM11>3.0.CO;2-M

PubMed Abstract | CrossRef Full Text | Google Scholar

17. De Nunzio G, Cataldo R, Carla A. Robust intensity standardization in brain magnetic resonance images. J Digital Imaging. (2015) 28:727–37. doi: 10.1007/s10278-015-9782-8

PubMed Abstract | CrossRef Full Text | Google Scholar

18. Robitaille N, Mouiha A, Crépeault B, Valdivia F, Duchesne S, The Alzheimer's Disease Neuroimaging I. Tissue-based MRI intensity standardization: application to multicentric datasets. Int J Biomed Imaging. (2012) 2012:347120. doi: 10.1155/2012/347120

PubMed Abstract | CrossRef Full Text | Google Scholar

19. Traverso A, Kazmierski M, Shi Z, Kalendralis P, Welch M, Nissen HD, et al. Stability of radiomic features of apparent diffusion coefficient (ADC) maps for locally advanced rectal cancer in response to image pre-processing. Physica Medica. (2019) 61:44–51. doi: 10.1016/j.ejmp.2019.04.009

PubMed Abstract | CrossRef Full Text | Google Scholar

20. Fiset S, Welch ML, Weiss J, Pintilie M, Conway JL, Milosevic M, et al. Repeatability and reproducibility of MRI-based radiomic features in cervical cancer. Radiother Oncol. (2019) 135:107–14. doi: 10.1016/j.radonc.2019.03.001

PubMed Abstract | CrossRef Full Text | Google Scholar

21. Duron L, Balvay D, Vande Perre S, Bouchouicha A, Savatovsky J, Sadik J-C, et al. Gray-level discretization impacts reproducible MRI radiomics texture features. PLoS ONE. (2019) 14:e0213459. doi: 10.1371/journal.pone.0213459

PubMed Abstract | CrossRef Full Text | Google Scholar

22. Fedorov A, Vangel MG, Tempany CM, Fennessy FM. Multiparametric magnetic resonance imaging of the prostate: repeatability of volume and apparent diffusion coefficient quantification. Invest Radiol. (2017) 52:538–46. doi: 10.1097/RLI.0000000000000382

PubMed Abstract | CrossRef Full Text | Google Scholar

23. Schwier M, van Griethuysen J, Vangel MG, Pieper S, Peled S, Tempany C, et al. Repeatability of multiparametric prostate mri radiomics features. Sci Rep. (2019) 9:9441. doi: 10.1038/s41598-019-45766-z

PubMed Abstract | CrossRef Full Text | Google Scholar

24. van Griethuysen JJM, Fedorov A, Parmar C, Hosny A, Aucoin N, Narayan V, et al. Computational radiomics system to decode the radiographic phenotype. Cancer Res. (2017) 77:e104–e7. doi: 10.1158/0008-5472.CAN-17-0339

PubMed Abstract | CrossRef Full Text | Google Scholar

25. IBSI. Image Biomarker Standardisation Initiative. (2019). Available online at: (accessed January 10, 2020).

Google Scholar

26. Fedorov A, Schwier M, Clunie D, Herz C, Pieper S, Kikinis R, et al. An annotated test-retest collection of prostate multiparametric MRI. Sci Data. (2018) 5:180281. doi: 10.1038/sdata.2018.281

PubMed Abstract | CrossRef Full Text | Google Scholar

27. Otsu N. A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics. (1979) 9:62–6. doi: 10.1109/TSMC.1979.4310076

CrossRef Full Text | Google Scholar

28. Sezgin M, Sankur B. Survey over image thresholding techniques and quantitative performance evaluation. J Electron Imaging. (2004) 13:146-65 doi: 10.1117/1.1631315

CrossRef Full Text | Google Scholar

29. RGD Steel JT. Principles and Procedures of Statistics. New York, NY: McGraw-Hill. (1960)

Google Scholar

30. Cameron AC, Windmeijer FAG. An R-sqaure measured of goodness of fit for some common nonlinear regression models. J Econom. (1997) 77:1790–2. doi: 10.1016/S0304-4076(96)01818-0

CrossRef Full Text | Google Scholar

Keywords: radiomics, mpMRI, prostate cancer, test–retest in mpMRI, prostate TRUS-MRI, repeatable MRI features

Citation: Lu H, Parra NA, Qi J, Gage K, Li Q, Fan S, Feuerlein S, Pow-Sang J, Gillies R, Choi JW and Balagurunathan Y (2020) Repeatability of Quantitative Imaging Features in Prostate Magnetic Resonance Imaging. Front. Oncol. 10:551. doi: 10.3389/fonc.2020.00551

Received: 14 October 2019; Accepted: 27 March 2020;
Published: 07 May 2020.

Edited by:

Lei Deng, Jacobi Medical Center, United States

Reviewed by:

Naranamangalam Raghunathan Jagannathan, Chettinad University, India
Satish E. Viswanath, Case Western Reserve University, United States

Copyright © 2020 Lu, Parra, Qi, Gage, Li, Fan, Feuerlein, Pow-Sang, Gillies, Choi and Balagurunathan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Jung W. Choi,; Yoganand Balagurunathan,