Radiomic analysis of the proximal femur in osteoporosis women using 3T MRI

Introduction Osteoporosis (OP) results in weak bone and can ultimately lead to fracture. MRI assessment of bone structure and microarchitecture has been proposed as method to assess bone quality and fracture risk in vivo. Radiomics provides a framework to analyze the textural information of MR images. The purpose of this study was to analyze the radiomic features and its abilityto differentiate between subjects with and without prior fragility fracture. Methods MRI acquisition was performed on n = 45 female OP subjects: 15 with fracture history (Fx) and 30 without fracture history (nFx) using a high-resolution 3D Fast Low Angle Shot (FLASH) sequence at 3T. Second and first order radiomic features were calculated in the trabecular region of the proximal femur on T1-weighted MRI signal of a matched dataset. Significance of the feature’s predictive ability was measured using Wilcoxon test and Area Under the ROC (AUROC) curve analysis. The features were correlated DXA and FRAX score. Result A set of three independent radiomic features (Dependence Non-Uniformity (DNU), Low Gray Level Emphasis (LGLE) and Kurtosis) showed significant ability to predict fragility fracture (AUROC DNU = 0.751, p < 0.05; AUROC LGLE = 0.729, p < 0.05; AUROC Kurtosis = 0.718, p < 0.05) with low to moderate correlation with FRAX and DXA. Conclusion Radiomic features can measure bone health in MRI of proximal femur and has the potential to predict fracture.


Introduction
Osteoporosis (OP) is a disease of increased fracture risk (Fx) caused by reduced bone mass and microarchitectural deterioration of bone tissue.The main consequence of OP is fragility fracture.In 2014, 432,000 hospital admissions, 2.5 million hospital visits and 180,000 nursing home admissions in the USA were attributed to osteoporosis in USA (1).Hip fracture accounts for 72% of all osteoporosis-based fracture costs (1) and results in a mortality as high as 21% in the first year after fracture (2).Given the aging U.S and world population and the prevalence osteoporosis in older individuals, the cost of osteoporosis and fracture care is only expected to increase, further burdening societal healthcare systems (3).
To mitigate the effects of osteoporosis-related fracture, a major step is to improve accurate quantification of fragility fracture risk to make a relevant clinical or pharmaceutical intervention (4).The most common standard-of-care methods to quantify fracture risk are areal bone mineral density (BMD) and trabecular bone score (TBS) calculated using dual energy x-ray absorptiometry (DXA) and the Fracture Risk Assessment Tool (FRAX), a clinical-outcome-based 10-year fracture risk predictor.BMD is the WHO reference standard method used to diagnose osteoporosis (5).However, more than 50% of all fragility fracture cases occur in subjects who do not meet BMD criterion for osteoporosis (T-score < −2.5).This indicates that BMD has low sensitivity to diagnose osteoporosis and does not completely capture fracture risk.TBS was developed to estimate bone microarchitectural information.However, it is limited to the lumbar spine and is computed from a 2D projection of trabecular bone microarchitecture (TBA) and does not capture 3D information.FRAX on the other hand considers clinical factors (age, gender, BMI, parental Fx History 6) in its prediction of fracture risk.It is combined with other bone health measures like BMD, which capture bone density but no microarchitectural properties of the bone (6,7).
Magnetic resonance imaging (MRI) allows quantitative assessment of TBA and was first described two decades ago in the distal radius and calcaneus and more recently has been described in the proximal femur (8)(9)(10)(11)(12).MRI of TBA consists of depicting the unique 3-D network, size, and shape of individual submillimeter trabeculae.This requires resolution to be on the order of the size of trabeculae.The proximal femur is relatively deep in the human body, which makes high-resolution imaging more challenging because SNR decays quickly as the distance from the receive coil increases.Only recently, through pulse sequence and coil optimization, the femur was made accessible for trabecular bone analysis.
In addition, MRI of TBA is not widely available since it requires specialized analysis software and intense computation.
Radiomics has been used extensively in cancer research (13-19) and pancreatitis detection (19) and provide a way to construct models based on image analysis.Radiomics image analysis software are widely available and may provide another means to quantify bone microarchitecture on MR images, specifically by analyzing texture, shape, and intensity distribution in the region of interest (20,21).
The purpose of our study was to use radiomic to measure textural features in the trabecular bone architecture of the proximal femur and determine their relationship with fracture status and compared it to FRAX.

Subjects
This prospective, HIPAA compliant study was approved by our institutional review board, and written informed consent was obtained from all subjects.Forty-five postmenopausal women were recruited from our institution with total hip dual-energy x-ray absorptiometry (DXA, GE Lunar, Rahmay, NJ) results consistent with osteoporosis (femoral neck or total hip BMD T scores of greater than −2.5, 15 of whom had radiographically confirmed fragility fractures and 30 of whom did not have a fracture).Fragility fracture was defined as a low-energy fracture due to trauma from a fall of standing height or less.There are subjects that have had more than one fragility fractures, the types of fragility fractures included major osteoporotic fractures of the wrist (n = 3), spine (n = 2), elbow (n = 3), rib (n = 3), metatarsal (n = 2) and distal radius (n = 2).The median time since fragility fracture was 13 months.All subjects were able to ambulate without limitation.FRAX score was computed according to the standard method (https://www.sheffield.ac.uk/FRAX/tool), considering patient race and with/without BMD (total hip BMD T-score = −2.26 ± 0.65; Femoral Neck BMD T-score = −2.52 ± 0.64).Subjects were divided into two age-matched (age > 40 years) groups: with history of fragility fractures (Fx, n = 15) and without (nFx, n = 30).

Magnetic resonance imaging
The non-dominant proximal femur of each subject was scanned on a 3 T MRI scanner (SKYRA system, Siemens Healthcare) using an 26 element receive-coil setup (18 elements from a body matrix coil anteriorly and eight elements from a spine coil posteriorly).The coil was wrapped around the hip and secured by sandbags laterally and a velcro strap.We used a 3-dimensional (3D) fast low-angle shot sequence (FLASH) with the following scan parameters: repetition time (TR)/echo time (TE) = 37 ms/4.92ms, 0.234 mm × 0.234 mm, slice thickness = 1.5 mm, 60 coronal slices, bandwidth = 130 Hz/pixel, parallel acceleration [generalized autocalibrating partially parallel acquisition (GRAPPA) factor = 2, and acquisition time = 15 min 18 s.The imaging parameters were chosen in order to have the smallest voxel size possible while maintaining high enough SNR to visualize trabeculae and, most importantly, perform the image analysis (minimum of SNR ∼10-15 required).

Segmentation
Figure 1 illustrates a typical acquisition of the proximal femur and the segmented trabecular region used for analysis.The segmentation of the proximal femur was conducted by an expert, who manually delineated the trabecular border of bone on MR images using the FireVoxel software package (NYU Center for Advanced Imaging Innovation and Research, New York, USA; https://wp.nyu.edu/firevoxel/downloads/).This expert operated under the direct supervision and guidance of a musculoskeletal radiologist.Subsequently, the region of interest was resampled to an isotropic resolution of 1 × 1 × 1 mm 3 using 3rd order B-spline interpolation

Processing
Radiomic textural features of the trabecular region of the proximal femur were extracted using the PyRadiomics toolbox (22).The features encompassed: (1) first-order textural features such as average, contrast, variance, median, skewness, etc., and (2) second-order features like Gray Level Co-occurrence Matrix (GLCM) features, Gray Level Run length Matrix (GLRLM) features, Gray Level Dependence Matrix (GLDM) features, Gray gauge relationships between adjacent pixels.For computation, parameters such as the neighborhood radius (d) and the intensity quantization interval (BW) were considered.The radiomic features were computed for d ranging from 1 to 5 and BW ranging from 2 to 16.

Statistical analysis
The association between radiomic features and bone health was discerned using three analysis methods: 1. Wilcoxon test: Assesses feature separability between the Fx and nFx groups, considering features with p-value < 0.05 as significantly separable.2. ROC analysis: Evaluates the capability of radiomic features in predicting fragility fractures, measuring the Area Under the ROC (AUROC).The significance of AUROC values was ascertained through the Wilcoxon test.Furthermore, inter-feature correlations among radiomic features were assessed to eliminate redundant features, utilizing Pearson correlation with a significance threshold of p-value < 0.001.

Result Subject demographics
Demographic data are presented in Table 1.Age, weight, height, BMI and T-scores did not significantly differ between Fx and nFx patients.

Feature selection
Features with significant relationships between radiomic textural features and bone health are presented with their associated abbreviations in Table 2.These features are discriminative of Fx and nFx group or are significantly correlated to FRAX and BMD.The parameter d and BW are optimized for each radiomic feature to produce the most predictive measure of bone health.

Feature discrimination
Table 3 shows AUROC of radiomics parameters and clinical parameters able to discriminate between subjects with and without fragility fracture determined in Table 2.
Radiomic features could discriminate between Fx and nFx with AUROC values ranging from 0.687-0.751(p-value < 0.05).The non-uniformity features DNU, SZNU and RLNU showed high discriminatory ability (AUROC > 0.7; p-value < 0.014); the first order features such as Kurtosis showed significant discriminatory ability (AUROC = 0.718; p-value = 0.0183).Furthermore, GLDM and GLRLM features including LGLE, LDLGE and LRLGLE measure the emphasis on low gray level pixels and large accumulation of similar pixel intensity within neighborhood in the trabecular ROI and showed significant discriminatory ability (AUROC > 0.7; p-value < 0.024) between Fx and nFx group.
FRAX scores could discriminate between Fx and nFx group with AUROC values ranging from 0.687-0.745.T-scores could not discriminate between Fx and nFx groups which was expected since there was no significant difference in T-scores between the groups.DNU and SZNU had AUROC values comparable to those of FRAX scores.

Relationship between radiomic features and clinical parameters.
Correlation between the radiomic features and clinical parameters are presented in Table 4. Non-Uniformity features such as DNU, SZNU and RLNU demonstrated a significantly weak to moderate positive correlation with FRAX (0.299-0.444).DNU, SZNU, RLNU and GLNU demonstrated a moderate positive correlation with FRAX + BMD measure (ρ = 0.461, 0.484, 0.531 and 0.527 respectively).First order features such as kurtosis showed no correlation with age, FRAX and T-scores.Moreover, GLDM and GLRLM features such as LGLE, LDLGLE and LRLGLE showed no significant correlation with any clinical metrics or FRAX scores.

Correlation between features
Figure 2 shows the Pearson-correlation between radiomic features.Among the features that showed high ability to predict fragility fracture were SZNU, DNU, and RLNU-or coarseness features-and these were significantly and highly correlated with each other.LGLE, LDLGLE, MP and LRLGLE, which also showed significant ability to predict fragility fracture, were also significantly and highly correlated with each other.

Discussion/conclusion
In conclusion, MRI-based radiomics can discriminate between OP women with and without fragility fractures.According to our analysis, we identified that DNU, LGLE and Kurtosis are three features of interest since they have the highest AUROC, very low correlation with other radiomic features, and weak correlation with BMD and FRAX scores suggesting that they could be used as novel, imaging biomarkers for bone health that provide complementary information to each other and to DXA and FRAX.
In the T1-contrast MRI acquisitions of the trabecular region of the proximal femur used in this study, subjects with fragility fracture showed lower kurtosis and LGLE values and higher DNU values compared to subjects without fracture.DNU and LGLE are local second-order features.DNU measures nonuniformity in the interdependence of pixels within the trabecular region of interest in the proximal femur.LGLE measures the empasis of low intensity pixels within the trabecular region of interest.Kurtosis is a global first-order feature.High kurtosis values imply that in the region of interest there is a copmaratively large number of pixel values towards the extremes, while a low kurtosis value implies high peakedness of the distribution.MRI of microarchitecture indirectly image of the trabeculae since it relies on the contrast between non fully relaxed bone marrow fat tissue and trabeculae fully relaxed signals.GLDM features that we found of interest are defined on low intensity voxels, notably their dependance (LGLE) and their uniformity (DNU).They may correspond to a measure of the trabecular bone network since in our image fat appear hyperintense and bone hypointense.One of the technical considerations in our study was the MRI image resolution, specifically the slice direction resolution being much lower than the in-plane resolution.Differences in resolution can potentially influence the granularity of the features extracted and might play a role in the robustness of the radiomic analysis.In the context of bone health and fracture risk assessment, where subtle variations in the trabecular structure can be critical, the resolution of the MRI images can be a determining factor.While our study utilized high-resolution MR images of the proximal femur to ensure detail preservation, it is imperative for future research to investigate the direct impact of varying resolutions, especially slice direction resolution, on radiomic feature extraction and subsequent analyses.Such investigations would provide clearer insights into the optimal imaging parameters for robust radiomic analyses in osteoporosis assessments.
In recent years, several studies have investigated the use of radiomics in oncology to analyse tumors.There are few studies which have investigated the use of radiomics for bone health assessment.In the lumbar spine a multi-contrast approach was evaluated using both T1 and T2 weighted MRI (23) to detect osteoporosis compared to osteopenic and controls subjects.They found similar AUROC values of 0.73 using T1-weighted images, of 0.734 for T2 weighted image, and of 0.769 when using T1and T2-weighted images.Another study used opportunistic abdominal CT to retrospectively compute 41 radiomic features in the proximal femur of 500 patients (24) to predict OP status and found an AUROC of 0.96 to predict OP.More recent studies combine radiomic features computed using CT and MRI scans of the lumbar spine (25).They notably used chemical shift encoded MRI methods to separate bone marrow from MR images.They showed that the use of additional radiomic features can provide a better differentiation between OP patients with and without vertebral fracture (47% of the variance in osteoporotic vertebral fracture was explained by the model when it was based on BMD and bone marow measurement only compared to 81% of the variance in fracture when adding textural features to the model).
To the best of our knowledge, our study is the first to analyze the relationship between high-resolution MR-based radiomic features of the proximal femur and osteoporotic fracture status and the relationship between MR-based radiomic features and standard-of care measures of fracture risk such as FRAX and BMD.
In this study, we decided to perform a univariate analysis.Most studies use multiple combinations of feature selection and machine learning models over unmatched datasets to arrive at a model that shows high predictive ability.However, while advanced ML models have feature selection capabilities, our study's objective was to provide foundational insights into the underlying radiomics features critical for osteoporotic fracture risk.The distinct features we identified could guide and refine the feature selection process in subsequent ML models, ensuring they're both statistically robust and grounded in domain-specific knowledge.The limitation of such methods is that high predictivity might be influenced by confounding factors, such as age or BMI.Matching the test and control data has the effect of limiting the size of the dataset.In earlier stage of this study we used the above defined method but found that during feature selection methods one or two features were selected and individual features showed similar predictive ability as models with several features.Features showing significant predictive ability could be deciphered by curating the dataset by using high quality MRI images with matched control and test groups.Measuring and analyzing the predictive accuracy of individual features could help to use the dataset more effectively rather than building a multiparametric machine learning model.
The correlation between radiomics parameters and traditional indicators such as FRAX/BMD is weak to moderate.This suggests that radiomics parameters provide information distinct from that captured by FRAX/BMD.If there was a strong correlation, it would mean the information from both sources overlaps, reducing the need to assess radiomics parameters.However, the observed correlation indicates that radiomics parameters capture different aspects of bone health.
Given this distinction, while FRAX and BMD remain primary tools for osteoporosis screening, radiomics analyses might be used for further risk stratification.This approach could be particularly beneficial when primary screening results are inconclusive.This study is not without limitations.The first limitation is the relativley small size of the dataset.However, as an initial pilot study, we believe that this is sufficient in size and provides the foundation and evidence for a larger study with more fracture cases and controls.Second, with larger dataset, we could build machine learning models combining multiple features to predict osteoporotic fracture risk.This would be possible now that we know which types of features are most important after doing the univariate analyses in this study.Third, we do not have clinical imaging or microarchitectural information on these subjects, and in the future it would be important to determine the correlation between radiomic information and microarchitectural parameters or information that could be derived from clinical scans.Finally, this study is limited in that we used a T1-weighted FLASH acquisition for the MRI data.Moreover, considering the potential of MR Fingerprinting (MRF) as a quantitative method for assessing fracture risk, future research might benefit from exploring its utility alongside traditional imaging techniques.In the future, it would be important to investigate the effect of different types of image acquisitions or even investigate MRF in more depth as a promising method to assess fracture risk.
In conclusion, we have shown that MR-based radiomics of the proximal femur, in particular the features of DNU, LGLE, and Kurtosis can discriminate osteoporotic fracture cases from controls and provides different information about fracture risk compared to DXA and FRAX.Larger, longitudinal studies are need to help determine whether these radiomics parameters could have value to predict future fracture.

3 .
Spearman correlation: Determines correlations between radiomic measurements and established clinical and imaging parameters, such as age, BMI, FRAX scores (both overall and specific to the hip, and with or without BMD consideration), and BMD values from DXA (T-scores for the hip and femoral neck regions).A p-value < 0.05 was considered indicative of significance.

TABLE 1
Demographics and characteristics.

TABLE 3
Separability of feature evaluated through Wilcoxon test and area under the receiver operating characteristic curve (AUROC).

TABLE 4
Spearman correlation between radiomic features and clinical parameters.
Significant correlation values are highlighted in bold (p-value < 0.05).