Vertebral Bone Marrow Heterogeneity Using Texture Analysis of Chemical Shift Encoding-Based MRI: Variations in Age, Sex, and Anatomical Location

Objective: Vertebral bone marrow composition has been extensively studied in the past and shown potential as imaging biomarker for osteoporosis, hematopoietic, and metabolic disorders. However, beyond quantitative assessment of bone marrow fat, little is known about its heterogeneity. Therefore, we investigated bone marrow heterogeneity of the lumbar spine using texture analysis of chemical-shift-encoding (CSE-MRI) based proton density fat fraction (PDFF) maps and its association with age, sex, and anatomical location. Methods: One hundred and fifty-six healthy subjects were scanned (age range: 20–29 years, 12/30 males/females; 30–39, 15/9; 40–49, 5/13; 50–59, 9/27; ≥60: 9/27). A sagittal 8-echo 3D spoiled-gradient-echo sequence at 3T was used for CSE-MRI-based water-fat separation at the lumbar spine. Manual segmentation of vertebral bodies L1-4 was performed. Mean PDFF and texture features (global: variance, skewness, kurtosis; second-order: energy, entropy, contrast, homogeneity, correlation, sum-average, variance, dissimilarity) were extracted at each vertebral level and compared between age groups, sex, and anatomical location. Results: Mean PDFF significantly increased from L1 to L4 (35.89 ± 11.66 to 39.52 ± 11.18%, p = 0.017) and with age (females: 27.19 ± 6.01 to 49.34 ± 7.75%, p < 0.001; males: 31.97 ± 7.96 to 41.83 ± 7.03 %, p = 0.025), but showed no difference between females and males after adjustment for age and BMI (37.13 ± 11.63 vs. 37.17 ± 8.67%; p = 0.199). Bone marrow heterogeneity assessed by texture analysis, in contrast to PDFF, was significantly higher in females compared to males after adjustment for age and BMI (namely contrast and dissimilarity; p < 0.031), demonstrated age-dependent differences, in particular in females (p < 0.05), but showed no statistically significant dependence on vertebral location. Conclusion: Vertebral bone marrow heterogeneity, assessed by texture analysis of PDFF maps, is primarily dependent on sex and age but not on anatomical location. Future studies are needed to investigate bone marrow heterogeneity with regard to aging and disease.


INTRODUCTION
The non-mineralized bone compartment can be divided into red bone marrow which features higher vascularization and cellularity and is essential for hematopoiesis, and yellow bone marrow which has a higher fat content and important metabolic and endocrine functions (1,2). The two bone marrow components can be quantitatively assessed using chemical shift encoding-based water-fat magnetic resonance imaging (CSE-MRI) and magnetic resonance spectroscopy (MRS). These two techniques allow for quantification of the proton density fat fraction (PDFF) while MRS additionally has been shown to enable the quantification of marrow fat composition (3)(4)(5).
Variations in bone marrow composition, assessed by quantitative MRI, have been demonstrated to be associated with a variety of diseases, such as osteoporosis, hematopoietic disorders, and metabolic disorders, such as obesity and type 2 diabetes mellitus (6)(7)(8)(9). In osteoporosis, bone mineral density (BMD) is reduced and a negative correlation of BMD with bone marrow fat fraction as well as a positive correlation with bone marrow unsaturation has been established (10)(11)(12)(13)(14). Thus, MR-based quantification of vertebral bone marrow composition has been proposed as imaging biomarker for osteoporosis and fracture risk assessment (3,14,15). In addition, vertebral bone marrow composition has been recommended as an advanced surrogate marker for hematopoietic and metabolic disorders such as multiple myeloma and metabolic syndrome (16)(17)(18)(19)(20).
Previous studies showed that PDFF is a valid biomarker for fatty conversion of vertebral bone marrow and demonstrated a cranio-caudal increase in fat content for both adults and children (21,22). An accelerated age-dependent fatty conversion of vertebral bone marrow was shown in females compared to males, particularly pronounced after menopause (23).
However, there are methods that allow quantitative assessment of bone marrow beyond PDFF and fatty acid unsaturation. Burian et al. showed that CSE-MRI can be used for the assessment of vertebral bone marrow heterogeneity using texture analysis (24).
The texture of an image or three-dimensional imaging dataset is characterized by the spatial arrangement of voxels with different intensities. Texture analysis intends to quantify features Abbreviations: ANOVA, analysis of variance; BMD, bone mineral density; BMI, body mass index; CT, computed tomography; FOV, field of view; GLCM, graylevel co-occurrence matrix; L1, first lumbar vertebra; L4, fourth lumbar vertebra; MITK, Medical Imaging Interaction Toolkit; MRI, magnetic resonance imaging; MRS, magnetic resonance spectroscopy; PDFF, proton density fat fraction; PET, positron emission tomography; SD, standard deviation; T1, longitudinal relaxation time; T2, transverse relaxation time; T2 * , effective transverse relaxation time; TE, echo time; TE, echo time step; TR, repetition time.
reflecting repetitive patterns in the image as a function of the spatial gray-level variation. First-order, or global, features provide information related to the gray-level distribution of the image without carrying any information about the relative positions of the gray-levels within the image. This information is obtained from second-order features which can be extracted from the gray-level co-occurrence matrix (GLCM) as described by Haralick et al. (25). Texture features have previously been applied in different biomedical imaging modalities, e.g., for computed tomography (CT)-based analysis of bone microstructure, CTbased fracture risk assessment, and diagnostic support in mammography (26)(27)(28), in multi-modal oncologic imaging analysis (29)(30)(31) as well as for MRI-based computer-aided diagnosis and classification of vertebral compression fractures (32,33). In a recent preliminary study, CSE-MRI-based texture analysis demonstrated an increased vertebral bone marrow heterogeneity after menopause and the second-order features contrast and dissimilarity were shown to differentiate postfrom premenopausal equally well as PDFF (24). However, this previous study only included a relatively small number of female subjects and vertebral bone marrow heterogeneity has not been characterized in greater detail.
Therefore, the purpose of our study was to further investigate the heterogeneity of the lumbar vertebral bone marrow with regard to anatomical location, sex, and age in a larger cohort using texture analysis of CSE-MRI-based PDFF maps.

Subjects
Written informed consent was obtained from all subjects before participation in the study which was approved by the local institutional committee for human research. Healthy subjects older than 20 years of age were included in this study as outlined previously (23). The following exclusion criteria were applied: history of pathological bone changes such as hematological or metabolic bone disorders other than osteoporosis, history of diabetes, and contraindications for MRI. In total, 156 healthy subjects were recruited: age range 20-29 years (age group 1), 12/30 males/females; 30-39 years (age group 2), 15/9 males/females; 40-49 years (age group 3), 5/13 males/females; 50-59 years (age group 4), 9/27 males/females; 60 years and older (age group 5): 9/27 males/females.

MR Imaging
All subjects were scanned using the same 3T MRI system (Ingenia, Philips Healthcare, Best, The Netherlands). For chemical shift-encoding based water-fat separation at the lumbar spine, a sagittal eight-echo 3D spoiled gradient echo sequence was performed using the built-in-the-table posterior coil elements (12-channel array). The eight echoes of this sequence are acquired in a single repetition time (TR) using non-flyback (bipolar) read-out gradients and the following imaging parameters: TR/TE1/ TE = 11/1.4/1.1 ms, field of view (FOV) = 220 × 220 × 80 mm, acquisition matrix size = 224 × 224 × 20, acquisition voxel size = 0.98 mm × 0.98 mm × 4.00 mm, receiver bandwidth = 1,527 Hz/pixel, frequency direction = anterior-posterior (to minimize breathing artifacts), 1 average, scan time = 1 min and 17 s. A flip angle of 3 • was used to minimize T1-bias effects (34,35).

Vertebral Bone Marrow Fat Quantification
The fat quantification routine of the MRI system vendor was used for on-line processing of the gradient echo imaging data including the following steps: first, phase error correction; second, complex-based water-fat decomposition using a precalibrated seven-peak fat spectrum and a single T2 * to model the signal variation with echo time (36,37). The imaging-based PDFF map was computed as the ratio of the fat signal over the sum of fat and water signals. Figure 1 shows representative PDFF maps. The vertebral bodies L1 to L4 were included in the analysis and manually segmented by a radiologist with 3 years of experience in spine imaging (Figure 2). The segmentations did not include Schmorl nodes or cortical bone but comprised the basivertebral veins which could be considered as a potential source of quantification bias. However, the segmentations are consistent with previous publications (21,23). In order to avoid the inclusion of degenerative alterations, such as Modic changes, endplate-associated sclerotic changes were excluded. Segmentation was performed on the PDFF maps by using the free open-source software Medical Imaging Interaction Toolkit (MITK), developed by the Division of Medical and Biological Informatics, German Cancer Research Center, Heidelberg, Germany (www.mitk.org). PDFF values were calculated individually for each segmented vertebral level as well as for the entire segmentation from L1 to L4 to obtain average values for each subject.

Texture Analysis
Texture analysis was performed on the PDFF maps of the segmented vertebral bodies. Three global features (variance, skewness, kurtosis) and the following eight second-order features were extracted: Energy, entropy, contrast, homogeneity, and TABLE 1 | Mean and standard deviation (SD) of subject characteristics, proton density fat fraction (PDFF), and analyzed texture parameters, averaged from L1 to L4 and grouped by sex (0: male, n = 50; 1: female, n = 106). Differences of PDFF and texture parameters between male and female subjects, adjusted for age and BMI, were compared using a general linear model (*indicates p < 0.05).

Sex
Frontiers in Endocrinology | www.frontiersin.org correlation were calculated according to (25), variance and sumaverage according to (38) and dissimilarity according to (39). Global features were extracted from intensity histograms. The ideal number of bins used in the histogram analysis was calculated by a combination of different methods (40)(41)(42).
Second-order features were extracted using gray-level GLCM analysis (25). As a preprocessing step, gray level quantization of the PDFF maps was performed to prevent sparseness by normalizing the image intensities using 100 equally sized bins and the minimum and maximum gray levels present, corresponding to values of 0 and 100%, respectively.
GLCM is obtained by computing the joint probability of two adjacent voxel intensities at a given offset d = (dx, dy, dz) and angular directions θ = (0, 45, 90, and 135 • ). dx, dy, and dz denote the displacement along the x-, y-, and z-axis, respectively.
For 3D-GLCM analysis, the co-occurrence probabilities of voxel intensities were computed from the 26 neighbors, aligned in 13 directions taking into account discretization length differences. The mean value of the features computed from the 13 directions ensures the rotation invariance. Image preprocessing, including isotropic resampling, gray level uniform quantization and texture analysis were performed using MATLAB 2017 (MathWorks Inc., Natick, MA, USA) and a radiomics toolbox (https://github.com/mvallieres/ radiomics/) (29-31).

Statistical Analysis
The statistical analyses were performed with SPSS (SPSS Inc., Chicago, IL, USA). All tests were done using a two-sided 0.05 level of significance.
The Kolmogorov-Smirnov test indicated normally distributed data for the majority of parameters.
Mean and standard deviation (SD) of PDFF and texture features were calculated for L1 to L4 and a one-way analysis of variance (ANOVA) was performed to test for any significant differences between the four vertebral levels. Bonferroni-adjusted post-hoc analysis was used for pairwise comparisons between the four vertebral levels.
Furthermore, mean and SD of PDFF and texture features, averaged from L1 to L4, were calculated and compared with respect to sex, using a general linear model to adjust for age and body mass index (BMI).
Lastly, mean and SD of PDFF and texture features, averaged from L1 to L4, were calculated for the five age groups and a oneway ANOVA was performed to test for any significant differences between the age groups. This was done separately for female and male subjects. Bonferroni-adjusted post-hoc analysis was used for pairwise comparisons between the five age groups.

Study Population
The study population consisted of 100 female subjects with an age of 46.6 ± 17.1 years and 56 male subjects with an age of 42.9 ± 15.5 years. The age difference between females and males was not significant (p > 0.05). BMI was significantly (p < 0.05) different between females and males ( Table 1).

PDFF Measurements
Mean PDFF for all subjects combined showed a statistically significant difference depending on vertebral location (p < 0.05), increasing from 35.89 ± 11.66% in L1 to 39.52 ± 11.18% in L4 (Table 2, Figure 3). Significant differences between consecutive vertebral levels are indicated with * in Figure 3.

Texture Analysis
Combining all subjects, the global texture features variance, skewness, and kurtosis showed statistically significant anatomical variations from L1 to L4 (p < 0.05). However, only variance and kurtosis showed a consistent increasing trend from L1 to L4. None of the second-order texture features showed a significant anatomical variation from L1 to L4 (p > 0.05) (Table 2, Figure 3). Significant differences between consecutive vertebral levels are indicated with * in Figures 4A,B. Averaged from L1 to L4 and adjusted for age and BMI, the global features variance and skewness as well as the secondorder features contrast, dissimilarity and homogeneity showed statistically significant differences between female and male subjects ( Table 1).
In female subjects, all texture features, averaged from L1 to L4, showed statistically significant differences between age groups (p < 0.05). A consistent trend was shown for the global feature kurtosis (decreasing with age) and the second-order features energy (decreasing with age), entropy (increasing with age), correlation (increasing with age), and variance (increasing with age) (Table 3A, Figure 4A). In male subjects, all texture features, averaged from L1 to L4, except for the global feature skewness and the second-order feature variance showed statistically significant differences between age groups (p < 0.05). However, a consistent trend was shown only for the second-order feature correlation (increasing with age) (Table 3B, Figure 4B). Significant differences between consecutive age groups are indicated with * in Figures 4A,B.

DISCUSSION
In the present study, we observed that vertebral bone marrow heterogeneity, assessed by texture analysis of PDFF maps, exhibits sex-specific differences. Significant differences between age groups were found in a considerable subset of the analyzed texture features, particularly in women. In contrast to PDFF, bone marrow heterogeneity shows no gradient along the lumbar spine.
In 1973, Haralick et al. (25) described the computation of 14 textural features containing information about characteristics, such as homogeneity, contrast, linear structure, or complexity, on photo-micrographs and satellite images and applied them for image-classification. Since then, texture features have been increasingly used based on a variety of medical imaging modalities, with a strong focus on oncologic applications (27,(29)(30)(31)(43)(44)(45). Shape, texture and statistical features of T1weighted MR images of the lumbar spine have also been shown to improve diagnostic accuracy and classification results of vertebral compression fractures (32,33). Furthermore, texture analysis has been reported as a reproducible tool for the quantitative assessment of fatty infiltration of paraspinal muscles in patients suffering from lumbar spinal stenosis based on axial T2-weighted MR images (46). In this study, we performed texture analysis of vertebral bone marrow on CSE-MRI-based PDFF maps, an approach which has been shown to be feasible by Burian et al. (24). We took a texture feature processing tool which used positron emission tomography (PET) and non-quantitative MR images of the extremities as well as the head and neck region in the context of radiomics models and tailored it to be applied to PDFF maps of vertebral bone marrow which, by themselves  are already quantitative. The voxels of the PDFF maps assume gray-level values corresponding to a range between 0 and 100%. We normalized all image intensities to this same range using the same gray-level quantization for each dataset. Furthermore, the same MRI protocol and scan parameters and resolution were used for each subject ensuring comparable signal-to-noise ratios throughout the scans. As a result, these measures ensure good comparability of the extracted features among the, compared to previous studies, relatively large number of subjects (24,47).
In this larger-scale study, we found that bone marrow heterogeneity, assessed by the second-order textural features contrast and dissimilarity, is significantly higher in females compared to males, while PDFF itself showed no significant difference between females and males averaged over all age groups. We also found a sex-dependent statistical significant difference in the second-order textural feature homogeneity. However, it was only marginal and can be considered negligible. One potential pathophysiologic explanation for the revealed sexdependent difference may involve the conversion process from red to yellow marrow. According to Scheller et al. vertebral bone marrow adipose tissue is highly regulated and actively involved in hematopoiesis and skeletal remodeling (1). Several human and animal studies found sex-dependent differences regarding the relationship between marrow fat and bone which were potentially attributed to differences in sex hormone levels (6,13,(48)(49)(50). So it stands to reason that the revealed sex-dependent differences in bone marrow heterogeneity may also be due to the effects of higher estrogen and testosterone levels in males compared to females, particularly in older subjects. Our results could suggest that, beyond the known sex-dependent differences in bone marrow fat content, there are also differences in the conversion process from red to yellow marrow. One potential confounder could be degenerative alterations of the vertebral bodies. Females are known to have a higher prevalence of Modic changes than males which could result in higher heterogeneity measurements (51). However, more severe visible Modic changes were excluded in the segmentations of the vertebral bodies, minimizing this potentially confounding effect.
In addition, our results showed an age dependence of some textural features suggesting an increase in vertebral bone marrow heterogeneity with age. While textural features of bone marrow have not been investigated with regard to age before, heterogeneity shows a similar behavior to PDFF increasing with age in our study, confirming previous results in children and adults (22,23).
Previous studies have also demonstrated an increase in PDFF from cervical to lumbar vertebral bodies showing that conversion from red to yellow marrow is occurring first at lower vertebral levels (21)(22)(23). This could be confirmed in our study. However, in contrast to increasing PDFF from L1-4, textural features did not show a spatial gradient further supporting the hypothesis of Burian et al. of an anatomically homogeneous fatty conversion of bone marrow along the spine (24).
To the best of our knowledge, the present work is the first study investigating the association of bone marrow heterogeneity with sex and age. We found an association of bone marrow heterogeneity with sex and age, admittedly with varying degrees of statistical significance. The clinical relevance of these findings is still unclear. Therefore, an important and logical next step is to investigate these connections in subjects suffering from musculoskeletal and metabolic diseases affecting bone marrow structure and composition, such as osteoporosis and type 2 diabetes (3, 6-9, 15).
Our study is not without limitations. First, the number of female subjects was considerably higher than the number of male subjects, and some of the age groups comprised only a relatively small number of subjects. To more reliably characterize the revealed sex-and age-dependent differences in bone marrow textural features, the inclusion of additional subjects resulting in a more balanced sex and age distribution would be desirable.
Second, the present study has a cross-sectional design with the inherent limitations. However, it could be helpful in designing a future longitudinal study, in particular for the characterization of changes in bone marrow heterogeneity related to age and diseases of high clinical interest.

CONCLUSION
In the present study, we found that texture features of vertebral bone marrow, assessed by analysis of CSE-MRI-based PDFF maps of the lumbar spine, exhibit sex-, and age-specific differences. In contrast to PDFF, bone marrow shows no significant anatomical variation regarding textural features along the lumbar spine. Hence, vertebral bone marrow heterogeneity seems to be primarily sex-and age-dependent. Further studies are needed to investigate bone marrow heterogeneity during aging and disease progression in order to investigate whether texture analysis of bone marrow can improve the clinical impact of quantitative MRI of the spine.

DATA AVAILABILITY STATEMENT
The data supporting the conclusions of this article will be made available by the authors at reasonable request.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ethikkommision der Technischen Universität München. The participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
MD: concept of study design, subject recruitment, data acquisition, data post-processing, statistical analysis, and drafting the manuscript. DJ: subject recruitment, data acquisition, and critical revision of manuscript. SR: data acquisition and critical revision of manuscript. MM and KS: data post-processing and critical revision of manuscript. EB: subject recruitment and critical revision of manuscript. NS: critical revision of manuscript. JK: critical revision of manuscript and project supervision. DK: subject recruitment, data post-processing, and critical revision of manuscript. TB: concept of study design, subject recruitment, data post-processing, statistical analysis, drafting the manuscript, and leading role in critical revision of manuscript.