Exploratory Analysis of Qualitative MR Imaging Features for the Differentiation of Glioblastoma and Brain Metastases

Objectives To identify qualitative VASARI (Visually AcceSIble Rembrandt Images) Magnetic Resonance (MR) Imaging features for differentiation of glioblastoma (GBM) and brain metastasis (BM) of different primary tumors. Materials and Methods T1-weighted pre- and post-contrast, T2-weighted, and T2-weighted, fluid attenuated inversion recovery (FLAIR) MR images of a total of 239 lesions from 109 patients with either GBM or BM (breast cancer, non-small cell (NSCLC) adenocarcinoma, NSCLC squamous cell carcinoma, small-cell lung cancer (SCLC)) were included. A set of adapted, qualitative VASARI MR features describing tumor appearance and location was scored (binary; 1 = presence of feature, 0 = absence of feature). Exploratory data analysis was performed on binary scores using a combination of descriptive statistics (proportions with 95% binomial confidence intervals), unsupervised methods and supervised methods including multivariate feature ranking using either repeated fitting or recursive feature elimination with Support Vector Machines (SVMs). Results GBMs were found to involve all lobes of the cerebrum with a fronto-occipital gradient, often affected the corpus callosum (32.4%, 95% CI 19.1–49.2), and showed a strong preference for the right hemisphere (79.4%, 95% CI 63.2–89.7). BMs occurred most frequently in the frontal lobe (35.1%, 95% CI 28.9–41.9) and cerebellum (28.3%, 95% CI 22.6–34.8). The appearance of GBMs was characterized by preference for well-defined non-enhancing tumor margin (100%, 89.8–100), ependymal extension (52.9%, 36.7–68.5) and substantially less enhancing foci than BMs (44.1%, 28.9–60.6 vs. 75.1%, 68.8–80.5). Unsupervised and supervised analyses showed that GBMs are distinctively different from BMs and that this difference is driven by definition of non-enhancing tumor margin, ependymal extension and features describing laterality. Differentiation of histological subtypes of BMs was driven by the presence of well-defined enhancing and non-enhancing tumor margins and localization in the vision center. SVM models with optimal hyperparameters led to weighted F1-score of 0.865 for differentiation of GBMs from BMs and weighted F1-score of 0.326 for differentiation of BM subtypes. Conclusion VASARI MR imaging features related to definition of non-enhancing margin, ependymal extension, and tumor localization may serve as potential imaging biomarkers to differentiate GBMs from BMs.


INTRODUCTION
Brain metastases (BMs) are the most common tumors of the central nervous system (1,2). With an incidence of about 5,000 newly diagnosed cases every year in Switzerland (3), they by far exceed primary brain tumors (around 600 newly diagnosed cases per year) (4). Among primary brain tumors, glioblastoma multiforme (GBM), a very malignant form of diffusely infiltrating WHO grade IV astrocytoma, is the most frequent in adults (5).
BMs and GBMs have different growth patterns on a cellular scale, with BMs usually presenting as well-defined spherical lesions, which displace adjacent brain tissue without notable infiltration (6) and GBMs mostly exhibiting invasive growth patterns with infiltration of the surrounding structures, favorably white matter tracts (7). Nevertheless, they may be hard to distinguish on MR images, if no evident features such as multiplicity for BM or transhemispheric spread for GBM is present (8).
The morphology of brain tumors can be characterized in different ways, for example through volumetric segmentation of the tumor compartments (9), i.e. quantitative analysis, or by characterization of qualitative features. Quantitative analysis notably also includes the analysis of heterogeneity of brain tumors which has been demonstrated to be an important imaging biomarker for differentiation of cancerous tissues in gliomas (10)(11)(12). For primary brain tumors a set of qualitative features has been defined by a group of experienced neuroradiologists from the cancer research community to enable standardized scoring of subjective MR features, which are regularly encountered on routine contrast-enhanced MR images. This set is called the VASARI feature guide [Visually AccesSIble REMBRANDT (The Repository of Molecular Brain Neoplasia Data) Images], and it currently comprises 24 morphologic features, describing the location of the tumor, characteristics of the tumor compartments, and presence of distinct features such as hemorrhage or pial invasion (13-15). Even before VASARI, researchers assessed the utility of such information (16). For example already in 2005, Pope et al. studied the relationship between 15 imaging variables and survival in patients with grade III/IV gliomas (17), but the VASARI feature guide allows radiologists to study and report their findings in a standardized way, largely independent of the rater, institution, and approach used (16).
The VASARI features have been employed for different research questions, most commonly prediction of patient survival (18)(19)(20) or prediction of tumor progression (21). Most of these studies did not evaluate the predictive quality of VASARI features alone, but in combination with clinical, pathological and/or molecular data. Some used the TCGA (The Cancer Genome Atlas)-GBM dataset (13, 19,[22][23][24] as it provides easily accessible and ready to use imaging data including a variety of additional information about the available patients (25).
Even though most commonly used to describe GBMs, the VASARI features have also been applied to lower grade gliomas (LGG): Hyare et al. tried to predict isocitrate dehydrogenase 1 (IDH1) mutation status (26), Zhou et al. aimed at predicting histological grade and tumor progression as well as mutation status (IDH1 and 1p/19q codeletion) (27) and Lehrer et al. evaluated the relationship between MR tumor characteristics and protein measurements (28).
So far, the VASARI feature set has only been applied to primary brain tumors; its use in brain metastases has not been evaluated. Consequently, this study explores the applicability of the VASARI feature set in patients with BMs. The objective is the identification of a subset of VASARI features for the diagnostic discrimination among GBMs and BMs of different primary tumors.

Study Population
Eligible for this study were patients admitted to the Inselspital Bern between 2000 and 2018 with histologically confirmed diagnosis of one of the following five brain pathologies: 1) brain metastasis (BM) from carcinoma of the breast, 2) BM of non-small cell (NSCLC) adenocarcinoma of the lung, 3) BM NSCLC squamous cell carcinoma (SCC) of the lung, 4) small cell carcinoma of the lung (SCLC), or 5) GBM. The collective of eligible patients was reviewed for existence of pre-operative MR images.
Initially, 119 patients with histologically confirmed BMs were included. For the BM groups, exclusion occurred upon: one or more of the four required MR sequences are unavailable (n = 23, detailed in Section Imaging Protocol), poor image quality (n = 4), previous tumor resection (n = 1), exclusively extra-axial lesions (n = 6), and unmanageable number of metastases (>50, n = 1) (cf. Figure 1). For every BM group we included all patients meeting the inclusion and exclusion criteria, up to a maximum of 30 patients per group. In groups with more than 30 patients meeting the criteria we gave preference to the ones who underwent imaging more recently due to better imaging quality in recent years. We were able to include a total of 84 BM patients: 30 with BM from carcinoma of the breast, 13 with BM from NSCLC SCC of the lung, 30 with BM from NSCLC adenocarcinoma of the lung, and 11 with BM from SCLC. 25 GBM patients were included from a patient cohort published previously in context of brain tumor segmentation (29). Lesions which did not show any enhancing tumor component or exhibited a very large proportion of nonenhancing tumor were not considered in our analysis (in total five lesions in four patients).
This led to a total of 109 included patients (84 with BM, 25 with GBM). The research described in this paper took place at the Inselspital Bern, in the context of the trial CATCh, a singlecenter retrospective cohort study without intervention, using MR images which have been acquired in the process of clinical diagnostics. CATCh has been approved by the local research ethics commission (Kantonale Ethikkomission Bern).

Imaging Protocol
Due to the extensive time span of patient eligibility and images being partly externally acquired, imaging protocols for BMs were highly heterogeneous (parameter values are reported as ranges). MR images were acquired on 1.5 or 3 T MR scanners from Philips Medical Systems, Siemens and GE Medical Systems. Four representative MR sequences were used: T2-weighted (T2), T2weighted with fluid attenuated inversion recovery (FLAIR), native T1-weighted (T1) and T1-weighted with gadolinium contrast-agent (T1c). Sequence parameters: T2) acquired as a T2 SPACE iso-voxel sequence with a slice thickness of 1 mm in sagittal direction or as spin-echo or turbo spin-echo sequence with a slice thickness of 3-6 mm in axial direction, using an echo time (TE) of 13-409 ms and a repetition time (TR) of 438-15,000 ms. FLAIR) acquired as T2 SPACE dark fluid iso-voxel sequence with slice thickness 0.9 mm or 1.4 mm in sagittal direction or as FLAIR-sequence with slice thickness 3-6 mm in coronary or axial direction, TE 7.4-386 ms, TR 2,000-11,000 ms. T1) acquired as gradient echo sequence with a slice thickness of 1 mm in sagittal direction or as spin-echo sequence with a slice thickness of 3-6 mm in axial direction, TE 1.5-17 ms, TR 164-1,910 ms. T1c) acquired as gradient echo sequence with a slice thickness of 0.9 or 1 mm in sagittal direction, T1 vibe iso-voxel sequence with a slice thickness of 0.8 or 0.9 mm in transversal direction or as spin-echo sequence with a slice thickness of 3-6 mm in axial direction, with gadolinium enhancement, TE 2.3-17 ms, TR 6.1-2,320 ms.

Visually AcceSIble Rembrandt Image Magnetic Resonance Features
Based on the VASARI MR feature guide, we derived a set of morphological features which were evaluated for a total of 239 individual brain lesions from 109 patients. The defined set comprised localization-based features as well as appearancebased features.
The localization-based features were not treated as mutually exclusive, but every lesion was attributed to multiple of the above-mentioned categories, e.g. right hemisphere, temporal lobe, vision center.
The appearance-based features included -definition of contrast-enhancing margin (margin of contrastenhancing tumor compartment (CET), strongly T1c hyperintense) and non-enhancing margin (margin of the noncontrast enhancing tumor compartment [nCET]/peritumoral edema, FLAIR/T2 hyperintense) -existence of: hemorrhage, which was defined as T1 hyperintensity visible in both T1 native and T1c images, pial invasion, ependymal invasion -involvement of cortex -crossing of brain midline by the CET and nCET -multiplicity of enhancing foci.
We chose these features from the VASARI guide by excluding semi-quantitative measurements (i.e. proportion of compartment 1 to compartment 2) and features that evaluate post-interventional status, as we a priori excluded patients who already underwent surgical treatment. Furthermore, very rarely observed features in GBMs and BMs such as calvarian remodeling have been omitted.
In order to facilitate the rating of VASARI MR features, the four MR sequences (T1, T1c, T2, FLAIR) were rigidly coregistered using a versor 3D rigid transform optimized using Mattes Mutual Information metric from the Insight Toolkit (ITK) (30). The feature evaluation for all 239 lesions has been consecutively performed by a medical student (AP) and was subsequently confirmed by an experienced, board-certified neuroradiologist with more than 8 years of experience in brain tumor diagnostics (UPK).

Statistical Analysis
The aim of the exploratory data analysis was to identify patterns in the data, which could serve as a basis for hypothesis formulation and subsequent prospective confirmatory analysis. The VASARI features correspond to asymmetric binary attributes (1 = presence or 0 = absence of a feature). The presence of features for two lesions implies that they are more similar to one another, while the absence of features does not carry the same amount of information. Thus, the VASARI features are considered as asymmetric binary attributes. In the first phase, the features were counted, and proportions were computed for each histological type separately.
Proportions were visualized using heatmaps, and 95% binomial confidence intervals for proportions were estimated using the Wilson method (31). In the second phase, the localization-based and appearance-based MR features were combined for an unsupervised exploratory data analysis. Pairwise differences were measured among binary feature vectors of all lesions and histological subtypes using the Jaccard distance (=1 − Jaccard index). The resulting distance matrix (of dimension 239 × 239) was clustered using agglomerative hierarchical clustering with average linkage. Furthermore, pairwise differences were summarized across all lesions (by averaging) within a given histological tumor type, to yield a distance matrix for the different types. The resulting distance matrix was transformed to an affinity matrix of an undirected graph by computing its entries a i,j = 1 − d i,j with d i,j being the (i,j) th -entry of the distance matrix. Finally, the undirected graph is visualized using a spectral layout, which puts nodes with high affinity closer to each other than nodes with low affinity. This procedure was repeated for different VASARI feature subsets. Subsets were generated by exclusion of features if they exhibited proportions of less than X% across all histological tumor types (with X ranging from 15 to 90%, in 15% increments). The rationale for the exclusion is to remove features which do not carry sufficient information for the purpose of tumor type differentiation. In the third phase, the localization-based and appearance-based MR features were combined for a supervised exploratory data analysis, which included the tumor class label in the computation. Differentiation of BMs from GBMs was formulated as a binary classification problem and differentiation among histological subtypes of BMs as a multi-class problem (four classes). Univariate feature ranking was performed using the Mutual Information between feature values and tumor class labels. Multivariate feature ranking was performed by repeatedly fitting a linear soft-margin Support Vector Machine (SVM) to the data. In each iteration, the hyperparameter C [with values (0.001, 0.01, 0.1, 1, 10, 100)] and/or class balancing (on/off) was changed, resulting in a total of 12 models. Stratified 3-fold cross-validation with weighted-F1-score as performance metric was performed to find the optimal hyperparameter setting for the binary discrimination of BMs versus GBMs and for the multi-class problem of discriminating the histological subtypes of BMs. The latter was implemented in a one-vs-all approach. Finally, based on the optimal hyperparameters, feature ranking by recursive feature elimination was performed as an additional multivariate method. Descriptive statistics were computed using R (version 4.0.0) (32); unsupervised and supervised feature analyses were implemented using Python's networkx (version 2.4) and scikit-learn (version 0.22.1) modules.

Patient Characteristics
Among the 109 patients included in this study are 63 women and 46 men. This asymmetry is explained by the fact that there are only women in the breast cancer group. According to the Swiss Cancer League, there have been only around 50 cases of breast cancer in men per year in the period of 2012-2016 (33). NSCLC SCC presented with an asymmetry as well, occurring in 10 men and three women. The median age for all groups is 63.37 years, with a minimum of 27.13 years (GBM) and a maximum of 81.92 years (breast cancer). For more detailed information, see Table 1.
In Figure 2, an exemplary BM of a NSCLC adenocarcinoma is shown alongside an exemplary GBM case. Evidently both BM and GBM can exhibit contrast-enhancing tumor, central necrosis, and peritumoral edema.

Comparison of Histological Subtypes
In Figure 4, the proportions of localization-based MR features are broken down for the different histological subtypes of BMs. BMs from breast cancer occurred more frequently in the cerebellum (33.3%, 95% CI 24.6-43.4) than any other type of brain tumor. In case they were located supratentorially, they involved the frontal lobe more often than any other lobe or subcortical structure (31.2%, 95% CI 22.7-41.2). BMs from NSCLC adenocarcinoma tumors seemed to have a slight preference for the frontal lobe (34.4%, 95% CI 23.7-47). NSCLC SCC metastases involved the occipital lobe (27.8%, 95% CI 12.5-50.9) and especially the visual system more often

Appearance-Based Magnetic Resonance Features
Glioblastoma Versus Metastasis Figure 5 shows

Comparison of Histological Subtypes
In Figure 6, the proportions of appearance-based MR features are broken down for the different histological subtypes of BMs.
In general, the differences in appearance-based MR features among BMs of different histological types seemed to be subtle.

Unsupervised Analysis of Combined Magnetic Resonance Features
Based on the previous results, features were incrementally excluded (in 15% steps) if proportions were below a fixed threshold (<15 to <90% across all histological tumor types). In the case of the <15% threshold, the excluded features included "insular", "brainstem", auditory center", "Wernicke", "Broca", "somatomotor", and "CET crosses midline" (seven features: six localization-based and one appearance-based feature). In Supplementary Figure 1, the result of an agglomerative hierarchical clustering of the reduced feature set (for <15% threshold) is shown. The corresponding mean Jaccard distance among the different primary tumors is shown in Table 2. The NSCLC SCC BMs exhibited the lowest intra-class Jaccard distance (0.508) among all histological subtypes, which indicates that they appear to be more homogeneous. The distance matrix for a given threshold (e.g. <15%) can be transformed to an affinity matrix of an undirected graph and visualized using a spectral layout (Figure 7). We can observe that overall GBM and SCLC appear to be very different from the remaining three histological subtypes. The same observation can be made if all available features are used (see Figure 7, outer left).
With an increasing threshold, the nodes of NSCLC adenocarcinoma and NSCLC SCC move closer towards GBM and away from mammacarcinoma. For the <90% threshold, only three appearance-based features remained: definition of enhancing margin, definition of non-enhancing margin, and    Figure 7, bottom left).

Supervised Analysis of Combined Magnetic Resonance Features
In Table 3, the results of the univariate ranking using mutual information between feature values and target labels are shown alongside the multivariate ranking using either repeated fitting (varied hyperparameters) or recursive feature elimination (RFE, using optimal hyperparameters) based on linear soft-margin Support Vector Machines (SVMs). The features are ranked according to their scores (or weights) from the most important to the least important one. For differentiation of GBMs from BMs, high scores for definition of non-enhancing margin, ependymal extension, and features describing laterality were observed in case of both univariate and multivariate analyses.

DISCUSSION
In this study, we investigated the potential of the VASARI MR feature guide in differentiating BMs of different primaries from GBMs using localization-based and appearance-based features. The VASARI MR feature guide has been developed for primary brain tumors and has mostly been applied to GBMs (13, [18][19][20][21][22][23][24] and in a few studies to lower grade gliomas (26)(27)(28). To the best of our knowledge, this is the first study applying VASARI features to BMs. An explorative approach was chosen as a first evaluation of applicability of VASARI features in this setting. We found that GBMs differ from BMs both (i) in their preferred localization and (ii) MR image appearance: i.) GBMs involved all lobes of the cerebrum with slight fronto-occipital gradient, often affected central structures and showed a strong preference for the right hemisphere, whereas BMs occurred most frequently in the frontal lobe and in the cerebellum. ii.) GBMs always exhibited a well-defined non-enhancing margin and appeared more often than BMs as solitary lesions and/or with ependymal extension. Differences among BMs were very subtle; the only distinct finding was that NSCLC SCC metastases were localized occipitally affecting the vision center more often than any other type of BM.  For the multi-class problem, the Support Vector Machine (SVM) was trained in a one-vs-all fashion (bold, black font = features which are consistently top-ranked).
The former corresponds to a binary classification problem, whereas the latter is a multi-class classification problem (four classes).
Unsupervised analysis of combined MR features showed that NSCLC adenocarcinoma and NSCLC SCC BMs appear to be most similar and that this similarity is driven by the definition of tumor margins and cortical involvement. Furthermore, GBM appear to be distinctively different from all types of BMs for different feature subsets. Supervised analysis of combined MR features showed for differentiation of GBMs from BMs high scores for definition of non-enhancing margin, ependymal extension, and features describing laterality in case of both univariate and multivariate analyses. For differentiation of histological subtypes of BMs, high scores for definition of enhancing and non-enhancing margin as well as localization in the vision center appeared in the case of both univariate and multivariate analyses. A gridsearch based on 3-fold crossvalidation using all features yielded an optimal model with weighted F1-score of 0.865 for the differentiation of GBMs from BMs. For the differentiation of all tumor types, the optimal model led to a weighted F1-score of 0.326.
In the spirit of multiverse analysis (34), i.e. by viewing the data from different statistical angles, the same features and "feature families" have come up repeatedly, substantiating suspicion that these might play a role in brain tumor  i. non-enhancing tumor margin: GBMs presented with welldefined non-enhancing tumor margin more often than BMs. Previously, brain lesions have been found to show peritumoral edema if they are larger than~9.5 mm in diameter (35). GBMs are usually large at the time of diagnosis, thus typically exhibiting extensive edema with a well-defined margin. The extent of edema is constrained by the surrounding gray matter structures (e.g. cortex), which may also contribute to the increased perception of the edema's definedness. In our population, some BMs were very small and did not exhibit any peritumoral edema; therefore we argue that they also did not present with a well-defined non-enhancing tumor margin. In a future study, we plan to investigate the relationship between the volume of edema and its radiological presentation. Furthermore, qualitative differences in T2-weighted signal alterations between high-grade gliomas and BMs have been demonstrated previously with high-grade gliomas exhibiting more frequently high signal intensity of the cortex for nonenhancing tumor regions on T2-weighted FLAIR sequences when compared to brain metastases (36). A third aspect might be image quality. As for the GBMs in our study a standardized imaging protocol was used; MR images of BMs were acquired with a variety of different protocols. Consequently, the comparability of the MR images suffered. ii. ependymal extension: The dogma of the brain being a quiescent organ without neuronal regenerative potential is outdated. Multipotent, self-renewing neural stem cell populations have been confirmed in the forebrain subventricular zone (SVZ) and the subgranular zone (SGZ) of the dentate gyrus. These populations are capable of neurogenesis in the adult brain, and it is widely acknowledged that glioma initiating cells arise from these populations (37)(38)(39)(40). Because of the close spatial proximity of the SVZ to the lateral ventricles, the feature "ependymal extension" might be viewed as a surrogate marker for the involvement of the SVZ. This potentially explains why ependymal extension was more often present in GBMs than in BMs which emerge in the brain through hematogenic spread of systemic tumor cells. iii. localization: It is still unclear if and why BMs preferably arise in certain localization of the brain. The most accepted hypothesis is that the rate of metastases is proportional to the blood flow in this area (41,42). This hypothesis is in accordance with our findings of bigger cerebral lobes exhibiting more metastases. Moreover, BMs from breast cancer seemed to have particular affinity to the cerebellum, which has also previously been described in the literature (43). Compared to BMs, GBMs occurred more often in subcortical gray matter structures such as the basal ganglia, involved the corpus callosum and presented with a slight decreasing fronto-occipital gradient. This finding has previously been described by Larjavaara et al. (44). whose "findings indicate that gliomas arise mainly from the anterior subcortical structures of the brain, with an excess in the frontal and temporal lobes that is not accounted for by tissue volume alone." This can be explained by the close spatial proximity of the previously discussed origin of glioma initiating cells in the SVZ and SGZ. Furthermore, radiographic atlases of GBMs showed high tumor incidence in periventricular white matter regions and found that laterality and involvement of the frontal lobe may be related to underlying genetic and molecular characteristics of the tumor (45).
In general, our results could be useful in defining a subset of MR features that help radiologists to differentiate between GBMs and BMs in a more structured manner. As especially GBMs exhibit some distinct characteristics, it could be argued that in the absence of these, BM becomes the more likely diagnosis. Based on our observations, the differentiation of GBMs from BMs using definition of non-enhancing margin, ependymal extension and localization in the brain should be evaluated as imaging biomarkers for differential diagnosis in an independent confirmatory analysis.
Some limitations of the study should be noted. Despite the extensive inclusion period (years 2000 through 2018), we obtained only small sample sizes for rare types of BMs (NSCLC SCC and SCLC with 13 and 11 patients, respectively) which caused a class imbalance when compared to more frequently occurring BMs (NSCLC adenocarcinoma and mamma carcinoma with 30 included patients each). Aware of this imbalance, we intended to compensate by applying statistical techniques that account for different group sizes: i) we used stratified cross-validation in order to ensure that the original class distribution is maintained, ii) we use class-weighting as a hyperparameter for the SVM to adapt it for handling imbalanced classes, and iii) we use the weighted F1-score as performance metric, which is computed for each class label independently and weighted by its support thus providing a robust classification metric for imbalanced data. Since the initial class distribution of the four tumor types approximates their prevalence in clinical routine and the class distribution is maintained throughout our analysis, we obtain an estimation of the SVMs' classification performance in a setting which closely reflects the clinical scenario. With the extensive inclusion period another issue arose: the heterogeneity of imaging protocols and image quality. At our institution, a standardized imaging protocol for brain tumor patients exists since 2014. Therefore, available images and image quality varied widely over the years. This can potentially lead to failure to detect the smallest lesions or slight alteration of lesion features. On the contrary, one can argue that the features which were found most discriminative are so in a manner robust to different image acquisition protocols.
The applied VASARI features were capable of effectively highlighting differences between GBMs and BMs, which were reflected in descriptive statistics, showing consistently large inter-tumoral distances in unsupervised analyses, and high weighted F1-score for binary discrimination in supervised multivariate analyses. Definition of non-enhancing margin, ependymal extension, and tumor localization seem to play a major role. Regarding the differentiation between histological types of BMs, differences were much less accentuated. They seem to be driven mainly by definition of tumor margins, and localization in the vision center.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Kantonale Ethikkomission Bern. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.