The value of multiparametric MRI radiomics in predicting IDH genotype in glioma before surgery

Objective To explore the value of multiparametric magnetic resonance imaging(MRI) radiomics in the preoperative prediction of isocitrate dehydrogenase (IDH) genotype for gliomas Methods The preoperative routine MRI sequences of 114 patients with pathologically confirmed grade II-IV gliomas were retrospectively analysed. All patients were randomly divided into training cohort(n=79) and validation cohort(n=35) in the ratio of 7:3. After feature extraction, we eliminated covariance by calculating the linear correlation coefficients between features, and then identified the best features using the F-test. The Logistic regression was used to build the radiomics model and the clinical model, and to build the combined model. Assessment of these models by subject operating characteristic (ROC) curves, area under the curve (AUC), sensitivity and specificity. Results The multiparametric radiomics model was built by eight selected radiomics features and yielded AUC values of 0.974 and 0.872 in the training and validation cohorts, which outperformed the conventional models. After incorporating the clinical model, the combined model outperformed the radiomics model, with AUCs of 0.963 and 0.892 for the training and validation cohorts. Conclusion Radiomic models based on multiparametric MRI sequences could help to predict glioma IDH genotype before surgery.


Introduction
Gliomas are the most common primary brain tumors that originate from neuroepithelial cells and can occur in any part of the central nervous system(CNS) (1).Previously, World Health Organization(WHO) classified gliomas into grades I-IV, with grades I and II considered low-grade gliomas (LGG) and grades III and IV considered high-grade gliomas (HGG) (2), with a median survival of 14 months for glioblastomas (grade IV) and more than 7 years for grades II and III gliomas (3,4).However, many studies have reported that the prognosis of gliomas is not related to the pathological grade, but is mainly based on the molecular characteristics of the tumor, and if the same genotype exists, similar biological behavior and prognosis may exist even if the tumors have different pathological grades (5,6).The 2016 WHO CNS tumor classification included molecular features, particularly isocitrate dehydrogenase (IDH), which is divided into IDH mutant (IDH-M)and IDH wild type(IDH-W),on a histological basis for the first time and used it as one of the important bases for molecular typing of gliomas (7,8).Low-grade gliomas with IDH-W are similar to glioblastomas in terms of molecular features and prognosis, while IDH-M gliomas have a better prognosis than IDH-W (9,10).The impact of total tumor resection on the prognosis of low-grade gliomas has been reported to depend on IDH mutation status (11).Therefore, preoperative prediction of IDH status is necessary for appropriate treatment planning.
Currently, IDH genotypes are identified mainly by sequencing or immunohistochemistry of tumor specimens, which can only be obtained after surgery, and even biopsies of unresectable gliomas carry the risk of neurological impairment, and the small samples obtained do not reflect the full heterogeneity of the entire tumor (6,12,13).To overcome these limitations, there is an urgent need to establish a non-invasive technique to identify the IDH genotype of the tumor (14,15), thus MRI examination is of great value in the preoperative diagnosis of glioma.At this stage, studies have evaluated the performance of various machine learning algorithms in predicting glioma genotypes (16)(17)(18)(19)(20). Highthroughput features from MRI have been shown to be highly advantageous and effective in predicting the classification of IDH (15).Conventional MRI examinations correlate with IDH genotype and its prognosis by tumor morphology, border and enhancement, but most of them rely on the subjective diagnosis of radiologists and cannot be analyzed comprehensively from the whole tumor area.Radiomics can extract a large number of intrinsic features that cannot be observed by the naked eye and analyze the shape and texture of images (14,(21)(22)(23), which shows great advantages and values in the diagnosis of glioma.
In recent years, studies on preoperative prediction of tumor genotype by radiomics have been widely carried out, and the methods and results of different studies are not the same.Zhang et al. (18) predicted IDH-M in LGGs preoperatively by multiparametric MRI radiomics model and obtained AUC value of 0.83, with T2-weighted imaging(T2WI) images being the most important.Another study titled "Predicting IDH Mutation Status in Low-Grade Gliomas Based on Optimal Radiomic Features Combined with Multi-Sequence Magnetic Resonance Imaging, 2022 (24)" concluded that a multiparametric radiomics model of T2-weighted-fluid-attenuated inversion recovery (T2-FLAIR) is most effective in distinguishing IDH mutation status in low-grade gliomas.However, Niu et al. (20) found that contrast-enhanced T1weighted imaging (CE-T1WI) radiomics model could effectively predict IDH genotype in high-grade glioma, which is inconsistent with the results of the the two aforementioned studies.Therefore, despite numerous studies on using radiomics models to predict the IDH genotype status of gliomas, the research results are not conclusive and still exhibit certain differences.Further research from more clinical centers is required to enhance the accuracy of the model's results.Sun et al (25) concluded that a combined machine learning algorithm exhibits excellent predictive performance in non-invasively predicting the molecular subtypes of lower-grade glioma (LGG) preoperatively.Several studies have incorporated clinical data into radiomics to build a combined model and found superior results (24-27).Zhou et al (28) and Tan et al (27).concluded that incorporating age information can improve the predictive results of the models.Furthermore, compared to imaging features, age information has a higher predictive value.This finding is of great importance for clinical practice, as clinical information can be obtained preoperatively and can provide more valuable information for treatment and prognosis.Therefore, more research results are needed to corroborate this conclusion.Furthermore, several studies have integrated functional sequences like perfusion•weighted imaging(PWI) and diffusion tensor imaging(DTI) to develop radiomics models for the prediction of IDH gene status in gliomas (29,30).These studies have achieved satisfactory outcomes.However, a meta-analysis (1) indicates that despite the growing adoption of advanced imaging sequences for constructing feature models, traditional MRI sequences exhibit superior specificity in predicting IDH gene status in gliomas.Currently, most research is limited to studying low-grade or high-grade gliomas, which are limited by pathological findings and have variable results.
In this study, all high-grade and low-grade gliomas were included and constructed radiomics models based on multiparametric MRI sequences, including T1WI, T2-FLAIR, CE-T1WI, and apparent diffusion coefficient (ADC).Additionally, a combined model of radiomics features and clinical data has been established, making the model results more stable and enabling a more comprehensive prediction of the applicative value of the IDH genotype in gliomas.

Patients
Clinical and radiological data of 132 patients with glioma who underwent preoperative MRI at the Aerospace Center Hospital from December 2018 to October 2022 were retrospectively collected.According to WHO classification of central nervous system tumors, the pathological findings were grade II-IV glioma.
Inclusion criteria were the following: (1) pathological data reported as glioma; (2) preoperative cranial MRI examination; (3)Patients over 18 years old;(4) complete clinical data; (5) no history of other brain tumors.Exclusion criteria: (1) preoperative radiotherapy; (2) poor quality MRI images with heavy artifacts; (3) incomplete or missing clinicopathological data.The study finally included 114 patients with glioma, including 64 males and 50 females.Among the 114 patients, 83 were IDH-W and 31 were IDH-M.Clinical information of patients was collected, including age, gender, pathological grade of glioma, whether peritumoral edema, whether necrosis was present in the tumor, whether the tumor was enhancing, and location of the lesion.Patients were randomly divided into training and validation cohort according to the ratio of 7:3.A flow diagram of patients is shown in Figure 1.This paper is a retrospective study, which was approved by our ethical committee.

Data preprocessing and ROI segmentation
The images were analyzed separately and independently by two radiologists with 3 years of experience in neurological MRI diagnosis using a double-blind method, and each tumor was manually outlined layer by layer, and the region of interest (ROI) of the entire tumor was manually mapped using the Deepwise Multimodal Research Platform version 2.2 (https://keyan.deepwise.com,Beijing Deepwise & League of PHD Technology Co., Ltd, Beijing, China.).The outline included tumor enhancement and areas of necrosis and cystic changes, but not peritumoral edema (Figure 2).Two radiologists with 3 years of experience in cranial MRI diagnostics simultaneously outlined regions of interest and extracted features for interclass correlation coefficients (ICC).One of the radiologists outlined and extracted features again after 2 weeks and compared them with the first features to evaluate the concordance of imaging histology features within the group.features with ICC > 0.75 were considered to have better concordance.

Feature extraction
All images were resampled to 2 mm×2 mm×2 mm for the same resolution, and the intensity of them were scaled to 0-100 before radiomics feature extraction.For feature extraction, a total of 10 image filtering methods were applied to the images.The specific details of these filtering methods can be found in Supplementary Table S1.These methods involved mathematical processing techniques such as Laplacian of Gaussian (LoG) filtering, wavelet filtering, gradient calculation, Local Binary Patterns (LBP) in both 2D and 3D, as well as non-linear intensity transformations like square, square root, logarithm, and exponential.It is important to note that the features were not only extracted from the original image but also from the images subjected to the aforementioned preprocessing steps.
The features we analyzed in this study included the first-order features, the shape features.The texture features included the graylevel co-occurrence matrix (GLCM), gray-level run-length matrix (GLRLM), gray-level size zone matrix (GLSZM), and gray-level dependence matrix (GLDM).These features capture various aspects of the image texture, providing information about the spatial relationships and patterns within the image (21,(31)(32)(33).
Overall, a total of 1906 radiomics features were extracted for each lesion in the study, including features derived from both the original and filtered images.These features offer a comprehensive representation of the lesion characteristics, potentially enabling more accurate and detailed analysis in the context of the study.To eliminate severe covariance, the linear correlation coefficient r between features was first calculated, and one of the features was removed when r ≥ 0.75 until the linear correlation coefficient between all feature pairs was less than 0.75.The extracted features are reduced and transformed using principal component analysis (PCA).

Feature selection and radiomics model construction
The PCA features obtained after conversion retain the most important information in the original features, alleviate noise and redundant information interference to a certain extent, and eliminate the influence of the original features on each other.On this basis, feature selection was performed for each feature and label pair using the F-test, and all features were ranked by histological grading to calculate individual F-values, and selection based on this ranking ensured that the most informative ones could be selected.Finally, multi-factor logistic regression analysis was performed to build a radiomics model.Logistic regression model was used to build the prediction model of IDH.The area under the curve (AUC) of the receiver operating characteristic (ROC) was used to evaluate the diagnostic efficiency of the model (0.5 < AUC < 0.7 for low diagnostic efficiency, 0.7 < AUC < 0.9 for moderate diagnostic efficiency, and 0.9 < AUC for high diagnostic efficiency), and calculate their sensitivity, specificity and accuracy.Calibration curves were plotted to analyze the model calibration efficacy.

Clinical model construction
Clinical characteristics were studied for gender, grading, and age.Radiologists with 15 years of experience were evaluated for imaging features including tumor border (well or ill), cystic necrosis (yes, no), peritumor edema (yes, no), tumor enhancement, tumor site (frontal, occipital, parietal, temporal, central, cerebellum,two or more), and univariate analysis was performed to identify potential clinico-radiological differences between the IDH-M and IDH-W groups in the training and validation cohorts that were significantly different characteristics (Table 1).A multifactorial logistic regression approach was used to build clinical model.

Statistical analysis
The IBM SPSS 25.0 (https://www.ibm.com)software was used, and comparisons of measures that conformed to a normal distribution with homogeneous variances were performed using the independent samples t test, otherwise the Mann-Whiney U test was used.Count data were analyzed by chi-square test, and P < 0.05 was considered a statistically significant difference.

Clinicopathological data
A total of 114 patients were finally enrolled and randomly assigned to the training cohort (n = 79) and validation cohort (n = 35) in this study.There were 64 males and 50 females with an average age of 50.3 ± 15.2 years.The clinical data of the patients are shown in Table 1.Among all clinical characteristics, age was the count data, which was tested to be not normally distributed, so the Mann-Whiney U test was used.The rest of the characteristics were measures and the chi-square test was used.The results showed that age and whether the tumor was enhanced in the clinical data were statistically significant (P < 0.05); the differences in gender, peritumoral edema, lesion site, and tumor necrosis between the two groups were not statistically significant (P > 0.05).

Radiomics feature selection
The mean ICC of these features was 0.792 (95% CI 0.678 to 0.883), showing good interobserver agreement.We extracted 1906 features from each sequence (396 first-order features, 14 shape features and 1496 texture features including 484 Gray Level Cooccurence Matrix (GLCM), 352 Gray Level Run-Length Matrix (GLRLM), 352 Gray Level Size Zone Matrix (GLSZM), and 308 Gray Level Dependence Matrix (GLDM).In total, 7624 radiomics features were extracted from four MRI single sequences for each patient.After calculating the linear dependent coefficient, 5868 out of 7624 features remained.After redundancy reduction, 242 features were selected for the subsequent analysis.Finally, the most significant features were selected by F-test to build a prediction model by logistic regression method, including five features on T1WI, six features on CE-T1WI, four features on T2-FLAIR, and four features on ADC.For T1WI+CE-T1WI + T2-FLAIR + ADC images, eight features were selected.The selected radiomics features are listed in Table 2.

Performance of the radiomics models
T1WI, CE-T1WI, T2FLAIR, ADC and T1WI+CE-T1WI + T2-FLAIR + ADC models produced AUC values of 0.940, 0.947, 0.947, 0.932 and 0.974 in the training cohort and 0.780, 0.848, 0.792, 0.764, 0.872 in the validation cohort.The ROC curves, waterfall plots and boxplots are shown in Figures 3, 4. The accuracy, sensitivity and specificity of the radiomics models are shown in Table 3.The results showed that the multiparametric model of T1WI + CE-T1WI + T2-FLAIR + ADC had the best diagnostic efficacy, followed by the CE-T1WI radiomics model.

Performance of the clinico-radiological model
In the analysis of clinical data, the age of the patient and whether the tumor is intensified or not are statistically significant.In this study, these two characteristics were analyzed as a clinical model, and the AUC values of 0.960 and 0.804 were obtained for the training and validation cohorts.The clinical model was analyzed combined with the radiomics model, and the AUC values for the training and validation cohorts were 0.963 and 0.892, respectively, showing that the combined clinical-radiological model had higher diagnostic efficacy than the common radiological model.Table 3 summarizes the sensitivity, specificity and accuracy of the clinico-radiological model.the ROC curves, waterfall plots and boxplots are shown in Figures 3, 4.Among all models, the clinical-radiomics model including eight radiomics features and two clinical features achieved a performance with a classification accuracy = 0.828 and AUC = 0.892.Tests results for all variables included in the model are listed in Supplementary Table S2.

Discussion
In this study, the IDH genotype of grade II-IV glioma was predicted and analyzed based on a multiparametric MRI radiomics model, and the clinical data were statistically significant in terms of patient age and whether the tumor was enhancing or not, and the clinical model and radiomics model were combined for prediction.The combined multiparametric model of T1WI + CE-T1WI + T2-FAIR + ADC had the best diagnostic performance.Thus, it can be seen that features obtained jointly from MR images of multiparametric sequences can better predict glioma IDH genotypes with higher diagnostic performancd than single sequence studies, and that combined radiomics features of multiparametric sequences can quantify comprehensive information on glioma heterogeneity.CE-T1WI contains information on local angiogenesis and blood-brain barrier disruption of the tumor.T2-FLAIR reflects the anatomical information of the tumor, and ADC provides information on the structure and density of the tumor cells.we also found that the diagnostic performance of the radiomics model of CE-T1WI was the highest in single-sequence studies, and its diagnostic performance was higher than that of ADC maps.some studies have shown that DWI sequences are not grading or stable indicator of molecular subtypes, which may be responsible for this result (29)(30)(31)(32)(33)(34).
We finally extracted the most significant feature including five features on T1WI, six features on CE-T1WI, four features on T2- FLAIR, and four features on ADC.For T1WI+CE-T1WI + T2-FLAIR + ADC images, eight features were selected which included 2 first-order features and 6 texture features.The first-order features are obtained by calculating the tumor's gray value and can respond to the tumor's gray intensity distribution.They also capture the tumor's heterogeneity, representing low-dimensional information easily perceived by vision.Texture features, including GLSZM, GLCM, and GLRLM, quantify the texture or tissue distribution within the tumor.These features are difficult to perceive visually but can provide information about the structure of tumor cells and the microenvironment (13,22).In our study, we used filters to extract radiomics features from the original images.Most of the final independent imaging features comprise wavelet features, which analyze the spatial frequency changes in a comprehensive way.These features effectively capture high-frequency and lowfrequency signals in the image, allowing for a detailed analysis of texture changes.The wavelet features can describe clinical problems related to the visual features of tumor images (30).Additionally, it is believed that wavelet features may contribute to our understanding of tumor morphology, pathophysiology, and proteomics (35).
In 2016, WHO classified gliomas into mutant and wild-type according to the IDH gene in the classification of central nervous system tumors (7,8), and the IDH gene is an important genetic marker of glioma that plays an important role in glioma metabolism, pathogenesis and progression (2,10).A growing number of studies have shown the clinical importance of genotype in developing treatment plans and assessing prognosis (20)(21)(22)(23).Pathology is still the gold standard for diagnosis, but histological examination, as an invasive test, is invasive and has sample error, especially when stereotactic biopsy is performed.In this study, we attempted to predict IDH genotype of glioma noninvasively before surgery by the method of radiomics model based on MRI images, and provide some reference and guidance for clinical selection of surgical and postoperative radiotherapy regimens, and the results showed that this method can predict IDH genotype well.Radiomics can dig deeper into the intrinsic features of medical images through machine learning methods and extract a large number of quantitative features that cannot be observed by the naked eye, which can support the implementation of precision medicine and individualized treatment.Some radiomics-based studies have also focused on the modeling of conventional sequences, and the obtained conventional structural sequences can reveal basic information about the tumor, such as the location, size, whether the glioma is combined with necrotic cystic lesions, the extent of edema, and the blood supply, which is helpful to provide more informed clinical information (1,13,23,25).In this paper, the clinical data of the tumors were compiled in detail and statistically analyzed, in which age and whether the tumor intensified were statistically significant, indicating that age and tumor intensification were independent risk factors for predicting the IDH genotype of glioma.
Previous studies have suggested that gender is a risk factor for IDH mutation, potentially due to hormonal fluctuations in females (27,36).However, our study found that gender did not exhibit significant predictive capability in univariate logistic regression analysis.This discrepancy underscores the need for additional research to ascertain the relevance of gender in predictive models.Although more and more scholars are using advanced techniques of MRI to analyze the relationship between IDH genotype and glioma, kim et al. (37) concluded that DWI and PWI have high diagnostic performance in predicting IDH mutations in low-grade glioma, with ADC features playing a significant role.In contrast, our study found that CE-T1WI played a more significant role, which is inconsistent with the results of this article.And in addition Park et al. (30) found that adding DTI imaging histology to conventional serial radiomics significantly improved the predictive accuracy of IDH status in low-grade glioma.However, a meta-analysis (1) revealed that despite an increasing number of scholars using more advanced examination sequences to establish feature models, the conventional MRI sequence imaging model showed better specificity in predicting  extracted features from the enhanced, non-enhanced, necrotic, edematous, tumor core, and six regions of the whole tumor on multiparametric MRI, respectively, and showed that multiregional radiomics models can predict the mutational status of glioblastoma preoperatively.Most studies have focused on high-grade gliomas or low-grade gliomas for experimental studies (16, 18, 20, 23-25, 31, 38).The present study did not include pathological grading in the model characteristics, making the study unrestricted by pathological grading, extending the clinical application of radiomic models, and offering the possibility of preoperative prediction of IDH gene status in glioma.This study still has some limitations.Firstly, the sample size of our study is relatively small, which may obscure the predictive value of clinical data and radiomics features of patients.This should be further investigated in larger cohorts.Secondly, we conducted a retrospective study and selected the IDH genotype for analysis.As we gather more cases, we will further investigate the relationship between abnormal expression of other important genotypes and imaging features.Lastly, our study is a single-center study, and in the future, it is necessary to collect multi-center data to validate the stability performance of the radiomics model.

Conclusion
The multiparametric radiomics model performs better than other single sequence models in predicting the IDH genotype of gliomas.After incorporating features such as patient age and whether the tumor was enhancing or not, the results of the clinical-radiomics model are more satisfactory, indicating that the combined model is an effective tool for predicting the IDH genotype.Furthermore, the variable parameters obtained in the model contribute differently to the prediction of the IDH genotype.These findings will be beneficial for future research on using brain tumor imaging to predict molecular status and tumor invasiveness.

FIGURE 1 Flow
FIGURE 1Flow diagram of the study population.
Different features have different means and variances, which can vary widely; we performed Z-score normalization, and after normalization, Z now equals 0 and STD equals 1, making all features comparable.The features were then analyzed by the Mann-Whitney Utest, and features with two types of differences (P<0.05) were retained.

FIGURE 2
FIGURE 2Examples of ROI segmentation.

3 The
FIGURE 3The ROC curves of the clinical model, radiomics models of T1WI, CE-T1WI, T2FLAIR, ADC, and T1WI+CE-T1WI + T2-FLAIR + ADC, and combined model in the training cohort (A-G) and validation cohort (H-N) and the waterfall plot of the validation cohort (a-g).

TABLE 1
Clinical and radiological characteristics of patients.

TABLE 3
(17)performance of the clinical model, radiomics models, and combined model.Niu et al. (20)found that a radiomics model based on preoperative enhanced MR was effective in predicting the IDH gene in high-grade gliomas, this is consistent with the research findings of this study, but we are not limited to a single MRI sequence.Instead, it is based on the analysis of multiple-parameter conventional MRI sequences and combined models, which are more stable and accurate.The present study was not limited to a single MRI sequence, but modeled and combined models based on multiparametric conventional MRI sequences for analysis.Tan et al. (27) studied the use of age as a clinical model to predict IDH mutations in astrocytoma based on radiomics, clinical and combined models and found that the combined model had higher diagnostic performance, this part of the research results is consistent with the findings of this study, but it only indicates the importance of age as an independent risk factor for preoperative prediction of IDH mutation in gliomas.However, this study also includes tumor enhancement as a variable in the model.Importantly, both age and tumor enhancement demonstrate satisfactory results as important variables in the model, and clinical data can be easily obtained before surgery.This is crucial for the stability of the model and its clinical application.Li et al.(17)