Practice of the new supervised machine learning predictive analytics for glioma patient survival after tumor resection: Experiences in a high-volume Chinese center

Li, Yushan; Ye, Maodong; Jia, Baolong; Chen, Linwei; Zhou, Zubang

doi:10.3389/fsurg.2022.975022

ORIGINAL RESEARCH article

Front. Surg., 17 February 2023
Sec. Neurosurgery
Volume 9 - 2022 | https://doi.org/10.3389/fsurg.2022.975022

Practice of the new supervised machine learning predictive analytics for glioma patient survival after tumor resection: Experiences in a high-volume Chinese center

Yushan Li^1,†

Maodong Ye^2,†

Baolong Jia^3,†

Linwei Chen^4*

Zubang Zhou^1*

¹Department of Ultrasound, Gansu Provincial Hospital, Lanzhou, China
²Medical Cosmetic Center, First Affiliated Hospital of Shantou University Medical College, Shantou, China
³Pingliang Second People's Hospital Neurosurgery Department, Pingliang, China
⁴Neurosurgery Unit, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, China

Objective: This study aims to assess the effectiveness of the Gradient Boosting (GB) algorithm on glioma prognosis prediction and to explore new predictive models for glioma patient survival after tumor resection.

Methods: A cohort of 776 glioma cases (WHO grades II–IV) between 2010 and 2017 was obtained. Clinical characteristics and biomarker information were reviewed. Subsequently, we constructed the conventional Cox survival model and three different supervised machine learning models, including support vector machine (SVM), random survival forest (RSF), Tree GB, and Component GB. Then, the model performance was compared with each other. At last, we also assessed the feature importance of models.

Results: The concordance indexes of the conventional survival model, SVM, RSF, Tree GB, and Component GB were 0.755, 0.787, 0.830, 0.837, and 0.840, respectively. All areas under the cumulative receiver operating characteristic curve of both GB models were above 0.800 at different survival times. Their calibration curves showed good calibration of survival prediction. Meanwhile, the analysis of feature importance revealed Karnofsky performance status, age, tumor subtype, extent of resection, and so on as crucial predictive factors.

Conclusion: Gradient Boosting models performed better in predicting glioma patient survival after tumor resection than other models.

Introduction

Glioma is the most widely recognized primary tumor in the central nervous system (CNS) (1). Accounting for around 80% of malignant CNS tumors (1), gliomas are composed of lower-grade gliomas [LGGs; World Health Organization (WHO) grades II and III] and grade IV gliomas (glioblastoma, GBM). The treatment of glioma is troublesome, and tumor resection is the main approach to treatment. Due to the large heterogeneity between different kinds of gliomas, the prognosis of glioma patients is diverse, and the survival always ranges from a few months to 10 years (2, 3). Obviously, GBM was supposed to have a poorer prognosis than diffuse low-grade and intermediate-grade gliomas for its characteristics of invading growth and easy recurrence. However, along with the presence of certain molecular markers and various clinical characteristics, including age, Karnofsky performance status (KPS), symptoms, and so on, the prognosis varies even in this most malignant type of glioma, GBM. Predicting glioma patient survival after tumor resection still remains a great challenge for clinical doctors.

Nowadays, there have been endeavors, mainly in three directions, to explore useful predictive models for glioma prognosis. Some researchers have focused on traditional multivariate Cox regression models with several certain prognostic factors. For example, Gittleman et al. (4) developed a survival nomogram for LGGs with independent validation. Meanwhile, some turn to new biomarkers for the construction of models. Not long ago, Zhang et al. (5) constructed a novel model using immune-related gene signature, which is also effective in predicting overall survival in primary LGG. What is more, some researchers have concentrated on radiomics feature prediction models and made some achievements (6). Albeit the effort in putting forward these models, some shortcomings limit their usefulness and availability of these models. First, the traditional statistic approach has a huge limitation: its analysis is based on the condition of a linear relationship and might miss the nonlinear relationship between input and outcome. In other words, this approach cannot fully use medical information, which makes it unable to adjust to the era of big data. Second, as Jakola et al. (7) claimed, a pure biomarker approach for prediction, such as gene signature model, is of limited value because tumor classes and tumor cells are neither stable over time nor homogeneous throughout the lesion tissue. Third, prediction models based on radiomics features are powerful and promising, but we acknowledge that the techniques are at an early stage and available only at a limited number of centers and not readily validated in medical practice yet (7). Therefore, it is still necessary to explore a new predictive model based on the algorithm suited to the big-data era, with the combination of common clinical features and reliable biomarkers as prognostic factors.

Recently, supervised machine learning (ML) methods have demonstrated precise predictive capacity, being progressively utilized in the prognosis prediction of different diseases (8). The supervised ML approach is a kind of data-driven analysis method, including support vector machine (SVM) (9), decision tree (10), and so on, which integrates multiple risk factors into a predictive algorithm and performs well with complex information (11). Gradient Boosting (GB) is one of the supervised ML algorithms. Although it was strange for medical workers, this ML algorithm did have a good performance in medical scenes, such as predicting the survival outcome of triple-negative breast cancer (12) and the recurrence of colorectal cancer (13). So far, studies seldom used Gradient Boosting to analyze and predict glioma prognosis. This study was conducted to assess its effectiveness on glioma prognosis prediction and to explore new predictive models for glioma patient survival after tumor resection.

Patients and methods

Patients

Approved by the Institutional Review Board of Sun Yat-sen University, this study was conducted in the Neurosurgery unit, the First Affiliated Hospital, Sun Yat-sen University, a high-volume central center that performs approximately 100 glioma surgeries yearly. In accordance with the guidelines for retrospective study in our institution, the institutional review board waived the requirement for patients' informed consent. Our study only included cases of astrocytoma, oligodendroglioma and oligoastrocytoma, anaplastic astrocytoma, oligodendroglioma and oligoastrocytoma, and glioblastoma. A cohort of 776 glioma cases (WHO grades II–IV) between 2010 and 2017 was obtained. This consecutive malignant series consisted of 74 cases of WHO grade III (anaplastic astrocytoma, oligodendroglioma, and oligoastrocytoma), 268 cases of WHO grade IV (glioblastoma), and 434 cases of WHO grade II (astrocytoma, oligodendroglioma, and oligoastrocytoma).

Clinical characteristics

Most data were accessible through the hospital database. All data were extracted into two copies of a standardized form by two research assistants independently and integrated into the final file version by a third. Discrepancies were discussed and resolved by consensus. The extracted characteristics include age at surgery, gender, symptoms (seizures, headaches/dizziness, nausea/vomiting, limb dysfunction, blurred vision, or other cranial nerve deficit), duration of the first presenting symptom, preoperative KPS, tumor size and location, time of surgery, extent of resection (gross-total resection and others), tumor subtype, treatment after surgery (chemotherapy and/or radiotherapy), survival status (alive or dead), and survival/follow-up time. The subtype of glioma was reviewed by a pathologist according to the latest 2016 WHO criteria (14). The deficit of motor, visual, or cranial never function was confirmed by the proof of physical examination, diffusion tensor imaging (DTI)-based tractography, and so on. The same as the definition by Okamoto et al. (15), the extent of resection was categorized, where gross-total resection was defined as residual tumor less than 5%. The follow-up data were collected until December 2019. Survival/follow-up time was calculated from the date of tumor resection to death (any cause) or censor (still survived) in December 2019. All patients were followed up at the regular interval of 3 months for the initial 3 years and afterward followed every 1 year until death. The last follow-up for every single accessible patient was finished in December 2019.

Biomarkers

Biomarkers’ detection, including immunohistochemistry (IHC) and molecular genetics, was performed on histological specimens that were obtained at the time of resection surgery prior to chemotherapy and/or radiotherapy treatment. The detection of kit67, p53, vimentin, and glial fibrillary acidic protein (GFAP) was performed using immunohistochemical stains in glioma by standard techniques that were described previously (16). For the specimen with p53 immunohistochemical stain, the presence of strong positive tumor nuclei in more than 10% of cells was marked as immunopositive, which indicated the mutational status of TP53 (17). The immunopositivity of vimentin was identified when more than 25% of tumor nuclei stained positive with vimentin IHC stain. GFAP immunopositivity was marked when any tumor nuclei were positive with GFAP IHC stain. The Ki-67 index was recorded as the average percentage of the positive ones on the total number of nuclei at 400× magnification, where “≥10%” represented high Ki-67 expression (18). As for molecular genetics, the biomarker we detected was methylation of the O⁶-methylgaunine-DNA-methyltransferase (MGMT) promoter. This test was done using methylation-specific PCR.

Supervised machine learning algorithm

SVM, as a machine learning algorithm, has been widely used in the prognosis of diseases. Decision tree is a well-known ML approach for statistical problems, which represents the mapping relationship between properties and values. It consists of a root node, internal nodes, and leaf nodes, where leaf nodes correspond to values represented by the path from the root node to the leaf node. Decision tree can be used for survival analysis (19). Here, we used survival decision tree as the base learner of random forests (RFs). RF is an ensemble tree method whose final prediction is the average of all predictions from every tree in the forest. RF performs better in prediction than a single tree because a combination of predictions from separate methods could substantially promote prediction performance (20). Random survival forest (RSF) is an adaptation of random forest, which is designed for the analysis of survival data (21).

GB is an ML technique that can be used for survival analysis. Here, we used component-wise least squares and survival decision tree as two types of base learners, respectively. The Gradient Boosting algorithm produces different weak prediction models (for instance, component-wise least squares) at each step and combines them into a total model at different weights. The prediction of the weak model that Gradient Boosting produced at each step generates a unanimous gradient direction of the loss function. The details have been described previously (22).

Model evaluation

Harrell's concordance index (c-index), defined as the ratio of correctly ordered (concordant) pairs to comparable pairs, is a measure of the rank correlation between predicted risk scores and observed time points. A value of 1 refers to perfect prediction, while a value of 0.5 means that prediction does not perform better than random guessing.

The area under the receiver operating characteristic curve (ROC curve) is often used to assess the discrimination of the binary classification model. When extending the ROC curve to survival time, it gives rise to the time-dependent cumulative ROC curve at a certain survival time t. The area under the cumulative ROC curve (AUC) at time t indicates how well a model can distinguish subjects who will experience an event by time t from those who will not.

The calibration curve is a graphical measure of the calibration of the model, which is a linear plot with the predicted event on the x-axis and the observed event on the y-axis. Good calibration would be matched by a regression line with a 45° slope.

To fully capture the true utility of a prediction model, the sensitivity and specificity of models for predicting 6-, 12-, 36-, and 60-month survival were calculated after determining the optimal threshold through the ROC curve.

Model construction

All clinical characteristics and biomarker information were included in the model training set as variables. The missing values of variables were filled with multiple imputations. Here, we randomly split the data into a training set and a test set at an 8:2 ratio using the train_test_split function in the scikit-learning module of Python (version 3.7). The scikit-survival module (version 0.12.1) was used to construct ML models, SVM, RF, Tree GB, and Component GB. ML algorithms involve many hyperparameters that are significant for performance prediction. The optimal combination of hyperparameters was determined using the method of grid search. During every cross-validation, 1/3 of the data in the training set were randomly excluded as out-of-bag (OOB) data for validation. For different combinations of hyperparameters, the mean c-index on the validation data was calculated after 50 times cross-validation. The hyperparameter combination with the best c-index was selected as optimum. After constructing the model, we usually assess the feature importance by calculating its contribution to the c-index, namely, the decrease of c-index after discombobulating the relationship of this feature with survival.

To compare the performance difference between ML models and conventional survival models, we also built the Cox proportional hazards model. Three continuous variables, age at surgery, preoperative KPS, and tumor size, were transformed into categorical variables to obtain the best model prediction performance. Cutoffs for these variables were 50 years, 70 cm, and 55 cm, respectively. All variables were entered into the model step by step, and the final model only included variables with a significant risk ratio.

Statistical analysis

Mean ± SD or median (IQR) was chosen to describe continuous variables regarding their statistical distribution, while categorical variables were expressed in the form of example numbers (%). P < 0.05 was set as the criteria of statistical significance in all analyses. The confidence interval (CI) of the AUC was computed by the bootstrap method, while 95% CI was computed with 2,000 stratified bootstrap replicates. The comparisons of the c-index between different models were conducted using the R package Survcomp.

Results

Characteristic overview

The sociodemographic and characteristics of the study population are presented in Table 1. The most frequent symptom was headaches or dizziness, while the most frequent tumor location and subtype were parietal lobe and diffuse astrocytoma, respectively. The medians of the duration of the first presenting symptom, preoperative KPS, and tumor size were 1.90 months, 60 mm, and 45.00 mm, respectively. Gross-total resection was adopted in 90.0% of patients. The immunopositivity of GFAP, Vimentin, and p53 was observed in more than half of the patients. The median of survival time was 32.65 months.

TABLE 1

Table 1. Clinical characteristics and biomarkers of patients.

Figure 1 shows the correlation coefficient between each independent variable. It demonstrated low correlation between each variable.

FIGURE 1

Figure 1. Correlation coefficient matrix of each variable. Each coefficient is annotated. The closer it gets to 1, the more positively correlated it is. The closer it gets to −1, the more negatively correlated it is.

Model performance

The flow chart of model construction is shown in Supplementary Figure S1. Supplementary Table S1 shows the detailed descriptions of the selected modules, classes, and hyperparameters in Python for each model, including the Cox survival model and supervised ML models.

The five models are compared in Table 2. The Cox proportional hazards model had the worst performance, with a concordance index of 0.755 for the test set. The SVM model was observed to have relatively poor performance, with a c-index of 0.787 for the test set. The Tree GB survival model ranked second, with a c-index of 0.837, while the RSF model ranked third, with a c-index of 0.830. The Component GB survival model had the best prediction performance, with a c-index of 0.840. In addition, we also compared the c-index values of different models on the test set. The c-index values of Tree GB and Component GB survival models were significantly higher than those of the Cox proportional hazards model (P < 0.05) and SVM model (P < 0.05). Although the comparison results of the RF model were not significant (P values were 0.332 and 0.112, respectively), relatively superior performances of Tree GB and Component GB models were still observed. The reason for no statistical significance could be attributed to the little sample size of the test set to a certain extent.

TABLE 2

Table 2. Concordance indexes of models for the training set and test set.

Figure 2 shows the AUCs of both GB models. All AUCs at different survival times were above 0.800, which indicated the excellent discrimination of models. The prediction AUC and CI values of both GB models’ for 6-, 12-, 36-, and 60-month survival are specifically listed in Supplementary Tables S2 and S3, which highlighted superior predictive performance. The calibration curves of both GB models are shown in Supplementary Figures S2 and S3, where good calibration was found in survival prediction. Based on the optimal thresholds, the Tree GB model predicted 6-, 12-, 36-, and 60-month survival with 94.4%, 90.6%, 99.3%, and 100% sensitivity and 91.3%, 85.7%, 71.3%, and 73.8% specificity, while the Component GB model predicted the survival results with 90.0%, 73.5%, 85.2%, and 90.4% sensitivity and 84.1%, 92.9%, 87.5%, and 82.8% specificity, respectively. The results are listed in Supplementary Tables S4 and S5.

FIGURE 2

Figure 2. Area under the cumulative ROC curves of both two Gradient Boosting models at different survival times. (A) AUC of the Component GB survival model at different survival times. (B) AUC of the Tree GB survival model at different survival times. ROC curve, receiver operating characteristic curve; AUC, area under the curve; GB, Gradient Boosting.

Feature importance

From Table 3, we found that KPS, the tumor subtype of glioblastoma not otherwise specified (NOS), age, tumor size, and the tumor subtype of oligodendroglioma (NOS) ranked top five in the Tree GB survival model in terms of the feature importance. As for the Component GB survival model, it was KPS, the tumor subtype of glioblastoma (NOS), age, extent of resection, and tumor size that ranked the top five. As described in Supplementary Figure S4, significant variables included in the final Cox proportional hazards model were KPS, age, tumor size, tumor subtype, extent of resection, chemotherapy, radiotherapy, p53 immunopositivity, and methylation of the MGMT promoter.

TABLE 3

Table 3. Top 15 feature importance of two Gradient Boosting models.

Sensitivity analysis

We also performed sensitivity analysis to detect the robustness of ML model prediction performance. Training and testing with variables without imputation, the c-indexes of ML survival models are listed in Table 4. All c-indexes were at a high level (above 0.800), which indicated the robustness of ML model prediction performance. Considering the large heterogeneity of different types of gliomas, we deliberately tested the prediction performance of ML models on three most common gliomas, namely, diffuse astrocytoma, oligodendroglioma, and glioblastoma. As can be seen from Supplementary Table S6, the c-indexes were almost at the level of about 0.8, which proves that the model is compatible with different types of gliomas.

TABLE 4

Table 4. Concordance indexes of models for variables without imputation.

Discussion

This study was designed to assess the effectiveness of the supervised ML algorithm, especially Gradient Boosting, in the prediction of glioma patient survival after tumor resection and to explore new predictive models useful for medical workers. Judging by Harrell's concordance index of the training set and test set, the Gradient Boosting algorithm ranked first on prediction performance. There were differences in prediction performance between Tree GB and Component GB algorithms. Tree GB showed better performance on the training set (c-index: 0.905) but worst performance on the test set (c-index: 0.837) than Component GB, which implied a trend of overfitting. By the way, considering the discrimination and calibration through time-dependent cumulative ROC curves and sensitivity, specificity, and calibration curves, Tree and Component GB were both good at predicting 6-, 12-, 36-, and 60-month survival after surgery. The results of the sensitive analysis revealed that both GB models were stable at the prediction outcome.

At the same time, the feature importance of ML models was also assessed. The top 15 important features in Tree or Component GB models could be reduced to nine variables, namely, KPS, age, tumor size, tumor subtype, extent of resection, chemotherapy, radiotherapy, p53 immunopositivity, and methylation of the MGMT promoter. It was in line with the significant variables included in the Cox proportional hazards model.

Consistent with the results of previous studies (23, 24), our result revealed that KPS influenced glioma patient survival after resection surgery. KPS or similar crude scales are commonly seen methods to evaluate gross functional status and have been repeatedly described as prognostic factors in the management of glioma patients (23, 24). Also, age is one of the most established prognostic factors in patients with malignant gliomas, regardless of lower-grade (24, 25) or higher-grade gliomas (26). As claimed by Paugh et al. (27), the substantial differences in the molecular features underlying age-stratified gliomas might lead to different treatment responses, accounting for different survival outcomes. A cutoff value of 55 years has been reported repeatedly to stratify glioma patients, while significantly impaired survival is always observed in those 55 years and above. Here, our study confirmed advanced age as an unfavorable prognostic factor once more. Previous studies (28, 29) have shown a strong association between preoperative tumor size and glioma survival, which is in line with the finding of our research. Regarding the extent of resection, complete curative resection is thought impossible due to the lack of clear tumor borders and the invasive behavior of the tumor. Although a number of studies (30, 31) have demonstrated that maximal resection substantially improves progression-free and overall survival, it has also been reported that aggressive glioma resection might increase the risk of postoperative complications and lead to worse survival prognosis. Therefore, the relationship between the extent of surgical resection and patient outcome still remains controversial. Even so, our results showed a positive correlation between gross-total resection and prognosis improvement for patients compared to partial resection or biopsy. In view of chemotherapy and radiotherapy, they are crucial elements in the treatment plan of glioma patients. Postoperative adjuvant radiotherapy and chemotherapy have always been recommended to start within 2–4 weeks after surgical resection and have proven to be significant prognostic factors by previous studies (32, 33) and this study. Tumor subtype is one the most commonly recognized prognostic factors, and the subtype based on the latest 2016 WHO criteria helps to predict patient prognosis more accurately. Here, our research also served as evidence of the critical role of tumor subtype in glioma management.

Then, it comes to biomarker information. MGMT is a DNA repair protein that removes alkyl groups and adducts at the O⁶ position of guanine, protecting the cell against mutagenic effects. Promoter methylation of MGMT causes silencing of the MGMT gene and loss of protein expression, accounting for the accumulation of DNA damage and increased sensitivity to temozolomide-based chemoradiotherapy. A prognostic effect of MGMT promoter methylation in patients with lower-grade (34) or higher-grade (35) glioma has already been observed. Located on human chromosome 17p13, the p53 gene is a tumor suppressor and has been detected to regulate apoptosis, inhibit DNA replication, and control cell motility and invasion. As a consequence of p53 gene mutation, the mutant p53 protein escapes from degradation and accumulates in the cells, leading to positive staining by IHC. A meta-analysis concluded that p53 immunopositivity has effective usefulness in analyzing the prognosis of glioma patients (36). As for GFAP, Vimentin, and Ki-67, there exist a number of research studies (37–39) concentrating on their prognostic value. However, our analysis only validated the essential prognostic value of MGMT promoter methylation and p53 immunopositivity, and the other biomarkers need to be further evaluated.

There were several limitations. First, this was a single-center study, which might make the analysis potentially prone to bias and limit the generalization of supervised ML models. Second, the study included cases that occurred before 2016, where the glioma subtype classification at that time was different from the recent 2016 WHO criteria, causing half of those cases to lack evidence of subdivision (for instance, isocitrate dehydrogenase (IDH)) for the latter classification criteria. This might influence the calibration and discrimination of prediction models. Nevertheless, we believed our research has merit, given it is the first study to apply Gradient Boosting algorithms to glioma prognosis prediction. We had constructed predictive models successfully and also found that Gradient Boosting models were more likely to improve the performance of predicting glioma patient survival after tumor resection.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

The studies involving human participants were reviewed and approved by the Institutional Review Board of Sun Yat-sen University. The patients/participants provided their written informed consent to participate in this study.

Author contributions

YL: conception and design of study, acquisition, analysis, and interpretation of data, and drafting the manuscript. MY: conception and design of the study and critically revising the manuscript for important intellectual content. BJ: analysis and interpretation of data and critically revising the manuscript for important intellectual content. LC: conception and design of the study and critically revising the manuscript for important intellectual content. ZZ: conception and design of the study and critically revising the manuscript for important intellectual content. All authors contributed to the article and approved the submitted version.

Funding

The study was funded by the Guangzhou Science and Technology Planning Project (201802020027) of Guangzhou Science and Technology Innovation Committee.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher's note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Supplementary material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fsurg.2022.975022/full#supplementary-material.

References

1. Ostrom QT, et al. CBTRUS statistical report: primary brain and other central nervous system tumors diagnosed in the United States in 2011–2015. Neuro Oncol. (2018) 20(suppl_4):iv1–86. doi: 10.1093/neuonc/noy131

PubMed Abstract | CrossRef Full Text | Google Scholar

2. Cavaliere R, Lopes MB, Schiff D. Low-grade gliomas: an update on pathology and therapy. Lancet Neurol. (2005) 4(11):760–70. doi: 10.1016/S1474-4422(05)70222-2

PubMed Abstract | CrossRef Full Text | Google Scholar

3. Gorovets D, et al. IDH mutation and neuroglial developmental features define clinically distinct subclasses of lower grade diffuse astrocytic glioma. Clin Cancer Res. (2012) 18(9):2490–501. doi: 10.1158/1078-0432.CCR-11-2977

PubMed Abstract | CrossRef Full Text | Google Scholar

4. Gittleman H, Sloan AE, Barnholtz-Sloan JS. An independently validated survival nomogram for lower grade glioma. Neuro Oncol. (2020) 22:665–74. doi: 10.1093/neuonc/noz191

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Zhang M, et al. Novel immune-related gene signature for risk stratification and prognosis of survival in lower-grade glioma. Front Genet. (2020) 11:363. doi: 10.3389/fgene.2020.00363

PubMed Abstract | CrossRef Full Text | Google Scholar

6. Han W, et al. Deep transfer learning and radiomics feature prediction of survival of patients with high-grade gliomas. AJNR Am J Neuroradiol. (2020) 41(1):40–8. doi: 10.3174/ajnr.A6365

PubMed Abstract | CrossRef Full Text | Google Scholar

7. Jakola AS, et al. Advancements in predicting outcomes in patients with glioma: a surgical perspective. Expert Rev Anticancer Ther. (2020) 20(3):167–77. doi: 10.1080/14737140.2020.1735367

PubMed Abstract | CrossRef Full Text | Google Scholar

8. Kourou K, et al. Machine learning applications in cancer prognosis and prediction. Comput Struct Biotechnol J. (2015) 13:8–17. doi: 10.1016/j.csbj.2014.11.005

PubMed Abstract | CrossRef Full Text | Google Scholar

9. Barakat NH, Bradley AP, Barakat MN. Intelligible support vector machines for diagnosis of diabetes mellitus. IEEE Trans Inf Technol Biomed. (2010) 14(4):1114–20. doi: 10.1109/TITB.2009.2039485

PubMed Abstract | CrossRef Full Text | Google Scholar

10. Esteban C, et al. Development of a decision tree to assess the severity and prognosis of stable COPD. Eur Respir J. (2011) 38(6):1294–300. doi: 10.1183/09031936.00189010

PubMed Abstract | CrossRef Full Text | Google Scholar

11. Passos IC, Mwangi B, Kapczinski F. Big data analytics and machine learning: 2015 and beyond. Lancet Psychiatry. (2016) 3(1):13–5. doi: 10.1016/S2215-0366(15)00549-0

PubMed Abstract | CrossRef Full Text | Google Scholar

12. Xu Y, et al. Supervised machine learning predictive analytics for triple-negative breast cancer death outcomes. Onco Targets Ther. (2019) 12:9059–67. doi: 10.2147/OTT.S223603

PubMed Abstract | CrossRef Full Text | Google Scholar

13. Xu Y, et al. Machine learning algorithms for predicting the recurrence of stage IV colorectal cancer after tumor resection. Sci Rep. (2020) 10(1):2519. doi: 10.1038/s41598-020-59115-y

PubMed Abstract | CrossRef Full Text | Google Scholar

14. Louis DN, et al. The 2016 World Health Organization classification of tumors of the central nervous system: a summary. Acta Neuropathol (2016) 131(6):803–20. doi: 10.1007/s00401-016-1545-1

PubMed Abstract | CrossRef Full Text | Google Scholar

15. Okamoto Y, et al. Population-based study on incidence, survival rates, and genetic alterations of low-grade diffuse astrocytomas and oligodendrogliomas. Acta Neuropathol. (2004) 108(1):49–56. doi: 10.1007/s00401-004-0861-z

PubMed Abstract | CrossRef Full Text | Google Scholar

16. Hu W, et al. Expression of CPEB4 in human glioma and its correlations with prognosis. Medicine (Baltimore). (2015) 94(27):e979. doi: 10.1097/MD.0000000000000979

PubMed Abstract | CrossRef Full Text | Google Scholar

17. Takami H, et al. Revisiting TP53 mutations and immunohistochemistry—a comparative study in 157 diffuse gliomas. Brain Pathol. (2015) 25(3):256–65. doi: 10.1111/bpa.12173

PubMed Abstract | CrossRef Full Text | Google Scholar

18. Cai J, et al. ATRX mRNA expression combined with IDH1/2 mutational status and Ki-67 expression refines the molecular classification of astrocytic tumors: evidence from the whole transcriptome sequencing of 169 samples. Oncotarget. (2014) 5(9):2551–61. doi: 10.18632/oncotarget.1838

PubMed Abstract | CrossRef Full Text | Google Scholar

19. Fan J, Nunn ME, Su X. Multivariate exponential survival trees and their application to tooth prognosis. Comput Stat Data Anal. (2009) 53(4):1110–21. doi: 10.1016/j.csda.2008.10.019

PubMed Abstract | CrossRef Full Text | Google Scholar

20. Taylor JM. Random survival forests. J Thorac Oncol. (2011) 6(12):1974–5. doi: 10.1097/JTO.0b013e318233d835

PubMed Abstract | CrossRef Full Text | Google Scholar

21. Ishwaran H, et al. Random survival forests. Ann Appl Stat. (2008) 2(3):841–60. doi: 10.1214/08-AOAS169

CrossRef Full Text | Google Scholar

22. Rashmi KV, Gilad-Bachrach R. DART: dropouts meet multiple additive regression trees. In: AISTATS (2015).

23. Capelle L, et al. Spontaneous and therapeutic prognostic factors in adult hemispheric world health organization grade II gliomas: a series of 1097 cases: clinical article. J Neurosurg. (2013) 118(6):1157–68. doi: 10.3171/2013.1.JNS121

PubMed Abstract | CrossRef Full Text | Google Scholar

24. Chang EF, et al. Preoperative prognostic classification system for hemispheric low-grade gliomas in adults. J Neurosurg. (2008) 109(5):817–24. doi: 10.3171/JNS/2008/109/11/0817

PubMed Abstract | CrossRef Full Text | Google Scholar

25. Corell A, et al. Age and surgical outcome of low-grade glioma in Sweden. Acta Neurol Scand. (2018) 138(4):359–68. doi: 10.1111/ane.12973

PubMed Abstract | CrossRef Full Text | Google Scholar

26. Sun Y, et al. Characteristics and prognostic factors of age-stratified high-grade intracranial glioma patients: a population-based analysis. Bosn J Basic Med Sci. (2019) 19(4):375–83. doi: 10.17305/bjbms.2019.4213

PubMed Abstract | CrossRef Full Text | Google Scholar

27. Paugh BS, et al. Integrated molecular genetic profiling of pediatric high-grade gliomas reveals key differences with the adult disease. J Clin Oncol. (2010) 28(18):3061–8. doi: 10.1200/JCO.2009.26.7252

PubMed Abstract | CrossRef Full Text | Google Scholar

28. Jairam V, et al. Defining an intermediate-risk group for low-grade glioma: a national cancer database analysis. Anticancer Res. (2019) 39(6):2911–8. doi: 10.21873/anticanres.13420

PubMed Abstract | CrossRef Full Text | Google Scholar

29. Leu S, et al. Preoperative two-dimensional size of glioblastoma is associated with patient survival. World Neurosurg. (2018) 115:e448–63. doi: 10.1016/j.wneu.2018.04.067

PubMed Abstract | CrossRef Full Text | Google Scholar

30. Brown TJ, et al. Association of the extent of resection with survival in glioblastoma: a systematic review and meta-analysis. JAMA Oncol. (2016) 2(11):1460–9. doi: 10.1001/jamaoncol.2016.1373

PubMed Abstract | CrossRef Full Text | Google Scholar

31. Pan IW, Ferguson SD, Lam S. Patient and treatment factors associated with survival among adult glioblastoma patients: a USA population-based study from 2000 to 2010. J Clin Neurosci. (2015) 22(10):1575–81. doi: 10.1016/j.jocn.2015.03.032

PubMed Abstract | CrossRef Full Text | Google Scholar

32. Barnholtz-Sloan JS, et al. Racial/ethnic differences in survival among elderly patients with a primary glioblastoma. J Neurooncol. (2007) 85(2):171–80. doi: 10.1007/s11060-007-9405-4

PubMed Abstract | CrossRef Full Text | Google Scholar

33. Aizer AA, et al. Underutilization of radiation therapy in patients with glioblastoma: predictive factors and outcomes. Cancer. (2014) 120(2):238–43. doi: 10.1002/cncr.28398

PubMed Abstract | CrossRef Full Text | Google Scholar

34. Bell EH, et al. Association of MGMT promoter methylation Status with survival outcomes in patients with high-risk glioma treated with radiotherapy and temozolomide: an analysis from the NRG oncology/RTOG 0424 trial. JAMA Oncol. (2018) 4(10):1405–9. doi: 10.1001/jamaoncol.2018.1977

PubMed Abstract | CrossRef Full Text | Google Scholar

35. Gilbert MR, et al. Dose-dense temozolomide for newly diagnosed glioblastoma: a randomized phase III clinical trial. J Clin Oncol. (2013) 31(32):4085–91. doi: 10.1200/JCO.2013.49.6968

PubMed Abstract | CrossRef Full Text | Google Scholar

36. Jin Y, et al. Expression and prognostic significance of p53 in glioma patients: a meta-analysis. Neurochem Res. (2016) 41(7):1723–31. doi: 10.1007/s11064-016-1888-y

PubMed Abstract | CrossRef Full Text | Google Scholar

37. Chen WJ, et al. Ki-67 is a valuable prognostic factor in gliomas: evidence from a systematic review and meta-analysis. Asian Pac J Cancer Prev. (2015) 16(2):411–20. doi: 10.7314/APJCP.2015.16.2.411

PubMed Abstract | CrossRef Full Text | Google Scholar

38. Schwab DE, et al. Immunohistochemical comparative analysis of GFAP, MAP-2, NOGO-A, OLIG-2 and WT-1 expression in WHO 2016 classified neuroepithelial tumours and their prognostic value. Pathol Res Pract. (2018) 214(1):15–24. doi: 10.1016/j.prp.2017.12.009

PubMed Abstract | CrossRef Full Text | Google Scholar

39. Lin L, et al. Analysis of expression and prognostic significance of vimentin and the response to temozolomide in glioma patients. Tumour Biol. (2016) 37(11):15333–9. doi: 10.1007/s13277-016-5462-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: glioma, machine learning, Gradient Boosting, survival, tumor resection

Citation: Li Y, Ye M, Jia B, Chen L and Zhou Z (2023) Practice of the new supervised machine learning predictive analytics for glioma patient survival after tumor resection: Experiences in a high-volume Chinese center. Front. Surg. 9:975022. doi: 10.3389/fsurg.2022.975022

Received: 21 June 2022; Accepted: 28 December 2022;
Published: 17 February 2023.

Edited by:

Rafael De La Garza Ramos, Montefiore Medical Center, United States

Reviewed by:

Anant Naik, University of Illinois at Urbana-Champaign, United States
Vinayak Narayan, Northwell Health, United States

© 2023 Li, Ye, Jia, Chen and Zhou. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Linwei Chen wjszhongdafuyi@163.com Zubang Zhou zzbxjh@126.com

^†These authors have contributed equally to this work and share first authorship

Specialty Section: This article was submitted to Neurosurgery, a section of the journal Frontiers in Surgery

ORIGINAL RESEARCH article

Practice of the new supervised machine learning predictive analytics for glioma patient survival after tumor resection: Experiences in a high-volume Chinese center

Introduction

Patients and methods

Patients

Clinical characteristics

Biomarkers

Supervised machine learning algorithm

Model evaluation

Model construction

Statistical analysis

Results

Characteristic overview

Model performance

Feature importance

Sensitivity analysis

Discussion

Data Availability Statement

Ethics statement

Author contributions

Funding

Conflict of interest

Publisher's note

Supplementary material

References

This article is part of the Research Topic

People also looked at