AUTHOR=Booth Thomas C. , Grzeda Mariusz , Chelliah Alysha , Roman Andrei , Al Busaidi Ayisha , Dragos Carmen , Shuaib Haris , Luis Aysha , Mirchandani Ayesha , Alparslan Burcu , Mansoor Nina , Lavrador Jose , Vergani Francesco , Ashkan Keyoumars , Modat Marc , Ourselin Sebastien TITLE=Imaging Biomarkers of Glioblastoma Treatment Response: A Systematic Review and Meta-Analysis of Recent Machine Learning Studies JOURNAL=Frontiers in Oncology VOLUME=12 YEAR=2022 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2022.799662 DOI=10.3389/fonc.2022.799662 ISSN=2234-943X ABSTRACT=Objective

Monitoring biomarkers using machine learning (ML) may determine glioblastoma treatment response. We systematically reviewed quality and performance accuracy of recently published studies.

Methods

Following Preferred Reporting Items for Systematic Reviews and Meta-Analysis: Diagnostic Test Accuracy, we extracted articles from MEDLINE, EMBASE and Cochrane Register between 09/2018–01/2021. Included study participants were adults with glioblastoma having undergone standard treatment (maximal resection, radiotherapy with concomitant and adjuvant temozolomide), and follow-up imaging to determine treatment response status (specifically, distinguishing progression/recurrence from progression/recurrence mimics, the target condition). Using Quality Assessment of Diagnostic Accuracy Studies Two/Checklist for Artificial Intelligence in Medical Imaging, we assessed bias risk and applicability concerns. We determined test set performance accuracy (sensitivity, specificity, precision, F1-score, balanced accuracy). We used a bivariate random-effect model to determine pooled sensitivity, specificity, area-under the receiver operator characteristic curve (ROC-AUC). Pooled measures of balanced accuracy, positive/negative likelihood ratios (PLR/NLR) and diagnostic odds ratio (DOR) were calculated. PROSPERO registered (CRD42021261965).

Results

Eighteen studies were included (1335/384 patients for training/testing respectively). Small patient numbers, high bias risk, applicability concerns (particularly confounding in reference standard and patient selection) and low level of evidence, allow limited conclusions from studies. Ten studies (10/18, 56%) included in meta-analysis gave 0.769 (0.649-0.858) sensitivity [pooled (95% CI)]; 0.648 (0.749-0.532) specificity; 0.706 (0.623-0.779) balanced accuracy; 2.220 (1.560-3.140) PLR; 0.366 (0.213-0.572) NLR; 6.670 (2.800-13.500) DOR; 0.765 ROC-AUC.

Conclusion

ML models using MRI features to distinguish between progression and mimics appear to demonstrate good diagnostic performance. However, study quality and design require improvement.