AUTHOR=Cho Jungheum , Kim Young Jae , Sunwoo Leonard , Lee Gi Pyo , Nguyen Toan Quang , Cho Se Jin , Baik Sung Hyun , Bae Yun Jung , Choi Byung Se , Jung Cheolkyu , Sohn Chul-Ho , Han Jung-Ho , Kim Chae-Yong , Kim Kwang Gi , Kim Jae Hyoung TITLE=Deep Learning-Based Computer-Aided Detection System for Automated Treatment Response Assessment of Brain Metastases on 3D MRI JOURNAL=Frontiers in Oncology VOLUME=11 YEAR=2021 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2021.739639 DOI=10.3389/fonc.2021.739639 ISSN=2234-943X ABSTRACT=Background

Although accurate treatment response assessment for brain metastases (BMs) is crucial, it is highly labor intensive. This retrospective study aimed to develop a computer-aided detection (CAD) system for automated BM detection and treatment response evaluation using deep learning.

Methods

We included 214 consecutive MRI examinations of 147 patients with BM obtained between January 2015 and August 2016. These were divided into the training (174 MR images from 127 patients) and test datasets according to temporal separation (temporal test set #1; 40 MR images from 20 patients). For external validation, 24 patients with BM and 11 patients without BM from other institutions were included (geographic test set). In addition, we included 12 MRIs from BM patients obtained between August 2017 and March 2020 (temporal test set #2). Detection sensitivity, dice similarity coefficient (DSC) for segmentation, and agreements in one-dimensional and volumetric Response Assessment in Neuro-Oncology Brain Metastases (RANO-BM) criteria between CAD and radiologists were assessed.

Results

In the temporal test set #1, the sensitivity was 75.1% (95% confidence interval [CI]: 69.6%, 79.9%), mean DSC was 0.69 ± 0.22, and false-positive (FP) rate per scan was 0.8 for BM ≥ 5 mm. Agreements in the RANO-BM criteria were moderate (κ, 0.52) and substantial (κ, 0.68) for one-dimensional and volumetric, respectively. In the geographic test set, sensitivity was 87.7% (95% CI: 77.2%, 94.5%), mean DSC was 0.68 ± 0.20, and FP rate per scan was 1.9 for BM ≥ 5 mm. In the temporal test set #2, sensitivity was 94.7% (95% CI: 74.0%, 99.9%), mean DSC was 0.82 ± 0.20, and FP per scan was 0.5 (6/12) for BM ≥ 5 mm.

Conclusions

Our CAD showed potential for automated treatment response assessment of BM ≥ 5 mm.