AUTHOR=Zhan Geng , Wang Dongang , Cabezas Mariano , Bai Lei , Kyle Kain , Ouyang Wanli , Barnett Michael , Wang Chenyu TITLE=Learning from pseudo-labels: deep networks improve consistency in longitudinal brain volume estimation JOURNAL=Frontiers in Neuroscience VOLUME=Volume 17 - 2023 YEAR=2023 URL=https://www.frontiersin.org/journals/neuroscience/articles/10.3389/fnins.2023.1196087 DOI=10.3389/fnins.2023.1196087 ISSN=1662-453X ABSTRACT=Introduction: Brain atrophy is a critical biomarker of disease progression and treatment response in neurodegenerative diseases such as multiple sclerosis (MS). Confounding factors such as in-consistent imaging acquisitions hamper the accurate measurement of brain atrophy in the clinic. This study aims to develop and validate a robust deep learning model to overcome these challenges; and to evaluate its impact on the measurement of disease progression.Methods: Voxel-wise pseudo-atrophy labels were generated using SIENA, a widely adopted tool for the measurement of brain atrophy in MS. Deformation maps were produced for 195 pairs of longitudinal 3D T1 scans from patients with MS. A 3D U-Net, namely DeepBVC, was specifically developed overcome common variances in resolution, signal-to-noise ratio and contrast ratio between baseline and follow up scans. The performance of DeepBVC was compared against SIENA using McLaren test-retest dataset and 233 in-house MS subjects with MRI from multiple time points. Clinical evaluation included disability assessment with the Expanded Disability Status Scale (EDSS) and traditional imaging metrics such as lesion burden.For 3 subjects in test-retest experiments, the median percent brain volume change (PBVC) for DeepBVC and SIENA was 0.105% vs. 0.198% (subject 1), 0.061% vs. 0.084% (subject 2), 0.104% vs. 0.408% (subject 3). For testing consistency across multiple time points in individual MS subjects, the mean (± standard deviation) PBVC difference of DeepBVC and SIENA were 0.028% (± 0.145%) and 0.031% (±0.154%) respectively. The linear correlation with baseline T2 lesion volume were r = −0.288 (p < 0.05) and r = −0.249 (p < 0.05) for DeepBVC and SIENA respectively. There was no significant correlation of disability progression with PBVC as estimated by either method (p = 0.86, p = 0.84).