AUTHOR=Bozsik Bence , Tóth Eszter , Polyák Ilona , Kerekes Fanni , Szabó Nikoletta , Bencsik Krisztina , Klivényi Péter , Kincses Zsigmond Tamás TITLE=Reproducibility of Lesion Count in Various Subregions on MRI Scans in Multiple Sclerosis JOURNAL=Frontiers in Neurology VOLUME=Volume 13 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/neurology/articles/10.3389/fneur.2022.843377 DOI=10.3389/fneur.2022.843377 ISSN=1664-2295 ABSTRACT=Purpose: Lesion number and burden can predict the long-term outcome in multiple sclerosis, while the localization of the lesions is also a good predictive marker of disease progression. These biomarkers are used in studies and in clinical practice, but the reproducibility of lesion count is not well known. Methods: Five raters evaluated T2 hyperintense lesions in 140 multiple sclerosis patients in six localizations: periventricular, juxtacortical, deep white matter, infratentorial, spinal cord, optic nerve. Black holes on T1 weighted images and brain atrophy was subjectively measured on a binary scale. Reproducibility was measured using the intraclass correlation coefficient (ICC). ICC`s were also calculated for the four most accurate raters, to see how one outlier can influence the results. Results: Overall, moderate reproducibility (ICC 0.5 – 0.75) was shown, which did not improve considerably when the most divergent rater was excluded. The areas that produced the worst results were the optic nerve region (ICC: 0.118) and atrophy judgement (ICC: 0.364). Comparing high and low lesion burden in each region revealed that the ICC is higher when the lesion count is in the mid-range. In the periventricular and deep white matter area, where lesions are common, higher ICC was found in patients who had a lower lesion count. On the other hand, juxtacortical lesions and black holes that are less common, showed higher ICC when the subjects had more lesions. This difference was significant in the juxtacortical region when the most accurate raters compared patients with low (ICC: 0.406 CI: 0.273 – 0.546) and high (0.702 CI: 0.603 - 0.785) lesion loads. Conclusion: Lesion classification showed high variability by location and overall moderate reproducibility. Excellent range was not achieved, which can be owed to the fact that some areas showed poor performance. Hence, putting effort in towards the development of artificial intelligence for the evaluation of lesion burden should be considered.