Thalamic Atrophy Without Whole Brain Atrophy Is Associated With Absence of 2-Year NEDA in Multiple Sclerosis

Purpose: To study which brain volume measures best differentiate early relapsing MS (RMS) and secondary progressive MS (SPMS) patients and correlate with disability and cognition. To test whether isolated thalamic atrophy at study baseline correlates with NEDA (no evidence of disease activity) at 2 years. Methods: Total and regional brain volumes were measured from 24 newly diagnosed RMS patients 6 months after initiation of therapy and 2 years thereafter, and in 36 SPMS patients. Volumes were measured by SIENAX and cNeuro. The patients were divided into subgroups based on whole brain parenchyma (BP) and thalamic atrophy at baseline. Standard scores (z-scores) were computed by comparing individual brain volumes against healthy controls. A z-score cut-off of −1.96 was applied to separate atrophic from normal brain volumes. The Expanded Disability Status Scale (EDSS) and Symbol Digit Modalities Test (SDMT) were assessed at baseline and at 2 years. Differences in achieving NEDA-3, NEDA-4, EDSS progression, and SDMT change were analyzed between patients with no thalamic or BP atrophy and in patients with isolated thalamic atrophy at baseline. Results: At baseline, 7 SPMS and 12 RMS patients had no brain atrophy, 8 SPMS and 10 RMS patients had isolated thalamic atrophy and 2 RMS and 20 SPMS patients had both BP and thalamic atrophy. NEDA-3 was reached in 11/19 patients with no brain atrophy but only in 2/16 patients with isolated thalamic atrophy (p = 0.012). NEDA-4 was reached in 7/19 patients with no brain atrophy and in 1/16 of the patients with isolated thalamic atrophy (p = 0.047). At 2 years, EDSS was same or better in 16/19 patients with no brain atrophy but only in 5/17 patients with isolated thalamic atrophy (p = 0.002). There was no significant difference in the EDSS, relapses or SDMT between patients with isolated thalamic atrophy and no atrophy at baseline. Conclusion: Patients with isolated thalamic atrophy were at a higher risk for not reaching 2-year NEDA-3 and for EDSS increase than patients with no identified brain atrophy. The groups were clinically indistinguishable. A single measurement of thalamic and whole brain atrophy could help identify patients needing most effective therapies from early on.


INTRODUCTION
The quantification of brain atrophy by MRI has become an increasingly important part of evaluating neurodegeneration in MS (1,2). Atrophy measures can reflect the damage on the central nervous system (CNS) caused by the pathological processes of the disease. However, some contributors to volumetric change, such as fluid shifts, are potentially more reversible than others, e.g., axonal and/or neuronal loss. Brain atrophy occurs in all clinical stages of untreated MS patients at a rate of 0.5-1.35%/year, in comparison with 0.1-0.3%/year in healthy individuals (1). Thalamus and other deep gray matter (GM) nuclei are among the first GM structures to be affected in MS. Several studies have shown associations between GM atrophy and measures of disease progression, such as accumulation of physical disability (3,4) and cognitive impairment (5). Thalamic atrophy occurs from early on in the disease course (6) and it has been associated with the transition from clinically isolated syndrome (CIS) to definite MS (7). Thalamic atrophy has been shown to correlate with accumulation of disability in patients with MS (8).
Brain atrophy measures could help predict the risk of future cognitive impairment and disability progression. Knowledge of the extent of brain atrophy could affect early treatment decisions, such as choice of disease modifying therapy (DMT). However, brain volume is not routinely measured in normal clinical practice due to several reasons. There are no standardized guidelines on how to measure brain atrophy in MS and the measures are potentially influenced by technical, biological, and pharmacological factors (2). SIENA and SIENAX are the most established automated methods for measuring brain atrophy. Even though these methods are automated, their use in image analysis requires time consuming refinement of the images. In recent years, several other automated methods of brain segmentation, such as the FIRST tool from the FMRIB Software Library (FSL) (9) and FreeSurfer, have been introduced (10). These methods have mainly been used in research setting and their applicability in real life and implementation in evaluating treatment outcomes is not yet included in the guidelines for the use of MRI in MS (2).
NEDA (no evidence of disease activity) has emerged as a potential treatment target in patients with relapsing MS (RMS). NEDA-3 is determined by no clinical relapse, no confirmed Expanded Disability Status Scale (EDSS) disability progression sustained during the follow-up period, no new Gadolinium (Gd)enhancing lesions, and no new or enlarging T2 lesions in MRI and NEDA-4 by annualized whole brain volume loss ≤0.4% as a fourth variable (11). NEDA-3 has been used to evaluate DMT (12,13) and is becoming integrated into clinical decision-making. It has been shown to predict long term disability (11). However, the implementation of NEDA-4 in clinical practice is currently hindered by logistical and technical difficulties (11).
A recent study showed, that MRI-based brain volumetry at a single time point was able to reliably distinguish MS patients with isolated thalamus atrophy from those without brain atrophy (14). These patients could not be clinically distinguished from the RMS patients with no thalamic or whole brain atrophy, but patients with thalamus and BP atrophy showed significantly higher EDSS scores than patients in the other groups. The authors suggested that grouping patients based on MRI-based volumetry could provide information that goes beyond clinical assessments and could help identify MS patients at risk of developing widespread atrophy and disease progression.
We performed a prospective 2-year clinical follow up study in a single academic center. A fully automated multi-atlas segmentation based method, cNeuro (15), was used for the regional brain volume assessment and SIENAX for measurement of whole brain volume and SIENA for volume changes. Our first objective was to study which brain volume measures best differentiate early RMS and secondary progressive MS (SPMS) patients and correlate with disability and cognition. Secondly, we wished to test whether isolated thalamic atrophy at study baseline correlates with clinical disability worsening, relapses, cognition, and MRI outcomes at 2 years. To meet these aims, total and regional brain volumes were measured from 3D T1 brain MR images of 24 newly diagnosed RMS patients and 36 SPMS patients at baseline and after 2 years. The correlation of global and regional brain volumes with cognition and disability were analyzed at baseline and after 2 years. Secondly, the patients were divided into subgroups according to baseline thalamic and whole brain atrophy and subgroup differences in reaching NEDA-3 and NEDA-4 status, in EDSS progression and in cognitive performance after 2 years were analyzed. We have earlier shown that vitamin D supplement use is associated with less brain atrophy progression in a pooled analysis of the patients from the FREEDOMS trials (16), while others have shown that serum 25 (OH)D levels are associated with clinical and MRI outcomes in MS-patients receiving interferon-beta (17). Therefore, we included measurement of 25(OH)D levels in the baseline characteristics.

Ethics Committee Approval
The study was approved by the Ethics committee of Turku University Hospital and University of Turku at 21.1.2014 and written informed consent was obtained from all patients participating in the study.

Subjects
In this prospective study, 24 patients with newly diagnosed RMS on first-line immunomodulatory treatment initiated 6 months before study baseline, and 36 patients having SPMS were included. Inclusion criteria for the RMS patients were RMS fulfilling McDonald 2010 criteria and EDSS 0-3.5 and either glatiramer acetate of interferon-beta treatment initiated within 12 months as the first DMT. In the SPMS group, patients who had entered secondary progressive stage of MS, as defined by the treating neurologist, and EDSS 4-6.5, were included. All the participants were patients at the out-patient clinic of Turku University Hospital. Exclusion criteria were malignancy, contraindications to MRI, pregnancy or planning of a pregnancy, and failure to obtain informed consent from the patient. The following background data were collected using structured StellarQ MS registry (www.stellarq.com): age, sex, date of MS diagnosis, first symptoms, neurological status findings, and EDSS at baseline and at 2 years, serum concentration of 25-OH(D) (25-hydroxyvitamin D), socioeconomic status, data on immunomodulatory drug usage, data on relapses from the date of first symptoms until the end of the study. First patient enrolled in March 2014 and last patient in January 2015. No evidence of disease activity status, NEDA-3 was determined by no relapses, no new or enhancing lesions on MRI and no disability progression from the study baseline 6 months to 2 years later. NEDA-4 was determined by NEDA-3 and annualized whole brain volume loss ≤0.4% were determined by SIENA.

MRI Acquisition
MRIs were obtained from the 24 newly diagnosed RMS patients 6 months after initiating DMT and 2 years later, and from the 36 SPMS patients at study baseline and 2 years later. Images were analyzed by the same neuro radiologist with long experience in MS image analyses (JOK). Two female patients did not undergo the 2-year MRI analysis because of refusal in one and a total atrioventricular block necessitating pacemaker application before the second MRI in the other. The patients had to be clinically stable with no corticosteroid administration within 30 days before the MRI. All of the brain scans were acquired at the same radiological facility using the same scanner and same acquisition protocol settings were used for all of the scans. We aimed to minimize the pseudo atrophy effect in the RMS group by timing the baseline MRI 6 months after the onset of the DMT. It is a routine clinical practice in Finland to rescan all the patients at 6 months after initiation of DMT treatment (18).

Global and Regional Brain Volume Measurement
Baseline white and gray matter volumes, normalized for subject head size, were determined by SIENAX and volume change between baseline and 2 years by SIENA on the 3D T1 MRI images prior to Gadolinium administration (19,20) by an experienced neuro radiologist (TP). Manually generated lesion masks and the lesion-filling tool (part of FSL) were used to minimize the impact of hypo intense T1 lesions on volume measurements (21). Briefly, the lesion-filling tool uses the co-registered lesion masks and structural images (i.e., images to be filled) to fill the lesions with intensities that are similar to those in the non-lesional neighboring white matter.
In the SIENAX analyses, the same BET (Brain Extraction Tool) parameters (f = 0.1; g = 0; option "B") were used for all images as described by Popescu et al. (22). Quality control was performed to exclude imaging artifacts and when necessary, BET parameters were further refined individually after manual correction of the segmentations. Regional brain volumes were determined at both time points by a fully automated multi-atlas segmentation tool cNeuro (Combinostics Ltd, Tampere, Finland) from the 3D T1 MRI images that were obtained prior to Gadolinium administration (15,23). Volumes normalized for age, sex and head size were used in the statistical analyses. The cNeuro method was used since it is a fully automated tool for brain atrophy measurement, it has a CE marking and is in clinical use at the Turku University Hospital. The segmentation method described by Koikkalainen et al. (23), and Wang et al. (24), was used to compute volumes of white-matter lesions from 3D FLAIR images.

Clinical and Neuropsychological Assessment
The Symbol digit modalities test [SDMT (25)] was performed at baseline and at 24 months by the study nurse and the results were collected using SQ-MS. EDSS was performed by a trained neurologist with 21 years of experience in EDSS evaluation (M S-H). EDSS and SDMT evaluations were performed within 2 weeks of the MRI acquisition.

25(OH)D Measurement
Plasma levels of 25(OH)D were determined in Turku University Hospital clinical chemistry laboratory (Tykslab) using a chemiluminescence binding assay (Roche Diagnostics GmbH, Mannheim, Germany). The values were adjusted by the season of sampling using the following formula: -sin(2πX/ 12)cos(2πX/12), where X is month of sample collection (26). Values adjusted for the season were used for the statistical analyses.
Standard Score (z-Score) Calculation and Grouping of MS-Patients According to Atrophy Measures Standard scores (z-scores) were defined by comparing individual brain volumes with corresponding volumes from the Open Access Series of Imaging Studies (OASIS) cohort of 295 healthy controls acquired using Siemens 3T scanners (27). A z-score cut-off of −1.96 was applied to separate pathologically atrophic from normal brain volumes for thalamus and whole brain parenchymal (BP) volume (accepting a 2.5% error probability). Based on z-scores the patients were divided into groups with no brain atrophy, isolated thalamic atrophy, both thalamic and whole BP atrophy, and whole BP atrophy without thalamic atrophy.

Statistical Analysis
Cognition, disability, and brain volumes were compared between RRMS and SPMS patients. Correlation of volumes between cognition and disability were analyzed. In the group comparisons, Fisher's exact test was used for categorical parameters and Wilcoxon Rank-Sum test for continuous variables. The Pearson Correlation Coefficients were used in the correlation analyses and the significances of correlations were obtained from the t distribution. For controlling the The bold values indicate statistically significant p-values.
False Discovery Rate, Benjamini-Hochberg procedure was used as correction for multiple comparisons. After grouping of the patients based on the z-scores, between-group differences in NEDA-3, NEDA-4, EDSS, and SDMT were tested using Fisher's exact test and validation of p-values was done using the Monte Carlo simulated chi2 test. Additional analysis for 2 × 2 contingency tables was done using predictive values, logistic regression derived odds ratios and confidence intervals. R was used for all the analyses and p-values < 0.05 were considered significant. NEDA-3 positive predictive value was determined by the ratio of patients with thalamus atrophy not reaching NEDA-3 among all the patients with thalamus atrophy (true positives). NEDA-3 negative predictive value was determined by the ratio of patients with no thalamic atrophy reaching NEDA-3 among all patients with no thalamic atrophy (true negatives). Similarly, EDSS positive predictive value was determined by the ratio of patients with thalamus atrophy and EDSS progression among all patients with thalamus atrophy (true positives). EDSS negative predictive value was calculated with the ratio of patients with no thalamus atrophy and no EDSS progression among patients with no thalamus atrophy (true negatives). The number of patients reaching NEDA-4 was too small for a meaningful predictive value analysis.

Baseline Clinical Characteristics and SDMT Test Results in the RMS and SPMS Groups
The baseline clinical characteristics and SDMT test results in the RMS and SPMS patients are shown in Table 1. There were significant differences between RMS and SPMS patients in age, disability, disease activity and cognition. The SPMS patients were older, had longer disease duration, higher EDSS points and lower SDMT scores. There were no significant differences between the RMS and SPMS patients in their comorbidities, alcohol consumption, education status, smoking habits, serum vitamin D levels or BMI. All patients in the RMS group started either glatiramer acetate (GA) or interferon-beta (IFNB) therapy 6 months prior to the baseline MRI. A total of 30% of the SPMS patients included in the study were using DMT's. The DMT's used by the patients are shown in Supplemental Table 1. As a local treatment practice, all patients were supplemented with vitamin D3. The mean dose of vitamin D3 supplement was 65.9 µg (range 50-150) in the RMS patients and 55.2 µg (range 10-100) in the SPMS patients. One of the patients in the SPMS group had ulcerous colitis and azathioprine medication for it. Other comorbidities included depression, arterial hypertension, glaucoma, irritable bowel syndrome, hypothyroidism, hypercholesterolemia, epilepsy, mitral prolapse, and frozen shoulder with no clustering of any comorbidities in either patient group (data not shown).

MRI Characteristics in the RMS and SPMS Groups
First, we examined the differences in MRI characteristics between the RMS and SPMS patients. Significant differences between RMS and SPMS patients at both study baseline and at 2 years were detected in the white matter lesion volume, total BP, total gray and white matter, thalamus, hippocampus, cerebellar white matter, and putamen volumes. The volumes normalized for age, sex and head size were used in the analyses. The total and regional brain volumes of the patients at study baseline and at 2 years are shown in Table 2. The most significant differences between RMS and SPMS patients were found in thalamic volume (p < 0.001) and cerebellar WM volume (p < 0.001) at both time points. The numbers of T2 lesions and Gadolinium-enhancing lesions at study baseline and at 2 years in the RMS and SPMS patient groups are shown in Supplemental Table 2. We aimed to minimize the effect of pseudo atrophy in the RMS group by timing the baseline MRI 6 months after the onset of the DMT.

Correlation of Total and Regional Brain Volumes and Lesion Volume With EDSS and SDMT
Results of the correlation analyses of the brain volumes with cognition and disability are presented in Table 3. In brief, better performance in SDMT-test significantly correlated with larger cerebellum white matter, hippocampus, thalamus, total gray matter, and total brain volumes at both baseline and followup, and with putamen volume at study baseline but not at follow-up. Higher EDSS significantly correlated with smaller cerebellum white matter, hippocampus, thalamus, putamen, total gray matter, and total brain volumes at both time points. White matter lesion volume positively correlated with EDSS (p = 0.045 at baseline, p = 0.015 at follow-up) and negatively with SDMT (p < 0.001 at both time points). The two-year data was compared Frontiers in Neurology | www.frontiersin.org  cross-sectionally, because as a segmentation based tool, cNeuro is not ideal for measuring changes between two time points. However, we wanted to show that the group differences were reproducible at two different time points.

Grouping of the Patients Based on Thalamic and BP Atrophy
After the comparisons between the RMS and SPMS patient groups, the patients were divided into new subgroups based on thalamic and BP atrophy using a z-score cut-off value of − patients in Group 2 because of missing two-year MRI and EDSS change could not be determined for one patient due to missing two-year EDSS data. The group of patients with both BP and thalamic atrophy were named Group 3 (n = 22). The grouping of the patients according to their thalamic and BP atrophy is illustrated in Figure 1. The one SPMS patient that had BP atrophy without thalamic atrophy (Group 4, n = 1), was very close to the cut-off value of the patients in Group 3 (thalamus z-score −1.88, whole BP −3.34), and one patient fell in between Groups 1 and 2 and could not be categorized in either group.

Subgroup Comparisons Between the Groups Formed Based on Atrophy Measures
After grouping of the patients according to atrophy measures, between-group differences in EDSS and SDMT at baseline and at 2 years and the achievement of NEDA-3 and NEDA-4 during the follow-up were tested. There was no significant difference in the EDSS, relapses or SDMT points between Groups 1 and 2 at baseline. In Group 3, most of the patients (20/22) were SPMS patients and had higher EDSS, lower SDMT and less relapses compared to patients in Group 1 and 2 and were therefore not included in the between-group outcomes analyses (data not shown). At the 2-years follow-up, a significant difference was detected between the Groups 1 and 2 in EDSS change. EDSS was same

DISCUSSION
Thalamus and other deep GM nuclei are among the first GM structures to be affected in MS. Previous studies have suggested that subcortical atrophy might precede whole brain atrophy in MS (6,14,28). In this study, only 1/60 MS patients had whole BP atrophy without thalamic atrophy, but thalamic atrophy without whole BP atrophy could be shown in 16/60 patients, indicating that thalamic atrophy is meaningfully clinically measurable before whole brain atrophy. Further, a total of 20 out of the 22 patients who had both whole BP and thalamic atrophy were SPMS patients.
Thalamic atrophy has been shown to correlate with accumulation of disability in patients with MS (8). The unique role of the thalamus in predicting future disability, in comparison to other subcortical structures, has been shown in previous studies (14,29). Eshaghi et al. (29) found that smaller deep gray matter volume at baseline was associated with increased risk of shorter time to EDSS progression and that thalamus was a better predictor of future disability than other deep gray matter regions. In our study, thalamic atrophy significantly correlated both with worse performance in SDMT test and with higher EDSS score. Patients with isolated thalamus atrophy were at a higher risk for not reaching 2-year NEDA-3 and for EDSS increase than patients with no identified brain atrophy. However, 4 out of the 11 patients with no thalamic or whole brain atrophy at baseline and having NEDA-3 at 2 years, still developed brain atrophy exceeding 0.4% annual atrophy rate in the 2 years follow up (>0.8% brain atrophy at 2 years). This result is in line with the study by Uher et al. (30) suggesting that reaching NEDA-4 at least on platform MS therapies is a hard goal to achieve.
Based on relapse activity, EDSS and SDMT, the RMS patients with isolated thalamus atrophy and no detectable brain atrophy were clinically not distinguishable (p > 0.05). Most of the patients falling within the group of isolated thalamus atrophy were treatment naïve RMS patients and they were treated with either IFN or GA during the study. Our results suggest that isolated thalamus atrophy could serve as a subclinical prognostic factor associated with the NEDA-3 outcome measure on the injectable platform therapies. Measuring thalamic and whole brain atrophy could thence help to identify patients with a more severe disease and need for the most effective therapies from early on.
The association of global and regional brain atrophy with disability and the differences between RMS and SPMS patients considering atrophy measures have been reported in several previous studies (3,29,31,32). In line with previous findings, we detected significant differences in total and regional brain volumes between newly diagnosed RRMS patients and SPMS patients. In the early RRMS group, WM lesion volume was smaller, whereas total brain, total cerebral GM, thalamus, hippocampus, cerebellar WM and putamen volumes were greater than in the SPMS group. Differences in volumes were most significant for thalamus and cerebellar WM. SPMS patients have a longer history of the neurodegenerative disease process explaining their progression to greater extent of brain atrophy irrespective of their age. The age difference between the RMS and SPMS patients could not solely explain the observed differences in brain volumes because the brain volumes were normalized for age in the analyses.
Brain volume assessment involves multiple confounding variables, which make its clinical applicability challenging. These include the physiological variables of the patient, such as age, sex, head size, hydration state, menstrual cycle, smoking, alcohol consumption, and comorbidities (e.g., hypertension and diabetes). MRI-related variables include differences between scanners, acquisition protocol, quantification method, position in the scanner, and head motion. MS-related variables include brain lesions, pseudo atrophy, disease duration, inflammation, and drug treatment (33,34). In this study, we aimed to take into account the confounding variables. We gathered information of comorbidities, education and life style factors. The regional brain volumes were normalized for age, gender and head size. There were no significant differences between the two groups of patients in their comorbidities, alcohol consumption, education status, smoking habits, serum vitamin D levels, or BMI. All of the brain MRIs were acquired with the same Siemens MRI scanner and protocol. Unfortunately, we did not have normative data available from exactly the same scanner model that was used in for the MS patients. However, we analyzed our data using normative data from more than 400 cases from 8 different Siemens scanner models from the OASIS data base (27) showing relatively consistent results: the 95 % confidence interval of the average thalamus volume for each scanner contains the average thalamus volume computed for all eight scanners. Because of this, we think that using normative data from Siemens 3T scanners is a feasible compromise. If enough normative data were available, it would be ideal to compute the z-scores using data from exactly the same scanner model. In clinical practice, it may also be challenging to always scan the patient in the same scanner.
In line with Raji et al. (14), we found that it is possible to detect isolated thalamus atrophy by MRI-based brain volumetry at a single time point. A single time point predictive MRI brain volume measure could help implementing brain volume measurement into the clinical practice, since measuring brain volume changes between time points is technically challenging outside clinical trial settings (2). Segmentation based methods for brain volume assessment, such as SIENAX, are not designed for the assessment of volume change between time points.
The discussed challenges in brain volume assessment are among the reasons, why brain atrophy is not routinely measured in MS clinical practice. However, being relatively easy to carry out, measuring thalamic atrophy at a single time point could be applicable in clinical practice.

CONCLUSION
Our study supports the usefulness of a single measurement of thalamic and whole brain atrophy in identification of patients at risk of disability progression and not reaching NEDA-3 with platform injectable therapies. The results of our study have to be considered as pilot ones, since the number of patients in our study was rather small and the follow-up time of 2 years is short for detection of confirmed disability progression. Further studies with larger cohorts of patients, longer follow and possibly complemented by addition of other prognostic markers such as neurofilament will hopefully be seen in near future.

ETHICS STATEMENT
The study was performed in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki and its later amendments. All persons included in the study gave their informed consent prior to their inclusion in the study.

AUTHOR CONTRIBUTIONS
KH, MS-H, and JL contributed conception and design of the study. KH, MS-H, and MV organized the database. MV performed the statistical analysis. KH wrote the first draft of the manuscript. MV wrote the statistics section of the manuscript and JOK and TP the MRI acquisition section of the manuscript. All authors contributed to manuscript revision, read and approved the submitted version.

FUNDING
The study was supported by an investigator-initiated trial grant from Biogen Idec Finland to Turku University Hospital, Division of Clinical Neuroscience. KH received personal research grants from Maire Taponen foundation, from the Instrumentarium Science Foundation and from the Finnish MS foundation and a travel grant from the University of Turku Doctoral Programme in Clinical Research.