Evaluating normalized registration and preprocessing methodologies for the analysis of brain MRI in pediatric patients with shunt-treated hydrocephalus

Introduction Registration to a standardized template (i.e. “normalization”) is a critical step when performing neuroimaging studies. We present a comparative study involving the evaluation of general-purpose registration algorithms for pediatric patients with shunt treated hydrocephalus. Our sample dataset presents a number of intersecting challenges for registration, representing the potentially large deformations to both brain structures and overall brain shape, artifacts from shunts, and morphological differences corresponding to age. The current study assesses the normalization accuracy of shunt-treated hydrocephalus patients using freely available neuroimaging registration tools. Methods Anatomical neuroimages from eight pediatric patients with shunt-treated hydrocephalus were normalized. Four non-linear registration algorithms were assessed in addition to the preprocessing steps of skull-stripping and bias-correction. Registration accuracy was assessed using the Dice Coefficient (DC) and Hausdorff Distance (HD) in subcortical and cortical regions. Results A total of 592 registrations were performed. On average, normalizations performed using the brain extracted and bias-corrected images had a higher DC and lower HD compared to full head/ non-biased corrected images. The most accurate registration was achieved using SyN by ANTs with skull-stripped and bias corrected images. Without preprocessing, the DARTEL Toolbox was able to produce normalized images with comparable accuracy. The use of a pediatric template as an intermediate registration did not improve normalization. Discussion Using structural neuroimages from patients with shunt-treated pediatric hydrocephalus, it was demonstrated that there are tools which perform well after specified pre-processing steps were taken. Overall, these results provide insight to the performance of registration programs that can be used for normalization of brains with complex pathologies.


Introduction
Pediatric hydrocephalus is a disease characterized by a complex set of neurological indications-in particular a high volume of cerebrospinal fluid in the cerebral ventricles.While there is interest in studying pediatric hydrocephalus using neuroimaging techniques to learn more about the disease, working with these images may prove to be difficult given the potentially large pathology induced deformations and artifacts from surgical treatment (e.g., shunts) (Ou et al., 2014;Patel et al., 2017).When performing neuroimaging studies, a common goal is to be able to compare findings between participants.In order to accomplish this, the neuroimages must be registered to a standard stereotaxic space (i.e., spatial normalization) such as the Montreal Neurological Institute (MNI) space using a template image (e.g., , such that there is a one-to-one correspondence between images (Mazziotta et al., 1995).Poor normalizations, wherein there is suboptimal alignment of brain regions relative to the template image, can have a variety of impacts on the results of neuroimaging studies.For example, in functional magnetic resonance imaging studies, poor normalizations can result in decreased sensitivity and false negatives wherein observed effects could be driven by structural rather than functional differences (Crinion et al., 2007).Therefore, it is not surprising that image registration is a non-trivial task, and there has been ongoing interest in assessing the accuracy of various programs used for registration (Crinion et al., 2007;Klein et al., 2009;Ou et al., 2014).
Image registration can be characterized by the possible transformation into two categories: linear and nonlinear.Linear registration in 3D can perform translations, rotations, scales, and skews in three directions (x, y, and z).In contrast, non-linear registration allows for deformations.Normalization can take advantage of a combination of both methods wherein there can be an initial linear registration followed by a non-linear registration.A number of freely available neuroimaging and medical imaging programs include functions for performing both these registrations (e.g., FMRIB Software Library [FSL 1 ], and Statistical Parametric Mapping [SPM 2 ]) (Smith et al., 2004;Ashburner, 2007).
Difficulty in performing registrations can occur for a variety of reasons.Ou et al. (2014), have operationalized the potential difficulties into four overarching challenges, which include: inter-participant anatomical variation, intensity and noise differences, protocol and field-of-view differences, and pathology induced missing correspondence.Often, there can be many of these challenges present in one dataset.For example, many of these challenges can be observed particularly in clinical pediatric populations wherein there can be pathology induced missing correspondence in addition to age-based anatomical variation (Courchesne et al., 2000).
There exist various methods to improve normalization accuracy with pathological brains.Tang et al. (2017) have characterized these methods into three overarching categories which include: masking, pathology simulation, and inpainting (Tang et al., 2017).Specifically, cost function masking, wherein a region of non-correspondence in the image is masked, has been shown to result in more accurate 1 https://fsl.fmrib.ox.ac.uk/fsl 2 https://www.fil.ion.ucl.ac.uk/spm registrations (Brett et al., 2001).The generation of masks, however, can be incredibly time consuming, particularly in cases wherein the regions of interest cannot be accurately segmented automatically thus requiring manual segmentation, and there are many participants.Further, even when segmentation can be completed automatically, many segmentation methods are computationally intensive.Indeed, segmenting the enlarged ventricles such as those found in hydrocephalus can pose a challenge for many programs that perform automated segmentations.As such there has been an increase in exploration of solutions with increased accuracy in segmentation and concurrently decreased processing time (Shao et al., 2019;Quon et al., 2020).As a result, there is interest in general purpose normalization pipelines which can be utilized for these complex images that can produce accurate results without extensive manual work and computationally expensive processes.Furthermore, given the heterogeneity of the data due the large variation in ventricle size, having a single pipeline that can apply to all patients would be beneficial.This is particularly pertinent as the data associated with medical images are becoming increasingly large (Scholl et al., 2011).
To date, there have been no studies assessing the efficacy of various normalization pipelines for pediatric hydrocephalus.The normalization of neuroimages in those with shunt-treated pediatric hydrocephalus provides a unique series to study as these images represent a variety of challenges including non-correspondence and artifacts from shunt treatment, potentially large pathology induced deformities in the ventricles and surrounding tissues, and age-based anatomical variation.Indeed, once treated with a ventriculoperitoneal shunt, the ventricles can range from being smaller than normal, to staying extremely large depending on when the shunt is inserted in the life of the child, and what type of valve is used.The objective of the current study is to assess the accuracy of a variety of freely available registration programs after preprocessing steps in pediatric hydrocephalus, a population who has a wide variation in brain imaging, and explore the impact of ventriculomegaly.

Participants
Clinically stable children with hydrocephalus treated by ventriculoperitoneal shunts were recruited from a pediatric neurosurgical outpatient clinic in London, Ontario, Canada.Written informed consent and assent was obtained from all parents and children, respectively.Approval was obtained from our institutional research ethics board.Inclusion criteria included patients with hydrocephalus within the first two years of life or intraventricular hemorrhage at birth.Patients were not eligible for the study if they had a programmable shunt or any other contraindications for MRI. Figure 1 shows characteristics of neuroimages of those with pediatric hydrocephalus that could impact normalization.

Magnetic resonance imaging acquisition
Neuroimages were acquired from a Siemens MAGNETOM Prisma 3-Tesla MRI scanner with a 32-channel head coil.A whole

Image preprocessing
Given the potential impact of image preprocessing on registration accuracy, various preprocessing steps were performed.Registrations were performed with and without skull-stripping, and with or without bias correction (view Figure 2 for preprocessing pipeline).Registration using the DARTEL Toolbox was performed only with whole-brain data as segmentation of the tissue types is required to run DARTEL and SPM's segmentation tool will remove the non-brain tissues.

Skull stripping
Removal of non-brain tissues was completed using the Brain Extraction Tool (BET) from FMRIB Software Library (FSL) version 6.0 (see text footnote 1) (Smith, 2002).In order to achieve an accurate brain extraction given the large deformities present in the dataset, various BET parameters were tuned, and manual removal of non-brain structures was performed following BET on a per subject basis.

Bias correction
Bias correction to correct for intensity inhomogeneities was performed using N4 bias field correction from Advanced Normalization Tools (ANTs) 3 (Tustison et al., 2010).

Image registration
Images were both registered to the 1 × 1 × 1 mm 3 MNI-152 nonlinear 6th generation template.Additionally, images were registered to the age-specific NIHPD symmetric pre-to mid-puberty (7.5 years to 13.5 years) 1 × 1 × 1 mm 3 pediatric template, followed by registration to the aforementioned MNI-152 template.This additional registration was performed as it has been suggested that registering an age specific template could produce more accurate registrations (Wilke et al., 2002;Fonov et al., 2011).Image registration using the DARTEL Toolbox differed from the aforementioned process, firstly DARTEL creates a groupwise template image wherein each participant's neuroimage is registered to the groupwise template, then these images can be normalized to MNI space.

Registration program details
A variety of freely available programs commonly used for neuroimaging analysis were chosen.The selected programs employ a variety of registration algorithms and implementations.All registrations were implemented using default parameters (view Appendix 1), except for FLIRT wherein two iterations were used, one with the default parameters and a second with a reduced angular range for initial optimization (FLIRT 2 and FNIRT 2 represent linear and non-linear registrations completed with a reduced angular range during the linear registration step).View Table 1 for the programs used.Characteristics of the algorithms including deformation model, similarity, and regularization, have been summarized by Ou et al. (2014) and Klein et al. (2009).

Region selection and verification
A series of cortical and subcortical regions were selected to represent areas proximal and distal to the area of deformation as it has been previously demonstrated that registration accuracy can be impacted by proximity to the region of deformation (Ou et al., 2014).Areas included in the custom atlas include the corpus callosum, internal capsule, superior temporal gyrus, hippocampus, superior occipital gyrus, and paracentral lobule (view Figure 3 for the atlas).All areas were manually segmented from each patient's neuroimage as well as the template image and verified by an expert (SdeR).The custom study atlas in each participant's native space was then warped using the generated warps from all registrations for analysis.

Computational time
All registrations were performed on a computer with the Linux CentOS version 8 operating system, 64GB of RAM, GeForce 970GTX GPU, and an AMD Ryzen 5 3600 6-Core Processor (3.6GHz/4.2GHz boost).Registration time was reported in minutes, rounded up to the nearest minute.Given that computation time can be influenced by the size of deformation needing to be estimated, computation time for both the most and least deformed brains have been reported.Multicore processing was used whenever supported by the software tool and the number of cores used was reported.

Similarity metrics
In order to evaluate the accuracy of the registration two commonly reported similarity metrics were used (Taha and Hanbury, 2015).The warped participant atlas was compared to the same areas segmented from the MNI-152 template image.The Dice coefficients (DICE) were computed for each registration to assess similarity in overlap of the selected 3-dimensional regions (Dice, 1945).Using two sets, A and B, the DICE is defined as: Additionally, Hausdorff Distance (HD) which is a measure of spatial distance was also assessed (Hausdorff, 1914).Using two sets, A and B, HD is defined as: 3 Results

Participants
Eight patients with hydrocephalus treated with a VP shunt were included in the current study (1 female, mean age = 8.79 years, sd = 1.81).Their voxel-based ventricle volume ranged from 7,250 mm 3 to 336,735 mm 3 .The etiology of the hydrocephalus was variable between patients, and included intraventricular hemorrhage, dandywalker's malformation, meningitis, and spina bifida.Complete atlas generation was possible in seven of the eight participants.In the participant with the largest ventricle size severe deformities resulted in the inability to distinguish three cortical regions (i.e., left superior occipital gyrus, and both left and right paracentral lobules).

Normalization
A total of 592 registrations were performed.Excluding the registration of whole-brain bias corrected data, registrations directly
A similar pattern was observed in those registered first to an age-appropriate template where the largest DICE (median DICE = 0.5637 IQR = 0.1900) was observed in images which underwent skull-stripping and bias correction.Table 2 outlines the median DICE and HD for all regions in the study atlas across all programs and the various preprocessing steps.Figure 5  The median score for each participant has been shown by ventricle size.Qualitative results for three participants (small, medium, and large ventricle size) have been depicted for skull-stripped, bias corrected in Figure 7.
Whether assessed with DICE or HD, the interquartile range is often smaller for bias corrected images that underwent skull-stripping compared to whole brain images for the majority of programs assessed.Additionally, regardless of program and preprocessing performed, patients with the largest ventricle size predominately have poorer registration accuracy compared to those with a smaller ventricle size as measured using DICE.In contrast, when accuracy is measured using HD there is less distinction between accuracy based on ventricle size, though participants with the two largest ventricle sizes (i.e., ventricle size >100,000 mm 3 ) often have scores worse than the median.
The better DICE was seen with the SyN algorithm by ANTs with the preprocessing steps of skull-stripping, and bias correction, with or without initial registration to a pediatric atlas (without intermediate registration median DICE = 0.6504, IQR = 0.1009; median HD = 10.3920,IQR = 4.9754; with initial registration to a pediatric atlas median DICE = 0.6590, IQR = 0.1449; median HD = 9.4340, IQR = 5.7898).Figures 8A,B shows the individual performance for each participant, and each region of interest using the SyN algorithm with bias correction and skull-stripping, results are depicted qualitatively in Figure 9.As ventricle size increases, overall subcortical regions which are closer to the ventricles, on average, have lower DICE compared to cortical regions.When assessing accuracy using HD, participants with the smallest ventricle sizes (i.e., < 8,000 mm 3 ) predominately have a HD for subcortical structures below the median and an HD for cortical structures above the median.
The best performance with the least number of preprocessing steps (i.e., whole brain, no bias correction) was the DARTEL toolbox by SPM (median DICE = 0.5541, sd = 0.1604; median HD = 11.5330,sd = 5.2630).Similar to SyN, subcortical regions generally have a lower DICE (Figures 8C,D) as participant ventricle size increases, compared to cortical regions.Results are depicted qualitatively in Figure 10.

Computational time
Computational time for the smallest and largest ventricle sizes are seen in Table 3.Despite the large difference in volumes, the time to complete the normalization for either participant are similar except for FNIRT wherein the participant with the smaller ventricle size has a much quicker registration to the MNI-152 atlas relative to the participant with the larger ventricle size.The fastest non-linear registration is with ANTs (approximately 18 min).DRAMMS and FNIRT have comparable times (approximately 20 min-40 min) and both use a single-core.Performing two series of registrations from patient T1 to NIHPD, then the registration from the NIHPD atlas to MNI-152 atlas almost increases all the times two-fold which is to be expected as this process involves two-times the registrations.The majority of the algorithms only offer single-core computation.As the DARTEL Toolbox creates a group-wise template, its computational time is dependent on the number of participants.Given the performance of the DARTEL Toolbox with whole-brain non-bias corrected data, this computational time was included.Normalization of neuroimages of pediatric patients with shunttreated hydrocephalus was assessed using a variety of freely available software tools commonly used for neuroimaging studies.Fifty ways of normalizing neuroimages were examined wherein variations included programs, parameters, and preprocessing steps for a total of 592 registrations performed.
Our study revealed that SyN had the most accurate registration as measured by DICE, and HD with, or without registration to a pediatric atlas.Previous studies assessing the accuracy of registration in healthy brains and/or brains with pathologies have also highlighted SyN as registration algorithms that performs with high accuracy (Klein et al., 2009;Ripolles et al., 2012;Ou et al., 2014).While only few studies have focused on registration in pediatric populations, there has been interest in registration in pathological adult populations.For example, in the study conducted by Ou et al. (2014) databases including patients with Alzheimer's Disease and brain tumors were assessed.Notably parallels can be drawn between the aforementioned datasets, and the current study's dataset.In specific, Alzheimer's Disease patient can present with larger ventricles, and brain tumors are a non-correspondence when compared to healthy brain images.Similar to the current study, SyN by ANTs had high accuracy in these pathological populations.Additionally, DRAMMS was one of their best performing algorithms.In contrast, in the current study, DRAMMS was outperformed by other algorithms including SyN.Similarly in a study which assessed the normalization of deep brain structures in adults who underwent neurosurgery, SyN outperformed all other assessed algorithms (Vogel et al., 2020).Furthermore, in a study  assessing surface and volume registration in healthy pediatric brains, ANTs also outperformed other volume registration techniques (Ghosh et al., 2010).The SyN algorithm has been identified as robust when faced with different non-pediatric datasets, and this may be due to its large degree of freedom (Klein et al., 2009;Ou et al., 2014).Further, Ou et al. (2014) suggested that the decrease in registration accuracy observed particularly in the dataset with Alzheimer's Disease patients could be due to variable degrees of  neurodegeneration (e.g., various ventricle sizes, brain structure sizes, and atrophy).Indeed the potential limitations caused by that dataset is similar to the problem-set in the current study.
In the current study, we found superior registration accuracy using bias corrected, skull-stripped data; however, normalization with non-preprocessed data (i.e., whole-brain, non-bias corrected) can also result in normalized images with good accuracy.Typically, performing bias correction can help to improve normalization accuracy and it is non-computationally intensive relative to the time required to perform registrations.In addition to having a small benefit for normalization, bias correction has been shown to improve brain extraction (Fennema-Notestine et al., 2006).The current dataset composed of pathological pediatric neuroimages revealed that the removal of the skull has been shown to be incredibly beneficial for normalization.However, it is worth noting that skullstripping has been identified as a non-trivial task which could be very time consuming (Popescu et al., 2012).Poor brain extractions can result in removing areas of the brain or including non-brain matter.Furthermore the presence of neck in the volumes have been shown to negatively impact brain extraction (Popescu et al., 2012).These errors in brain extraction could result in poor registration wherein the non-brain matter for example, could be interpreted as brain matter.
Towards minimizing preprocessing steps, whole brain normalization can also be performed in patients with complex neuropathology, such as that seen in hydrocephalus.It was demonstrated that the DARTEL toolbox outperforms many of the other algorithms under these circumstances.The DARTEL toolbox makes use of groupwise registration and is the only tool assessed in this study which uses this process (Ashburner, 2007).In this case, a group-specific template is created based on the whole input dataset, then each participant's neuroimage is then registered to the group template.Group-wise registrations are beneficial as there is no a priori template selection required; however, performing group-wise registration between different groups (e.g., healthy controls compared to patients with morphological differences) is non-trivial and an area of interest (Liao et al., 2012;Ribbens et al., 2013).Given the differences between pediatric and adult neuroimages such as the size, shape, and tissue type, it has been previously suggested that using an age-appropriate brain template in registrations, can help to improve registrations reducing the age-based variability between images (Courchesne et al., 2000;Fonov et al., 2011).We have demonstrated that with our current dataset, that registering to an age-appropriate template, for the most part, did not improve registration.Accuracy was similar whether an age-specific template was used, though accuracy was slightly reduced overall.Given the purpose of an age-specific template is to better represent a pediatric brain, structural changes to the brain due to hydrocephalus may make these registrations more difficult (Del Bigio, 2010).In addition, there was almost a two-fold increase in processing time with age-based registration compared to a single registration between the participant's image and the target MNI-152 image, which may not be ideal in some circumstances.Therefore, while we would still advise to register healthy participants with age-specific templates, this step can be skipped when registering children with large anatomical deformation due for example to hydrocephalus.
Regardless of the overall accuracy of the registration (measured using DICE), often, participants with larger ventricle sizes had poorer normalization accuracy compared to those with smaller ventricle sizes.Further, the areas that were most impacted as measured by DICE were those near the ventricles (i.e., subcortical regions) such that these areas have low overlap with ground truth.This observation may be due to the sensitivity of the DICE when comparing regions of different sizes, wherein the size of subcortical structures, particularly the ones chosen, are much smaller in volume compared to the cortical structures.The inclusion of subcortical brain structures in neuroimages studies can be challenging given their small size and many are already excluded from standard atlases (Forstmann et al., 2016).To address this, some studies have used a modified DICE, specifically dilated DICE when assessing sub-cortical structures (Bazin et al., 2020).
Notably, many of the assessed programs do not make use of multicore computing for a single subject.Only SyN allowed a streamline method of utilizing multiple cores by modifying the function call 4 (Avants et al., 2009).Improving the computational efficacy of registrations is an area of interest.Given their timeconsuming nature, registrations are often performed outside of busy clinical practice, though they have a utility in clinical/surgical practice Frontiers in Neuroscience 11 frontiersin.org(Alam et al., 2018).Ultimately, there is ongoing interest in utilizing the power of modern GPUs which are built for parallel processing to improve the computational efficiency of image registration (Shams et al., 2010).
Normalizing images of pediatric patients with shunt-treated hydrocephalus provide a unique opportunity to assess the accuracy of various non-linear registration programs given many different challenges.While the patients in this study had a wide range of  ventricle sizes, our study was limited by sample size.Having a larger sample size would potentially allow us to better understand the impact of large deformities on normalization outcomes.Furthermore, as we used a custom atlas for assessing registration accuracy, many regions were excluded.Given a more robust atlas, we could have further assessed the impact of the shunt location on registration accuracy in nearby areas as registration performance can vary based on proximity to a pathological site (Ou et al., 2014).
Given the largest DICE being marginally over 0.60, more robust registration algorithms are needed to better account for complex pathologies.Notwithstanding, it is worth noting that all the programs we have used have parameters that can be tuned, and that there are various techniques such as a masking that could have been used to better accommodate our data; however these processes were outside the scope of the current study.Finally, though our best overlap value was not very large, this is consistent with various results from other studies who have assessed potential challenging registrations (Ghosh et al., 2010;Ou et al., 2014;Vogel et al., 2020).
With the performance of these programs in complex datasets, it is therefore important to complete visual checks following registration and consider manual segmentation of areas of interest when possible.
In sum, we assessed four different non-linear registration algorithms to normalize neuroimages from pediatric patients with shunt-treated hydrocephalus.Ultimately preprocessing the neuroimages to remove non-brain tissue (e.g., skull-striping) and bias correcting resulted in on average the most accurate normalized images using the SyN algorithm.Notably, various other studies have demonstrated good registration accuracy with SyN in non-pediatric populations, which suggest that SyN is a robust normalization algorithm under a variety of circumstances (Klein et al., 2009;Ghosh et al., 2010;Ou et al., 2014).We also demonstrated that the DARTEL Toolbox, which performs a group-wise registration, can produce a similarly accurate registration without any preprocessing steps.Finally, while registering to an age-appropriate atlas has been shown to produce a superior registration outcome, overall it did not have a positive impact on the registration accuracy in the current study.These results may help to inform a normalization pipeline and algorithm selection for studies with pediatric patients and complex neuronal pathologies.

FIGURE 1
FIGURE 1 Characteristic asymmetries seen in pediatric hydrocephalus.(A-C) The ventricles segmented in purple and highlight the potentially non-typical brain shape.The circle in (A) outlines an artifact that can occur as a result of the shunt.(D) The catheter segmented in orange.
depicts box plots for the DICE score per program and Figure 6 depicts box plots for the HD per program.Both figures use the preprocessing step of bias correction and include results with, and without skull-stripping.

FIGURE 3
FIGURE 3Custom atlas used for registration which includes various cortical and subcortical structures.All selected regions are represented at least once unilaterally.

FIGURE 4
FIGURE 4Normalization results for three participants with different ventricle sizes.The participants' neuroimages were skull-stripped and bias corrected prior to undergoing normalization.

FIGURE 5
FIGURE 5 Box plots of the DICE scores per program for normalization from participant T1 to the MNI-152 template image.All images were bias corrected.Results from both whole brain and skull-stripped images are shown in dark gray and light gray, respectively.The median DICE per participant is plotted by ventricle size.

FIGURE 6
FIGURE 6Box plot of the HD per program for normalization from participant T1 to the MNI-152 template image.All images were bias corrected.Results from both whole brain and skull-stripped images are shown in dark gray and light gray, respectively.The median HD per participant is plotted by ventricle size.

FIGURE 7
FIGURE 7Results from three participants with varying ventricle sizes.The warped atlases for each participant have been overlayed onto the MNI-152 template.All were preprocessed with brain extraction and bias correction and the participant's image was registered directly to the MNI-152 template.
FIGURE 8 (A,B) The DICE and HD, respectively, by participant by region for SyN with the preprocessing steps of skull-stripping and bias correction.(C,D) The DICE and HD, respectively, by participant by region for DARTEL with no preprocessing steps (whole-brain, no bias-correction).In all graphs cortical structures are red, and subcortical structures are blue.

FIGURE 9
FIGURE 9Warped atlases using the SyN algorithm by ANTs.All four preprocessing options with direct registration to the MNI152 template are depicted.Three participants are shown to represent different ventricle sizes.Warps have been overlayed onto the MNI152 template.

FIGURE 10
FIGURE 10Results from the DARTEL algorithm with no preprocessing (whole brain, no bias correction) for 3 participants with varying ventricle sizes.Results have been overlayed onto the MNI-152 template image.

TABLE 1
Registration programs assessed.

TABLE 2
Summary statistics for normalization accuracy across 50 registration conditions.