Multisite Harmonization of Structural DTI Networks in Children: An A-CAP Study

The analysis of large, multisite neuroimaging datasets provides a promising means for robust characterization of brain networks that can reduce false positives and improve reproducibility. However, the use of different MRI scanners introduces variability to the data. Managing those sources of variability is increasingly important for the generation of accurate group-level inferences. ComBat is one of the most promising tools for multisite (multiscanner) harmonization of structural neuroimaging data, but no study has examined its application to graph theory metrics derived from the structural brain connectome. The present work evaluates the use of ComBat for multisite harmonization in the context of structural network analysis of diffusion-weighted scans from the Advancing Concussion Assessment in Pediatrics (A-CAP) study. Scans were acquired on six different scanners from 484 children aged 8.00–16.99 years [Mean = 12.37 ± 2.34 years; 289 (59.7%) Male] ~10 days following mild traumatic brain injury (n = 313) or orthopedic injury (n = 171). Whole brain deterministic diffusion tensor tractography was conducted and used to construct a 90 x 90 weighted (average fractional anisotropy) adjacency matrix for each scan. ComBat harmonization was applied separately at one of two different stages during data processing, either on the (i) weighted adjacency matrices (matrix harmonization) or (ii) global network metrics derived using unharmonized weighted adjacency matrices (parameter harmonization). Global network metrics based on unharmonized adjacency matrices and each harmonization approach were derived. Robust scanner effects were found for unharmonized metrics. Some scanner effects remained significant for matrix harmonized metrics, but effect sizes were less robust. Parameter harmonized metrics did not differ by scanner. Intraclass correlations (ICC) indicated good to excellent within-scanner consistency between metrics calculated before and after both harmonization approaches. Age correlated with unharmonized network metrics, but was more strongly correlated with network metrics based on both harmonization approaches. Parameter harmonization successfully controlled for scanner variability while preserving network topology and connectivity weights, indicating that harmonization of global network parameters based on unharmonized adjacency matrices may provide optimal results. The current work supports the use of ComBat for removing multiscanner effects on global network topology.

The analysis of large, multisite neuroimaging datasets provides a promising means for robust characterization of brain networks that can reduce false positives and improve reproducibility. However, the use of different MRI scanners introduces variability to the data. Managing those sources of variability is increasingly important for the generation of accurate group-level inferences. ComBat is one of the most promising tools for multisite (multiscanner) harmonization of structural neuroimaging data, but no study has examined its application to graph theory metrics derived from the structural brain connectome. The present work evaluates the use of ComBat for multisite harmonization in the context of structural network analysis of diffusion-weighted scans from the Advancing Concussion Assessment in Pediatrics (A-CAP) study. Scans were acquired on six different scanners from 484 children aged 8.00-16.99 years [Mean = 12.37 ± 2.34 years; 289 (59.7%) Male] ∼10 days following mild traumatic brain injury (n = 313) or orthopedic injury (n = 171). Whole brain deterministic diffusion tensor tractography was conducted and used to construct a 90 x 90 weighted (average fractional anisotropy) adjacency matrix for each scan. ComBat harmonization was applied separately at one of two different stages during data processing, either on the (i) weighted adjacency matrices (matrix harmonization) or (ii) global network metrics derived using unharmonized weighted adjacency matrices (parameter harmonization). Global network metrics based on unharmonized adjacency matrices and each harmonization approach were derived. Robust scanner effects were found for unharmonized metrics. Some scanner effects remained significant for matrix harmonized metrics, but effect sizes were less robust. Parameter harmonized metrics did not differ by scanner. Intraclass correlations (ICC) indicated good to excellent within-scanner consistency between metrics calculated before and after both harmonization approaches.

INTRODUCTION
Network neuroscience has become a popular approach to characterize brain structure in vivo in healthy and clinical populations (1)(2)(3). The structural connectome can be mapped using diffusion-weighted MRI (2,(4)(5)(6), a non-invasive technique that is sensitive to white matter microstructure (7).
Pediatric mild traumatic brain injury (mTBI) is a prevalent global public health concern (8-11) that is characterized by subtle and diffuse alterations in brain tissue [reviewed in (12,13)]. The neurobiology of pediatric mTBI remains poorly understood [see (12)]. White matter microstructural alterations can occur after pediatric mTBI, and multiple studies have examined specific white matter tracts using diffusion tensor imaging [DTI; (13)(14)(15)]. Emerging evidence indicates that pediatric mTBI can alter global and regional brain networks (16)(17)(18)(19). Thus, network neuroscience may be a potentially promising tool that could provide a robust characterization of network mechanisms involved in this important and highly prevalent neurological disorder.
Large, multisite neuroimaging studies of pediatric mTBI have become increasingly common to reduce false positive results from small samples, increase statistical power, and enhance reproducibility and generalizability of results (20,21). For instance, the Advancing Concussion Assessment in Pediatrics (A-CAP) study (22) is the largest neuroimaging study of pediatric mTBI to date, with recruitment occurring at five children's hospitals across Canada including longitudinal MRI assessment using 6 different scanners. The A-CAP study has the potential to increase scientific and clinical knowledge about neurobiological outcomes in pediatric mTBI. However, using multiple MRI scanners introduces non-biological data variability due to different scanner systems, models, and sequence protocols, among other factors (23)(24)(25)(26). Managing these non-biological sources of variability in multisite studies is increasingly important to generate accurate group-level inferences and enable detection of underlying biological phenomena (23).
ComBat is a widely used method for multisite (multiscanner) harmonization that originated from techniques used for genomics data (27). It is one of the most well-validated tools for multiscanner harmonization of structural neuroimaging data that makes no assumptions about the origin of scanner variation (26). ComBat implements a multivariate linear mixed effects regression with terms for biological variables and site to model the features of interest; the model parameters are estimated using an empirical Bayes approach. For diffusion tractography, ComBat has already demonstrated higher performance for multiscanner harmonization than other methods such as removal of artificial voxel effect by linear regression (RAVEL) and functional normalization of metrics (26). Unlike a general linear model approach that includes site or scanner as a fixed effect covariate, ComBat demonstrates better outlier robustness to account for small within-scanner sample sizes by borrowing information across features to shrink estimates toward a common mean (27,28). The multiplicative scanner effects are also corrected by removing heteroscedasticity of model errors across scanners (29). Furthermore, ComBat preserves the variability contributed by true biological effects [e.g., sex and age; (26)]. However, no study has yet examined whether ComBat is suitable for graph theory metrics derived from the structural connectome based on DTI.
Unlike tractography, which yields a final value for each white matter tract, connectome analyses use weighted adjacency matrices to calculate network parameters. In tractography or region-of-interest analyses, multisite harmonization is performed on final metrics [e.g., average fractional anisotropy measures; (26,(30)(31)(32)]. However, graph theory analysis takes place after connections in a network have been mapped, mathematically represented as an adjacency matrix, and summarized by the computation of network parameters (33). Two distinct approaches to data harmonization are therefore possible in network analysis: (1) before the calculation of network parameters (i.e., matrix harmonization; harmonization at the level of connectivity weights), or (2) after calculation of network parameters (i.e., parameter harmonization). Identifying the optimal timing of data harmonization during data processing and analysis may influence the harmonization of multisite data, and hence has important implications for the accuracy of conclusions drawn from multisite connectivity studies.
To our knowledge, the performance of ComBat for multiscanner harmonization in studies of network topology and neurological disorders has not been evaluated. Therefore, the present study examined the application of two approaches to data harmonization across sites in a sample of DTI scans from children with mTBI or mild orthopedic injury (OI).

Study Design and Procedure
Data were drawn from the Advancing Concussion Assessment in Pediatrics (A-CAP) study (22), a multisite prospective, cohort study with longitudinal follow-up in children [Mean age (range) = 12.37 ± 2.34 years (8.00-16.99 years); 289 (59.7%) Male; see Table 1] with pediatric mTBI (n = 313) or mild orthopedic injury (OI; n = 171). Briefly, children were recruited within 48 h of injury from five children's hospitals across Canada (32), all of which were members of Pediatric Emergency Research Canada [PERC; (34)], and returned for three post-injury followup assessments: post-acute (targeted for 10 days post-injury), 3 and 6 months. Injuries and acute signs and symptoms were assessed during an initial emergency department visit that took place within 48 h post-injury.
The study was conducted with the approval of the research ethics board at each study site. All participants provided written informed assent and parents/guardians provided written informed consent (22). This study examined data from the MRI scans collected during the post-acute visit, as previously described (32, 35).

Quality Assurance
Visual quality checks of all raw images were conducted to identify and exclude scans with structural abnormalities/incidental findings, scanner artifacts (e.g., warping), incomplete acquisition, or not collected using the standardized scan parameters (32). Data that passed the initial quality assessment were subsequently rated for motion by two trained analysts. Discrepancies were resolved through a third reviewer blind to initial ratings. Diffusion-weighted volumes with severe motion artifact were removed, and any scans with > 7 volumes with severe motion artifact were excluded from subsequent analysis (36).

Structural Connectome
Detailed image processing methodology has been previously described (18). Briefly, ExploreDTI (37) was used to preprocess diffusion images, calculate the diffusion tensor, conduct whole brain fiber tractography, and compute an adjacency matrix for each participant. Preprocessing included correction for signal drift (38), eddy currents, subject motion with rotation of the Bmatrix (39), and susceptibility distortions (40). A deterministic streamline approach was used for whole brain fiber tractography (randomized seed points; seed and tractography FA threshold = 0.10; step size = 0.50 mm; angle threshold = 30 • ; step size = 0.5 mm; streamline length 50-500 mm). The resulting whole brain fiber tractography was extracted and used to compute an adjacency matrix for each participant.
The automated anatomical labeling [AAL-90, (41)] template was used to define 90 nodes in native (diffusion) space using functions from open-source software packages in MATLAB R2019a [see (18)]. Fully connected 90 × 90 adjacency matrices were constructed using the average FA of passing fibers among nodes in ExploreDTI for each participant and an absolute threshold of 0.10.

Global Network Metrics and Multisite Harmonization
The following global network parameters were calculated in MATLAB using the GRETNA software toolbox [http:// www.nitrc.org/projects/gretna/; (42)]: global efficiency, global clustering coefficient, small worldness, modularity, and density. Network parameters were normalized against 1,000 randomly generated matrices.
Global network parameters were evaluated before harmonization and after two different harmonization approaches: matrix harmonization and parameter harmonization. The steps used for each approach are summarized in Figure 1. For both harmonization approaches, ComBat v1.0.5 (https://github.com/Jfortin1/ ComBatHarmonization/tree/master/R) was conducted in R v3.6.3 [(43); https://www.R-project.org/] to harmonize the data for scanner differences. A covariate matrix with group (mTBI/OI), age at injury, and biological sex was included to preserve this variance: mod <− model.matrix(∼ injury + age + sex)

Matrix Harmonization
For matrix harmonization, weighted connectivity matrices were harmonized for multiple scanners and global network parameters were calculated using the harmonized connectivity matrix for each participant (see Figure 1). First, the lower diagonal values of each connectivity matrix were extracted to construct a dataframe of 4,005 columns corresponding to node connection pairs among the 90 defined brain regions (nodes), excluding self-connections [i.e., principal diagonal; (n(n-1))/2]. This was done because undirected adjacency matrices are diagonally symmetrical. ComBat was then used to harmonize those extracted values: After the harmonization of the extracted connectivity weights, the adjusted square and symmetric weighted matrices were reconstructed for each participant and subsequently used to calculate global network parameters. During matrix harmonization, many of the connection weights that were 0 before harmonization (i.e., indicating that no connection FIGURE 1 | Overall study procedure illustrating the data processing steps for the generation global network parameters (A) before harmonization, and the implementation of (B) matrix harmonization, and (C) parameter harmonization. existed between two nodes) were transformed to negative values.
To correct for this transformation, an additional masking step was applied to reassign negative weights to 0 prior to graph analysis. Specifically, the masking step multiplied the binary connectivity matrix derived before harmonization with the harmonized weighted connectivity matrix for each participant (see Figure 1).

Parameter Harmonization
For parameter harmonization, the raw global network metrics (i.e., calculated before harmonization) were harmonized for multiple scanners using ComBat. Each parameter was harmonized in separate models because the distribution of each parameter is not necessarily related to the distribution of other parameters. The empirical bias estimation option was not applied (i.e., eb = FALSE) during parameter harmonization because each global network parameter was harmonized separately (i.e., number of features < n). For each global network metric, the following model was used to harmonize the data: neuroCombat (dat = Parameter, batch = Site, mod = mod, eb = FALSE)

Statistical Analysis
Statistical analyses were conducted using R v3.6.3. To evaluate the performance of each harmonization approach (i.e., adjacency matrix or network parameters), the effect of site was examined using separate one-way ANOVA models for each global network parameter. Non-significant scanner effects (p > 0.05) were interpreted as a successful removal of variability due to different scanners. The proportion of significant (p < 0.05) post-hoc pairwise between-site comparisons was evaluated by calculating the number of significant uncorrected pairwise comparisons across scanners, divided by the total number of possible pairwise comparisons (i.e., n = 15). Correction for multiple comparisons was not applied for post-hoc t-tests followups, providing a more conservative evaluation of scanner effects.
The within-scanner consistency of the global network metrics before (unharmonized) and after each harmonization approach (matrix harmonization, parameter harmonization) was examined by calculating the intraclass correlation coefficient (ICC), with ICC < 0.50, 0.50 ≤ ICC < 0.75, 0.75 ≤ ICC < 0.90, and ICC ≥ 0.90 indicative of poor, moderate, good, and excellent consistently, respectively (44). Successful harmonization would reduce the effect of site while preserving the within-scanner variability for each parameter observed before harmonization.
To evaluate whether ComBat harmonization preserves biological variability, analysis of covariance (ANCOVA) was used to examine the effect of site, age at injury, sex, and group (mTBI, OI) on each network parameter. Significant effects involving age at injury were further examined using Pearson correlations, which were compared using a backtransformed average Fisher's Z procedure for dependent and overlapping correlations (45), as implemented using the cocor package (46). Overlapping correlations were used to conduct the following pairwise comparisons of age correlations: (1) matrix harmonization vs. unharmonized data, (2) parameter harmonization vs. unharmonized data, and (3) parameter harmonization vs. matrix harmonization.
Within-scanner age correlations on the unharmonized data were calculated to provide a reference value for the expected age correlation for each network parameter following harmonization. The reference value was calculated based on the means of within-site age correlations, weighted by the corresponding sample size of each scanner. Weighted means were calculated because sites with a greater number of participants may influence the correlation values to a greater extent than sites with smaller cohorts. Successful preservation of age-related biological variability across all scanners following harmonization would approximate the weighted mean of within-scanner agecorrelations. Group differences between mTBI and OI were calculated using t-test.

Matrix Harmonization
Main effects of site remained significant for global network metrics after matrix harmonization ( Table 2 and Figures 2B,  3B). However, pairwise site differences were less pervasive after harmonization for global efficiency [F (5)  Within-site consistency of unharmonized and matrix harmonized metrics ranged from poor to excellent ( Figure 3B).

Network Parameter Harmonization
Site was not significantly associated with global network metrics after parameter harmonization ( Table 2 and Figures 2C, 3C).
No significant correlations with age were observed for modularity and small worldness before harmonization, but modularity significantly correlated with age following matrix  harmonization (r = −0.11, p =0.018), and both parameters showed significant age relationships following parameter harmonization (modularity: r = −0.15, p < 0.001; small worldness: r = −0.13, p < 0.003), although the coefficients were not significantly higher compared to the unharmonized data (p > 0.05). Clustering coefficient (z = 0.26, p < 0.008) and density (z = 2.40, p < 0.016) were significantly stronger following parameter as compared to matrix harmonization. Within-group age correlations are reported in Supplementary Table 1. No significant effects of sex or group were observed for any of the network parameters (see Supplementary Table 2).

DISCUSSION
The popularity of large, representative datasets from collaborative, multisite research initiatives and structural connectomics has increased in recent years. Previous studies demonstrated that ComBat can control for site (scanner) differences while preserving biological variability (e.g., due to injury group, age, sex) for several MRI modalities (26,29,47,48). This is the first study to validate the use of ComBat for the structural connectome. Here, ComBat was successfully used to harmonize structural connectivity data based on diffusionweighted MRI across multiple scanners ("sites"). Parameter harmonization reduced the variability associated with different scanners to a greater extent than matrix harmonization, although both approaches reduced site differences in global network metrics. As expected, both harmonization approaches also preserved biological effects of age on network parameters. Moreover, expected age-related associations with global network parameters were stronger after applying parameter as opposed to matrix harmonization. Overall, the results extend the validity of using ComBat harmonization to network parameters derived using diffusion-weighted MRI. Parameter harmonization showed superior performance for removing scanner effects compared to matrix harmonization. Furthermore, parameter harmonization is more computationally efficient. Matrix harmonization requires a series of steps that involve value translation. Specifically, connectivity weights were deconstructed from the matrices by extracting the lower (or upper) diagonal elements, organized in a high dimensional data frame for harmonization, and reconstructed back in square matrices following harmonization. In some instances, this approach transformed connection weights that were initially 0 (i.e., no connection exists between two nodes in unharmonized data) to negative values, requiring an additional step reassigning these values to zero before graph analysis. Thus, matrix harmonization preserved the location, but not the strength of connections among node pairs. In contrast, parameter harmonization requires only one step. This appears beneficial in preserving the true global properties of the network, as illustrated by the reduced variability of the within-site consistency between the parameter harmonized and unharmonized global network metrics (see Figure 3).
Before harmonization, global efficiency exhibited more robust site effects than other measures, such as clustering coefficient.
Matrix harmonization reduced (e.g., global efficiency), introduced (e.g., clustering coefficient), or maintained (e.g., density) site effects compared to the unharmonized data. The variable performance of matrix harmonization across different metrics may indicate that properties of the network other than the pairwise connection strengths are affected by scanner. Except for density, global parameters included in the present analysis encode information about the topology (i.e., location) as well as the weights of connections among distinct brain regions. Density, which reflects the number of connections regardless of their strengths, did not demonstrate differences in the magnitude of site effects following matrix harmonization (see Figures 2, 3), indicating that site differences are present in topological properties of the network beyond the strengths of pairwise connections (e.g., the number or location of connections). In addition, clustering coefficient quantifies segregation across brain regions (i.e., nodes) by counting the occurrence of existing connections between groups of three nodes. Since matrix harmonization does not alter the location of connections, groups of connected nodes maintain their configuration before and after harmonization. Furthermore, the magnitude of scanner effects might differ slightly among connections, and matrix harmonization might differently impact the reciprocal connection strengths across groups of nodes (i.e., it targets pairs of nodes), potentially explaining the variable performance for removing site effects in the case of clustering coefficient (see Figures 3A,B). These topological properties may be better controlled by parameter harmonization, because global parameters already encode this information.
Correlations between age and network parameters were generally larger following both harmonization approaches, but were slightly more robust following parameter harmonization. One exception was the relationship between age and clustering coefficient, which weakened following matrix harmonization. This is in line with the other results, suggesting that matrix harmonization may be problematic for clustering coefficient. The detection of significant age effects following parameter harmonization, even in the absence of significant correlations for the unharmonized data, raises the question of whether additional variability was added to the data during harmonization that might have artificially boosted the relationship between age and network topology. Further analyses suggest this is not the case, because the age correlation following parameter harmonization were closer to the weighted means of within-scanner correlations before harmonization (see Supplementary Table 1). In addition, previous studies show relationships between age and global network topology in typical development (49)(50)(51) and children with TBI (19). This indicates that parameter harmonization may better preserve age-related biological variability compared to matrix harmonization and to the unharmonized data, although differences between the two harmonization approaches were small.
The children with mTBI and OI did not differ in any global network metrics before or after each harmonization approach. This was expected given that DTI and NODDI indices of white matter microstructure did not differ between groups previously in this sample (18,32,35), and other pediatric samples at similar time points (52). Another study compared a subset of this sample (children recruited at the Calgary site) to typically developing children and also did not find global or regional (nodal) network differences between mTBI and mild OI groups post-acutely, but did find an effect of injury more generally relative to typical development (18).
The current study did not address the effect of data harmonization applied prior to the generation of adjacency matrices, which is an additional possibility to account for the variability across different scanners [e.g., using methods described by (26,53,54)]. It has been suggested that connectome generation can be stable across scanners based on the derived network parameters (55). While future studies may consider this, data harmonization prior to connectome construction is increased in complexity, involving additional processing steps. These include warping the data into a common space and deconstructing brain images to build a voxel by participant data frame, which does not allow for the construction of adjacency matrices in native diffusion space. Following voxelwise harmonization, data would need to be reconstructed into subject-specific brain images (i.e., harmonized FA maps), which may impose substantial feasibility challenges due to the high computational complexity and number of additional transformations involved in this process.
There are some limitations to the current work. Weighted connectivity matrices were analyzed in this study; future multisite studies might examine whether differences in binary matrices relate differently to the effects of site. We did not assess the influence of different thresholds on harmonization.
In addition, the current study used only one parcellation for the construction of adjacency matrices, and future studies might focus on whether other parcellations are similarly affected by site effects particularly when running matrix harmonization. While most methods use similar preprocessing steps, slight variations in these steps and how they are applied can impact calculated diffusion metrics, and thus may be important to explore in future studies. Data acquisition in the current study included single shell diffusion-weighted data. Multishell acquisition protocols may be differently affected by site effects, which might be addressed in future studies. Lastly, the current study used deterministic tractography, and future analyses might consider testing the effect of harmonization on networks derived using probabilistic tractography, as the two approaches have been shown to differ in terms of within-and between-scanner consistency (55).

CONCLUSIONS
The present paper validates the utility of ComBat harmonization in the context of graph theoretical analysis for structural connectivity derived from DTI. The harmonization of global parameters derived from unharmonized adjacency matrices provided superior performance as compared with the harmonization of connectivity weights for removing betweensite differences, preserving the within-site variability and preserving age-related biological variability in the data.

DATA AVAILABILITY STATEMENT
A dataset with deidentified participant data and a data dictionary will be made available upon reasonable request from any qualified investigator, subject to a signed data access agreement.

ETHICS STATEMENT
The current study was reviewed and approved by each site where data was collected. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.

AUTHOR CONTRIBUTIONS
AO, AW, AH, KY, and CL: conception and design of the study, acquisition and analysis of data, and drafted significant portion of the manuscript and figures. MB, CB, WC, QD, SF, and RZ: conception and design of the study. All authors contributed to the article and approved the submitted version.