Parametric Probability Distribution Functions for Axon Diameters of Corpus Callosum

Axon diameter is an important neuroanatomical characteristic of the nervous system that alters in the course of neurological disorders such as multiple sclerosis. Axon diameters vary, even within a fiber bundle, and are not normally distributed. An accurate distribution function is therefore beneficial, either to describe axon diameters that are obtained from a direct measurement technique (e.g., microscopy), or to infer them indirectly (e.g., using diffusion-weighted MRI). The gamma distribution is a common choice for this purpose (particularly for the inferential approach) because it resembles the distribution profile of measured axon diameters which has been consistently shown to be non-negative and right-skewed. In this study we compared a wide range of parametric probability distribution functions against empirical data obtained from electron microscopy images. We observed that the gamma distribution fails to accurately describe the main characteristics of the axon diameter distribution, such as location and scale of the mode and the profile of distribution tails. We also found that the generalized extreme value distribution consistently fitted the measured distribution better than other distribution functions. This suggests that there may be distinct subpopulations of axons in the corpus callosum, each with their own distribution profiles. In addition, we observed that several other distributions outperformed the gamma distribution, yet had the same number of unknown parameters; these were the inverse Gaussian, log normal, log logistic and Birnbaum-Saunders distributions.


INTRODUCTION
Axon diameter is an important structural characteristic of tissue in the central nervous system. Axon diameter correlates with conduction velocity, is affected by some neurological disorders, such as multiple sclerosis and autism, and changes during development (Ritchie, 1982;Piven et al., 1997;Bauman and Kemper, 2005;Hughes, 2006;Kunz et al., 2014). The conventional approach to obtain axon diameter values is through histological techniques such as electron microscopy (Aboitiz et al., 1992). Given the large number of axons in a region of interest and the variation in axon diameter, a statistical representation such the distribution of axon diameters distribution, is useful to describe axon diameter. In additional, recent efforts to infer axon diameter non-invasively using diffusion-weighted MRI often utilize a statistical model of axon diameter distribution, to be fitted to MRI measurements (Stanisz et al., 1996;Assaf et al., 2008;Barazany et al., 2009;Alexander et al., 2010;Dyrby et al., 2013;McNab et al., 2013;Horowitz et al., 2014;Huang et al., 2015;Sepehrband et al., 2016).
The gamma distribution is the most common probability distribution function used for this purpose. It has two parameters, shape and scale, and broadly reflects the shape of the axon diameter distribution, which has been consistently observed to be right-skewed and heavy-tailed (i.e., an asymmetric distribution, in which the right tail of the distribution is much longer than the left tail). Recently Pajevic and Basser (2013) argued, from neurophysiological perspective, that this skewed profile optimizes information transfer and capacity along bundles of axons. They also reported optimum distributions, based on parameters describing the fiber's ability to transmit information, that outperform the gamma distribution, in practical applications such as AxCaliber (Assaf et al., 2008).
Here we introduce another probability distribution function that provides a good representation of the heavy-tailed axon diameter distribution, the generalized extreme value distribution. We empirically compared the generalized extreme value distribution with the gamma distribution and fourteen others, without any prior assumptions with respect to the anatomy or metabolic requirements of axons. We assessed different probability functions using electron microscopy images of mouse corpus callosum in which we manually measured more than 20,000 axons. In addition, we examined previously published electron microscopy data for the human and macaque corpus callosum (Liewald et al., 2014).

MATERIALS AND METHODS
Axons of the corpus callosum of a mouse corpus callosum were manually measured using cross sectional electron microscopy images. Animal preparation and electron microscopy are described in detail in (Sepehrband et al., 2016), but are also included in Appendix A for readability. Electron microscopy images and measurements are available at: https://github.com/ sepehrband/AxonDiameter. Figure 1 shows a representative electron microscopy image and measured axons. In addition, we investigated the axon diameter distribution profile of human and macaque from an electron microcopy study performed by Liewald et al. (2014). Three regions of two human corpus callosum (genu, truncus, and splenium) and a region of macaque corpus callosum (truncus) were examined; from Figure 10 of Liewald et al. (2014).
Sixteen different parametric continuous probability distribution functions were fitted to the axon diameter data ( Table 1; Severini, 2005). In particular, distribution functions with three or fewer unknown parameters were used. The beta distribution was excluded from the comparison, as it only defines the values in 0 to 1 range, and even after normalizing axon diameter values to this range, it fitted the data poorly (see Supplementary Figure 1). Distribution functions were FIGURE 1 | An example electron microscopy image of mouse corpus callosum. On the right panel the circular representation of the measured axons is presented. fitted using the allfitdist function, in the MATLAB R statistics toolbox.
To evaluate the performance of the distribution functions, the Akaike Information Criterion (AIC) was used (Akaike, 1974); it trades off goodness of fit against model complexity. AIC measures the relative quality of a statistical model, in which the model with the lowest AIC score is ranked highest. The AIC was corrected for finite sample sizes and was calculated as follows: where L and k are the maximum value of the likelihood function and the number of estimated parameters, respectively. In addition to AIC, the Bayesian Information Criterion (BIC) amd negative log-likelihood values were also assessed. However, we only report the AIC, as all criteria used led to same rankings of tested distributions.

Basic Statistics
Mean axon diameters were around 0.56 µm in different regions of the corpus callosum ( Table 2). The largest axons were observed in the body and genu, and the smallest axons in the splenium. The mean axon diameter in the genu was significantly smaller than in both the body and the splenium (p < 0.01). The mean axon diameter in the splenium was only slightly smaller than in the body, but the difference was not significant. The splenium has the highest median value and the lowest standard deviation, demonstrating homogeneity of axon diameter in this region. A similar trend was seen in other mammalian species (Olivares et al., 2001). Table 3 shows the ranking of the evaluated probability distribution functions. Regardless of the evaluation criterion, the same ranking was obtained. The generalized extreme value distribution, while having three unknown parameters, ranked highest. Log normal, inverse Gaussian, log logistic, and Birnbaum-Saunders distributions, with relatively similar AIC

Axon Diameter Distribution in the Corpus Callosum
Frontiers in Neuroanatomy | www.frontiersin.org values and the same number of unknown parameters, were good alternatives to represent the distribution of axon diameter. The gamma distribution, despite also having two unknown parameters, performed relatively poorly. The Generalized Pareto and t-location-scale distributions performed poorly, even though they both have three unknown parameters. As expected, the normal distribution was also ranked low; it fails to represent the skewness of axon diameter distribution. Interestingly, while the generalized extreme value distribution outperformed other distributions, the extreme value distribution had the lowest ranking. Extreme value distribution models extreme deviation from the median of probability distribution, but may fail to accurately describe the rest of the distribution. Generalized extreme value distribution, however, combines three types of extreme value distributions, allowing a continuous range of possible shapes, which most likely explains the divergence of performance. Figure 2 compares the top seven probability distribution functions with the empirical data. As shown in Table 3, generalized extreme value distribution gave an accurate representation of the data. Most of distribution functions accurately represented the location of the mode (peak of the distribution), but failed to represent the scale of the mode (see log normal, inverse Gaussian, and Birnbaum-Saunders). The gamma and t-location-scale distributions missed both location and scale of the mode. In addition, they gave a poor representation of both tails of the distribution. These poor representations raise questions about the reliability of the neuroanatomical measures obtained from the commonly used gamma distribution. Figure 3 demonstrates the cumulative distribution of top five distribution functions, together with their error. The error plot ( Figure 3B) shows that generalized extreme value function had relatively constant errors across axon diameter values. The remaining distributions showed two error peaks, one before and one after the mode. gamma and t-location-scale distributions had the highest negative and positive error values.

Sub-Regions of the Corpus Callosum
The main difference between the sub-regions was in skewness (Figure 4). The axon diameter distribution of the genu and body was more skewed compared with that of the splenium. Regardless of region, the generalized extreme value distribution always ranked highest. As expected, the ranking of the probability distribution functions in the genu and body (with relatively similar distribution profile) was almost the same as for the splenium. In the splenium, log logistic and log normal distributions were ranked second and third, respectively. Regardless of the region, gamma and t-locationscale distributions were the lowest ranked of the top seven. Similar to the analysis of the whole corpus callosum, the gamma distribution had high negative and positive error values compared with other top ranked distribution functions. Supplementary Tables show the ranking of all the distribution functions across sub-regions of the corpus callosum.
Corroborating our mouse brain data, the generalized extreme value distribution consistently ranked highest when human or monkey corpus callosum were assessed ( Table 4). Similar to previous results, the gamma distribution ranked sixth, regardless of the region or sample. The inverse Gaussian, log normal, log logistic, and Birnbaum-Saunders distributions were ranked 2nd to 5th, with the inverse Gaussian appearing to give a slightly better fit to the data. A further non-parametric Friedman's test followed by post-hoc Nemenyi test was performed on the ranking results of the top six probability distribution functions to assess whether the goodness of fit of different probability functions differed significantly (Supplementary Figure 2). The probability distribution function affected goodness of fit significantly (p < 0.0001). The gamma distribution was significantly poorer fitting than the generalized extreme value, inverse Gaussian, log normal and log logistic distributions (p < 0.05).

DISCUSSION
In this work, we investigated the optimum probability distribution function for describing axons of corpus callosum. The optimum probability distribution is most useful for techniques that use mathematical models either to describe axon diameters that are measured directly (e.g., from electron histology), or to infer the distribution of axon diameter from non-invasive measurements indirectly (e.g., diffusion-weighted FIGURE 2 | Comparing top ranked probability density functions with empirical data from electron microscopy. MRI). To find the optimum probability distribution function, we performed electron microscopy of the corpus callosum of an adult C57Bl/6J mouse and carefully measured the diameter of more than 20,000 axons. We also assessed electron microscopy data for human and macaque corpus callosum from the literature. Model selection for axon diameter distribution functions was based on information criteria (i.e., AIC) and error propagation.
Gamma distribution failed to accurately describe the main characteristics of the axon diameter distribution, such as location and scale of the mode and the profile of distribution tails. On the contrary, generalized extreme value distribution consistently fitted the measured distribution better than other distribution functions. Axon morphology correlates with axonal function (e.g., axon diameter correlates with conduction velocity). It is possible that axons fall into different subpopulations with different distribution profiles; i.e., axons with small diameter may have a different distribution profile to large diameter axons. Therefore, a distribution function that can capture such a characteristic would outperform others. The generalized extreme value combines three simpler distributions and may fit better because it can capture different subpopulations simultaneously.
Techniques such as AxCaliber that use a mathematical model of tissue microstructure to indirectly infer axon diameter distribution require a small number of unknown parameters. The generalized extreme value distribution function has one more unknown parameter than the gamma distribution. Four other probability distribution functions outperformed the gamma distribution and, like the gamma distribution, also have only two unknown parameters (Tables 1, 4). Log-normal, log-logistic, and inverse Gaussian distribution functions proved to be significantly better descriptors of axon diameter distribution than the gamma distribution function. In particular, the log-normal model which outperforms the gamma distribution in all of our comparisons, has the virtues of simplicity, widespread use in biology and neurobiological validity (Buzsáki and Mizuseki, 2014).
Unlike previous studies (Gov, 2009;Perge et al., 2012;Pajevic and Basser, 2013), we did not explicitly focus on explaining the skewness of the axon diameter distribution. Rather, we evaluated a range of parametric probability distribution functions to find the most parsimonious model in terms of unknown parameters that optimized model accuracy. Model selection did not consider prior knowledge about axon morphometry.
As discussed in (Sepehrband et al., 2016), our estimated values are higher than those reported in a previous study (median of 0.25 µm and mean of 0.43 µm) (Innocenti et al., 1995). The difference could be due to the shrinkage artifact caused by older techniques for embedding and fixation of the tissue. We  used a method that, in some settings, has been demonstrated to produce almost no shrinkage during processing compared to 40-70% shrinkage with other techniques (Hanssen et al., 2013).

ACKNOWLEDGMENTS
We would like to thank Gayeshika Leanage from Centre for Advanced Imaging of UQ, and Dr. Kathryn Green from Centre for Microscopy and Microanalysis of UQ, for their valuable input, and help during animal preparation and histological imaging.

SUPPLEMENTARY MATERIAL
The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fnana. 2016.00059