Low-Rank Representation of Head Impact Kinematics: A Data-Driven Emulator

Head motion induced by impacts has been deemed as one of the most important measures in brain injury prediction, given that the vast majority of brain injury metrics use head kinematics as input. Recently, researchers have focused on using fast approaches, such as machine learning, to approximate brain deformation in real time for early brain injury diagnosis. However, training such models requires large number of kinematic measurements, and therefore data augmentation is required given the limited on-field measured data available. In this study we present a principal component analysis-based method that emulates an empirical low-rank substitution for head impact kinematics, while requiring low computational cost. In characterizing our existing data set of 537 head impacts, each consisting of 6 degrees of freedom measurements, we found that only a few modes, e.g., 15 in the case of angular velocity, is sufficient for accurate reconstruction of the entire data set. Furthermore, these modes are predominantly low frequency since over 70% of the angular velocity response can be captured by modes that have frequencies under 40 Hz. We compared our proposed method against existing impact parametrization methods and showed significantly better performance in injury prediction using a range of kinematic-based metrics—such as head injury criterion (HIC), rotational injury criterion (RIC), and brain injury metric (BrIC)—and brain tissue deformation-based metrics—such as brain angle metric (BAM), maximum principal strain (MPS), and axonal fiber strains (FS). In all cases, our approach reproduced injury metrics similar to the ground truth measurements with no significant difference, whereas the existing methods obtained significantly different (p < 0.01) values as well as substantial differences in injury classification sensitivity and specificity. This emulator will enable us to provide the necessary data augmentation to build a head impact kinematic data set of any size. The emulator and corresponding examples are available on our website1.

Head motion induced by impacts has been deemed as one of the most important measures in brain injury prediction, given that the vast majority of brain injury metrics use head kinematics as input. Recently, researchers have focused on using fast approaches, such as machine learning, to approximate brain deformation in real time for early brain injury diagnosis. However, training such models requires large number of kinematic measurements, and therefore data augmentation is required given the limited on-field measured data available. In this study we present a principal component analysis-based method that emulates an empirical low-rank substitution for head impact kinematics, while requiring low computational cost. In characterizing our existing data set of 537 head impacts, each consisting of 6 degrees of freedom measurements, we found that only a few modes, e.g., 15 in the case of angular velocity, is sufficient for accurate reconstruction of the entire data set. Furthermore, these modes are predominantly low frequency since over 70% of the angular velocity response can be captured by modes that have frequencies under 40 Hz. We compared our proposed method against existing impact parametrization methods and showed significantly better performance in injury prediction using a range of kinematic-based metrics-such as head injury criterion (HIC), rotational injury criterion (RIC), and brain injury metric (BrIC)-and brain tissue deformation-based metrics-such as brain angle metric (BAM), maximum principal strain (MPS), and axonal fiber strains (FS). In all cases, our approach reproduced injury metrics similar to the ground truth measurements with no significant difference, whereas the existing methods obtained significantly different (p < 0.01) values as well as substantial differences in injury classification sensitivity and specificity. This emulator will enable us to provide the necessary data augmentation to build a head impact kinematic data set of any size. The emulator and corresponding examples are available on our website 1 .
Keywords: traumatic brain injury, concussion, head impact kinematics, injury biomechanics, data-driven emulator, injury metrics INTRODUCTION Traumatic brain injury (TBI) is one of the most debilitating health problems in our society today, with nearly two million new cases in the US every year (Taylor et al., 2017). The majority of these cases are considered mild, also known as concussion (Defense and Veterans Brain Injury Center, 2018). The substantial increase in reported concussions in contact sports (Selassie et al., 2013), together with the recent findings of increased long-term pathological changes (DeKosky et al., 2013), has sparked a public discussion and raised awareness about TBI. An important requirement is an accurate and objective diagnosis of concussions, which in turn could inform better protective equipment design and safer activities (Manoogian et al., 2006;Wu et al., 2014;Kuo et al., 2017;Kurt et al., 2017;Siegkas et al., 2019).
Head motion kinematics, including the rate, frequency, and direction of head's movement during collision, has been deemed as one of the most consequential metric in predicting brain injury. Historically, kinematic-based metrics such as head injury criterion (HIC) (NTSA, 1972), rotational injury criterion (RIC) (Kimpara and Iwamoto, 2012), and brain injury criterion (BrIC) (Takhounts et al., 2013) have been used to detect injury. These metrics are still widely used among researchers and are endorsed by safety regulating organizations such as the National Highway Traffic Safety Administration (NHTSA) (Laituri et al., 2016) and the National Operating Committee on Standards for Athletic Equipment (NOCSAE) (National Operating Committee on Standards for Athletic, 2012). More recently, brain tissue deformation-based metrics have been introduced that use head kinematics as input to computational models that can approximate the effect of head motion on brain displacement and deformation. These metrics either use simple discrete mechanical elements in lumped-parameter models, i.e., mass-spring-damper combinations, to give a rigid-body estimate of brain's relative motion with respect to the skull (Kornhauser, 1954;Low and Stalnaker, 1987;Laksari et al., 2015;Gabler et al., 2018b), or more complex finite element (FE) models with detailed geometry of the brain anatomy, which can simulate the local brain deformation and interaction with the stiff bony or membranous structures (Kleiven, 2013;Zhao et al., 2016). In the case of lumped models, brain angle metric (BAM), developed based on a data set of concussive and sub-concussive head impacts , and in the case of FE models, maximum principal strain (MPS) and axonal fiber strain (FS) along the white matter axon fibers have been proposed as effective injury diagnosis metrics (Wu et al., 2019b).
Evidently, both for the kinematic-based and the brain deformation-based metrics, head impact kinematics play a major role. With the advent of wearable sensor technology, several groups have been collecting on-field head kinematic measurements during contact sports events (Hernandez et al., 2014;Laksari et al., 2018;Miller et al., 2018Miller et al., , 2019Wu et al., 2018). However, despite these pioneering efforts, on-field head kinematic measurements are not widely available. As a result, researchers have resorted to simplifying parameterizations of head collisions as idealized biphasic acceleration impulses (Yoganandan et al., 2008;Abderezaei et al., 2019). These biphasic impulses are commonly represented either by a triangle or half-sine and defined by two parameters: height and width constitute the magnitude and duration of a head impact impulse. The simplification of kinematic impulses serves the objective of emulating on-field kinematic data of a head impact with a few and manageable number of parameters to populate an otherwise infinite-dimensional loading space to investigate and establish a relation between head motion and brain injury. However, a potential disadvantage of these simplifications is overlooking valuable information that could prove detrimental in developing injury metrics. Therefore, it is paramount to understand the characteristics of real-world head impacts and whether we can accurately capture them through simplified approximations. Furthermore, advances in computational methods, including machine learning algorithms, have provided new and exciting avenues for fast and reliable prediction and diagnosis of brain injury. As a result, given the prohibitively high computational cost of current FE models, the biomechanics community has been trying to utilize such interpolative and machine learning techniques Cai et al., 2018;Wu et al., 2019a,b). However, a limitation of those techniques is the large number of kinematic data required to train these algorithms (in the order of thousands of head impacts; Wu et al., 2019a). Currently such a data set is not widely available. Thus, artificial augmentation of kinematic samples has been utilized as an alternative to satisfy that training data set requirements of such algorithms.
In this paper, we present a formal method to extract the most dominant features of on-field head impact kinematics from an existing data in the context of contact sports. We subsequently use the extracted features in order to construct an augmented data set that resembles the on-field measurements. We hypothesize that by using our method, based on principal component analysis (PCA), we will acquire more accurate injury predictions than current biphasic impulse approximations when compared against the ground truth measurements. Furthermore, we present a modal reconstruction technique that, despite using relatively few modes, can emulate a desired number of augmented head impacts that are statistically similar to the ground truth impact measurements.

MATERIALS AND METHODS
In order to study the characteristics of head impact kinematics and the efficacy of simplified approximations, we used a previously-collected data set of 537 head impact kinematics measured during contact sports, including American football, boxing, and mixed martial arts (Hernandez et al., 2014;Laksari et al., 2018). For each impact, 6 degrees of freedom (DoF) kinematics-linear acceleration and angular velocity in the three anatomical directions-were collected at 1,000 Hz for 100 ms using a mouthguard instrumented with a triaxial accelerometer and a triaxial gyroscope (Hernandez et al., 2014). We construct three different reduced kinematics data sets to approximate the measured kinematics: (1) using principal component analysis (PCA), we decrease the dimensionality of the measured head impact kinematics to construct a low-rank kinematics data set, and (2-3) using previously proposed biphasic assumptions for acceleration impulses with acceleration magnitude and duration as the two variables, we construct biphasic data sets once for triangle (Tri) and once for half-sine (HS) approximations. We investigate the efficacy of each approximation by comparing its performance in detecting brain motion/deformation and injury prediction using three types of metrics: (1) kinematicsbased injury metrics, including HIC, RIC, and BrIC, (2) brain angle metric (BAM), and (3) tissue deformation-based finite element injury metrics, including maximum principal strain (MPS) and axonal fiber strain (FS) in the whole brain (WB) and corpus callosum (CC). We compare the performance of each approximation against the ground truth (GT) data described above. We perform power spectral density (PSD) analysis on the derived temporal modes to obtain their predominant frequencies. These values are given by the maximum power spectral density (PSD) values of each mode. Finally, we present a modal reconstruction method for emulating augmented head impacts.

Dimension Reduction Through Principal Component Analysis
Our goal is to exploit the correlations between different measurements and find a reduced representation for three quantities of interest (QoIs), including: linear acceleration, angular velocity, and angular acceleration in each anatomical direction. In the case of linear kinematics, anterior-posterior, inferior-superior, and lateral directions are considered and in the case of angular kinematics, axial, coronal, and sagittal directions are considered as separate QoIs. To this end, we apply principal component analysis (PCA) to our data set. For each QoI, we form a data matrix X m×n , where n = 537 is the number of measured head impacts and m = 100 is the number of time steps, and X = [x 1 |x 2 | . . . |x n ], where each column represents the measured QoI for a particular head impact and each row represents the time instance of the measurement. To perform PCA, we compute the singular value decomposition (SVD) of the data matrix: σ n are the singular values, and Y = [y 1 |y 2 | . . . |y n ] are the uncorrelated linear components, i.e., y T i y j = δ ij with the joint probability distribution function (PDF) of p(y 1 , y 2 , . . . , y n ). This gives an ordered array of the modal contributions inherent to head impacts response. Finally, we perform power spectral density (PSD) analysis on the derived modes u i to obtain their predominant frequencies. These values are given by the maximum PSD values of each mode.
A reduced representation of the impact data is obtained by: To quantify the performance of the reduction, we introduce:

Data-Driven Emulator
Once we extract the modal characteristics of the head impact data set, a reduced emulator of the head impact kinematics is obtained by truncating to k PCA modes: where (y * 1 , y * 2 , . . . , y * k ) is a random point with k components drawn from the marginal PDF of p(y 1 , y 2 , . . . , y k ). Equation (2) can be used as an emulator for producing new time series (x * ) for each of the QoIs that are nearly indistinguishable from the ground truth head impact kinematics measurements. Our approach can be interpreted as a stochastic dimension reduction technique and it is a special case of dimension reduction with time-dependent modes (Sapsis and Lermusiaux, 2009;Babaee and Sapsis, 2016;Babaee et al., 2017;Babaee, 2019;Patil and Babaee, 2020), in which the loading is a random input and the QoIs are random output. In order to ensure high quality augmented data, we establish features to statistically compare new time series with the GT, including parameters such as impact time, peak, and duration of impacts, and orthonormal projections of PCA modes. We will provide this emulator as a stand-alone program that allow users to build low-rank head kinematics data sets with various approximation levels (k) and emulate a pre-defined number of impacts.

Parameterizing Biphasic Impulse Profiles
In order to compare the performance of previously proposed biphasic models for angular and linear acceleration impulses Abderezaei et al., 2019) against our low-rank PCA approximation, we reconstruct biphasic triangle (tri) and half-sine (hs) representations of the 537 head impacts described above. First, the maximum absolute value of the angular acceleration profile (α M )-computed by differentiating angular velocity measurements using a first-order forward divided difference method (two points)-was identified, including the time of peak (t M ). The impact duration ( t) is defined as the time interval on either side of t M . The boundary of this interval is defined as where the sign or the convexity of the acceleration profile (whichever comes first) changes. Convexity changes are computed based on the second derivative test using a common three-point stencil central finite difference derivative. This process is described in Figure 1, where the impact duration ( t = t 1 − t 0 ) is the time elapsed between the initiation time (t 0 < t M ) and the completion time (t 1 > t M ) of impact. In the cases where t M → 0 ms or t M → 100 ms, since it is not possible to define t 0 or t 1 correctly, only half of the simplified pulse was created. Finally, the change in velocity ( ω) was computed as the area under the acceleration impulse ( ω = t 1 t 0 α(t)dt), and the corresponding acceleration magnitudes for the triangle (α tri ) and half-sine (α hs ) approximations were calculated through: Angular velocity pulses were then computed through direct temporal integration of low-rank angular acceleration pulses.

Kinematics-Based Injury Metrics
We used HIC 15 , RIC 36 , and BrIC to compare the performance of our proposed PCA reduction against the triangle and half-sine biphasic signals, i.e., triangle and half-sine approximations. To this end, we used previously published injury threshold values: (1) for HIC 15 , values of 240 and 667 have been reported as 50% risk of concussion (Newman and Shewchenko, 2000) and skull fracture (Marjoux et al., 2008), respectively; (2) for RIC 36 , a value of 10.3 × 10 6 is reported as 50% risk of concussion (Kimpara and Iwamoto, 2012); and (3) for BrIC, a value of 0.5 constitutes a 50% concussion risk (Takhounts et al., 2013). We used these thresholds to assess the performance of each reduction approach in providing injury predictions in terms of sensitivity and specificity with respect to the ground truth measurements. Injury thresholds were used as indicators, defining true positives (above the threshold) and negatives (below the threshold), while the predictive value of each impulse approximation was compared against the GT measurement.

Brain Angle Injury Metric
We further compared the performance of each approximation using injury criterion based on lumped-parameter models of the head. These models generally consider simplifying assumptions: skull and brain are considered rigid bodies and relative motion between the two represents a form of deformation and injury, and the compliance of the brain-skull interface such as the effect of bridging veins, dura and pia maters is represented by linear spring and damper elements (Kornhauser, 1954;Low and Stalnaker, 1987;Gabler et al., 2018a). Given the head kinematics as the base excitation input, these models can estimate the relative motion of brain and skull, particularly the angular motion since that has been seen as the more consequential type of motion (Sullivan et al., 2015). Recently, brain angle metric (BAM) was developed based on the characteristics of human brain and skull in finite element simulations, and validated against observed concussive and sub-concussive head impacts . We compute BAM for each kinematic approximation (PCA, triangle, and half-sine).

Tissue Deformation-Based Injury Metrics
As a final step in studying the efficacy of the different kinematic approximations, we compared the performance of each approximation in predicting the tissue-level deformation metrics using finite element simulations, including maximum principal strain (MPS) in the whole brain (WB) and in the corpus callosum (CC) region, as well as axonal fiber strains (FS) in the corpus callosum region, which have all been proposed as predictive tissue-level metrics for injury classification (Kleiven, 2007;Laksari et al., 2018) (Figure 2). Recently a convolutional neural network (CNN) was developed based on pre-trained FE simulations based on the Worcester Head Injury Model (WHIM) (Zhao et al., 2017). This CNN method uses angular velocity data as input to approximate the regional brain deformations, i.e., maximum principal and axonal fiber strain (Wu et al., 2019a).

Injury Metric Error Analysis
Having simulated the injury metrics for each head impact measurement, we calculated the corresponding injury metric (metric GT ) and the injury metric estimated by the kinematics approximation (metric approx ) using the equation below: Subsequently, we performed Friedman test (Daniel, 1990) with a p-value of 0.01 (MATLAB, friedman) to show significant differences between each approximated impulse and the ground truth. We also performed sensitivity and specificity analysis to provide an estimate for the efficacy of approximating the metrics for injury diagnosis. For this purpose, we used previously published values for 50% risk of concussion, including MPS WB =

Dimension Reduction Through PCA
We performed PCA on the measured kinematics data for the QoIs, i.e., linear acceleration, angular velocity and angular acceleration in each anatomical direction. We used the reduction criterion of η = 0.90, as defined in Equation (1), for all these cases. In the case of angular velocity, the minimum number of modes that satisfies this reduction criterion is k = 13, 15, and 10 modes for coronal, sagittal, and axial directions, respectively. In Figure 3 (top row), the PCA reconstruction of angular velocity in three anatomical directions for a sample case is shown. The sample case was chosen randomly from the 537 cases and it is represented by a column of the data matrix X. The ground truth measurement for the sample case as well as the reconstructed impulses with different levels of reduction are shown. It is clear that the 15-mode reduction yields a satisfactory reconstruction. In Figure 3 (bottom row), the individual and cumulative contribution of PCA modes are shown for the entire kinematics data set. These results demonstrate that with a relatively small number of PCA modes an accurate approximation of the head kinematic measurements can be achieved. As an additional analysis, in order to determine the convergence of this method, we performed PCA with several randomly selected subsets of the 537 ground truth measurements (with 100, 200, 300, 400, and 500 samples) and investigated the number of modes required to satisfy the η > 0.90 criterion. The results show that the minimum number of modes slightly grows with subsets size but levels off below 500 cases, indicating that our data set of 537 could be sufficient to reliably reconstruct a head impact data set (see Supplementary Figure 8).

Data-Driven Emulator
The first three temporal modes show a classic modal behavior with 1, 2, and 3 peaks in all three anatomical directions (Figure 4). Together, these first three modes capture nearly half of the total angular velocity response, and with each additional mode, we can reconstruct a closer approximation with the ground truth. For more analysis of the modes (see Supplementary Figures 2, 4). To further study the distribution of these modal approximations, we show the orthonormal projection of the first five PCA modes (y 2 , ..., y 5 ) against the first and most energetic mode (y 1 ) of the PCA data (black circles and bars in Figure 5). It is clear that the modes follow a Gaussian distribution, which would be an important consideration for emulating more data points.
To show the performance of our data-driven emulator, we reproduced an additional k = 537 head impact cases by randomizing the y * k columns each with the same mean and variance as the original Y matrix (Equation 2). Projection of the orthonormal vectors y * i on y * 1 for an augmented data set (red circles and bars in Figure 5) shows similar distribution as the ground truth data. In addition, features of the augmented data generated on our emulator, such as duration and peak distributions, have not significant differences with respect those of GT (see Supplementary Figures 5-7). Thus, our emulator is able to successfully generate reliable kinematics sets of data based on real on-field measurements. A copy of the head impact kinematics emulator will be available on our website (Arrué, 2020).

Natural Frequencies
The contribution of the most dominant frequencies, obtained through PSD criterion, are displayed in Figure 6. In general, low frequencies interval such from 10 to 40 Hz have the highest predominance for each parameter. Notably, the cumulative contribution for rotational velocity is progressively decreasing with increasing frequency.

Parameterizing Biphasic Impulse Profiles
Using the criteria described above, we fitted triangle and halfsine analog pulses to the ground truth kinematics measurements in order to parameterize the rotational acceleration magnitude and duration for each head impact. As a result, we derived 537 analog impulses in the three anatomical directions for both triangle and half-sine approximations. The results are presented in Table 1, where the rotational acceleration magnitude and duration average and standard errors of the mean are given for the ground truth and the impulses approximations.
Additionally, whereas the PCA-based impulses showed accurate predictions compared to the ground truth, we observed that the biphasic approximations either under-predicted injury (higher number of false negatives) in terms of HIC and RIC, or over-predicted injury (higher number of false positives) in terms of BrIC (Figure 7).
To better illustrate this, we performed sensitivity and specificity analysis for injury classification with respect to the ground truth, where the PCA-based signals showed high predictive performance compared to the biphasic impulses ( Table 2).

Brain Angle Injury Metric
We computed 3DoF relative brain angles using the lumped model proposed in Laksari et al. (2019) to obtain the maximum resultant relative brain angle as a result of each head impact based on the ground truth kinematics and each of the three approximations. In coronal and axial directions, triangular and half-sine approximations gave significantly lower predictions for the brain angle metric, whereas the PCA modes showed no statistically significant difference from the ground truth (Figure 8). We observed substantially smaller approximation errors (Equation 4) for the PCA approach (∼3%) compared to the two biphasic approximations (∼25%) (Figure 8).

Tissue Deformation-Based Injury Metrics
Similar to brain angle metric, we calculated the errors between the ground truth strain metrics and those of low-rank approximations. As can be seen in Figure 9, the PCA approach closely follows the ground truth simulation results in all three strain metrics: maximum principal strain (MPS) in the whole brain (WB) and corpus callosum (CC), as well as axonal fiber strain (FS) in the corpus callosum. The strains computed for the biphasic impulses significantly differ from the ground truth values and successively provide higher errors as we include region-specific and morphological information. Furthermore, based on the 50% concussion risk thresholds mentioned above, the PCA approach provides substantially higher sensitivity and specificity for injury classification (Table 3).
FIGURE 4 | The three most energetic temporal modes for angular velocity for the entire 537 head impact data set.
FIGURE 5 | Distribution graphs for second to fifth principal components (y 2 , ..., y 5 ) projected on to the first principal component (y 1 ) for angular velocity in the sagittal direction. PCA data (in black) follows a Gaussian distribution. Emulated data (red) is generated performing a Gaussian random number generator based on mean and variances from the PCA modes.

DISCUSSION
In this study we provide a formal approach for reducing the dimensionality of head impact kinematics in contact sports settings. We first derived the most important modes contributing to the head kinematics through principal component analysis and then compared those to existing methods that approximate head kinematics with simple biphasic impulses. We show that The PCA magnitudes were obtained by decomposing the ground truth angular accelerations for the criterion η = 0.90, which constituted of 21 modes for coronal and sagittal directions and 20 modes for axial direction (see Supplementary Figure 3).
FIGURE 7 | Computed HIC 15 , RIC 36 , and BrIC for each of the ground truth (GT) and the three approximated data sets: PCA, triangle (Tri), and half-sine (HS). The circles represent each sample, the solid red line represents the mean, and the blue and red regions show the standard deviation and standard error, respectively. The solid blue and dashed red lines represent 50% risk of concussion and skull fracture, respectively. Significant differences are indicated with (*) for p < 0.01. HIC and RIC graphs are on log-scale.  the modal decomposition approach can capture the kinematic behavior of the head with better accuracy and provide better approximations of brain deformation and injury classification. This analysis confirms that although head kinematics during head collisions span a wide range of magnitudes and frequencies , we can accurately capture the impact head kinematics by using only a relatively small number of modes. The low-rank database constructed based on PCA analysis require FIGURE 9 | Simulated strain metrics for ground truth head impact measurements and the corresponding PCA and biphasic approximations. The blue line shows the 50% risk of concussion for each metric. The significant differences are indicated with (*) for p < 0.01. Also the mean and standard error of the mean are shown on the right for strain estimation errors. only 15 modes to build the ground truth angular velocity kinematics with over 90% accuracy as well as accurately capture the predictive value of head impact kinematics using a variety of injury metrics. A major advantage of our approach is that with the acquired modes above, we are able to emulate a head impact data set with any given number of cases without needing access to the ground truth measurements. This emulated data set would closely replicate head impacts measured by on-field wearable sensors that constitute current state of the art.
The advantage of this low-rank emulator, in addition to its computational efficiency, is that it avoids simplifying assumptions for the shape of acceleration impulses and only uses empirical measurements. In contrast, the conventional biphasic assumption for head impacts as simple impulses with only two variables, i.e., acceleration magnitude and duration, falls short in providing accurate estimates. This effect is more pronounced for acceleration impulses that are more variable due to the time derivative but is true for velocity profiles as well. This apparent lack of accuracy in injury prediction in the biphasic approximations might be due to the fact that the biphasic triangle and half-sine signals are built using acceleration signals and then integrated to give the corresponding velocity profiles. Since there is no restitutive deceleration for these impulses, angular velocity eventually becomes constant after the acceleration returns to zero, contrary to the actual measured impulses (Yoganandan et al., 2008). In the case of kinematics-based injury metrics, the discrepancies in misidentifying concussive and subconcussive cases by HIC and RIC could be explained by these metrics' dependence on the shape of the acceleration impulse. In contrast, BrIC is only a function of peak angular velocity, and therefore exhibits less sensitivity to the shape of the impulse and the biphasic impulse's lack of restitutive deceleration (Figure 7).
In the case of brain angle metric, the biphasic impulse approximations show over five-fold errors compared to PCAbased impulses. This difference could be attributed to the simplification of the biphasic models that influences the solution of the mechanical lumped-parameter models. This discrepancy seems to affect the coronal direction the most and the sagittal direction the least for the biphasic approximations.
Similar to the brain angle results, brain finite element strains showed superior performance by our PCA-based approach. In previous publications, it has been shown that the closer the model is to the correct anatomical and morphological attributes of the brain, the more accurate the injury predictions become Zhao and Ji, 2019). Similarly, we see a decline in performance for the biphasic models as we include more region-specific and morphological details: from maximum principal strains (MPS) in the whole brain to MPS in corpus callosum and on to axonal fiber strains ( Table 3).
Several other points are worth noting based on our analysis. Relative with time domain, we observe differences in kinematics profiles in the three anatomical directions. It seems that the head experiences higher linear accelerations in the anterior-posterior direction (3.54 ± 2.97 m/s 2 ) compared to lateral (2.62 ± 2.25 m/s 2 ) and inferior-superior (3.01 ± 3.44 m/s 2 ) directions, and higher angular accelerations in the sagittal direction (190.53 ± 177.34 rad/s 2 ) compared to coronal (314.50 ± 401.53 rad/s 2 ) and axial (177.34 ± 128.16 rad/s 2 ) directions. This observation might be attributed to the type and direction of loading in the specific activity, e.g., direction of tackling in football, as well as anatomical features such as the neck constraint in those directions (Eckersley et al., 2017). In the frequency domain, there is a dominant low-frequency response (10 to 40 Hz) in the head kinematics, expressed by ∼ 90% of the total angular velocity response in the axial plane, ∼83% in the coronal plane and ∼74% in the sagittal plane, confirming previous findings on frequency dependence of head impacts (Wu et al., 2016;Laksari et al., 2018). These results could prove useful for designing better helmets and other safety devices to avoid brain injury by targeting specific low-frequency range of motion.
In summary, our proposed PCA decomposition approach not only provides a deeper understanding of the head's response during impacts, but also provides a formal basis for reconstructing and augmenting head impact kinematics data. Our current emulator is built upon the 537 measured on-field head impacts described above and successfully generates new kinematic data, whose features such peak or duration are nearly indistinguishable (see Supplementary Figure 7). It is expected that with more on-field measurements, we would be able to improve the performance of the emulator even further, but our convergence analysis showed the available 537 cases to be sufficient (see Supplementary Figure 8). This type of approach might prove necessary given the increased need for larger training data sets in modern machine learning algorithms.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
PA: study design, data analysis, and writing of the paper. KL: study design, data collection and analysis, and writing of the paper. HB and NT: study design and contribution to writing. All authors contributed to the article and approved the submitted version.