Intracranial Densitometry-Augmented Machine Learning Enhances the Prognostic Value of Brain CT in Pediatric Patients With Traumatic Brain Injury: A Retrospective Pilot Study

Background: The inter- and intrarater variability of conventional computed tomography (CT) classification systems for evaluating the extent of ischemic-edematous insult following traumatic brain injury (TBI) may hinder the robustness of TBI prognostic models. Objective: This study aimed to employ fully automated quantitative densitometric CT parameters and a cutting-edge machine learning algorithm to construct a robust prognostic model for pediatric TBI. Methods: Fifty-eight pediatric patients with TBI who underwent brain CT were retrospectively analyzed. Intracranial densitometric information was derived from the supratentorial region as a distribution representing the proportion of Hounsfield units. Furthermore, a machine learning-based prognostic model based on gradient boosting (i.e., CatBoost) was constructed with leave-one-out cross-validation. At discharge, the outcome was assessed dichotomously with the Glasgow Outcome Scale (favorability: 1–3 vs. 4–5). In-hospital mortality, length of stay (>1 week), and need for surgery were further evaluated as alternative TBI outcome measures. Results: Densitometric parameters indicating reduced brain density due to subtle global ischemic changes were significantly different among the TBI outcome groups, except for need for surgery. The skewed intracranial densitometry of the unfavorable outcome became more distinguishable in the follow-up CT within 48 h. The prognostic model augmented by intracranial densitometric information achieved adequate AUCs for various outcome measures [favorability = 0.83 (95% CI: 0.72–0.94), in-hospital mortality = 0.91 (95% CI: 0.82–1.00), length of stay = 0.83 (95% CI: 0.72–0.94), and need for surgery = 0.71 (95% CI: 0.56–0.86)], and this model showed enhanced performance compared to the conventional CRASH-CT model. Conclusion: Densitometric parameters indicative of global ischemic changes during the acute phase of TBI are predictive of a worse outcome in pediatric patients. The robustness and predictive capacity of conventional TBI prognostic models might be significantly enhanced by incorporating densitometric parameters and machine learning techniques.


INTRODUCTION
Pediatric traumatic brain injury (TBI) accounts for over 500,000 emergency department visits in the United States each year (1). The anatomical and physiological properties (e.g., thinner cranium, less myelinated tissue) of the pediatric brain can result in more rapid and severe development of secondary ischemic-edematous insults after head injury (2)(3)(4), and hence worse outcomes, than in adults (5). Given that cerebral edema is the fundamental pathophysiological mechanism underlying TBI, assessing the extent of cerebral edema may be crucial for evaluating the risk of intracranial hypertension and predicting outcomes (6).
Magnetic resonance imaging (MRI) is of significant diagnostic value for identifying pathological diffuse brain swelling; however, it requires children to be stationary and often sedated during a long acquisition time. Therefore, brain computed tomography (CT) remains the gold standard imaging modality during the acute phase of TBI for rapidly evaluating TBI and developing an appropriate intervention strategy (7,8). The degree of brain swelling is generally evaluated by CT classification systems [i.e., Marshall (9) and Rotterdam score (10)] based on the status of the mesencephalic cisterns or midline shift; this classification is easily applicable and is considered useful for predicting outcomes after TBI (9)(10)(11)(12). The efficacy of CT classification systems makes them important prognostic factors in well-known TBI prognostic models (i.e., Corticosteroid Randomization after Significant Head Injury [CRASH] or International Mission for Prognosis and Analysis of Clinical Trials [IMPACT] models) (13,14). However, the classification system relies on manual visual inspection of CT images by clinicians and hence has been criticized for its intrinsic inter-and intrarater variabilities (10,15).
It is well-known that there is a linear relationship between edema-induced water accumulation and Hounsfield unit (HU) Abbreviations: ANN, artificial neural network; cEs, cerebral edema score; CRASH, corticosteroid randomization after significant head injury; CT, computed tomography; dGx, proportions of normal-density gray matter; dWx, proportions of normal-density white matter; GBDT, gradient-boosted decision tree; GM, gray matter; HU, Hounsfield unit; IMPACT, international mission for prognosis and analysis of clinical trials; LOOCV, leave-one-out cross-validation; MRI, magnetic resonance imaging; PCA, principal component analysis; SHAP, Shapley additional explanations; SVM, support vector machine; TBI, traumatic brain injury; WM, white matter. values (16); a 1% increase in tissue water content causes a 2-3 HU reduction in attenuation on CT images (17). This dose-responsive relationship could be applied to quantitatively evaluate edematous changes by using HU distribution via densitometric CT analysis in patients with TBI (18). Accordingly, densitometric analysis could be an appropriate tool for evaluating the majority of early-stage pediatric TBIs that show no visually identifiable abnormalities on CT images. Moreover, by performing whole-cerebrum densitometric analysis in a fully automated manner, the degree of secondary edematous accumulation following TBI could be objectively measured without inter-or intrarater variability (18). These advantages of densitometric CT analysis could be exploited to construct a more robust prognostic model for TBI.
The past few years have seen a surge in attempts to use machine learning to construct prognostic models. Traditional logistic regression may have low robustness for explaining multivariate non-linear relationships (19), whereas machine learning can better apprehend non-linear relationships and interactions through more flexible modeling (20). Gradientboosted decision trees (GBDTs), a widely used machine learning algorithm, produce an interpretable prognostic model that is an ensemble of decision trees, which are in high demand in the medical domain (21). Among GBDT variants, CatBoost has recently been introduced and has shown notable robustness and highly accurate generalizability (22). This study hypothesized that combining CT densitometry and a machine learning technique (i.e., CatBoost) would enhance the prognostic value of brain CT with a more robust prognostic model for pediatric TBI. The objectives of this study are 2-fold: (1) to investigate the association between intracranial densitometry based on brain CT and various outcomes in pediatric TBI patients and (2) to evaluate the prognostic value of brain CT by constructing a densitometry-augmented TBI prognostic model based on a robust machine learning method.

Study Design and Setting
This retrospective pilot study investigated the relationship between intracranial densitometry based on brain CT and the outcome of pediatric patients with TBI. Furthermore, TBI prognostic models were constructed using the CatBoost model (22), a cutting-edge gradient boosting algorithm optimized in a small dataset and robust to overfitting, based on intracranial densitometric information. The prognostic models were evaluated through leave-one-out cross-validation (LOOCV), which is appropriate for small datasets. An overview of this study is shown in Figure 1.
The setting of this study is at a single trauma center (i.e., the Trauma Center of Seoul National University Hospital). Basic clinical information of TBI patients, such as brain CT images and Glasgow Coma Scale (GCS) scores, was obtained during the same period. Anonymized clinical information obtained on admission to the emergency room, radiology reports, and brain CT images were retrospectively reviewed and collected from the institutional database of Seoul National University Hospital from 2013 to 2017. This study was approved by the Ethics Committee of Seoul National University Hospital (IRB H-1706-144-862).
The requirement for informed consent was waived due to the retrospective nature of this study.

Study Population
The subjects were enrolled according to the following inclusion criteria: (a) direct admission for TBI through the emergency medicine department (not transferred from another hospital); (b) age ≤ 19 years; (c) underwent non-enhanced brain CT scan; and (d) eligible CT acquisition conditions (tube voltage = 120 kVp, tube current ≥ 150 mA) for analysis of HU values (23). Additionally, the following exclusion criteria were applied: (a) significant image artifacts (e.g., beam-hardening effects) in the CT images and (b) outcome unrelated to traumatic head injury (e.g., acute respiratory failure, myocardial contusions). Furthermore, patients with significant infratentorial hemorrhage were excluded because even a small infratentorial lesion may be fatal, which would significantly affect the outcomes and skew the data (24).
These criteria were entered into the institutional clinical data warehouse of Seoul National University Hospital, SUPREME R . Consequently, a total of 58 pediatric TBI patients who underwent head CT examinations were retrospectively included in this pilot study. Enrolled subjects were directly transferred to the trauma center. The Glasgow Outcome Scale (GOS) score was recorded at the time of discharge. Additionally, alternative outcome measures such as in-hospital mortality, length of stay (LOS) and need for surgery were used as reference standards.

Densitometric Analysis of Brain CT
Densitometric CT analysis utilizes HU values obtained by brain CT to derive intracranial densitometric data as a distribution representing the material density. It can provide a quantitative evaluation of brain density alterations caused by edema-induced water accumulation without inter-or intrarater variability. This study utilized the methods proposed by Kim et al. (18), which allow the quantitative derivation of intracranial densitometric data of the whole cerebrum. To implement the method, inhouse software was written in Java (Oracle, Inc., Redwood Shores, California, USA), providing a fully automatic method for the densitometric analysis of head CT images. To analyze only the major intracranial components (e.g., cerebrospinal fluid, parenchyma, blood), a threshold limit of 0-79 HU was applied (18). The intracranial densitometry was derived from Equation 1: where λ k is the number of pixels having an HU value of λ from the k th CT image in a series of CT scans showing the supratentorial brain region, the denominator of the equation is the entire number of pixels having an HU value of 0 to 79 in the whole cerebrum, and p(λ) is the proportion of pixels having an HU value of λ in the whole cerebrum. The graph of the intracranial densitometric data was derived by plotting the p(λ) in the range of 0-79 HU.
In this study, densitometric analysis was based on initial brain CT scans at admission and follow-up CT scans acquired within 48 h of initial CT scans. Brain CT scans were obtained with a Brilliance 64 scanner (Philips Medical Systems, Eindhoven, Netherlands). The CT acquisition parameters were as follows: tube voltage = 120 kVp, tube current ≥ 150 mA, and image matrix = 512 by 512, with a 5-mm slice thickness. Follow-up CT scans were also performed with the same acquisition settings.

Quantitative Evaluation of Intracranial Densitometry
There are two paradigms to quantitatively evaluate intracranial densitometry: one for evaluating the proportion of a specific HU region and the other for evaluating the morphology of the HU distribution.
In evaluating the proportion of a specific HU region, three parameters, dWx, dGx, and cEs, proposed in previous studies were used in the study (18,25). Excessive water accumulation caused by cerebral edema or ischemia lowers the material density of the brain parenchyma and consequently affects both white matter (WM) and gray matter (GM). In addition, the presence of space-occupied lesions with high density (e.g., subdural, epidural, and intracerebral hemorrhage) also decreased the relative proportion of normal density parenchyma in the cranium. Accordingly, the proportions of normal-density WM (dWx) and GM (dGx) were quantitatively assessed. dWx and dGx were defined as the proportions of pixels of 26-30 and 31-35 HU, respectively, among pixels depicting the entire cerebrum (25). In addition, Kim et al. suggested the use of the cerebral edema score (cEs), ranging from 17 to 24 HU, as an indicator of cerebral edema severity to compensate for the CT classification system (18).
The presence of a hypodense lesion (e.g., ischemic-edematous lesion) or hyperdense lesions (e.g., intracranial hemorrhage) in the intracranial area can contribute to the left-and right-sided dominance of the densitometry, respectively. Such pathological brain changes can be identified by assessing the intracranial densitometric morphology, which was evaluated as the proportional HU distribution by calculating µ, skewness and kurtosis, where µ is the mean HU value of the distribution, and skewness and kurtosis are the measure of the asymmetry and the tail of the distribution, respectively.
Intracranial densitometry reveals pathological changes in the material density of the whole cerebrum. Thus, it may be suggestive of various TBI pathologies (e.g., cerebral ischemia, edema, intracranial hemorrhage).  (Figure 2A), the patient with massive brain edema following severe ischemic insults ( Figure 2B) showed a leftward shift of the densitometry center with an increase in both skewness and kurtosis due to excessive water accumulation. On the other hand, in the patient with a large degree of acute subdural hemorrhage (Figure 2C), the center of the densitometry shifted to the right due to the space-occupied lesion. In addition, both Figures 2B,C show that the proportion of parenchyma within the normal density range was also affected by the occurrence of intracranial pathology.
Lesions with partially overlapping HU ranges (e.g., subacute subdural hemorrhage) may skew the interpretation of the densitometric parameters, but only patients during the acute stage were involved in this study.

Prognostic Model Construction
Prognostic models were constructed based on the four outcome measures (i.e., favorability, mortality, LOS and need for surgery) with CatBoost (22), which is a state-of-the-art gradient-boosted decision tree. CatBoost mitigates the overfitting problem and shows robustness and generalizability. In addition, CatBoost has various advantages of (1) having a swift training speed through parallel processing, (2) being appropriate for small sample sizes and unbalanced data, and (3) exhibiting higher accuracy than other gradient boosting algorithms. The IMPACT model was excluded from this study because the IMPACT model does not consider patients under the age of 14 years (14).
In this study, a total of three types of prognostic models were constructed. (1) CatBoost-based CRASH-CT (i.e., CRASH-CT CatBoost ) is the model based on admission characteristics (i.e., age, GCS, pupil reaction, and presence of extracranial injury) and initial CT findings (i.e., presence of petechial hemorrhage, obliteration of the third ventricle or basal cisterns, subarachnoid hemorrhage, midline shift, and non-evacuated hematoma). These are the same input of the conventional CRASH-CT model (13). (2) The densitometry-augmented model (i.e., D/CRASH-CT CatBoost ) was constructed with the intracranial densitometric information (i.e., dWx, dGx, cEs, µ, skewness, kurtosis, and HU distribution) and the CRASH-CT input. (3) The reference model was the conventional CRASH-CT model (i.e., CRASH-CT LR ) established by logistic regression.
Prognostic models for outcome favorability were derived from both initial and follow-up CT to assess the changes in prognostic value during the acute phase of TBI. Undoubtedly, the model based on the follow-up CT data uses input variables derived from the follow-up CT. On the other hand, prognostic models for mortality, LOS, and need for surgery were all established only based on initial CT data at admission.
Principal component analysis (PCA) was used to reduce the dimensions of the input variables that still contain most of the information to alleviate the dimensionality and overfitting problems that can occur in machine learning procedures. Accordingly, the high dimensions of admission characteristics, initial CT findings, and HU distribution were reduced by PCA.
SHAPs (Shapley Additional exPlanations) of the tree ensemble model were derived to assess the importance of the model input variables. Based on this, it is possible to conveniently examine the importance of individual input variables used in the CatBoostbased prognostic model and to determine how the input variables contribute to the prediction of the outcomes.
The predictive performance of all prognostic models was assessed by LOOCV (Figure 1), which results in unbiased and reliable estimates of model performance; iterative validations were performed 58 times by separating training (N = 57) and testing subjects (N = 1). These machine learning models were developed by scikit-learn 0.24.1 and CatBoost library 0.25.1 in the Python 3.7 environment.

Statistical Analysis
The calculation of the sample size was performed by Viechtbauer's formula developed for the pilot study (28); the calculated sample size was 45, with a confidence level of 0.90 and a probability of 0.05. Considering that the dropout rate was 30%, 58 subjects were enrolled in this study. Non-parametric statistical methods were employed due to the sample size. The Mann-Whitney U-test was applied to compare continuous data between outcome groups. The discrimination of the prognostic models was assessed by receiver operating characteristic (ROC) curve analysis. The optimal cutoff values for discriminating outcomes were calculated using the maximal Youden's J statistic (sensitivity + specificity -1) (29) in the ROC curve. Hanley's test (30) was performed to compare the area under the ROC curve (AUC). In this study, four reference standards against AUC of all prognostic models were used: (1) dichotomous outcome measured at discharge [i.e., GOS score 1-3 (positive) vs. 4-5 (negative)], (2) in-hospital mortality [i.e., deceased (positive) vs. survived (negative)], (3) LOS in hospital [i.e., > 1 week (positive) vs. ≤ 1 week (negative)], and (4) need for surgical intervention [i.e., underwent TBI-related surgery during hospitalization (positive) vs. no need for surgical procedure (negative)]. In this study, there was no need to reassign the LOS outcome of the deceased patients because no patients died in the group with LOS of <1 week (31). The sensitivity, specificity, positive predictive value, and negative predictive value with 95% confidence interval of each prognostic model were determined at the optimal cutoff. The analyses were considered statistically significant at two-sided p < 0.05. Statistical analyses were conducted using commercial software (SPSS 24, IBM Corp., Chicago, Illinois, USA).

Demographics
Fifty-eight pediatric TBI patients were included in this study. Of the 58 patients, 46 (79.3%) were assigned to the favorable outcome group, and 12 (20.7%) were assigned to the unfavorable outcome group. The detailed demographics are listed in Table 1.

Changes in Intracranial Densitometry at the Acute Phase of Pediatric TBI
For initial CT acquired at admission and follow-up CT acquired within 48 h later, the changes in intracranial densitometry according to the outcome were evaluated. Figure 3A shows the density distribution obtained from the initial CT scan at admission, which skewed to the right in the unfavorable outcome group compared to the density distribution of the favorable outcome group, indicating brain density alteration in the acute phase of TBI. These morphological disagreements resulted in significant differences between the outcome groups in the specific HU range. Contrary to the intracranial densitometry of the favorable outcome relatively analogous to the normal distribution, the skewed distribution of the unfavorable outcome became more distinguishable in the follow-up CT ( Figure 3B). This deformation suggests that the change in brain density due to secondary insults became more substantial.
The intracranial densitometric morphology was further assessed through densitometric parameters in a quantitative manner. Table 2 shows the densitometric parameters in both outcome groups. Among the proportional densitometric parameters from the initial CT, cEs showed a significant difference, whereas dWx and dGx, indicating the proportion of normal-density WM and GM, showed no significance. Nevertheless, only the dGx from the follow-up CT showed a significant difference. As the intracranial morphology of the unfavorable outcome group showed a distorted distribution, all of the morphological densitometric parameters differed significantly between outcome groups both in initial and followup CT. On the other hand, the CT classification systems (i.e., Marshall and Rotterdam score) did not significantly distinguish outcome favorability in either initial or follow-up CT. The predictive performance of the prognostic models constructed from the variables acquired during the acute phase was assessed by LOOCV based on the outcome favorability ( Figure 4). As the reference model, the CRASH-CT model   Figure 4B). Accordingly, the difference in the AUC between the CRASH-CT LR and the D/CRASH-CT CatBoost became more significant (p < 0.02 by Hanley's test) despite the smaller number of subjects. The performance of the prognostic models at the optimal cutoff point for predicting the outcome was further evaluated ( Table 3).
The feature importance of the input variable and SHAP of the tree ensemble of the best model from the initial CT, D/CRASH-CT CatBoost , was derived ( Figure 5A). In the D/CRASH-CT CatBoost model, morphological parameters (i.e., kurtosis, skewness, and µ) and dGx were the main contributors to the prediction. As kurtosis and skewness increased, the prediction of unfavorable outcomes increased. Additionally, lower µ and dGx values contributed to the prediction of unfavorable outcomes. D/CRASH-CT CatBoost from the followup CT also showed that dGx and skewness mainly contributed to the outcome prediction (Figure 5B), which is similar to the model from the initial CT. This may reflect the shift in the center of intracranial densitometry toward lower HUs with morphological distortion in patients with an unfavorable outcome.

Comparative Assessment of Intracranial Densitometry According to Alternative TBI Outcome Measures
Intracranial densitometry acquired at admission was further compared for alternative outcome measures for TBI (i.e., inhospital mortality, LOS, and need for surgical intervention). Intracranial densitometry in the deceased group showed considerable skewness, suggesting that low-attenuated pixels dominate ( Figure 6A). On the other hand, when the groups were dichotomized based on the LOS (1 week), the intracranial densitometry of the worse outcome group was transversely shifted to the left without noticeable deformation ( Figure 6B). However, there was no association between need for surgery and intracranial densitometry; the morphology between the outcome groups (i.e., need for surgery vs. no surgery) was indistinguishable ( Figure 6C). In addition, Table 4 further describes the densitometric parameters and CT classifications that depend on the three outcome measures. Outcome groups based on in-hospital mortality and LOS revealed significant morphological parameters, although proportional parameters and CT scores showed limited significance.

Prognostic Model for Alternative TBI Outcome Measures Augmented by Intracranial Densitometry
For the three alternative outcome measures, the performance of the prognostic models was evaluated through LOOCV (Figure 7). Figure 7A shows exceptionally superior performance of the D/CRASH-CT CatBoost than that of the CRASH-CT LR for predicting in-hospital mortality (p < 0.018 by Hanley's test). On     the other hand, prognostic models for LOS prediction showed consistent capacity without a significant difference between the prognostic models ( Figure 7B). The prognostic ability of the model to predict the need for surgery showed a moderate AUC of 0.71. Table 5 indicates detailed statistical measures for the performance of the prognostic model. The feature importance and SHAP of the tree ensemble of the D/CRASH-CT CatBoost were derived based on the three outcome measures (Figure 8). The in-hospital mortality prediction model possessed the skewness of intracranial densitometry as a significant contributor, but in the models predicting LOS or need for surgery, the conventional CRASH-CT input, not the densitometric parameter, made the most dominant contribution to the prediction. This result was related to the weak statistical significance of the input variables from the intracranial densitometry.

DISCUSSION
The efficacy and practicality of the conventional CT classification system for evaluating injury severity after TBI have long been acknowledged; however, this method suffers from high inter-and intrarater variability (10,15). Non-etheless, the system has been incorporated into well-known prognostic models for TBI, which may significantly affect the robustness of the models. This study utilized a recently developed interpretable machine learning model (CatBoost) to build a TBI prognostic model and employed CT densitometry, which is fully automated and thus does not suffer from inter-and intrarater variability (18), to compensate for the subjective CT classification system. The results indicate that (1) densitometric parameters are independently associated with various TBI outcome measures (i.e., favorability, mortality, and LOS) but not the need for surgery, (2) the prognostic capacity of the conventional CRASH model could be enhanced by employing CatBoost rather than traditional logistic regression, and (3) the capacity can be further increased by supplementing densitometric parameters as model inputs. The novelty and importance of the utilized methods and derived findings of this study warrant a detailed discussion.

The Rationale of Densitometric CT Analysis in TBI Pathologies
The inter-and intrarater variability of conventional CT classification systems mainly stems from the wide heterogeneity of injury types and severity (15) and the dependence on arbitrary estimations based on visual inspection (10). Densitometric CT analysis aims to overcome such limitations by quantitatively evaluating the density of brain structures and/or lesions based on brain CT scans. The technique has been widely applied in evaluations of various pathological brain changes [e.g., early cerebral edema (18,32) or ischemic changes (33), lesion water uptake (34), parenchymal compression (35), and hemorrhage growth (36)] and for predicting outcomes (18,25,37,38) in acquired brain injury. There are two approaches to applying densitometry: region of interest-based analysis (33)(34)(35)(36)38) and whole-cerebrum analysis (18,25,37). This study adopted the  latter approach (1) to ensure the robustness of the prognostic model to be constructed and (2) because it was originally designed to be applied for pediatric TBI (18). Whole-cerebrum analysis-based intracranial densitometry can respond to two major intracranial pathologies of TBI: ischemic-edematous insults and intracranial hemorrhage. However, there is a discrepancy in the sensitivity of intracranial densitometry for the two pathologies. According to volumetric CT measurements, the proportion of parenchyma in pediatric subjects accounts for 93-94% of the cranium (39). Undoubtedly, parenchyma also occupies the equivalent proportion in intracranial densitometry. Therefore, intracranial densitometry derived from whole-cerebrum analysis benefits from the sensitive response to diffuse ischemic injury that contributes to global changes in parenchymal density, despite subtle alterations (18,25). Of course, the significant amount of hyperdense lesion (e.g., crescent-shaped acute subdural hemorrhage) also markedly expanded the proportion of the intracranial densitometry area above 40 HU (Figure 2C). On the other hand, whole-cerebrum densitometry shows a limited response to focal lesions occupying a relatively small proportion. For example, intracranial densitometry may be less sensitive for focal lesions such as petechial hemorrhages or focal brain ischemia. Accordingly, consideration of focal lesions in the whole-cerebrum densitometric analysis may be restricted, and this study thus focused on the interpretation of TBI pathologies involved in global parenchymal changes.
Quantitative Assessment of Secondary Ischemic-Edematous Insults in the Acute Phase of TBI Intracranial densitometry offers a wide variety of quantitative parameters, i.e., morphological parameters (µ, skewness, and kurtosis) and proportional parameters (dWx, dGx, and cEs). The morphological parameters are derived from a whole-cerebrum density distribution (Figure 3). Based on the dichotomized outcome favorability, the averaged density distribution of patients with an unfavorable outcome was determined to be significantly right-skewed in distribution with a leftward shift of the center compared to that of patients with a favorable outcome. The distortion of the intracranial densitometry in the unfavorable outcome group became further apparent in the follow-up CT. These morphological characteristics are reflected as an increase in skewness and a decrease in µ, respectively, and suggest that the overall density of the parenchyma was reduced due to a subtle global ischemic change after TBI (37). The unfavorable outcome group also showed a higher kurtosis value in their density distributions than the favorable outcome group. Kurtosis, which has often been wrongfully interpreted as a measure of the "peakedness" of distribution graphs, is actually a measure of outliers (40). Localized hyper-or hypodense lesions result in significant increases in a specific range of HU values, i.e., they are outliers that contribute to increased kurtosis of the density distribution ( Table 2). Although the statistical power was limited due to the small number of subjects in the follow-up CT subgroup, this group exhibited the same trend as the outcome group in terms of the morphological parameters from the initial CT. These morphological parameters were the main contributors to the outcome prediction of the proposed model ( Figure 5). Unlike morphological parameters, which mainly reflect whole-cerebrum density distribution, proportional parameters reflect the density distribution of specific, major intracranial entities, and can be interpreted in relation to the pathophysiology of cerebral edema. Cerebral edema is divided into vasogenic and cytotoxic types. In general, cytotoxic edema affects both WM and GM (41), whereas vasogenic edema primarily affects WM and easily spreads to other locations via WM tracts (42). In the acute phase of TBI, vasogenic, and cytotoxic edema often coexist (43). Thus, reduced brain tissue density mediated by cerebral edema can be simultaneously reflected by lower WM and GM density values on CT. The increased proportion of hypodense pixels in CT images is reflected as an increase in the cEs (18), whereas changes in WM and GM densities are reflected as changes in dWx and dGx (37). Intriguingly, the conventional statistical analysis indicated that only cEs was significant in differentiating favorable and unfavorable outcome groups based on the initial CT ( Table 2), whereas the machine learning model indicated that dGx had the highest importance among the proportional parameters, followed by cEs and dWx ( Figure 5). Nonetheless, inferential statistics are influenced by effect size or sample size (44), and statistical significance does not guarantee high feature importance in the prognostic model. Indeed, dGx, the proportion of normal-density GM, which did not differ significantly between the two outcome groups, was considered a significant predictor of the outcome by the utilized machine learning model. Nevertheless, in follow-up CT within 48 h, dGx showed a significant difference between the outcome favorability. Despite the shortage of statistical power, this would suggest that brain density alternation even affected the relative proportion of GM with normal density.
In addition to the densitometric analysis, quantitative CT classification systems (i.e., Marshall and Rotterdam score) have been evaluated for use in assessing the severity of TBI. In pediatric TBI, the Rotterdam score has better discriminatory power than that of Marshall (45). However, neither the Marshall nor the Rotterdam scores classified outcomes well in our cohort (Tables 2, 4), which may be attributed to the cohort characteristics of this pilot study. Originally, the Rotterdam score was developed for only moderate or severe TBI, excluding mild head injury (10). Liesemer et al. reported that although the Rotterdam score was initially developed for use in the adult population, the prognostic function worked well in the pediatric TBI population (46); however, ∼80% of the subjects had moderate or severe TBI (GCS, [3][4][5][6][7][8]. A study in which the Marshall score showed adequate prognostic ability in pediatric TBI was also composed of moderate or severe TBI in 60% of the cohort (45). Consequently, the conventional CT classification systems could not work properly in this study, where ∼70% of mild TBI patients are composed.

Responsiveness of Intracranial Densitometry to Alternative Outcome Measures
Mortality, LOS, and the need for surgical procedures are primary outcome measures for pediatric TBI in clinical practice (47). In addition to outcome favorability, we investigated how intracranial densitometry responds to these outcome measures. LOS was closely associated with mortality in TBI patients (48); nevertheless, mortality is the worst consequence of TBI. Therefore, undoubtedly, the most sensitive response of intracranial densitometry was mortality. The averaged densitometry between the survival and deceased groups showed the most distinguishable difference among the alternative outcome measures. This finding suggests that deceased patients enter an irreversible state in which the brain densitometry was significantly different from the normal state. On the other hand, the morphology of intracranial densitometry was relatively similar between groups based on the LOS, a less severe outcome. However, the dichotomized LOS revealed a significant difference in the GM-related HU range. In contrast to mortality and LOS, no statistically significant difference in intracranial densitometry was observed between the groups classified by surgery necessity. This non-significance implies that, as mentioned above, the focal lesions that require surgical intervention may have a limited effect on the deformation of whole-cerebrum densitometry.

Machine-Learning-Based Prognostic TBI Model Augmented by Densitometric Information
This study proposed a novel prognostic model based on GDBTs augmented with CRASH-CT and intracranial densitometric information. Based on various outcome measures, the proposed models (i.e., D/CRASH-CT CatBoost ) achieved enhanced prognostic capacity compared with the conventional model; this consistent improvement can be contributed by a combination of (1) densitometric information and (2) machine learning models. Intracranial densitometry is a quantitative input variable that responds to global pathological changes in the intracranial region (18,25). The proposed model included additional densitometric information, enhancing prognostic values, especially for GOSrelated outcome measures (Figures 5, 6A). On the other hand, in terms of LOS and surgery necessity, CRASH-CT input variables (e.g., age, GCS, pupil reaction, extracranial injury, CT findings) played a more substantial contribution than the densitometric information (Figures 6B,C). In this case, it can be assumed that machine learning itself contributed to increasing the prognostic value rather than adding the densitometric variables. Logistic regression does not consider the correlation between the input variables and has multicollinearity problems (49), whereas CatBoost can lead to better performance by reducing information loss by creating a combination considering the correlation between the input variables (50).
The significant factors contributing to the decreased prognostic capacity of the conventional CRASH-CT model are 2-fold, namely, interrater variability, and validation method. The CRASH-CT model is a widely used prognostic TBI model (13), and its variables are based on the injury status and initial CT findings. The radiological findings used as the input of the CRASH-CT model are based on the Marshall CT classification, which has been reported to have ∼12.7% interrater variability (51). Interrater variability changes the results of a prognostic model and thus lowers its reliability. In addition, CRASH-CT yielded low accuracy for pediatric TBI patients in this study, unlike reports in the literature (25,52). It can be assumed that it is not properly fitted to predict the outcome at discharge since the original CRASH-CT model predicts outcome favorability after 14 days (13). In addition, the LOOCV method, a stricter validation method than others reported in the literature (25,52), also contributed to the low prognostic capacity of the CRASH-CT model. Previous studies (13,25,52) in which the training and testing of logistic regression models were performed with the same cohort without external validation have the potential to overestimate the prognostic capacity, and it is complicated to resolve the overfitting problem oriented to the cohort. In this study, individual and iterative validations were performed 58 times by separating training and testing subjects through LOOCV. Despite the more rigorously evaluated results, improvements in prognostic value were observed by adding densitometric information to the CRASH-CT input. A significantly enhanced prognostic value could be achieved compared with CRASH-CT when only initial brain CT data were used, suggesting that machine learning-based automated densitometry is a useful prognostic tool that can minimize unnecessary radiation exposure in children with TBI.
In several recent studies, prognostic models using machine learning have been proposed to predict the outcome favorability of patients with pediatric TBI more accurately (53)(54)(55) (54). Tunthanatip et al. compared the prognostic value of various machine learning models using comorbidity and radiological finding information and concluded that the SVMbased model showed the highest performance (55). These studies consistently reported that machine learning outperformed conventional logistic regression (53)(54)(55), and the same results were obtained in the present study. However, unlike the GBDT used in this study, the ANN and SVM models used in previous studies are "black box" models that are difficult to interpret (56). Interpretability of the model is an important issue when using machine learning in the medical domain (21). A black box-based prognostic model that lacks an explanation of how much an input variable contributed to the prediction may have limited application in the clinical environment (57). It is complicated to convince a clinician of the results of a model when they are presented without any explanation. On the other hand, the proposed GBDT-based model showed reliable prognostic capacity and could explain how the input variables contribute to the prediction. Consequently, unlike the previous ML-based prognostic models for pediatric TBI, which entail a trade-off between interpretability and prognostic capacity (53)(54)(55), the proposed model accomplished both.

Limitations and Suggestions
Several limitations should be considered. First, this study was a single-center, retrospective pilot study with a small cohort size. The small cohort size hampered the appropriate distribution of TBI severity; mild TBI patients were the most dominant in this study. Thus, the proposed model should be validated by a large-scale dataset, and a prospective, multicenter study is required for generalization of the model proposed in this study. Second, this study used only the in-hospital outcome measures because there was no long-term outcome information in our institutional database. The model's prediction of long-term outcomes should also be validated. Third, except for intracranial densitometry, elaborative modalities for assessing cerebral edema in TBI patients were not used in this study. The extent of cerebral edema would have been better assessed and cross-validated using MRI (e.g., diffusion-weighted imaging and apparent diffusion coefficient mapping). In future validation studies, the benefits of the proposed method will be investigated to address the following issues: (1) changes in physician decision-making as the result of utilizing densitometric analysis; (2) estimation of other intracranial pathologies (e.g., intracranial hypertension); and (3) prognostic capacity of the proposed models for predicting long-term outcomes.

CONCLUSION
This study revealed that intracranial densitometric information derived from initially acquired brain CT scans performed at admission was highly associated with worse outcomes in pediatric TBI patients. Contrary to the conventional TBI prognostic model (i.e., CRASH-CT model), which mainly uses arbitrary measures (i.e., Marshall classification) that suffer from inter-and intrarater variability, fully automated densitometric analysis of the whole cerebrum was supplemented in the construction of the prognostic model. Accordingly, the prognostic value of brain CT was significantly enhanced by augmenting densitometric information with a cuttingedge GBDT-based machine learning model. In conclusion, intracranial densitometry information could improve the reliability of brain CT-based clinical decision-making during the acute phase of TBI and may serve as the basis for enhancing the TBI prognostic model.

DATA AVAILABILITY STATEMENT
The datasets presented in this article are not readily available because sharing data outside is not available according to the policy of our institution. Requests to access the datasets should be directed to http://hrpp.snuh.org/.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Ethics Committee of Seoul National University Hospital (IRB H-1706-144-862). Written informed consent from the participants' legal guardian/next of kin was not required to participate in this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
Y-TK drafted and revised the manuscript, carried out clinical data analysis and interpretation, and contributed to the design of the study. HK critically revised the manuscript and contributed to the design of the study. C-HL performed prognostic modeling via machine learning. C-HL, BY, JK, YC, W-SC, and B-MO critically revised the manuscript. B-MO performed data collection. D-JK conceptualized and designed the study, critically revised the manuscript, and oversaw the creation of the final manuscript. All authors approved the final manuscript as submitted and agree to be accountable for all aspects of the work.