Abnormal structural and functional network topological properties associated with left prefrontal, parietal, and occipital cortices significantly predict childhood TBI-related attention deficits: A semi-supervised deep learning study

Introduction Traumatic brain injury (TBI) is a major public health concern in children. Children with TBI have elevated risk in developing attention deficits. Existing studies have found that structural and functional alterations in multiple brain regions were linked to TBI-related attention deficits in children. Most of these existing studies have utilized conventional parametric models for group comparisons, which have limited capacity in dealing with large-scale and high dimensional neuroimaging measures that have unknown nonlinear relationships. Nevertheless, none of these existing findings have been successfully implemented to clinical practice for guiding diagnoses and interventions of TBI-related attention problems. Machine learning techniques, especially deep learning techniques, are able to handle the multi-dimensional and nonlinear information to generate more robust predictions. Therefore, the current research proposed to construct a deep learning model, semi-supervised autoencoder, to investigate the topological alterations in both structural and functional brain networks in children with TBI and their predictive power for post-TBI attention deficits. Methods Functional magnetic resonance imaging data during sustained attention processing task and diffusion tensor imaging data from 110 subjects (55 children with TBI and 55 group-matched controls) were used to construct the functional and structural brain networks, respectively. A total of 60 topological properties were selected as brain features for building the model. Results The model was able to differentiate children with TBI and controls with an average accuracy of 82.86%. Functional and structural nodal topological properties associated with left frontal, inferior temporal, postcentral, and medial occipitotemporal regions served as the most important brain features for accurate classification of the two subject groups. Post hoc regression-based machine learning analyses in the whole study sample showed that among these most important neuroimaging features, those associated with left postcentral area, superior frontal region, and medial occipitotemporal regions had significant value for predicting the elevated inattentive and hyperactive/impulsive symptoms. Discussion Findings of this study suggested that deep learning techniques may have the potential to help identifying robust neurobiological markers for post-TBI attention deficits; and the left superior frontal, postcentral, and medial occipitotemporal regions may serve as reliable targets for diagnosis and interventions of TBI-related attention problems in children.


Introduction
Traumatic brain injury (TBI) is a major public health concern. For children in the United State, TBI-related emergency department visits exceeded 600,000 every year (Dewan et al., 2016). Children with TBI have elevated risks in developing neurocognitive impairments and behavioral abnormalities (Konigs et al., 2015;Polinder et al., 2015;Lumba-Brown et al., 2018). Significant attention deficits are among the most common cognitive consequences that can be observed in more than 35% of children two years post-TBI (Max et al., 2005). The attention problems in children post-TBI can persist into late adolescence and have been linked to the development of severe psychopathology and impairments in overall functioning (Le Fur et al., 2019;Narad et al., 2019). Without having established neurobiological signatures, treatments and interventions of TBI-related attention deficits in children have been based on subjective observations from clinicians and have resulted in suboptimal efficacy (Backeljauw and Kurowski, 2014;Kurowski et al., 2019;LeBlond et al., 2019).
In the past two decades, a number of clinical and neuroimaging studies have tried to investigate the neuroanatomical and functional substrates associated with TBI-related attention problems in children. Several diffusion tensor imaging (DTI) studies reported that the white matter integrity in corpus collosum, superior longitudinal fasciculus, and inferior fronto-occipital fasciculus were linked with impaired attention function in children with chronic TBI (Wozniak et al., 2007;Kurowski et al., 2009;Dennis et al., 2015;Konigs et al., 2018). Task-based functional magnetic resonance imaging (fMRI) studies have also reported functional alterations in frontal, parietal, and occipital regions during inhibition and sustained attention process (Kramer et al., 2008;Tlustos et al., 2011Tlustos et al., , 2015Strazzer et al., 2015).
Known as a foundation of neuroscience, human brain regions do not work in an isolated manner. The existing voxel-and regionof-interest (ROI)-based studies have limitations in addressing how, in the systems-level, certain brain regions are vulnerable to TBI and contribute to related cognitive and behavioral consequences. The graph theoretical technique (GTT)-based approaches have been increasingly implemented in human brain imaging data to construct structural and/or functional brain networks in a systemslevel, and to characterize the network integration, segregation, centrality, and small-worldness in both the global and regional (sub-network) scales (Bullmore and Sporns, 2009). Studies have reported that children with TBI demonstrated a less integrated structural or functional brain network compared to healthy controls (Caeyenberghs et al., 2012;Konigs et al., 2017;Yuan et al., 2017;Botchway et al., 2022;Ware et al., 2022). Our recent GTT-based studies in both DTI and task-based fMRI data reported that, compared to group-matched typically developing children (TDC), children with diagnosed TBI-related attention deficits (TBI-A) had significant regional topological alterations associated with frontal, parietal, and temporal lobes in both structural and functional networks, with the altered regional topological properties associated with parietal and temporal regions significantly linking to elevated inattentive symptoms in children with TBI-A (Cao et al., 2021a,b). These existing studies suggest that TBI-related attention deficits in children have close relationships with systems-level functional and structural abnormalities associated with multiple brain regions. However, all these studies have adopted conventional parametric models (such as t-test, analysis of variance, etc.) for group comparisons, which have very limited capacities to deal with the large-scale and nonlinearly related neuroimaging measures.
Compared to conventional parametrical models, machine learning techniques have the capacity in learning the joint effects of measures in high dimensional space and have the sensitivity in detecting subtle information that have high discriminative/predictive power (Nielsen et al., 2020). When aided with feature selection methods and cross-validation methods, machine learning techniques can deliver efficient and robust classifications between different groups. A few existing studies in children with TBI have applied machine learning techniques. By constructing classification model using support vector machine (SVM) and edge density image, one study was able to differentiate 14 children with TBI and 10 controls with an area under the receiver-operating-characteristic-curve (AUC) of 0.94 (Raji et al., 2020). Another study built an SVM-based classification model using structural MRI data and DTI data from 29 student athletes (aged from 15 to 20 years) and 27 controls and achieved an AUC of 0.84 (Tamez-Pena et al., 2021). A longitudinal study reported that when combining resting-state MRI data and structural MRI data in 99 children with TBI at 4 weeks after the injury, SVM algorithm Frontiers in Neuroscience 02 frontiersin.org was able to predict the recovery of post-concussion symptoms at 8 weeks with an AUC of 0.86 (Iyer et al., 2019). However, the majority of these machine learning studies in children with TBI applied supervised models that only focused on discriminating labels of the two diagnostic groups, and none of these studies have intended to detect the neurobiological features associated with the most common TBI-related cognitive deficits.
In this study, we propose to utilize a deep learning technique, semi-supervised autoencoder, to identify the robust functional and structural brain signatures of TBI-related attention deficits in children. Deep learning techniques were highly effective in generating feature representations by learning the deep linear or nonlinear relationships within a high dimensional space of the study measures (LeCun et al., 2015). Based on results of previous study from our and other teams (Wozniak et al., 2007;Kramer et al., 2008;Kurowski et al., 2009;Tlustos et al., 2011;Dennis et al., 2015;Strazzer et al., 2015;Tlustos et al., 2015;Konigs et al., 2018;Cao et al., 2021a,b), we hypothesize that topological anomalies associated with frontal, parietal, and temporal regions in the functional and structural brain networks not only play the most important role in characterizing children with TBI when compared to controls, but also most significantly contribute to TBI-related attention deficits in the affected children.

Participants
A total of 110 children, including 55 children with TBI and 55 group-matched controls, were initially involved in this study. The TBI subjects were recruited from the New Jersey Pediatric Neuroscience Institute, Saint Peter's University Hospital, and local communities in New Jersey. Controls were solicited from the local communities by advertisement in public places. The study received institutional review board approval at the New Jersey Institute of Technology and Saint Peter's University Hospital. Prior the study, all the participants and their parents or guardians provided written informed assent and consent, respectively.
The inclusion criteria for the TBI group were: (1) has history of at least one clinical diagnosed mild or moderate nonpenetrating TBI (Teasdale and Jennett, 1974); (2) has no overt focal brain damages or hemorrhages during all the TBI incidences; (3) the first TBI incidence was at least 6 months prior to the study date; (4) has no significant inattention or hyperactive problems before the injury. The control group included children with no history of diagnosed TBI or no history of diagnosed attention deficit/hyperactivity disorder (ADHD). Conners 3 rd Edition-Parent Short form  were assessed during the study visit to characterize the inattention problems and hyperactivity/impulsivity problems in both groups (Conners, 2008).
To further improve the homogeneity of the study sample, the general inclusion criteria for both groups included (1) only right-handed, to remove handedness-related potential effects on brain structures, which the handedness were evaluated using the Edinburgh Handedness Inventory (Oldfield, 1971); (2) full scale IQ ≥ 80, which were estimated by the Wechsler Abbreviated Scale of Intelligence II (WASI-II) (Wechsler, 2011); (3) has no current or previous diagnosis of Autism spectrum disorders, pervasive development disorder, psychosis, major mood disorders (except dysthymia not under treatment), post-traumatic stress disorder, obsessive compulsive disorder, conduct disorder, anxiety (except simple phobias), or substance use disorders, based on Diagnostic and Statistical Manual of Mental Disorders 5 (DSM-5) (Association, 2013) and supplemented by the Kiddie Schedule for Affective Disorders and Schizophrenia for School-Age Children-Present and Lifetime Version (K-SADS-PL) (Kaufman et al., 2000); (4) has no learning disabilities, neurological disorders, or any types of diagnosed chronic medical illnesses, from the medical history. None of the subjects involved in this study had any treatments with long-acting stimulants or non-stimulant psycho-tropic medications within the past month nor any contraindications for MRI scanning, such as claustrophobia, tooth braces, or other metal implants.
After initial processing of the neuroimaging data from each subject, three subjects from the TBI group and two subjects from the control group were excluded due to low imaging quality or excessive motions in either DTI data or functional MRI data. Therefore, a total of 52 children with TBI and 53 controls were included in the group-level analyses. All the demographic information was shown in Table 1.

Neuroimaging data acquisition protocol
For each subject, a DTI scan, a task-based functional MRI scan, and a high-resolution T1-weighed MRI scan were collected using a 3-Tesla Siemens TRIO (Siemens Medical Systems, Germany) scanner at Rutgers University Brain Imaging Center. The DTI data were acquired using a single-shot echo planar sequence, with the following parameters: voxel size = 2.0 mm × 2.0 mm × 2.5 mm, repetition time (TR) = 7,700 ms, echo time (TE) = 103 ms, field of view (FOV) = 250 mm × 250 mm, 30 diffusionsensitizing gradient directions with b-value = 700 s/mm 2 , and one image with b-value = 0 s/mm 2 . The fMRI data were acquired using a whole brain gradient echo-planar sequence, with the following parameters: voxel size = 1.5 mm × 1.5 mm × 2.0 mm, TR = 1,000 ms, TE = 28.8 ms, and FOV = 208 mm. A highresolution T1-weighted structural image was also collected with a sagittal multi-echo magnetization-prepared rapid acquisition gradient echo sequence with the following parameters: voxel size = 1 mm 3 isotropic, TR = 1,900 ms, TE = 2.52 ms, flip angle = 9 • , FOV = 250 mm × 250 mm, and 176 sagittal slices. The T1weighted image were used for fMRI co-registration and creation of individualized brain region masks in DTI-based structural brain network construction.

Visual sustained attention task for fMRI
In the current study, the fMRI data for each subject were collected during an enhanced continuous performance task, the visual sustained attention task (VAST), which was designed to achieve optimal power in maintaining sustained attention and to assess related functional brain pathways in children (Li et al., 2012;Cao et al., 2021a). The VAST is a block-designed task which included five task stimulations block that interleaved with five resting blocks. The total duration is 5 min with each block last 30 s. During task blocks, subjects were asked to remember a sequence of three numbers and responds when the stimulus sequences match the target. To ensure full understanding of the instructions, practical trials of the task were provided to each subject before the scan session.

Individual level structural MRI and DTI data processing and structural brain feature generation
Each individual's structural MRI data was visually checked for artifacts and excessive motions. Then the preprocessing steps, including registration into Talairach space, skull-stripping, and intensity normalization, were performed using Freesurfer v6.0.0 (Fischl, 2012). The preprocessed structural MRI data were parcellated using Desikan atlas and were used for node generation in constructing the structural brain network.
To construct the structural network, the DTI data were preprocessed using the Diffusion Toolbox from FMRIB Software Library v6.0 (FSL) (Woolrich et al., 2001). The preprocessing steps included head-motions correction, non-brain voxels removal, and intensities normalization. The head motions and eddy-current distortion were then corrected with affine transformation and predictions estimated by a Gaussian Process (Andersson and Sotiropoulos, 2016). Heavy head movement is a critical issue that can significantly affect the quality of imaging data and cause inaccurate results of tractography. In this study, the cutoffs of heavy head movements were defined as data with> 2 mm translational displacement, > 5 • rotational displacement, or > 0.2 mm mean volume-by-volume displacement. Three subjects from TBI group and one subject from control group were excluded due to heavy head motion. Then, the probabilistic tractography parameters of each voxel were estimated with a two-fiber model in each individual's native space. For each subject, a total of 78 ROIs were selected as the nodes for structural brain network, including 34 cortical regions and 5 subcortical regions per hemisphere. The mask for each ROI was generated based on the parcellation in the preprocessed structural MRI data and transformed into the native diffusion space. Probabilistic tractography were used to estimate the connecting fibers between each pair of the seed masks. Five thousand streamlines per voxel were then initiated from each seed mask, with 0.5 step distance. A fiber was terminated when (1) it reached other seed masks; (2) it exceeded 2,000 step limits; (3) it looped back to the same streamline; or (4) its curvature exceeded 80. The streamlines between seed masks were averaged in both directions to determine the structural connectivity between network nodes. Due to the connection density bias, the white matter bundle with higher anisotropy usually generate significantly higher streamline counts in the probabilistic tractography process (Jones, 2010;Zhang et al., 2022). Therefore, in this study, the weight of a non-zero edge was evaluated by log-transformed streamline count and normalized by dividing the maximum edge weight in the same network to increase the discriminability of low edge weights (Ashourvan et al., 2019;Hansen et al., 2022). Then for each subject, a 78 × 78 symmetric connectivity matrix was generated for construction of the weighted structural brain network.
After the weighted structural brain network was constructed for each subject, the network topological properties were calculated [technical details for computations were provided in our previous publications (Cao et al., 2021b)]. The nodal-level topological properties for weighted network, including the nodal strength, nodal global efficiency, nodal local efficiency, clustering coefficient, and betweenness centrality, were calculated for each node in the structural brain networks to serve as structural brain features. All structural network topological properties were calculated using the Brain Connectivity Toolbox (Rubinov and Sporns, 2010). A total of 390 structural brain features were generated for building the semi-supervised autoencoder.

Individual level fMRI data processing and functional brain feature generation
The preprocessing of the fMRI data was carried out using FEAT Toolbox from FSL v6.0 (Woolrich et al., 2001). For fMRI data, the same cutoffs of heavy head motions that used in DTI preprocessing were applied, with which two subjects from TBI group (overlapped with excluded subjects in DTI preprocessing) and one subject from control group were excluded. After motion correction and slice timing correction, the fMRI data of each subject was co-registered to standard Montreal Neurological Institute (MNI) space using high-resolution structural MRI. The hemodynamic response to the task-related condition was modeled using the general linear model with 24 motion parameters. The activated voxels were identified by cluster-based thresholding on the Z statistic map with Z > 2.3 and p < 0.05. To construct the functional brain network for each subject, the network nodes were generated by defining a spherical region with a radius of 5 mm at the local maximum of any clusters that have more than 100 activated voxels. A total of 59 ROIs were generated based on the automatic anatomical labeling atlas (Tzourio-Mazoyer et al., 2002). The connectivity of a ROI-pair was represented by the Pearson's correlation coefficient of the bloodoxygen-level-dependent (BOLD) signal in each of the two ROIs. The connectivity matrix was then binarized using the network cost range that satisfied small-network property to construct the binarized functional brain network (Achard and Bullmore, 2007).
The nodal-level topological properties for binarized network, including the nodal degree, nodal global efficiency, nodal local efficiency, clustering coefficient, and betweenness centrality, were calculated for each node in the functional brain networks [technical details for computations were provided in our previous publications (Cao et al., 2021a)]. The individual-level analysis was performed using pipeline tool GAT-FD (Cao et al., 2022), where all network topological properties were calculated by calling functions from the Brain Connectivity Toolbox (Rubinov and Sporns, 2010). A total of 295 nodal topological properties were calculated from the functional brain networks to serve as the functional brain features of each subject for building the semisupervised autoencoder.

Modeling of semi-supervised autoencoder
To increase training robustness and reduce overfitting risk, combination of three approaches, including two-sample t-test, mutual information-based method (Ross, 2014), and Lasso-based method (Muthukrishnan and Rohini, 2016), were utilized for feature reduction. At the end, a total of 60 top features from the 685 source brain features derived from structural and functional brain networks were selected for training in the model. Before passing to the autoencoder model, all these features were normalized to a range of 0 to 1 using min-max normalization.
The semi-supervised autoencoder consisted of three major components, the encoder, the decoder, and the classifier, as shown in Figure 1. The encoder and decoder were part of a regular autoencoder model, which learns a compressed representation of the original brain features by optimizing the reconstructed brain features in an unsupervised manner (Hinton and Salakhutdinov, 2006). The encoder transformed inputs from original feature space into a latent space by compressing the information in the inputs. The encoder in the proposed model contained one input layer with a size of 60, one hidden layer of 40 neurons, and one output layer of 20 neurons. Then the autoencoder-generated features, i.e., AE-features, in the latent space were passed into the decoder to reconstruct the original input. The decoder included an input layer with a size of 20, a hidden layer of 40 neurons, and one output layer of 60 neurons. An additional classifier was included in the proposed autoencoder to work as a constrain in the learning of the compressed AE-features in the latent space. The classifier took 20 AE-features in the latent space to predict the group label for each sample. The classifier included a hidden layer with 20 neurons and an output layer of 1 neuron. Sigmoid function was used as the activation function for all the artificial neurons in the semi-supervised autoencoder neural network.
Two different loss functions were used compensate the different training speeds of the regression task (the decoder) and the classification task (the classifier). Mean squared error (MSE) was selected as the loss function of the reconstruction process, which was calculated using the following formula, where n is the number of subjects in the training data, f is the number of brain features, x ij is the reconstructed value for feature j of subject i, and x ij is original value for feature j of subject i. Binary cross-entropy were selected as the loss function of the classification process, which was calculated using the following formula, where H binary is binary cross-entropy, n is the number of subjects in the training data, y is the binary indicator of the class label, and p is probability of y is 1.
In order to force the model to learn the latent AE-features for reconstruction earlier than for classification, loss of the decoder model was assigned with a higher weight than the loss of the classifier model. The loss function of the full model was calculated using the following formula, where the weight of the decoder loss is 0.7 and the weight of the classifier loss is 0.3.

Model training and evaluation
Training of the model was performed using python v3.8.0 and Tensorflow v2.10 (Abadi et al., 2016). Adam optimizer was used for the back-propagation process (Kingma and Ba, 2014). To increase the robustness of the model, a five-fold cross validation were employed in the training process. For details, the data were split into five stratified folds such that each fold consisted of balanced 20% of the entire data. For each iteration, four-folds were dedicated for training data and the remaining one for validation. To avoid potential leakage effect in the training process, the feature selection algorithms only used training data in each cross validation (Pereira et al., 2009). To further minimize the risk of overfitting in the training process, a gaussian noise with a mean of zero and standard deviation of 0.02 was randomly induced to 20% of the input features, before feeding into the encoder model. The training process stops when the accuracy of the training data exceeds 95% or reach a total of 1,000 epochs.
The performance of the reconstruction process of the semisupervised autoencoder model were measured using the MSE of the validation data and averaged for all the five cross validations. The classification performance was measured in terms of classification accuracy and AUC in the validation data, which also averaged for all five cross validations.
In comparison, a conventional machine learning model was also constructed using the same training and validation procedure. The model used principal component analysis (PCA) for feature reduction and SVM for classification.

Feature importance score calculation
To identify the most important brain features for successful classification process, a permutation-based method was used to calculate the importance score of each input feature (Breiman, 2001). A feature's importance was determined by the amount of error caused by shuffling the feature's value over all the samples (Fisher et al., 2019). For the classification process, the feature importance for a feature was characterized by the binary crossentropy, which was calculated using the following formula, where m is the number of random shuffling, H binary is the crossentropy of the original input, and H binary is the cross-entropy of the shuffled input. The importance score for features in the current study was calculated by shuffling for 1,000 times. Features with importance score that two standard deviation higher than the mean importance score of all features were identified as important features (Sun et al., 2020).

Modeling of brain-behavior relationships
Regression-based machine learning, a support vector regression (SVR) model, was first constructed to study the relations between the most important brain features for successful group discriminations and the severity measures of inattentive and hyperactive/impulsive symptoms (T-scores derived from Conners 3-PS) in the whole study sample. To minimize overfitting, five-fold cross validation were used for training and validation. The R 2 and MSE were used to evaluate the performance of the SVR model. Permutation importance score were used to evaluate the importance of the brain features.
To further validate the robustness of the relationships between the identified important brain features and clinical measures, a partial least squares structural equation modeling (PLS-SEM) was conducted (Hair et al., 2011). The rationale of the PLS-SEM was to test whether the important brain features for classification were associated with any AE-features, and whether those AE-features were associated with the clinical measures, while accounting for the effects of age, sex, handedness, SES, and IQ. The PLS-SEM analysis was carried out using R 4.1.3 and SEMinR 2.3.2 (Hair et al., 2021). First, Pearson's correlation between the AE-features in the latent space and T-scores of the inattentive and hyperactive/impulsive subscales from Conners 3-PS were performed within the whole study sample. The correlation analyses were controlled for potential multiple comparisons (for 20 features in the latent space), by using the Bonferroni correction with a threshold of significance at corrected α 0.05. The AE-features in the latent space that showed significant correlation with the clinical scores were selected as the intermediate variables in the PLS-SEM. Bootstrap with 5,000 random samples were performed to determine the significant levels of the path coefficients in the PLS-SEM analysis (Henseler and Chin, 2010).

Demographic and clinical/behavioral measures
There were no significant between-group differences in any demographic measures in our sample. Among the subjects in TBI group, 14 subjects had no significant inattentive or hyperactive problems, 27 had significant inattentive problems, 2 had significant Comparisons between the normalized features and reconstructed features by the semi-supervised autoencoder model. The normalized input data was shown on the left, the reconstructed data was shown in the middle, and the squared error was shown on the right. The vertical axis represented the subjects in each cross-validation set, and the horizontal axis represented the features. CV, cross-validation; MSE, mean squared error.
hyperactive/impulsive problems, and 12 had significant problems in both inattention and hyperactivity/impulsivity. In the TBI group, the range between first TBI incidence and MRI scan was from 6 to 90 months (7 years 6 months), with average of 33.8 ± 24.2 months. The results showed that children with TBI had significantly more inattentive (t = −9.145, p < 0.001) and hyperactive/impulsive (t = −4.747, p < 0.001) symptoms measured using the T-scores in Conners 3-PS, when compared to controls. No significant correlations were observed between the time after injury and inattention or hyperactivity/impulsivity T-scores. The demographic and clinical information was shown in Table 1.

Performance of the semi-supervised autoencoder
The semi-supervised autoencoder model was able to differentiate children with TBI and controls with a classification accuracy of 82.86% ± 07.97% and an AUC of 0.860 ± 0.061. At the same time, the model was able to reconstruct the original brain features with an MSE of 0.035 ± 0.005, as shown in Figure 2. In comparison, the PCA+SVM model was able to achieve a classification accuracy of 78.09% ± 11.47% with an AUC of 0.825 ± 0.114.

Most important brain features for classification
Network topological properties associated with left inferior and superior frontal, postcentral, inferior temporal and medial occipitotemporal regions were identified as the most important brain features for successful discrimination between children with

Regression model performance and brain-behavior relationships
The SVR model using the top 6 most important brain features was able to explain 9.44% of the variance (R 2 of 9.44% ± 4.02%) in the inattentive symptom T-score in the study Frontiers in Neuroscience 07 frontiersin.org sample ( Figure 3A). And the predicted inattentive symptom T-score yielded an MSE of 0.057 ± 0.015. The functional nodal clustering coefficient of left medial occipitotemporal gyrus and the functional nodal local efficiency of left postcentral gyrus showed the highest predictive values, with feature importance scores of 0.132 and 0.104, respectively. For the SVR model in predicting hyperactive/impulsive symptom T-score, the R 2 was 7.25% ± 2.69% and the MSE was 0.039 ± 0.009 ( Figure 3B). The most important brain features for predicting hyperactive/impulsive symptoms were the structural betweenness centrality of left superior frontal gyrus, with an importance score of 0.114, and the functional nodal clustering coefficient of left medial occipitotemporal gyrus, with an importance score of 0.050 (Table 3).
In the PLM-SEM analysis, AE-feature 17 showed significant direct effect on the inattentive symptoms T-score, and both AE-features 4 and 17 showed significant direct effects on the hyperactive/impulsive symptoms T-score in the whole study sample. Important brain features in left inferior temporal, medial occipitotemporal, postcentral, and superior frontal regions showed significant direct effects on AE-features 4 and 17. The detailed results of the PLM-SEM analysis were shown in

Discussion
To our best knowledge, this is the first study in the field applying deep learning approach in multimodal neuroimaging data to identify the neural signatures associated with post-TBI attention deficits in children. By constructing a semi-supervised autoencoder in task-based fMRI and DTI data from 110 children, this study has identified 6 most predictive brain features, involving functional and structural network topological properties associated with left frontal, parietal, temporal, and occipital lobes. Regression-based machine learning analysis in our study sample further showed that, among these most important brain features, those associated with left postcentral area showed significant predictive value for inattentive symptoms; those associated with left superior frontal gyrus showed significant predictive value for hyperactive/impulsive symptoms; while those associated with left medial occipitotemporal gyrus showed significant predictive value for both inattentive and hyperactive/impulsive symptoms.
In the current study, our semi-supervised autoencoder model has well-behaved in terms of effectiveness and robustness in successful discrimination between children with TBI and controls, with satisfactory accuracy and AUC. The reconstructed features also showed minimal error, measured using MSE, when compared to the input features. Compared to the conventional PCA+SVM model, our semi-supervised autoencoder model achieved higher classification accuracy and AUC. The reconstruction process preserved the distinctive information while reducing the feature dimensionality for the classification process (Hinton and Salakhutdinov, 2006;Kamal and Bae, 2022). In addition, the added gaussian noise to input features during the training process of the semi-supervised autoencoder model further improve the generalization performance of the constructed deep neural network model (Audhkhasi et al., 2016;Noh et al., 2017). Therefore, relative to those reported in the majority of existing conventional model-based studies, our identified brain substrates for childhood TBI and its related attention deficits are more reliable and have more significant value in guiding tailored diagnoses and interventions in affected children.
Our study observed the important roles of the structural topological alterations of left inferior frontal gyrus, left superior frontal gyrus, and left frontal pole in differentiating children with TBI and controls. In addition, the betweenness centrality (which represent the capacity of serving as a bridging node) of left superior frontal gyrus showed significant value for successfully predicting severity of the hyperactive/impulsive symptoms in the whole study sample. Those regions were part of the prefrontal cortex, which is an essential component in the top-down control pathway that facilitate the selective attention, inhibition, and sensory modulation (Buschman and Miller, 2007;Rossi et al., 2009;Katsuki and Constantinidis, 2014). Structural MRI and DTI studies have consistently reported decreased gray matter volume, reduced cortical thickness, and disrupted white matter integrity in left prefrontal area in children with TBI (Wilde et al., 2012a;Mayer et al., 2015;Dennis et al., 2016). Our previous investigation also reported significant structural topological alterations in left inferior frontal gyrus in children with TBI-A (Cao et al., 2021b). Linking with these existing findings, our findings of altered structural connectivity within left prefrontal cortex and between left prefrontal and other brain regions may be related to the axonal damages caused TBI; and the persisted structural alterations in the left prefrontal area in children with chronic TBI might disrupt the attention processing pathways and contribute to the emergence of hyperactive/impulsive symptoms.
Meanwhile, the functional nodal local efficiency (which represent regional integration in the whole network) in the left postcentral gyrus were identified as one of the most important brain features for accurate group classification as well as one of the most valuable brain features in predicting severity of inattentive symptoms in the whole study sample. The postcentral gyrus is responsible for transferring tactile information during the spatial attention, which is a key region in the attention top-down and bottom-up pathways (Macaluso et al., 2000;Buschman and Miller, 2007;Katsuki and Constantinidis, 2014). Existing task-based fMRI studies have reported functional alterations of postcentral gyrus in children with TBI during inhibitory control (Tlustos et al., 2015) and sustained attention (Cao et al., 2021b). Our functional network study also reported that the increased nodal local efficiency  Results of partial least square structural equation modeling analysis. The paths with significant direct effects were shown in black solid line. The paths without significant effects were shown in gray dashed line. The numbers next to the significant paths were standardized path coefficient. The p-values were calculated by applying bootstrapping with 5,000 random samples. AE-Features: autoencoder-generated features; NCC, nodal clustering coefficient; NLE, nodal local efficiency; BC, betweenness centrality; SES, socioeconomic status, was calculated using the average education years of the parents.
Frontiers in Neuroscience 09 frontiersin.org Intriguingly, our study also found that the functional nodal clustering coefficient (which represent the regional connectivity) in left medial occipitotemporal gyrus was an important brain feature in differentiating TBI and control, as well as a significant predictor for both inattentive and hyperactive/impulsive symptoms. The occipitotemporal gyrus has been associated with visual information processing, especially letter process (Mechelli et al., 2003;Vinckier et al., 2007), and was also found to play important role in visual imagery and internally directed cognition (Benedek et al., 2016;Ceh et al., 2021). Structural MRI studies have reported reductions in gray matter volume of the medial occipitotemporal gyrus in children with TBI, and the reduction can persist years after the injury (Wilde et al., 2012b;Dennis et al., 2016). However, no existing studies have reported functional alterations in medial occipitotemporal gyrus in children with TBI. One of the reasons might be that the conventional parametric models lack the sensitivity in detecting the subtle functional alterations in medial occipitotemporal gyrus.
There are some limitations associates with the current study. First, although we have a total of 110 subjects involved in the study, this sample size is still relatively modest in the deep learning field. Such sample size still has potential risk for having overfitted model and limited generalizability. To minimize such risk, we utilized multiple feature selection methods, applied crossvalidation, and implemented an additional gaussian noise layer during the training process. Future research with an even larger sample size is expected to further validate the findings of this study. Second, streamline count-based structural brain network can be biased using probabilistic tractography (Zhang et al., 2022). To reduce potential effects, estimation of streamline count was performed in the native diffusion space using individualized brain parcellations and edge weights were normalized in the individuallevel analysis. Other graph theory techniques on structural brain network, like fiber density-based (Smith et al., 2015), connectivity probability-based (Cao et al., 2013), and microstructural measurebased (Girard et al., 2017), can be explored to validate the significance of the current findings. Third, the sex factor associated with post-TBI attention deficits was not investigated in this study. Recent clinical studies with large sample size (> 500) reported that girls with TBI had significantly higher risk in developing attention problems than boys (Keenan et al., 2018;Wade et al., 2020). We did not investigate sex-specific neural markers, considering the sample size limitation mentioned above. To partially remove the potential confounding effects, sex was added in our post hoc analysis and showed no significant associations with inattentive or hyperactive symptoms. Future studies with much larger samples are required to thoroughly investigate the sex-specific neural markers of post-TBI attention deficits in children.
In summary, the current study has constructed a semisupervised autoencoder to effectively and robustly discriminate children with TBI and controls while preserve the intrinsic neuroimaging characteristics in the reconstruction of brain features. All the predominant brain features in differentiating children with TBI and controls were in the left hemisphere, including the functional and structural topological alterations involving left frontal regions, postcentral regions, and temporal regions. More importantly, the highly discriminative brain features in left frontal regions, parietal regions, and medial occipitotemporal regions demonstrated significant value for predicting elevated inattentive and/or hyperactive/impulsive symptoms in children post-TBI. The findings of this study suggest that deep learning techniques may have the potential to help identifying robust neurobiological markers for post-TBI attention deficits; and the left superior frontal, postcentral, and medial occipitotemporal regions may serve as reliable targets for the diagnosis and interventions of TBI-related attention problems in children.

Data availability statement
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement
The studies involving human participants were reviewed and approved by Institutional Review Board at the New Jersey Institute of Technology and Institutional Review Board at Saint Peter's University Hospital. Written informed consent to participate in this study was provided by the participants or their legal guardian/next of kin.

Author contributions
XL and JH designed the study. MC worked on literature searching, clinical and imaging data analyses, and wrote the first draft of the manuscript. MC, KW, JH, and XL edited and revised the manuscript. All authors contributed to and have approved the final manuscript.

Funding
This work was partially supported by research grants from the National Institute of Mental Health (R03MH109791, R15MH117368, and R01MH126448) and the New Jersey Commission on Brain Injury Research (CBIR17PIL012 and CBIR22PIL002).