Diagnostic Performance of MRI Volumetry in Epilepsy Patients With Hippocampal Sclerosis Supported Through a Random Forest Automatic Classification Algorithm

Introduction: Several methods offer free volumetry services for MR data that adequately quantify volume differences in the hippocampus and its subregions. These methods are frequently used to assist in clinical diagnosis of suspected hippocampal sclerosis in temporal lobe epilepsy. A strong association between severity of histopathological anomalies and hippocampal volumes was reported using MR volumetry with a higher diagnostic yield than visual examination alone. Interpretation of volumetry results is challenging due to inherent methodological differences and to the reported variability of hippocampal volume. Furthermore, normal morphometric differences are recognized in diverse populations that may need consideration. To address this concern, we highlighted procedural discrepancies including atlas definition and computation of total intracranial volume that may impact volumetry results. We aimed to quantify diagnostic performance and to propose reference values for hippocampal volume from two well-established techniques: FreeSurfer v.06 and volBrain-HIPS. Methods: Volumetry measures were calculated using clinical T1 MRI from a local population of 61 healthy controls and 57 epilepsy patients with confirmed unilateral hippocampal sclerosis. We further validated the results by a state-of-the-art machine learning classification algorithm (Random Forest) computing accuracy and feature relevance to distinguish between patients and controls. This validation process was performed using the FreeSurfer dataset alone, considering morphometric values not only from the hippocampus but also from additional non-hippocampal brain regions that could be potentially relevant for group classification. Mean reference values and 95% confidence intervals were calculated for left and right hippocampi along with hippocampal asymmetry degree to test diagnostic accuracy. Results: Both methods showed excellent classification performance (AUC:> 0.914) with noticeable differences in absolute (cm3) and normalized volumes. Hippocampal asymmetry was the most accurate discriminator from all estimates (AUC:1~0.97). Similar results were achieved in the validation test with an automatic classifier (AUC:>0.960), disclosing hippocampal structures as the most relevant features for group differentiation among other brain regions. Conclusion: We calculated reference volumetry values from two commonly used methods to accurately identify patients with temporal epilepsy and hippocampal sclerosis. Validation with an automatic classifier confirmed the principal role of the hippocampus and its subregions for diagnosis.

Introduction: Several methods offer free volumetry services for MR data that adequately quantify volume differences in the hippocampus and its subregions. These methods are frequently used to assist in clinical diagnosis of suspected hippocampal sclerosis in temporal lobe epilepsy. A strong association between severity of histopathological anomalies and hippocampal volumes was reported using MR volumetry with a higher diagnostic yield than visual examination alone. Interpretation of volumetry results is challenging due to inherent methodological differences and to the reported variability of hippocampal volume. Furthermore, normal morphometric differences are recognized in diverse populations that may need consideration. To address this concern, we highlighted procedural discrepancies including atlas definition and computation of total intracranial volume that may impact volumetry results. We aimed to quantify diagnostic performance and to propose reference values for hippocampal volume from two well-established techniques: FreeSurfer v.06 and volBrain-HIPS.
Methods: Volumetry measures were calculated using clinical T1 MRI from a local population of 61 healthy controls and 57 epilepsy patients with confirmed unilateral hippocampal sclerosis. We further validated the results by a state-of-the-art machine learning classification algorithm (Random Forest) computing accuracy and feature relevance to distinguish between patients and controls. This validation process was performed using the FreeSurfer dataset alone, considering morphometric values not only from the hippocampus but also from additional non-hippocampal brain regions that could be potentially relevant for group classification. Mean

INTRODUCTION
Quantification of brain anatomical structures from magnetic resonance images (MR) is being increasingly used to recognize pathologic conditions such as temporal lobe epilepsy. Volumetric estimates of hippocampal size are postulated to be more sensitive than visual assessment alone, and also to improve clinical diagnosis in dementia and epilepsy (1)(2)(3)(4).
Temporal lobe epilepsy with hippocampal sclerosis (HS) is one of the most frequent focal epilepsies in adults often refractory to pharmacological treatment; surgical resection is an effective therapeutic option for these patients achieving a seizure-free rate close to 80%.
Patients with temporal epilepsy and HS usually share clinical key features associated with the majority of seizure discharges including characteristic aura, arrest, alteration of consciousness (and amnesia), and automatisms. Relatively typical scalp EEG findings can be seen in the interictal state, at the seizure onset, during the course of the seizure, and postictally.
Hippocampal sclerosis is suspected in epilepsy patients when compatible ictal semiology and scalp EEG findings are found, but definitive diagnosis is established based on characteristic brain MR anomalies. Neuroimaging abnormalities are typically recognized in the hippocampus proper, including atrophy, loss of internal structure, and decreased T1-and increased T2-FLAIR signal intensity in clinical practice (5). Inspection of hippocampal coronal sections allows for a side-by-side comparison of asymmetry in volume, shape, and signal important for clinical diagnosis. Atrophy seems to be the most specific and signal changes the most sensitive biomarker in HS (6). Magnets with high field strengths above 3 T are able to depict subtle blurring of the internal architecture of the hippocampus on T2weighted images (5). Originally, manual segmentation of the hippocampus based on anatomical knowledge and specific MRI landmarks was used to estimate structural volumes. Previous studies using these methods adequately identified lateralization of seizure origin in the temporal lobe of patients with HS. Earlier reports also documented a strong association between severity of histopathological anomalies and hippocampal volumes with an increased diagnostic yield of MR studies (7,8) The recent development of automatic volumetry methods such as FreeSurfer (FS) suite (9) and VolBrain (vB) HIPS (10), among others, makes it possible to account for hippocampal volume differences that may escape visual detection. Several studies validated the utility of hippocampal volumetry for HS detection in temporal epilepsy, mostly based on postoperative correlation or using ex vivo neuroimaging analysis (7)(8)(9)(10). The potential of volumetry measures for postsurgical outcome prediction is still modest, with some improvement in reports considering subfields patterns of atrophy (11).
The main objective of this work is to estimate reference values of sensitivity, specificity, and confidence intervals for classification of a local population of epilepsy patients with unilateral hippocampal sclerosis using two different volumetry approaches.
We analyzed T1 brain MRI volumetry of the hippocampus and hippocampal subfields in a cohort of 61 healthy subjects and in 57 epilepsy patients with confirmed unilateral mesial temporal sclerosis. Anatomical volumes were computed using two wellestablished automatic methods FS and vB. Recorded values for the hippocampus and subregions are expressed as absolute values (in cm 3 ) and further normalized to brain size, quantified as a percent of total intracranial volume (TIV).
Furthermore, we provide hippocampal and subfield volume distribution for a community-based sample of healthy controls (HC) and evaluate subregion asymmetry differences in HC and between patients. We also compared the degree of asymmetry in left and right HS to investigate its relevance for diagnosis and the presence of distinctive patterns of atrophy at the subregion level.
Finally, a validation process was implemented to explore the contribution of non-hippocampal structures for group classification. This was performed using machine learning techniques, considering only FreeSurfer's morphometric information of whole-brain regions, including anatomical volumes and cortical thickness. Specifically, we used a feature selection technique to obtain the optimal number of features to discriminate between patients and HC, and then we performed three binary classifications for each group using a Monte Carlo cross-validation (MCCV) scheme (33) with a random forest classifier.

Participants
Patients were retrospectively enrolled based on medical records from the epilepsy unit between 2014 and 2019 at Nestor Kirchner-El Cruce Hospital at Florencio Varela, Buenos Aires, with a final diagnosis of temporal lobe epilepsy associated with unilateral right (n = 22, 15 females) and left (n = 35, 17 females) hippocampal sclerosis. Diagnosis was established using standardized practices as described in Oddo et al. (34) through clinical examination, assessment of disease history, semiology of seizures along with neuropsychological tests including prolonged video EEG, and compatible findings on 3-T MRI as suggested by ILAE (5). Thirty-one patients (54%) underwent surgical treatment with histopathology confirmation of HS after standard amygdalohippocampectomy with partial temporal lobectomy. The remaining patients are not yet operated but scheduled for surgery. Age-and sex-paired HC (n = 61, 44 females) were recruited mostly from local universities including students and academic personnel.
All participants gave written consent to participate and to make use of medical information for this study. The work described in this paper was carried out in accordance with the code of ethics of the world medical association (Declaration of Helsinki). Research ethics approval was obtained from the Hospital Research Ethics Board at El Cruce Hospital.

Imaging Characteristics and Analysis Methods
Only volumetric T1-weighted images were used in this study. These images were obtained as part of the clinical protocol for epilepsy workout in our institution and were acquired using the same MR unit (Philips Achieva 3T, 8-channel head coil), as recommended on recent specialized guidelines (5). Structural images consist of a 3D T1WI (FFE) sequence, with 180 slices of 1-mm isotropic resolution, TE= 3.3 msec, TR= 2300 msec, TI= 900 msec, flip angle= 9 • , and field of view (FOV)= 240 × 240 × 180. Images were exported from the scanner and transformed to Nifti format for further analysis. For the statistical analysis, the same T1 volumetric images were processed using two established and freely available methods used to calculate brain region segmentation and quantification, namely, FreeSurfer Suite v6.0 (FS) working in an offline workstation and VolBrain-Hips 2016 (vB) that provides online services running on remote servers through a website interface.
Both methods offer validated hippocampal and hippocampal subfield segmentation through different approaches, distinct reference atlases, dissimilar processing times, and specific subfield region delineations. Output files and results from both methods were independently reviewed by two experienced neuroradiologists (JPP and GDS) looking for labeling inconsistencies and to assure quality control (no manual correction was performed). (See segmentation details for each method in Figure 1). Full documentation is available for processing details on each software platform, but here we describe a resumed version of each method.

FREESURFER V6.0
All T1 brain volumes were processed to obtain a complete morphometric description. Cortical reconstruction and volumetric segmentation were performed in each participant's native space on FreeSurfer's 1 (v 6.0) image analysis suite.
Briefly, image processing included removal of non-brain tissue using a hybrid watershed/surface deformation procedure, an automatic Talairach transformation, segmentation of the subcortical WM and deep GM volumetric structures (including hippocampus, amygdala, caudate, putamen, and ventricles), intensity normalization, tessellation of the GM-WM boundary, an automatic topology correction, and surface deformation following intensity gradients to optimally place the GM/WM and GM/CSF borders at the location where the greatest shift in intensity defines the transition to the other tissue class (9).
Once the cortical models were complete, a number of deformable procedures were performed for further data processing and analysis, including surface inflation and registration to a spherical atlas-based on individual cortical folding patterns to match cortical geometry across subjects, parcellation of the cerebral cortex into units relative to gyral and sulcal structure, and creation of a variety of surface-based data-including maps of curvature and sulcal depth. These methods use both intensity and continuity information of the entire 3D MR volume from segmentation and deformation procedures to produce representations of cortical thickness, which is calculated as the closest distance from the GM/WM boundary to the GM/CSF boundary at each vertex on the tessellated surface (9). The maps were created using spatial intensity gradients across tissue classes; therefore, they were not simply reliant on absolute signal intensity. Since the ensuing maps were not restricted to the voxel resolution of the original data, they can detect submillimeter differences between groups. Procedures for the measurement of cortical thickness have been validated against histological analysis and manual measurements. FreeSurfer morphometric procedures including principal hippocampal subfields have been demonstrated to FIGURE 1 | (A) Examples of common subfield's atlas definition using vB and FS. Boxplots represent mean volumes as percent of TIV and whiskers the 95% confidence interval for the HC group. (B) Right hippocampal 3D models for the same subject, constructed using all subfields from both methods in 3D Slicer 2 ; boxplots represent mean hippocampal volumes for left and right hippocampi in HC expressed in mm3 and in percent of TIV. Upper models show anterior-superior view, and lower images represent inferior projections for comparison. Shaded gray-wireframe area embodies whole hippocampus representation created from standard FS segmentation; note reduced size of the vB model. Most noticeable subregion differences are related to the definition of the anterior and posterior extent of CA1 and posterior subiculum; more medially and dorsally extended in vB. Coincidentally, the hippocampal tail, pre-subiculum, and para-subiculum regions defined in FS represents at least partially overlapping areas between methods. Other deep internal hippocampal structures such as the molecular layer, GCMLDG, fissure, and fimbria are individually ascribed only in FS (C). CA4-DG and CA2-CA3 subfields are jointly segmented in vB; CA4 and GCMLDF are grouped together in FS for comparison purposes. Volume differences are probably not only related to atlas definition; both approaches also show methodological discrepancies for intracranial volume computation. *Significant after Bonferroni correction. **Significant uncorrected p < 0.05. Paired-sample T-test for inter-hemispheric comparison of volumes as percent of TIV in HC.
show good test-retest reliability across scanner manufacturers and across field strengths (35,36).
The FreeSurfer v6.0 algorithm follows a generative, parametric approach which focuses on modeling the spatial distribution of the hippocampal subregions and surrounding brain structures, which is learned from labeled training data. FreeSurfer v6.0 is built with a novel atlasing algorithm and ex vivo MRI data from autopsy brains. The segmentation provides 15 different subregions (12 used in for this work), based on the histology and morphometry from Rosene and Van Hoesen (37) and partly also on (38)(39)(40)(41). See Figure 1 for details on implemented atlas and segmentation.
The ex vivo imaging protocol yields images with high resolution and signal-to-noise ratio. The segmentation algorithm is similar to Van Leemput (42) which is appropriate for analyzing in vivo MRI scans of different manufacturers using different T1 contrasts.
Compared to other new methods available, FreeSurfer involves a prolonged processing time 8∼24 h running on standard single-core systems but also yielding extended quantification of additional brain structures including whole-brain regions beyond hippocampal formations.
We transformed the fixed-width-column plain-text files in which were written down the FreeSurfer outputs to comma separate values (csv) plain-text files which are more suitable to be opened as a Pandas' Dataframe (Python package). To ensure that classifiers did not consider features lacking specific regional information, we eliminated general features like cortical volume, mean cortical thickness, brain volume, and ventricle volume. Finally, to avoid potential biases due to differences among the participants' head size (43), volume measures of each area were normalized as a percentage of the estimated total intracranial volume (eTIV), provided also in FreeSurfer's results.

VolBrain-HIPS
VolBrain is a patch-based segmentation method for highresolution hippocampus subfields. It has been validated and uses two publicly available segmentation protocols different from FreeSurfer on manually ex vivo segmented datasets (44,45).
Both hippocampal segmentation protocols are available in volBrain-HIPS; Winterburn atlas disclosing 5 subregions was used for this work because it is more similar to the FreeSurfer v6.0 definition than Kulaga-Yoskovitz. VolBrain-HIPS is based on the combination of MOPAL (46), a multi-contrast extension of the OPAL (47) patch-based label fusion segmentation method and a novel neural network-based error corrector. The method uses an adaptation of MOPAL, a patch matching segmentation method to produce fast and accurate T1 brain segmentations. The method also works on standard MRI acquisition with image resolution of clinical practice as well as on single T1w or single T2w images. The VolBrain approach performs well also on mono-contrast T1w and T2w images as well as under standard resolution images that are upsampled using the LASR (48,49) super-resolution method. The HIPS method also includes an error corrector post-processing step based on the use of a boosted ensemble of a neural network algorithm that is proposed to minimize systematic segmentation errors at postprocessing. It works in a fully automated manner providing accurate results outperforming state-of-the-art methods such as MAGeT (50), ASHS (51), and SurfPatch (52) which usually require extended periods of computing time. VolBrain-HIPS takes <20 min and performs fast segmentation as well as subjectspecific library registration that only requires estimating one non-linear registration over small regions to translate the whole library to the case to be segmented.
Finally, an online report is generated and results are plotted as absolute or percent values adjusted for intracranial volume against a normal reference standard for each anatomical region. Segmentation images can also be downloaded for evaluation purposes.
The same T1 volumetric images used for FreeSurfer v6.0 were uploaded to VolBrain-HIPS 3 for this analysis, using Winterburn atlas definition for controls and patients (45). The produced final report including absolute values (mm3) and normalized to percent of brain volume were recorded for analysis.
To study structural changes in the brain without any bias, we used FreeSurfer v6.0 metrics, specifically parcels of cortical thickness and volumes of all the cerebral structures in combination with machine learning methods based on Random Forest Classifiers (RFC) (60). This process was based on the implementation of an automatic classification algorithm to evaluate group discrimination performance considering morphometric contribution of whole-brain structures as independent features, without any a priori consideration. The selection of RFC was based by several premises: (i) We were interested in considering linear and more importantly non-linear relationships between all the features. (ii) As the number of samples was relatively low (although it is high for this type of studies), the parameter tuning should be an optional step. (iii) The interpretability of the relevant features in the classification should be clear. Given these conditions and the experience of the research team, we selected RFC as the best suitable algorithm for the analysis (61)(62)(63).
Preprocessed features of cortical-subcortical volumes and cortical thickness normalized to estimated total intracranial volume (eTIV) were analyzed via a progressive feature elimination (PFE) procedure (64) with a Monte Carlo crossvalidation scheme (33). Briefly, we performed 30 shuffle-splits in which we randomly selected 80% of the samples (with balanced classes) to train the RFC and the remaining 20% for testing to optimize the accuracy of RFC by varying the number of features from all to a single one according to its classificatory relevance. RFC quantifies a feature's importance depending on how much the average Gini impurity index decreases in the forest due to its use as a node in a tree (65). We used this score to progressively eliminate features by removing the feature with the lowest importance at each iteration. Finally, we kept the N first features in the ranking, where N is the optimal number of features such that using more than N features fails to improve the classifier's performance.
The optimal number of features was selected visually by indicating the minimal quantity at which accuracy became constant. We used this fixed number of features to compute the accuracy, the confusion matrix, and the ROC curve, and to obtain each subject's probability of being in each group (HC, left HS, and right HS).
We implemented this processing framework to perform three classifications: (i) a binary classification to discriminate HC and HS; (ii) a binary classification to discriminate left HS and right HS; and (iii) a multiclass classification to discriminate HC, left HS, and right HS. For each classification, we obtained the optimal number of features, the list of defined features, and the classification performance metrics (accuracy, confusion matrix, and ROC curves). Asymmetry metrics were not included in these analyses given the conceptual basis that RFCs consider the relationship between features, and therefore the asymmetry between hemispheres regions was indirectly taken into account.
These analyses were performed with the RFC implemented in the Python's scikit-learn package, with a fixed number of trees (2000) and the recommended number of features (P) in each split, where P is the square root of the full set of features. The maximum depth in each tree was not restricted a priori, i.e., nodes were expanded until all leaves were pure or until all leaves contained less than two samples.

Statistical Analysis
Results were analyzed independently for each method using the Statistical Package for Social Sciences SPSS (Version 23; IBM, Armonk, New York). Volume mean average and 95% confidence intervals (CI) were calculated for each hemisphere. Receiver operating characteristic analyses were used to obtain optimal sensitivity, specificity, and 95% CI computed for left and right hippocampal sclerosis patients. A normalization of absolute values related to the total TIV was implemented and used for group comparison and correlation tests, since it was previously described as the most significant covariate to be considered (25). Normalization was performed by the following expression: normalized%TIV Subject = AbsoluteValue(cm3) Patients X100/TIV Subject .
Both methods implement different atlas definitions and strategies to quantify TIV, thus precluding a direct comparison between absolute values.
Asymmetry degree was analyzed as an independent measure representing the difference between right and left regions divided by their mean (in percent) as implemented in vB and used in previous reports (25). Thus, positive values represent greater volumes on the right side.
Nominal variables were compared using the Chi square test. Paired-sample t-test (right vs. left) and ANCOVA (between groups) were used for normally distributed scalar variables adjusted for age and sex. Correlations were tested using the twotailed Pearson coefficient controlling for age and sex. Significance level was adjusted for the effect of multiple comparisons using Bonferroni correction when appropriate. To test the difference between HS sides in the group of patients, an ANCOVA test was calculated on z-scores computed for each region using the following formula: Age, sex, and clinical characteristics of epilepsy were included in the analysis as covariates.

RESULTS
After correction for TIV, no significant correlation was found between age and sex with hippocampal or subfield volumes (p > 0.05) in controls or patients. Controls and patients were paired according to age and sex, with female prevalence (controls 44f/17m, right HS 15f/7m, and left HS 17f/18m) not reaching significant differences (p.062). Groups were not different in relation to participants' age (p. 495), control subjects with a mean of 32 (18-62y), right HS patients group with 33 (21-64y), and left HS with 34 (19-52y).
No correlation was found between clinical features of epilepsy and hippocampus or subregion volumes.

Hippocampal Results
Estimated hippocampal volume and 95% confidence interval (CI) for controls on the right side were 3,454 cm3 (3.  (48-58.3) for vB. Hippocampal volumes ipsilateral to the HS side were significantly reduced compared with controls and also with the non-lesional side of right and left HS groups (p.000). Additionally, the right hippocampus was greater in left HS patients than in HC (FS, p.022) (see details in Tables 1 and 2).
Hippocampal asymmetry was the most reliable indicator for accurate classification between HC and right and left HS with an AUC:1 for Vb (measured in cm3 and in brain percent), an AUC:0.998 (using cm3), and an AUC:0.977 (in brain percent) based on FS. Optimal sensitivity-specificity was also calculated using hippocampal volumes with elevated accuracy (AUC:0.914 ∼ 0.993) for patient classification. Detailed results are specified in Figure 2 and Table 3.
To specifically account for atrophy differences among HS sides, z-score volumes for each hippocampus were compared, and no significant differences were found (p.692, FS and p.768, vB).

Results for Hippocampal Subfields
The mean volume and 95% CI estimates of hippocampal subfields for HC and patients are detailed in Tables 1 and 2. In the HC group, a significant rightward asymmetry of hippocampal subfields was recognized for CA1, CA2-CA3, and CA4-DG (in vB) and for CA1, CA3, CA4, molecular layer, hippocampal fissure, and GC-ML of DG (in FS). Leftward lateralization was recognized for the subiculum (vB) and pre-subiculum (FS) subregions (see details in Tables 1 and 2).
All subregions on the ipsilateral side of HS patients showed significant volume reduction compared with HC using vB, and most subfields were also reduced considering FS except for the right (p.446) and left (p.140) HATA, right and left fissure (p.1), and right ipsilateral fimbria (p.849).
The only subregion with a significant volume difference between sides of the affected hemisphere in patients was CA2-CA3 (p.024) for the group of right HS patients (observed in vB). Accordingly, ipsilateral to the sclerotic side, CA2-CA3 (vB) and CA3 (FS) subfields in left HS patients were less atrophic than any other cornus ammonia division.  The most atrophic subfield ipsilateral to the sclerotic side for FS were CA4, GCMLDG, and molecular layer, and SLSRSM measured in vB in both right and left HS patients. See details in Figure 3.

Validation With the Automatic (Random Forest) Classifier
Our supervised machine learning validation process disclosed anatomical regions that were restricted to hippocampal subregions as the most relevant features to discriminate between patients and HC. In other terms, non-hippocampal regions were not identified as relevant for the classification.
The classifier was able to discriminate between controls and patients with a high accuracy in the three main classifications we performed: the classification between HC and patients

DISCUSSION
In this work, we define reference volumetric values and confidence intervals for hippocampus and hippocampal subfields using two commonly available approaches in a small communitybased sample of healthy adults from Buenos Aires, Argentina. This is a limited sample but an important contribution to the field due to the scarce research literature on brain morphometric variations available in Latin America (66)(67)(68).
Since population variability on brain morphometric estimates are being increasingly reported (14,69,70,(70)(71)(72), it is important to consider the possibility of innate differences for adequate interpretation of MRI volumetry.     Several methods provide quantification of brain structures by using MRI data, including freely available softwares and online processing services that usually report adjusted values considering intracranial total volume, age, and sex as covariates. Unfortunately, wide variability exists related to the employed methodology that impairs appropriate comparisons of results between different techniques. Results are usually matched against a mixture of publicly available database of normal subjects that may not entirely account for variation among populations. Thus, absence of local references for normal and pathologic hippocampus volumes may also be a challenge for nonneuroimaging experts.
In this work, we report volumes of hippocampal structures and subregions that are specific for two different methods, evaluating patients from Latin America. The proposed reference values are intended to clarify the results obtained using two different methodologies, which are based on unequal anatomical definitions, and therefore the resulting scores cannot be directly used for cross-comparisons (see details in Figure 1). We calculated mean volumes, confidence intervals, and cutoff estimations to recognize a regional sample of patients with confirmed unilateral mesial sclerosis and temporal lobe epilepsy with high sensitivity and specificity. Hippocampal asymmetry degree was the most accurate measure for classification regardless of the volumetry method used, as previously reported by others (3, 23, 31).
Our results are coincident with previous reports supporting rightward asymmetry for whole hippocampal volume not only in HC but also present in other animal species (73).
Interestingly, as recently reported (74), some hippocampal subregion volumes in our study were leftward lateralized in HC including the subiculum and pre-subiculum, the former based on volBrain and the latter on FreeSurfer. This discrepancy probably represents similar findings observed in overlapping areas related to known differences in atlas definitions (75) (see Figure 1).
Contrary to previous findings (31), our results did not show any significant correlation between hippocampus volume and its subfields with clinical features of epilepsy.
Few studies had focused on assessing subregion atrophy differences between HS sides based on imaging data. We found specific volume reduction of CA2-CA3(vB) in right HS patients with partial preservation in left HS patients. Future investigation using adequate methodology and involving a greater number of participants may confirm our findings. A distinctive pattern of modifications can be expected from left and right HS which are not usually considered on histopathology research, probably supporting differences in functional abilities (76)(77)(78)(79)(80).
To our knowledge, only one published study directly addressed asymmetry differences between hippocampal subregions among left and right HS patients using FS v6.0 (81). The authors found reduced contralateral volumes to the side of HS for presubiculum, HATA, and TAIL subfields. Unfortunately, information about known constitutional asymmetries present in HC (74) that could influence the results as in our analysis is not usually considered.
Another recent study used an approach similar to ours (but based on manual segmentation) and found greater (rather than reduced) volume of left subiculum (contrary to our findings) in right HS participants (32). Additionally, the authors also showed significant reduction of ipsilateral CA1 subfield compared against any other subregion on the sclerotic side.
An interesting observation from our analysis is a trend to find larger volumes on mesial-temporal structures contralateral to the side of HS in patients compared with HC. Diverse hippocampal subfields and also the hippocampus (FS) in the right (non-lesional) hemisphere of left HS patients support this assumption showing significant greater volumes compared to the same regions in healthy controls ( Table 2). We should stress that in clinical practice the interpretation of hippocampal volumetry alone may not adequately identify some confirmed cases (∼10%) with compatible clinical and paraclinical findings of HS which may only show subtle signal intensity changes on T2/FLAIR images (5,82). Furthermore, it is important to note that a small group (∼20%) of confirmed temporal lobe epilepsy patients without abnormal MRI finding will be postoperatively classified as "Gliosis only" without hippocampal sclerosis based on histopathology (83), showing no evidence of neuronal loss nor hippocampal volume reduction.
Supplementary functional imaging examinations are useful for diagnosis in temporal lobe epilepsy with HS and unremarkable MRI findings that may preserve normal hippocampal volumes. Interictal FDG-PET (2-[18F]-fluoro-2-D-deoxyglucose positron emission tomography) is a relatively widely available neuroimaging modality with high sensitivity (∼80%) to disclose abnormal cortex hypo-metabolism in temporal lobe epilepsy (84,85). Importantly, about 20% of patients with confirmed hippocampal sclerosis and normal MRI FIGURE 3 | Differences in hippocampal subregion atrophy; comparison of Z-scores between HS sides. Mean Z-score volume comparison between left and right HS; obtained from vB (A) and from FS (B). *Significant for ANCOVA test between groups; Bonferroni corrected (p < 0.05) adjusted for age, sex, and epilepsy characteristics. Whiskers represent 95% confidence interval.
Although great progress has been made in recent years for preoperative diagnosis of HS using non-invasive methods, a considerable group of patients (20∼40%) will fail to achieve complete seizure free after surgery (88,89) following appropriate medical practices in experienced epilepsy centers. A recognized limitation of our study is the absence of histopathology information about recent standardized ILAE classification for HS subtype (ILAE HS I-III) (90) that could allow us to correlate volumetry findings with specific subfield anomalies. Nevertheless, some controversies remain concerning the role of histopathologic classification for predicting clinical evolution in HS patients and also regarding the feasibility of MRIhistopathology correlations, limited by the amount of brain sample available for examination. Additional benefits of MRI . The colored boxes are the features which were selected by the progressive feature elimination procedure. The feature importance value was normalized with respect to the trivial importance level 1/N, where N is the number of features-that means, at the trivial level all the features have the same importance. Whiskers represent 95% confidence interval; small rhombuses indicate outliers.
volumetry include the ability to examine the entire length of the sclerotic hippocampus and its contralateral homologous and also to consider inherited asymmetries for comparison. Another caveat of this study is its relatively small sample size and also the uncertainty of segmentation accuracy of automated methods, to quantify structures on atrophic hippocampus. Some studies suggest that manual tracing methods may provide more accurate volumetric measurement than automated segmentation, especially in cases of HS (91,92). However, validation results from FreeSurfer v6.0 developers indicate that subfield volumes still carry useful information, even when T1 images usually display limited contrast on the internal subregion boundaries (75). Equivalent methodology was also successfully implemented in previous studies on cognitive function and epilepsy (3, 4, 22, 23, 91, 93) with satisfactory results.
Contrary to previous observations supporting a fundamental role for cortical mesial-temporal regions, our machine learningbased validation process using an automatic algorithm failed to identify non-hippocampal structures such as the thalamus, temporal pole, fornix, or mammillary bodies as relevant for group classification. It shall be stressed that the abovementioned structures and others known to be involved in HS patients could falsely not been recognized as important due to a superior performance of hippocampal and subregion metrics in a tradeoff between accuracy and number of analyzed features. Moreover, non-hippocampal anomalies preferentially involve white matter tracts (94,95) and are usually related to prolonged epilepsy duration or high seizure frequency not considered in our validation process.
In conclusion, hippocampal anatomical structures are the most relevant features to recognize HS patients as confirmed by an automatic classification based on RFC. The local reference values proposed for hippocampal volumes and subfields may prove a useful guide for diagnosis in adult patients with temporal lobe epilepsy and suspected HS particularly for nonspecialized radiologists.
Providing normal hippocampal reference values are a significant contribution to future studies focusing on regional morphometric variations in Latin America.
Finally, our results are also important for the interpretation of studies reporting hippocampal subfield volumes based on different atlas, which may show noticeable differences even when the same anatomical labels are used (96)(97)(98).

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Research and Bio-ethics review board at Hospital El Cruce, Carlos N. Kirschner. The patients/participants provided their written informed consent to participate in this study. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.

AUTHOR CONTRIBUTIONS
JP: study design, writing, statistical analysis, and quality revision. SK: study design and manuscript edition. PS: patients selection, clinical data, and follow-up. AN: patients selection, clinical data, and demography curation and analysis. SC: statistical analysis, data pre-processing, and quality revision. AD: data processing, analysis, classification algorithm, manuscript writing, and edition. GP: classification algorithm and manuscript writing and edition. JV: data pre-processing, analysis, quality revision, and manuscript edition. MV-A: data pre-processing, analysis, quality revision, and manuscript edition. PD-K: study design, data processing, analysis, classification algorithm, and manuscript writing and edition. All authors contributed to the article and approved the submitted version.