Ultrasonographic Estimation of Total Brain Volume: 3D Reliability and 2D Estimation. Enabling Routine Estimation During NICU Admission in the Preterm Infant

Objectives: The aim of this study is to explore if manually segmented total brain volume (TBV) from 3D ultrasonography (US) is comparable to TBV estimated by magnetic resonance imaging (MRI). We then wanted to test 2D based TBV estimation obtained through three linear axes which would enable monitoring brain growth in the preterm infant during admission. Methods: We included very low birth weight preterm infants admitted to our neonatal intensive care unit (NICU) with normal neuroimaging findings. We measured biparietal diameter, anteroposterior axis, vertical axis from US and MRI and TBV from both MRI and 3D US. We calculated intra- and interobserver agreement within and between techniques using the intraclass correlation coefficient and Bland-Altman methodology. We then developed a multilevel prediction model of TBV based on linear measurements from both US and MRI, compared them and explored how they changed with increasing age. The multilevel prediction model for TBV from linear measures was tested for internal and external validity and we developed a reference table for ease of prediction of TBV. Results: We used measurements obtained from 426 US and 93 MRI scans from 118 patients. We found good intra- and interobserver agreement for all the measurements. US measurements were reliable when compared to MRI, including TBV which achieved excellent agreement with that of MRI [ICC of 0.98 (95% CI 0.96–0.99)]. TBV estimated through 2D measurements of biparietal diameter, anteroposterior axis, and vertical axis was comparable among both techniques. We estimated the population 95% confidence interval for the mean values of biparietal diameter, anteroposterior axis, vertical axis, and total brain volume by post-menstrual age. A TBV prediction table based on the three axes is proposed to enable easy implementation of TBV estimation in routine 2D US during admission in the NICU. Conclusions: US measurements of biparietal diameter, vertical axis, and anteroposterior axis are reliable. TBV segmented through 3D US is comparable to MRI estimated TBV. 2D US accurate estimation of TBV is possible through biparietal diameter, vertical, and anteroposterior axes.


INTRODUCTION
Very low birth weight infants (VLBWI) are a population at high risk for cognitive, motor, neurosensory, and behavioral disability (1). These sequelae are associated with findings of brain injury and/or impaired growth of different brain structures (2).
Routine neonatal brain imaging via ultrasound (US) and magnetic resonance imaging (MRI) are usually evaluated through visual qualitative assessment of the images which leads to loss of information and variability among observers (3)(4)(5)(6). To improve reproducibility in assessment of brain growth, composite and global scores have been developed that include a combination of subjective items and objective measures. These composite scores incorporate quantitative measurements of different structures such as corpus callosum thickness, lateral ventricles width, biparietal diameter, cerebellar height and diameter, subarachnoid space dimensions, and interhemispheric distance (7)(8)(9). While these scores are undoubtedly useful for the neonatologist, an approach to US total brain volume estimation could provide new useful information.
A global reduction in brain volume can be visualized through these neuroimaging techniques and, when severe, it has been shown to be associated with adverse neurodevelopmental outcome (10). However, a more subtle tissue loss can be overlooked and therefore prognosis information can be inaccurate. This tissue loss could be hypothesized to occur in the context of brain dysmaturation, an increasingly recognized problem in the preterm infant that leads to impaired myelination and delayed cortical development (11). Despite survival of the extreme preterm infant without an increase in the prevalence of severe brain injury, the long-term outcome is still compromised in up to 70% of preterm infants. A better approach to the evaluation of brain growth pattern during admission in the NICU could help in the identification of this dysmaturation process. While some researchers have proposed different linear measurements as a proxy of early measurement of brain volume (12,13), these studies have not compared the estimated TBV with manual segmentation in 3D US or MRI estimated TBV, which could be considered the gold standard. Moreover, longitudinal estimation of brain volume though US has not been systematically tested as most of the studies rely on MRI for volumetric assessment of brain structures.
3D US allows an acquisition of the whole brain and navigation in three orthogonal planes which can improve orientation and symmetry of two-dimensional views (14). Furthermore, it enables a volumetric approach of the brain, which could lead to longitudinal assessment of brain growth during the neonatal period and facilitate impact of neonatal comorbidities on early brain growth.
The aim of this study is to determine the reproducibility of total brain volume estimated through 3D US and the accuracy of this estimation when compared to the gold standard MRI technique.
As 2D US is the standard of care we also wanted to analyse if three axes, one in each orthogonal plane of the brain could be a good estimate of TBV both in MRI and 3D US. We hypothesize that, if feasible, it could be implemented in the routine US evaluation during admission.

Patients
This study is part of a longitudinal cohort that includes VLBWI born at Hospital Puerta del Mar, Cádiz, Spain as of May 2018 with recruitment still ongoing. The study (PI0052/2017) aims to investigate the association of pre-and perinatal factors, brain growth and brain injury and socioeconomic status with long term neurodevelopmental outcome in the preterm infant. We consecutively enrolled VLBWI who met inclusion criteria (weight at birth equal or <1,500 grams, gestational age at birth equal or <32 weeks of gestation) and whose parents or legal guardians had signed informed consent. Exclusion criteria consisted of congenital and chromosomal anomalies, metabolic disorders and central nervous system infections. For the purpose of this study, we included those who were born from May 2018 to December 2019. We further excluded those with abnormal brain US or MRI findings (any degree of germinalmatrix/intraventricular hemorrhage and/or white matter injury). Perinatal data and details of the infants' clinical course were prospectively collected. All patients were followed prospectively and underwent weekly cranial US until either discharge or termequivalent age. We strived to perform two MRIs: one early scan, done as soon as the patient was clinically stable and another one at term equivalent age. The same day the MRI was done we performed a 3D US as per protocol to enable comparison of both neuroimaging tools.

2D and 3D US
Weekly cranial 2D US and 3D US were performed with the infant lying supine with his or her head turned to the right. Volume acquisition was carried out through the 4D option in the 3D/4D Voluson i portable ultrasound system (GE Healthcare) as previously described by our group (14,16). Through this option, with the transducer positioned in the third coronal plane, the beam moves from anterior to posterior planes using a center frequency of 6.5 MHz with a scan angle set at 90 • . Scans were saved and analysis was performed off-line by using 4D View software (version 17.0; GE Healthcare).

US and MRI Measurements
The same linear measurements were performed in MRI and US searching for the most similar plane in both techniques following the same anatomical landmarks. Parenchymal biparietal diameter (BPD), anteroposterior (AP) axis and vertical axis were measured in millimeters, and total brain volume (TBV) was measured in cubic centimeters.
BPD has been extensively studied and we have applied the same anatomical landmarks measuring, in the 3rd coronal plane the maximal distance from side to side of parietal cortex (9) (see Figure 1).
The anteroposterior and vertical axes were newly defined for this study, trying to make them easy and reliable, to diminish variability among observers. A mid-line sagittal view was obtained ensuring the contours of the vermis were clearly seen, with an orientation enabling the anterior vermis contour to be followed as an imaginary vertical line. A horizontal line was then drawn at the level of the inferior vermis contour. This line was considered to be the inferior limit of the vertical axis. The vertical axis was then drawn, starting at the cerebral cortex, visible just below the transducer as a narrow echogenic line, bordered by a lower echogenicity and broader line, and representing the maximum distance between the cerebral cortex and the inferior limit. The anteroposterior axis was always drawn after the vertical axis was defined to ensure a 90 degrees angle among both axes and was the maximum horizontal distance from the frontal to the occipital cortex. When defining the cortex was difficult, the inner bony mantel was used (Figure 2).
3D US TBV estimation was performed through manual segmentation using the Virtual Organ Computer-Aided Analysis (VOCAL) method (GE Healthcare) which allows manual contouring of consecutive 2D US planes with a 30 • rotation angle and a final 3D renderization (14,17).

Statistical Analysis
The intra-and interobserver reliabilities were assessed by two observers (S.L.F. and M.L.G. for MRI scans and I.B.F and E.R.G for US scans) blinded to clinical information. Based on previous studies by our group (16), we estimated a necessary sample size ranging from 12 to 22 to detect relevant differences for different measures within and among both techniques. All linear measurements were repeated three times to evaluate intraobserver variability. To control for memory effects potentially biasing subsequent measurements, each observer performed all the measurements for all patients prior to starting  again with the first one, with a 24-h interval between repeat measurements of the same subject and recorded them while blinded to previous measurements. The interobserver reliability was evaluated by comparing the mean of the three measures performed by each observer.
The intraclass correlation coefficients were calculated by using the two way random model for absolute agreement and interpreted according to the strength of agreement scale by Brennan and Silman (18 Furthermore, Bland-Altman (BA) analysis was also used as it allows a graphical representation of reproducibility and adds value through a complementary quantitative analysis (19).
We estimated the linear and TBV measurements by postmenstrual age expressing the mean and the population 95% confidence interval values of the mean.
As 3D US is not a standard US tool available in most NICUs, we aimed to facilitate volumetric estimation of TBV based on 2D US measurements. Using mixed effects regression models we compared the prediction of TBV based on linear measurements of the three brain axes made from 3D US and from MRI.
After model estimation, its reliability was further assessed through internal and external validation. External validity of the model was tested by evaluating its loss of prediction (shrinkage) after randomly splitting the sample into a training group and validation group. The model is considered acceptable if the shrinkage is <10%. Internal validity of the model was performed through cross-validation. This procedure splits the data randomly into k partitions, then for each partition it fits the specified model using the other k−1 groups and uses the resulting parameters to predict the dependent variable in the unused group. Finally, crossfold reports a measure of goodnessof-fit from each attempt and the mean value for all (R 2 Mean ) which is interpreted as the real predictive ability of the model when performed on external data.

Study Population
During the study period 156 VLBW were admitted to the NICU at Hospital Puerta del Mar in Cádiz, Spain. We excluded 38 patients with abnormal US findings such as germinalmatrix/intraventricular hemorrhage and/or white matter injury. US linear measurement of the three orthogonal axis and 3D US brain volume manual segmentation was performed in each US of all 118 patients, with 426 US measured with a median of 4 US per patient [IQR 2 -7].
A brief description of the main perinatal characteristics of the included patients is given in Table 1.
The same measurements were performed in 93 MRIs that were available from 62 patients (31 patients had 2 MRIs). For the reliability analysis we compared US to MRI measurements of 21 randomly selected patients who had their US scans performed within a ±24 h interval from the MRI. For the prediction model of MRI estimation of TBV though the linear measurements we used all the measurements performed in the 92 MRI scans.  We estimated the means in the measured US and 95% confidence interval in our population means of TBV manually segmented in the 426 3D US by post-menstrual age ( Table 2 and Figure 3).

Biparietal Diameter
Parenchymal BPD measurements obtained excellent intra-and interobserver reliability indices both in 3D US and in MRI, as FIGURE 3 | Scatterplot of total brain volume by post-menstrual age segmented in 3D US, predicted mean and 95% confidence interval of the population means. We estimated the means in the measured US (425 BPD measurements were obtained of the 426 US) and 95% confidence interval in our population means of BPD by post-menstrual age ( Table 3 and Figure 4).

Anteroposterior Axis
Anteroposterior axis measurements obtained excellent intra-and interobserver reliability indices both in 3D US and in MRI, as measured by ICC which ranged from 0.94 (95% CI 0.56-0.98) to 0.99 (95% CI 0.99-1.0). When comparing US anteroposterior axis estimation to MRI, we obtained an ICC of 0.83 (95% CI 0.07-0.95) (see Supplementary Material). Using Bland-Altman method and plotting both US and MRI measurements (see Tab "AP axis" in Supplementary Material) we could see that US measure of anteroposterior axis overlaps MRI measured anteroposterior axis with a non-relevant mean difference of 3.62 (95% CI 2.14-5.1).
We estimated the means in the measured US (423/426 anteroposterior axis measurements were obtained) and 95% confidence interval in our population means of anteroposterior axis by post-menstrual age (Table 4 and Figure 5).

Vertical Axis
Vertical axis measurements obtained excellent intra-and interobserver reliability indices both in 3D US and in MRI, as measured by ICC which ranged from 0.87 (95% CI 0.23-0.96) to 0.99 (95% CI 0.99-0.99). When comparing US vertical axis estimation to MRI vertical axis we obtained an ICC of 0.75 (95% CI 0.34-0.90). Using Bland-Altman method and plotting both US and MRI measurements (see Tab "Vertical axis" in Supplementary Material) we could see that US measure of vertical axis overlaps MRI vertical axis measure with a non-relevant mean difference of 1.7 (95% CI 0.44-3.01).
We estimated the means in the measured US (426/426 vertical axis measurements were obtained) and 95% confidence interval in our population means of vertical axis by post-menstrual age ( Table 5 and Figure 6).

Prediction of TBV Through Linear Axis by 3D US and MRI
Once we obtained good reliability of TBV measured by 3D US and MRI and excellent agreement regarding the linear measurements that represent the brain axes according to the three orthogonal planes, we wanted to take a step further to facilitate the TBV estimation from 2D US. To do this we made a predictive model of TBV based on the biparietal diameter, the anteroposterior axis, and the vertical axis. We made a model for US and another one for MRI and compared both models.

US TBV Prediction Based on US Linear Measurements
The relation between the brain axes and TBV was estimated by multilevel analysis, adjusting for repeated measurements. A  detailed description of the estimated parameters can be seen in Table 6. The estimated TBV based on the axes would follow this equation: US TBV = −390.9 + 2.5 * BPD + 3.4 * Vertical axis + 2.3 * AP axis

MRI TBV Prediction Based on MRI Linear Measurements
In a similar manner the relation between the brain axes and the MRI-TBV segmentation was estimated by multilevel analysis, adjusting for repeated measurements. A detailed description of the estimated parameters can be seen in Table 7. The estimated

Comparison of TBV Predicted by Linear Axis Through US vs. MRI
Predicted-by-axis-TBV estimated in US is reliable compared to the estimated by MRI with a non-significant mean difference of 17.81 cm 3 (Pearson r = 0.983; P = 0.09) (Figure 7). Once we showed that we could accurately predict TBV in US by measuring the three proposed axes we wanted to adapt an easy-to-use table that would further enable TBV use in routine clinical practice. The proposed model detailed in Table 6 and summarized in the equation: US TBV = −390.9 + 2.5 * BPD + 3.4 * Vertical axis + 2.3 * AP axis; was tested for external validation showing a loss of prediction (shrinkage) of 2.3%. These results suggest its reliability as shrinkage is <10%. Moreover, we obtained a R 2 Mean = 0.927 on the internal validation test, which proves the model to have high predictive ability when performed on external data was. We then estimated the predicted TBV based on the three axis and summarized it in an easy-to-use table (see Table 8).

DISCUSSION
Our study shows that monitoring brain growth in preterm infants during early life is feasible through TBV estimation based on 2D US measurement of the three orthogonal axes (BPD, vertical axis, and anteroposterior axis). Linear brain measurements of BPD, vertical axis, and anteroposterior axis are reliable when measured through US and show good agreement with MRI measures. We found, as other authors had previously reported (6) excellent intra-and interobserver agreement for parenchymal BPD. Interestingly, the intra-and interobserver agreement of the anteroposterior and vertical axis is almost perfect in our study. While our results for anteroposterior axis are consistent with the results obtained in other studies (6,21), we propose a systematic approach ensuring a 90 degrees angle among the anteroposterior axis and the vertical axis. Moreover, the proposed vertical axis takes into account the orientation of the sagittal view and the vermis anatomical landmarks, not the foramen magnum as suggested by Graça et al. (12) as we did not want to add the distance of the cisterna magna to a measure of brain growth, given its physiologic variability. This vertical axis could be suggested as more feasible in terms of its widespread use yet ensuring accuracy and reproducibility (6,9,22). 3D US enables obtaining optimal two-dimensional images of the desired anatomical section in all the explorations performed, while previous 2D ultrasound studies have found optimal planes obtained in 67% of the cases (23). This is the first time, to our knowledge, that the inter-and intraobserver agreement of these linear brain measurements has been studied using 3D US. Remarkably we achieved an overall better intra-observer agreement which could be attributable to the optimization of anatomical section selection achieved with the use of 3D US. Nevertheless, training and expertise in 3D US could have also contributed as our group has been working with 3D US for the last decade.
TBV measured by MRI has been associated with different perinatal morbidities (24) and neurodevelopment (25). We have studied the concordance between TBV measured by US 3D and MRI and found an excellent agreement between both methods (ICC of 0.98). Moreover, we have proven that TBV based on the three orthogonal axes is reliable for both US and MRI. Through this we have proved that an accurate estimation of TBV can be achieved through three very simple linear measures and we believe it could lead to a change in the clinical practice since TBV estimation could be introduced as a tool to monitor brain growth of the preterm infant during admission. We propose an easy to use equation and a table of predicted TBV that would enable implementing TBV in routine neonatal care.
The achieved sample size of patients and neuroimaging techniques has allowed us to describe the observed mean and the 95% confidence interval for population means which could be of further interest in the neonatal brain growth assessment. As 3D US is not widely available we have focused on its utility to make 2D US measurements reliable compared to a 3D US manual segmentation and to both linear and volumetric MRI measures. However, this study reinforces the potential role 3D US has in the NICU which has been previously recognized by other authors (26,27) and by our group (14,16,28,29). 3D allows navigation through the three planes once a whole brain acquisition has been obtained, is faster than 2D US and allows review offline of any possible section of interest instead of having a static 2D image saved (30)(31)(32). We suggest that 3D US routine implementation in the NICU could lead to a whole new approach to the central nervous system in the neonatal period, with a better evaluation of brain growth, maturation, and brain injury (32)(33)(34).
Our study is subject to several limitations that need to be considered. We studied images of VLBWI without brain damage and the anatomic landmarks of these measurements may not be clear in the presence of brain injury; the results should therefore also be validated in a cohort with brain damage. Moreover, only 38 patients (24.3%) were classified as having brain injury, which is a smaller number than expected in a VLBW population and it is therefore possible that we did include infants with the mildest forms of brain injury (both GMH-IVH and white matter injury). However, our study intended to establish the relationship of linear measurements with TBV and cannot be taken as a measure of brain growth. A more detailed study on all known factors that compromise brain growth (comorbidities, sex, gestational age) in a preterm population that includes those with brain injury was beyond the scope of this study but is warranted in the near future by our group. This might help to reach a better understanding of normal brain growth pattern vs. a hypothesized deviated pattern in the sickest preterm infant. We must also acknowledge that, although we used the MRI slices that were closest to the ultrasound images, we are measuring structures acquired with different angulations: US images are obtained in coronal and sagittal planes through the anterior fontanel while MRI axial and coronal planes are used with coronal planes being parallel. Nonetheless, the measurements proposed should not be affected by the angulation difference; moreover, in our study, the use of 3D US has allowed us to select offline those anatomical sections that most closely resemble those obtained with MRI. As this study was performed in preterm infants this methodology would need to be assessed separately in term infants.
In conclusion, we have found that US measurements of BPD, vertical axis and anteroposterior axis are reliable. TBV segmented through 3D US is reliable and accurate compared to MRI measured TBV. When 3D US is not available, 2D US TBV estimation could be achieved through biparietal diameter, vertical and anteroposterior axis which could lead to a better assessment of brain growth in the preterm infant and could potentially be added to routine 2D US in the NICU.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Research and Ethics Committee, Cádiz. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.

AUTHOR CONTRIBUTIONS
IB-F and SL-L conceptualized and designed the study, participated in recruitment and performed the measurements, supervised data collection and performed analysis, drafted the initial manuscript, reviewed the manuscript, and approved the final manuscript as submitted. ER-G, ML-G, SL-F, CR-C, PO-D, and YC contributed to data collection and performed the measurements, reviewed the manuscript, and approved the final manuscript as submitted. All authors approved the final manuscript as submitted and agree to be accountable for all aspects of the work.