Deep Learning Over Reduced Intrinsic Domains for Efficient Mechanics of the Left Ventricle

Cardiac mechanics tools can be used to enhance medical diagnosis and treatment, and assessment of risk of cardiovascular diseases. Still, the computational cost to solve cardiac models restricts their use for online applications and routine clinical practice. This work presents a surrogate model obtained by training a set of Siamese networks over a physiological representation of the left ventricle. Our model allows us to modify the geometry, loading conditions, and material properties without needing of retraining. Additionally, we propose the novel concept of intrinsic domain that improves the accuracy of the network predictions by one order of magnitude. The neural networks were trained and tested with numerical predictions from a previously published finite element model of the left ventricle. Different loading conditions, material properties and geometrical definitions of the domain were simulated by the model leading to a dataset of 5, 670 cases. In terms of accuracy and performance, the surrogate model approximates the displacement field of the finite element model with an error of 4.4 ± 2.9% (with respect to the L2-norm of the true displacement field) across all cases while performing computations 62 times faster. Hence, the trained model is capable of computing a passive cardiac filling of the chamber at 10 different time points in just ~0.7 s. These outcomes prove usability of training surrogate models for efficient simulations to facilitate the use of complex mechanical models in clinical practice for therapeutic planning and online diagnosis.


INTRODUCTION
During the last decades, increasingly complex multi-scale computational models of cardiac structure and mechanical function have been developed for improving our understanding of cardiac physiology [1]. These advances coupled with precision medicine have shown great potential for improving diagnosis, guiding therapies, and predicting prognosis in a number of clinical applications [2,3]. However, clinical translation of these models is often hampered by the high computational expense involved in solving the multi-physics PDEs necessary for simulating cardiac mechanical function [4][5][6]. These models often require high performance computing (HPC) infrastructure to generate outputs in a tractable time. For example, personalizing models to patient-specific data, and assessing confidence in model predictions requires large numbers of simulations to be performed. The requirement for Verification, Validation and Uncertainty Quantification (VVUQ) in the modeling process imposes additional computational demands [7,8]. For example, assessing the uncertainties associated with constructing anatomical models from medical images, prescribing loading and boundary conditions, and incorporating microstructural features into the models can make the assessment of even relatively simpler models computationally intractable.
The use of surrogate models or efficient techniques for solving PDEs are promising approaches to reducing the computational burden in solving mathematical models. Numerical methods based on neural networks approximants, particularly multilayer perceptron (MLP), have been proposed by for PDE solving [9][10][11]. Many approaches based on the Galerkin method were proposed where an MLP is employed to combine [10][11][12] or replace classic basis functions [9,13,14]. The later methods provide a meshfree alternative which is especially appealing for high-dimensionality problems and complex geometrical descriptions of the domain of interest. Recently, Raissi et al. [15] introduced a simpler methodology to solve PDEs, in which a neural network is optimized by minimizing the residuals of the equilibrium equations (PDEs) inside the domain (unsupervised learning) and specified initial and boundary values (supervised learning). Further more, Raissi et al. [16] and Raissi [17] explored the discovery of linear and non-linear differential equations, using an MLP to generate data-driven models without any knowledge from the underlying PDEs.
All previous approaches were trained to solve a specific mechanical setup, i.e., a fixed geometry of the domain, material properties, loading and boundary conditions. For any variation in the mechanical setup, the network must be retrained which is time consuming. Recently, Liang et al. [18] proposed a surrogate model to predict the stress distribution of the aorta for patientspecific geometries. This was achieved by encoding the geometry of the domain as a set of parameters using a principal component analysis. Also, Swischuk et al. [19] introduced surrogate models that allow variations of the material parameters by reducing the dimensionality of the output fields using proper orthogonal decomposition (POD) and, then, creates a mapping between inputs and POD coefficient by means of an MLP.
In this work, we present an MLP-based surrogate model of the left ventricle (LV) trained only from a FEM model of the LV, capable of performing predictions for different domain geometries, loading conditions, and material parameters without retraining the network. Also, we proposed the construction of a domain representation-refereed as intrinsic domain-better suited for training the surrogate model, yielding 10 times more accurate results than using the spatial representation of the LV for a given training dataset.
The manuscript is structured as follows. Section 2 presents the mechanical model used to generate the training dataset and the network architectures used to assimilate mechanical models in an efficient manner. Section 3.1 describes the mechanical model setup and training datasets used to evaluate the methods performance. Section 3.3 compares the performance of the proposed architecture when using the traditional domain and the intrinsic domain descriptions. Section 3.5 quantifies the computational burden of the FEM mechanical model and the surrogate model. Section 4 discusses the findings and limitations of the proposed methodology. Finally, Section 5 outlines the conclusions of this manuscript.

Heart Modeling
Kinematics of the LV are simulated using a patient-specific FE model [3]. This involves solving the finite elasticity equilibrium equations during the diastolic phase of the cardiac cycle under an endocardial pressure boundary condition to simulate passive filling of the ventricle. Patient-specific geometrical models of the LV are constructed at the diastasis frame of the cardiac cycle for a range of individuals, which are assumed to be in a load-free configuration. Cubic Lagrange basis functions are used for constructing the FE mesh of the geometry. A typical mammalian description of the myocyte orientation through the LV wall [20] is incorporated into the geometry through a material fiber field. The LV myocardium is modeled as an ideally-incompressible transversely isotropic material by means of the Guccione constitutive model [21] with the following strain energy density function where the c 1 parameter scales the overall stiffness of the myocardium, and c 2 , c 3 , and c 4 control the material anisotropy in the fiber (f), cross-fiber (c), and radial (r) directions, respectively. Incompressibility of the myocardium is enforced through a mixed formulation that uses linear Lagrange basis functions for describing the hydrostatic pressure [22]. Homogeneous Dirichlet boundary conditions are applied on nodes of the FE mesh at the epicardial perimeter of the basal surface of the model. All simulations were performed using the OpenCMISS open-source computational modeling software package [23].

Reference Domain and Dimensionality Reduction
We decompose the load-free LV geometries of a set of individuals using PCA, allowing for a compact description of the LV. Observations x i , i = 1, . . . , n, described by m geometric FE mesh parameters, are mean-centered and concatenated into an m × n data matrix A covariance matrix is then constructed from the mean-centered data matrix and an eigendecomposition is applied as The normalized eigenvectors b 1 ..b n in E are the principal components and the eigenvalues, λ 1 > λ 2 > ... > λ n , in the diagonal matrix D are the variances (σ 2 ) along each principal component. In practice, the covariance matrix is rank deficient, as m > n. The rank of the covariance matrix is further reduced by one because the data matrix is mean-centered. Therefore, there can only be as many non-zero eigenvalues, and meaningful principal components, as the number of shape observations minus one (n − 1). In this analysis, the principal components are efficiently identified using singular value decomposition (SVD) [24]. This is achieved by normalizing the data matrix as such that Y T Y = C X . By performing SVD on Y, we factorize yielding a PCA space with orthonormal principal components, b 1 ..b n , in the columns of V, and the square root of their variances, σ 1 ..σ n , ranked in descending order, in the diagonals of . The weights, g, of the i-th individuals load-free LV geometry evaluated on their corresponding principal components in PCA space are calculated as A reduced space for these geometries can be constructed by dropping the less meaningful components, i.e., the components associated with lower eigenvalues. Thus, we can approximate the i-th load-free LV geometry with k components as where weights g j are computed from Equation [ (7)]. For ease of interpretation, the j-th weight value is normalized using the square root of the variance of its corresponding principal component, i.e.,ĝ The cumulative variance in shape explained with increasing number of principal components can be evaluated aŝ Note thatx can be used as a reference domain where you can recover the load-free LV geometriesx by means of the PCAweights g j using the mapping described by Equation (8) (see Figure 3). We can then fix the geometrical representation of the mean LV shape asx and encode the geometrical variations as the set of parameters g j , j = 1, . . . , k, for a given approximation space of k principal components. For the remainder of the manuscript, we denote the reference domain as PCA , which encapsulates the LV shape through a vector of PCA-weights, g.

Surrogate Network
We define a surrogate model for the FE model defined in section 2.1 by using Siamese neural networks. Specifically, two networks are defined: (i) a boundary network (BN) that learns Dirichlet boundary conditions; and (ii) a domain network (DN) that learns the mechanics of the inner-domain and non-Dirichlet boundary conditions. Both networks share the same parameters (weights and biases) and architecture. The only difference is that each may optimize different terms of the loss function using a different set of inputs. The Siamese network defines a point-wise operator that estimates a field for a given point on the domain. As denoted by their names, BN computes a chosen field for points at the Dirichlet boundary and DN computes the field in the remaining loci of the domain. For our particular application, the Siamese network predicts the displacement field of the LV. Each component of the displacement field is estimated by an independent network, leading to an architecture with 3 Siamese networks. The inputs and outputs of the networks are detailed in Figure 1. Note that each set of inputs fully characterizes the mechanical problem: g = (g 1 , . . . , g k ) defines the geometry of the domain, p defines the loading conditions, c 1 defines the material properties and We choose a loss functions that combines the L 2 -norm error of BN and DN networks by means of a mixture parameter as follows where B b and B d are sets of points (training batches) on the boundary and inside the domain, respectively, u i andũ i are the displacements predicted with the neural network and the FE model, respectively, α is the penalty for the Dirichlet boundary conditions, and B = B b + B d is a given training batch. During training, two batches of the same size are fed in parallel to BN and DN, one with points over the boundary and one with points inside the domain. Then, the loss values from BN and DN are added and a back-propagation process updates the network weights.

Intrinsic Learning Domain
Training networks over a domain that is sparse (see Figure 3, PCA is only defined sparsely through x, y, z in the Cartessian coordinates) can be challenging due to the large degree of uncertainty in the non-sampled (or non-defined) regions. Moreover, as the coordinates are defined in a global Cartesian reference frame, the kinematic description can be too complex and higher-order approximants (or higher-capacity networks) would be required to approximate it. In turn, higher-order approximants will require larger datasets to properly define the unknown parameters through optimization.
To tackle such issues, we ideally want a reference domain in which the solution field is continuous, can be described by lower-order approximants, and where the dataset samples are evenly and contiguously spaced. To obtain that space, while preserving the topology of the geometry, we morph the reference domain to an intrinsic learning domain by means of a diffeomorphic mapping.
In this case, we map the reference domain PCA into a cylindrical representation of the LV learn (see Figure 2). The mapping is performed by stretching the ventricle base to be the outer radius of a cylinder, and fixing the central axis of the cylinder to be the apex of the heart. In this description, the axial direction is the intramural direction. Figure 2 shows the relationship between image, reference, and learning domains and their mappings. Note that by constructing these mapping functions, we can obtain the description of the fields in any domain.

Training Data
To construct the LV representation, we performed PCA over reconstructions of 28 in-vivo MRI sequences discretized by 1200 FE nodal points. We performed a statistical shape analysis on these reconstructions, as detailed in section 2.2. We assessed the accuracy of the PCA approximation space by means of the cumulative varianceσ 2 k introduced in Equation (10) to determine the necessary number of components to generate an acceptable representation of the LV. Figure 4 illustrates the improvement of the representation as more principal components are included into the space. We selected to use the approximation space of 2 principal components, accounting for 78.51% of the variability observed across the geometries. A qualitative interpretation of these two selected components is depicted in Figure 5, where g 1 has an important scaling contribution while g 2 increases the axial curvature and basal tilt of the chamber.  To construct our training and testing datasets, we sampled geometries over the PCA-space and then predicted the displacement field for different loading conditions and material properties by means of our FE model. In this analysis, we considered a reduced PCA space with k = 2 principal components for reconstructing approximate LV geometries using Equation (8). Therefore, only weights w = {w 1 , w 2 } were specified to generate an LV geometry. We extracted equidistant samples of these weights using a step of s = 0.  [25]. This yielded 70 mechanical problems per geometry. Additionally, each geometry was sampled at 1108 domain points, which were equally spaced inside each mesh element. Thus, a total of ∼ 6M samples were obtained from the FE models (see Table 1).
We partitioned the samples as 90% for training and 10% for testing datasets, using the former dataset to train DN. To impose an homogeneous Dirichlet boundary condition at BN, we dynamically generated batches of random points along the boundary, prescribingd = 0 in the BN loss function. These dynamically generated points yielded better results than assimilation of the values at the mesh boundary nodes.
The networks were trained using the ADAM algorithm [26] with the parameters α = 0.01, β 1 = 0.9, and β 2 = 0.999 and a batch size of 5, 000 samples during 500 epochs. During testing, we computed the loss function among all cases in the testing dataset to assess the performance of the network. After finishing all training epochs, we assigned the network weights as those associated with the lowest loss value among all testing epochs. FIGURE 5 | Visualization of LV shape related to the first two principal components of shape variation seen across the population of individuals considered in this study (first and second rows correspond to g 1 and g 2 , respectively). The mean shape is shown in the central column, while the weights corresponding to −2 and 2 standard deviations (ĝ j = ±2, j = 1, 2) are shown together with the mean shape for each principal component in the left and right columns, respectively.

Evaluation Metrics
The error of the network prediction was assessed in terms of absolute and relative L 2 -norm errors. For a single case-i.e., a mechanical problem with a single set of boundary, material, and loading conditions-, the errors in the displacements over the entire mesh are computed as follows where u FEM are the ground truth displacements from the FE model and u DL are the displacements predicted by the network. The adimensionalized quantity ε rel allows a statistical comparison of errors among different loading conditions without biasing toward contributions from the cases with larger displacements.

Training in Reference and Learning Domains
The surrogate network was trained in the reference and learning domains using the same dataset (described in section 3.1) to compare the performance of our method when different representation spaces were employed during training. The sensitivity of the method's accuracy with respect to the network architecture was also analyzed for different numbers of layers and neurons. This involved training networks with 2, 3, and 4 identical layers, each of which consisted of three different cases with 32, 64, and 128 neurons per layer, yielding a total of nine different architectures. The accuracy was quantified in terms of the relative error ε rel . This allowed us to quantify the overall error ratio in predicting the displacement field. The relative error was computed through all cases in the testing dataset (mean and standard deviation of ε rel are reported in Table 2). The mean relative error across all architectures was 10.78 times lower when using the learning domain compared with the reference domain, showing a clear improvement in the learning capabilities by modifying the training domain. Of particular note, the best-ranked model (the one with the lowest µ ε rel ) presented an error of 4.4 ± 2.9%. Interestingly, the relative error has higher sensitivity when displacements are smaller, i.e., lower pressures and larger elasticity values (see Figure 6). In terms of absolute displacement error, the µ ε abs is directly proportional to the displacement, but not through a linear relationship as indicated by ε rel . Hence, the surrogate model favors cases with larger displacements in terms of accuracy. Note that such behavior is a consequence of the selected loss function (11), which penalizes the absolute displacement error during training.

Analysis of the Surrogate Model Error
To further characterize the surrogate model error, we analyzed influences of the geometry, pressure loading, and material properties on the error distribution. In the previous section, we found that the relative errors were higher for mechanical problems with the smaller loads or the stiffer material properties.
To understand the distribution of such errors across the PCAspace, we computed ε rel for the 81 geometries described in section 3.1 and analyzed its distribution for three different materials (with c 1 values for a softer, an intermediate and a stiffer model of myocardium) under the range of physiological pressures during the filling of the LV chamber (see Figure 7). We observed that lower pressures not only increased the mean relative error, but also resulted in a wider spread in its distribution (higher ε rel values were associated with higher g 1 values), indicating that some parts of the PCAspace are more affected by the error than others. A similar effect was observed when increasing the stiffness of the LV. Increasing c 1 led to a wider error distribution with a larger effect in the low pressure regime (as can be observed by comparing relative stretching of the violins for p < 0.4 vs. p > 0.4).
Regarding the spatial distribution of the error, no specific pattern was found, although the errors tend to localize toward the base of the LV (see Figures 8, 9). As mentioned previously, an increase in pressure and decrease in stiffness results in an increment of ε abs but does not change its spatial distribution. Conversely, geometrical variations result in changes in error distribution as expected (see Figure 9). Furthermore, an increment of error and overall displacements is observed asĝ 1 increases which is related to the fact that this principal component scales with the LV size.

Performance Assessment
To highlight the increase in performance, we recorded the wall clock time for the FEM and surrogate models to simulate the LV filling process at 10 different time points. Each simulation (of 10 time points) was repeated 10 times to evaluate the mean performance and its standard deviation. The FEM model was implementation in OpenCMISS (refer to section 2.1 and references within for further details) and executed in an Intel Xeon Gold 6136 CPU at 3.00GHz for

DISCUSSION
This work presents a surrogate model capable of reproducing FEM predictions of a load-free LV mechanical model 62.23 times faster with an overall error of 4.4%. Deeper networks would render even more accurate results but would result in longer training times, more memory consumption and larger computational times for predictions. This particular trade-off should be studied for each specific application and its clinical demands. The error in the surrogate model is the difference between the surrogate model predictions and the trainer model predictions (in this case, the FE model predictions) and is expected to be trainer-model-dependent. If a more complex training model is used, the surrogate network may require a deeper and wider architecture to maintain the same order of accuracy. Also, the methodology deals with geometric variations of the domain by encoding them in a finite set of parameters by means of principal component analysis. A similar approach has been adopted in Liang et al. [18] for the assessment of wall shear stress in the aorta. Our strategy presents an important step toward translation of computational models to clinical practice. In this particular case, our surrogate model assimilated the mechanical behavior from a FEM model although the methodology is independent of the computational strategy (e.g., FEM, finite volume method, mesh-free methods, or finite difference approximations). The strategy is interestingly appealing in scenarios where large number of simulations are required, such as inverse problems for mechanical or geometrical parameter identification, or applications that require fast response times such as real-time simulations for risk assessment during therapeutic treatments or for teaching purposes.
The intrinsic learning domain introduced in this work produced a significant increase in the surrogate model performance, gaining one order of magnitude in accuracy with respect to using the Cartesian description of the LV. Traditional approaches for improving the surrogate model accuracy would have required increasing the size of the dataset that would result in a larger training process and refinements to the FEM model to compute such new data (which in some cases can lead to intractable computational cost). This issue is even more prohibitive for complex multiphysics models where only small datasets can be generated in a reasonable execution time. The proposed domain relaxes such requirements, easing the data generation and training process, and is therefore more beneficial for training HPC and multi-physics models than using traditional Cartesian description. Generalizing this methodology to other mechanics problems, or more complex domains such as bi-ventricular or whole-heart models, could be achieved by automating the morphing task for obtaining the intrinsic learning domain. Flattening-based strategies [27] may provide such a solution for 2D domains meanwhile 3D domains may exploit the 2D solution to project its surfaces and map the internal points by means of an optimization problem. Overall, we showed that the choice of the representation domain upon which the training is performed, is of utmost importance for the performance and training of the network.
Here, we addressed the estimation of three scalar fields by predicting each field in a decoupled manner by means of a different neural network. This approach can be extended straightforwardly to compute as many fields as needed. Nevertheless, multi-physics problems can benefit through the use of networks that predict more than one field at the time, allowing the network to learn the correlation among the fields. In our particular problem, the prediction of the three fields by a single neural network yielded lower accuracy than its decoupled counterpart.
The FEM model, employed in this work, was a simple phenomenological model that approximates the left ventricle inflation phenomenon. Contractile, multi-scale or multiphysics cardiac models present more complex behavior involving higher-order and non-linear responses. This will impact the network architecture required to approximate the physical fields of interest, where deeper and wider layers of neurons in the surrogate model may be better suited to reproduce such models. It is noteworthy that only the target physical field to be learnt would change, thus the presented methodology and learning space is still valid for such cases.
Another interesting aspect is the independence from the dataset generator. We chose to employ a computational model because of the availability and control over the model. However, a similar study could be conducted using experimental data, as long as all relevant mechanical parameters are identifiable, turning the surrogate model into a mechanical model per se. An advantage of this approach is that it provides the capabilities of neural networks to model non-linear contributions by construction (most activation functions are non-linear) and offers a universal approximator for continuous functions [28] in contrast to traditional computational models.
While great strides have been made for modeling LV mechanics, a number of issues remain before reliable and accurate LV models can be incorporated in medical practice. During construction of the PCA spaces, it is difficult to establish point-to-point correspondence between reconstructions of different subjects due to the difficulty to find multiple anatomical landmarks throughout the myocardium. Also, the lack of myocardial texture in the MR images hinders the estimation of displacements inside the muscle, encumbering the definition of appropriate boundary conditions for the displacement at the base of the LV. Also, a non-invasive and reliable methodology to determine the inner ventricular pressure would be beneficial to determine the stiffness of the LV by means of this model in a clinical setup.

CONCLUSIONS
In this work, we presented a surrogate model of the loadfree left ventricle that was capable of predicting the heart's displacement field during passive inflation under physiological loading conditions and material properties. To the best of our knowledge, our model is the first to account for physiologically realistic geometric variations of the myocardium for training machine learning based mechanical models. Additionally, we introduce a novel representation domain to increase the accuracy of the model and ease the training task. The surrogate model trained with 28 in vivo geometries produced an error of just 4.4% compared with FEM simulations while increasing its performance by more than 62 times, delivering simulations in 0.7 s. The analysis of the model error showed a higher relative accuracy for larger displacements given by larger loading conditions or less stiffer material properties. The model developed in this study is suitable for a number of applications including the characterization of the myocardial material properties and quantifying uncertainties associated with cardiac mechanics modeling.

DATA AVAILABILITY STATEMENT
The datasets generated for this study are available on request to the corresponding author.

AUTHOR CONTRIBUTIONS
GM designed the study, implemented, trained, and validated the machine learning networks, and prepared the manuscript. TB developed the biomechanical modeling framework, performed the PCA, and contributed to the design of the study and the manuscript text. MS contributed to the design of the study. MN contributed to the design of the study, the development of the biomechanical modeling framework, and edited the manuscript.