# ADVANCED HPC-BASED COMPUTATIONAL MODELING IN BIOMECHANICS AND SYSTEMS BIOLOGY

EDITED BY : Mariano Vázquez, Peter V. Coveney, Hernan Edgardo Grecco, Alfons Hoekstra and Bastien Chopard PUBLISHED IN : Frontiers in Physiology, Frontiers in Applied Mathematics and Statistics and Frontiers in Bioengineering and Biotechnology

#### Frontiers Copyright Statement

© Copyright 2007-2019 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use. ISSN 1664-8714 ISBN 978-2-88945-817-2 DOI 10.3389/978-2-88945-817-2

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# ADVANCED HPC-BASED COMPUTATIONAL MODELING IN BIOMECHANICS AND SYSTEMS BIOLOGY

Topic Editors:

Mariano Vázquez, Barcelona Supercomputing Center, Spain Peter V. Coveney, University College London, United Kingdom Hernan Edgardo Grecco, Universidad de Buenos Aires, Argentina Alfons Hoekstra, University of Amsterdam, Netherlands Bastien Chopard, Université de Genève, Geneva, Switzerland

Citation: Vázquez, M., Coveney, P. V., Grecco, H. E., Hoekstra, A., Chopard, B., eds. (2019). Advanced HPC-based Computational Modeling in Biomechanics and Systems Biology. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-817-2

# Table of Contents

*06 Competing Mechanisms of Stress-Assisted Diffusivity and Stretch-Activated Currents in Cardiac Electromechanics*

Alessandro Loppini, Alessio Gizzi, Ricardo Ruiz-Baier, Christian Cherubini, Flavio H. Fenton and Simonetta Filippi


Miguel O. Bernabeu, Yang Lu, Omar Abu-Qamar, Lloyd P. Aiello and Jennifer K. Sun

*74 Parameter Estimation of Platelets Deposition: Approximate Bayesian Computation With High Performance Computing* Ritabrata Dutta, Bastien Chopard, Jonas Lätt, Frank Dubois,

Karim Zouaoui Boudjeltia and Antonietta Mira


Alberto García-González, Emanuela Jacchetti, Roberto Marotta, Marta Tunesi, José F. Rodríguez Matas and Manuela T. Raimondi

*117 Enabling Detailed, Biophysics-Based Skeletal Muscle Models on HPC Systems*

Chris P. Bradley, Nehzat Emamy, Thomas Ertl, Dominik Göddeke, Andreas Hessenthaler, Thomas Klotz, Aaron Krämer, Michael Krone, Benjamin Maier, Miriam Mehl, Tobias Rau and Oliver Röhrle


Hoskote Chandrashekar, Fergus Robertson and Peter V. Coveney

*164 A Highly Automated Computational Method for Modeling of Intracranial Aneurysm Hemodynamics*

Jung-Hee Seo, Parastou Eslami, Justin Caplan, Rafael J. Tamargo and Rajat Mittal

*176 Towards a Computational Framework for Modeling the Impact of Aortic Coarctations Upon Left Ventricular Load* Elias Karabelas, Matthias A. F. Gsell, Christoph M. Augustin, Laura Marx,

Aurel Neic, Anton J. Prassl, Leonid Goubergrits, Titus Kuehne and Gernot Plank


Mario Ceresa, Andy L. Olivares, Jérôme Noailly and Miguel A. González Ballester

*270 Modeling Patient-Specific Magnetic Drug Targeting Within the Intracranial Vasculature*

Alexander Patronis, Robin A. Richardson, Sebastian Schmieschek, Brian J. N. Wylie, Rupert W. Nash and Peter V. Coveney

*287 3D Fluid-Structure Interaction Simulation of Aortic Valves Using a Unified Continuum ALE FEM Model*

Jeannette H. Spühler, Johan Jansson, Niclas Jansson and Johan Hoffman

*303 High-Performance Agent-Based Modeling Applied to Vocal Fold Inflammation and Repair*

Nuttiiya Seekhao, Caroline Shung, Joseph JaJa, Luc Mongeau and Nicole Y. K. Li-Jessen


Piero Colli Franzone, Luca F. Pavarino and Simone Scacchi

*357 Mechanical Characterization of the Vessel Wall by Data Assimilation of Intravascular Ultrasound Studies* Gonzalo D. Maso Talou, Pablo J. Blanco, Gonzalo D. Ares,

Cristiano Guedes Bezerra, Pedro A. Lemos and Raúl A. Feijóo

*382 The use of Biophysical Flow Models in the Surgical Management of Patients Affected by Chronic Thromboembolic Pulmonary Hypertension* Martina Spazzapan, Priya Sastry, John Dunning, David Nordsletten and Adelaide de Vecchi


Gábor Závodszky, Britt van Rooij, Victor Azizi and Alfons Hoekstra

# Competing Mechanisms of Stress-Assisted Diffusivity and Stretch-Activated Currents in Cardiac Electromechanics

Alessandro Loppini <sup>1</sup> , Alessio Gizzi <sup>1</sup> \*, Ricardo Ruiz-Baier 2,3, Christian Cherubini 1,4 , Flavio H. Fenton<sup>5</sup> and Simonetta Filippi 1,4

*<sup>1</sup> Unit of Nonlinear Physics and Mathematical Modeling, Department of Engineering, University Campus Bio-Medico of Rome, Rome, Italy, <sup>2</sup> Mathematical Institute, University of Oxford, Oxford, United Kingdom, <sup>3</sup> Laboratory of Mathematical Modelling, Institute of Personalized Medicine, Sechenov University, Moscow, Russia, <sup>4</sup> ICRANet, Pescara, Italy, <sup>5</sup> Georgia Institute of Technology, School of Physics, Atlanta, GA, United States*

#### Edited by:

*Raimond L. Winslow, Johns Hopkins University, United States*

#### Reviewed by:

*Arun V. Holden, University of Leeds, United Kingdom Jazmin Aguado-Sierra, Barcelona Supercomputing Center, Spain*

> \*Correspondence: *Alessio Gizzi a.gizzi@unicampus.it*

#### Specialty section:

*This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology*

Received: *18 December 2017* Accepted: *14 November 2018* Published: *03 December 2018*

#### Citation:

*Loppini A, Gizzi A, Ruiz-Baier R, Cherubini C, Fenton FH and Filippi S (2018) Competing Mechanisms of Stress-Assisted Diffusivity and Stretch-Activated Currents in Cardiac Electromechanics. Front. Physiol. 9:1714. doi: 10.3389/fphys.2018.01714* We numerically investigate the role of mechanical stress in modifying the conductivity properties of cardiac tissue, and also assess the impact of these effects in the solutions generated by computational models for cardiac electromechanics. We follow the recent theoretical framework from Cherubini et al. (2017), proposed in the context of general reaction-diffusion-mechanics systems emerging from multiphysics continuum mechanics and finite elasticity. In the present study, the adapted models are compared against preliminary experimental data of pig right ventricle fluorescence optical mapping. These data contribute to the characterization of the observed inhomogeneity and anisotropy properties that result from mechanical deformation. Our novel approach simultaneously incorporates two mechanisms for mechano-electric feedback (MEF): stretch-activated currents (SAC) and stress-assisted diffusion (SAD); and we also identify their influence into the nonlinear spatiotemporal dynamics. It is found that (i) only specific combinations of the two MEF effects allow proper conduction velocity measurement; (ii) expected heterogeneities and anisotropies are obtained via the novel stress-assisted diffusion mechanisms; (iii) spiral wave meandering and drifting is highly mediated by the applied mechanical loading. We provide an analysis of the intrinsic structure of the nonlinear coupling mechanisms using computational tests conducted with finite element methods. In particular, we compare static and dynamic deformation regimes in the onset of cardiac arrhythmias and address other potential biomedical applications.

Keywords: cardiac electromechanics, stress-assisted diffusion, stretch-activated currents, finite elasticity, reaction-diffusion

### 1. INTRODUCTION

Cardiac tissue is a complex multiscale medium constituted by highly interconnected units, cardiomyocytes, that conform a so-called syncitium with unique structural and functional properties (Pullan et al., 2005). Cardiomyocytes are excitable and deformable muscular cells that present themselves an additional multiscale architecture in which plasma membrane proteins and

**6**

intracellular organelles all depend on the current mechanical state of the tissue (Salamhe and Dhein, 2013; Schönleitner et al., 2017). Dedicated proteic structures, such as ion channels or gap junctions, rule the passage of charged particles throughout the cell as well as between different cells and they are usually described mathematically through multiple reaction-diffusion (RD) systems (Cabo, 2014; Dhein et al., 2014; Kleber and Saffitz, 2014). All these coupled nonlinear and stochastic dynamics, emerge then to conform the coordinated contraction and pumping of the heart (Augustin et al., 2016; Land and et. al., 2016; Quarteroni et al., 2017). During the overall cycle, the mechanical deformation undoubtedly affects the electrical impulses that modulate muscle contraction, also modifying the properties of the substrate where the electrical wave propagates. These multiscale interactions have commonly been referred in the literature as the mechano-electric feedback (MEF) (Ravelli, 2003). Experimental, theoretical and clinical studies have been contributing to the systematic investigation of MEF effects, already for over a century; however, several open questions still remain (Quinn et al., 2014; Quinn and Kohl, 2016; Land et al., 2017; Sack et al., 2018). For example, and focusing on the cellular level, it is still now not completely understood what is the effective contribution of stretch-activated ion channels and which is the most appropriate way to describe them. In addition, and focusing on the organ scale, the clinical relevance of MEF in patients with heart diseases remains an open issue (Orini et al., 2017), and more specifically, how MEF mechanisms translate into ECGs (Meijborg et al., 2017) and what is the specific role of mechanics during cardiac arrhythmias (Christoph et al., 2018).

The theoretical and computational modeling of cardiac electromechanics has been used to investigate some key aspects of general excitation-contraction mechanisms. For instance, the transition from cardiac arrhythmias to chaotic behavior, including the onset, drift and breakup of spiral/scroll waves (Panfilov and Keldermann, 2005; Bini et al., 2010; Keldermann et al., 2010; Dierckx et al., 2015), pinning and unpinning phenomena due to anatomical obstacles (Cherubini et al., 2012; Hörning, 2012; Chen et al., 2014), as well as the multiscale and stochastic dynamics both at subcellular, cellular and tissue scale (Trayanova and Rice, 2011; Hurtado et al., 2016; Land et al., 2017). However, the formulation of MEF effects into mathematical models has been primarily focused on accounting for the additive superposition of an active and passive stress to stretch-activated currents (Panfilov and Keldermann, 2005). Recent contributions have advanced an energy-based framework for the comparison of active stress, stretch-activated currents and inertia effects (Cherubini et al., 2008; Ambrosi and Pezzuto, 2012; Rossi et al., 2014; Costabal et al., 2017). These works further highlight the role of mechanics into the resulting heart function at different temporal and spatial scales.

In order to further motivate our theoretical developments, we provide an experimental representative example of the strong MEF coupling in cardiac tissue, observable on the macroscale. The data shown in **Figure 1** were obtained via dedicated fluorescence optical mapping applied on a pig right ventricle (the experimental procedure has been previously described in Fenton et al., 2009; Gizzi et al., 2013; Uzelac et al., 2017). After motion suppression via blebbistatin, the perfused tissue was electrically stimulated via an external bipolar stimulator with strength twice diastolic threshold. An excitation pulse with constant pacing cycle length of 1 s was delivered within the field of view (red spot in **Figure 1**) for several seconds (reaching a steady-state configuration) and for three different mechanical loading conditions on the same wedge: (a) free edges, (b) static uniaxial horizontal stretch, (c) static uniaxial vertical stretch with respect to a prescribed tissue orientation. The figure displays the underlying structure with clear evidence of the deformed tissue architecture, isochrones of electrical activation for a representative stimulus, and a sequence of spatial activation maps, where the colors indicate the level of activation– Action Potential (AP). Since in this proof of concept setup active contraction is inhibited by blebbistatin, these experiments clearly indicate that an additional degree of heterogeneity and anisotropy appears in the tissue and affects the AP excitation wave due to the intensity and direction of the externally applied deformation. In addition, this behavior does not correspond to a mere linear mapping from the reference to the deformed configuration (as a visual scaling of the image would easily show), but one observes that mechanical deformations induce higher, nonlinear and non-trivial anisotropies and heterogeneities in the tissue.

To better characterize such features, in **Figure 2** we provide an extended analysis of the local conduction velocity (CV) thorough histogram plots measured as follows:


The chosen methodology allows to represent tissue heterogeneity, provides a robust measure of the local CV distribution characterizing the underlying ventricular structure, and homogenizes physiological beat-to-beat variabilities. We summarize the results of such an extended analysis in **Table 1**, distinguishing between the three loading cases as described in **Figure 1**, providing sample size and statistical features of the computed CV histogram distribution, i.e., mean and median. We also provide the box plot representation of the obtained distributions for the three stretch states, respectively, to further highlight dispersion of the measured velocities. Every single feature in the study confirms a slower conduction velocity under stretch, and this behavior is full agreement with previous studies (Ravelli, 2003).

Also, in **Figure 3** we demonstrate that the tissue is at steadystate for the selected stimulation rate providing a quantitative comparison of the spatial and temporal activation sequences. In particular, after several activations (> 5), beat n and beat

FIGURE 1 | MEF observed in pig right ventricle via fluorescence optical mapping. From top to bottom, we provide: underlying tissue structure in reference (A) and deformed (B,C) states; activation isochrones each 4 ms originating from the stimulation point (red spot in the field of view–the bar indicates a length of 1 cm), and activation sequences. The three cases refer to no-stretch (A), static horizontally (B), and vertical (C) stretch in the directions indicated by the yellow arrows. The sequence of spatial activation uses the color code scaled to the AP level (yellow/green–high/low). Selected frames highlight the anisotropy induced by stretch. The outer black region is the noisy area not useful for the field of view.

pacing cycle length of 1 s. All the normal directions to the AP propagation are considered as indicated by orange arrows on a representative isochrone contour. The box plot of the distribution is provided as inset for the three histogram, respectively, highlighting the amount of dispersion and the reduction of CV under stretch (see Table 1 for details). Cut-off of spurious values is set at 0.05 and 1.3 m/s.

TABLE 1 | Summary of the local CV measurement, indicating histogram sample size and representative statistical features of the computed distribution: mean and median.


n + 10 are shown for a selected frame in terms of normalized AP distribution and its spatial difference, as well as comparing the time course of two consecutive activations (B1, B2) for a representative pixel under the field of view. In both cases, the spatio-temporal differences recorded are within the physiological variability of a ventricular wedge, and the tissue shows a steadystate regime which is considered at resting state for the numerical model.

Clear MEF effects evidenced in the previous experimental exercise suggest the incorporation of deformation and stress into the conduction properties of the cardiac tissue itself. The preliminary character of the proposed minimal model implies that we do not take into account the intrinsic structural variability of the tissue, but we stress that these effects will be investigated in future validation works. Accordingly, as a base line model, in the present study we will adapt the formulation recently proposed in Cherubini et al. (2017) and designed for general purpose stress-diffusion couplings. Doing so will allow us to readily and selectively incorporate two main MEF-related mechanisms into the computational modeling of cardiac electromechanics: (i) stretch-activated currents (SAC) and (ii) stress-assisted diffusion (SAD). The first paradigm relates the deformed mechanical state to the excitability of the medium via additional reaction functions (ionic-like currents); whereas the second one collects the homogenized effects of the deformation field on the diffusion processes originating the spatio-temporal patterns of the membrane voltage.

Within such a framework, we expect stretch-activated currents and stress-assisted diffusion to counterbalance each other by locally enhancing tissue excitability as well as smoothing the excitation wave according to the mechanical state of the tissue. In particular, since an external loading activates SAC at locations where the stretch is high and, at the same time, induces an heterogeneous and anisotropic diffusion tensor via the SAD mechanisms, our study focuses on the role of different mechanical boundary conditions in affecting action potential propagation and onset of arrhythmias. Accordingly, these two MEF mechanisms will be studied numerically in terms of three basic lines. First, by conducting a parametric analysis of the competing nonlinearities such to identify the limits of applicability of the proposed models. In particular, we select in the SAD mechanisms the most reliable modeling approach able to reproduce the experienced conduction velocity reduction upon an applied static loading state. Then, by performing a selective investigation of spiral onset protocols we will characterize the additional nonlinearities that arise due to MEF. Here we identify the different time span of the vulnerable window obtained via an S1S2 excitation protocol. Finally, by means of long-run analyses of arrhythmic scenarios, we compare and contrast static and dynamic displacement and traction loadings

FIGURE 3 | Spatial and temporal comparison of ventricular activation at constant pacing cycle length of 1 s under different mechanical loadings [free (A), horizontal (B) and vertical (C) stretch as in Figure 1]. The first two rows show the spatial distribution of the normalized voltage for beat *n* and beat *n* + 10 with the corresponding difference in the third row (color code is indicated). The last row indicates the time course of a representative pixel in the center of the field of view for two consecutive beats *n* and *n* + 10 with the corresponding difference provided in the red trace.

on a two-dimensional, idealized tissue slab. In this regard, we show how spiral core meandering results highly affected by the mechanical state and becomes unstable when SAC and SAD parameters are stronger.

Our results highlight several interesting conclusions regarding the propagation of the excitation wave in the presence of two competing MEF effects. These findings call for novel and additional experimental investigations. Finally, we provide a thorough discussion of the applicability of the proposed modeling approach and its extensions toward more realistic and multiphysics scenarios.

### 2. METHODS

The classical stress-assisted formulation proposed in Aifantis (1980) was developed in the context of dilute solutes in a solid. A similarity exists between this fundamental process and the propagation of membrane voltage within cardiac tissue. Indeed, on a macroscopically rigid matrix, the propagating membrane voltage can be regarded as a continuum field undergoing slow diffusion. Here we consider a similar approach (developed in Cherubini et al., 2017) which generalizes Fick's diffusion by using the classical Euler's axioms of continuously distributed matter. In particular, the balance of momentum can be imposed such to ensure frame invariance, a property of high importance in mechanical applications (Tadmor et al., 2012). We also assume quasi-static conditions for the continuum body, such that its macroscopic response is, in principle, independent from the diffusion process. On the contrary, the diffusion process will strongly depend on the mechanical state of the tissue.

#### 2.1. Continuum Electromechanical Model

We will assume that the body is a hyperelastic material and its motion will be described using finite kinematics. We will adopt an indicial notation where repeated indices indicate summation. We identify the relationship between material (reference), X<sup>I</sup> , and spatial (deformed), x<sup>i</sup> , coordinates via the smooth map xi(XI). The deformation gradient tensor FiI = ∂xi/∂X<sup>I</sup> allows to determine further properties of the continuum's motion. We indicate with J = det FiI the Jacobian of the map and with CIJ = FkIFkJ and Bij = FiKFjK the right and left Cauchy-Green deformation tensors, respectively. We assume that the generic myocardial fiber direction (the unit vector characterizing the microstructural property of the continuum body) in the material configuration, a<sup>I</sup> , is mapped to the deformed configuration as a<sup>i</sup> = FiJa<sup>J</sup> such that we can define the current fiber a<sup>i</sup> = aI/λ. Following the standard frame indifference mechanical framework (Spencer, 1989), these quantities are related to the invariants of the deformation in the following manner

$$I\_1 = \mathcal{C}\_{II} \text{, } I\_2 = \frac{1}{2} \left[ (\mathcal{C}\_{II})^2 - \mathcal{C}\_{I\bar{I}} \mathcal{C}\_{I\bar{I}} \right] \text{, } I\_3 = \det \mathcal{C}\_{I\bar{I}} = J^2,$$

$$I\_4 = \mathcal{C}\_{II} a\_I a\_I. \tag{1}$$

The principal invariants I<sup>1</sup> and I<sup>2</sup> rule the deviatoric response of the medium, the third invariant I<sup>3</sup> quantifies volumetric changes of the material, while the fourth pseudo-invariant I<sup>4</sup> measures the directional fiber stretch, λ. This last entity is intrinsically directional, so for two-dimensional models, we will simply assign a horizontal myocardial direction (1, 0)<sup>T</sup> . In what follows, the symbol δij denotes the second-order identity tensor.

As anticipated above, we will base our model on the stressassisted diffusion formulation from Cherubini et al. (2017). We do however, generalize the governing equations adopting a more accurate nondimensional three-variable model of cardiac action potential (AP) propagation introduced in Fenton and Karma (1998b), and we will account for SAC (Panfilov and Keldermann, 2005), that were not considered in Cherubini et al. (2017). Even though several more physiological assumptions could be made, here we will focus on a purely phenomenological approach.

In the deformed configuration, the electrophysiological model consists of three variables: the membrane potential u, and a fast and slow transmembrane ionic gates v,w. They satisfy the following RD system

$$\frac{\partial \boldsymbol{u}}{\partial t} = \frac{\partial}{\partial \boldsymbol{\chi}\_{i}} \left( d\_{ij}(\boldsymbol{\sigma}\_{ij}) \frac{\partial \boldsymbol{u}}{\partial \boldsymbol{\chi}\_{j}} \right) - \mathbf{I}\_{\text{ion}} \{ \boldsymbol{u}, \boldsymbol{\nu}, \boldsymbol{w} \} + \mathbf{I}\_{\text{sac}} \{ \boldsymbol{\lambda}, \boldsymbol{u} \} + \mathbf{I}\_{\text{ext}}, \tag{2a}$$

$$\frac{d\nu}{dt} = (1 - H\_c) \left( \frac{1 - \nu}{\tau\_\nu^{-}} \right) - H\_c \frac{\nu}{\tau\_\nu^{+}},\tag{2b}$$

$$\frac{d\boldsymbol{w}}{dt} = (1 - H\_c) \left( \frac{1 - \boldsymbol{w}}{\boldsymbol{\tau}\_{\boldsymbol{w}}^{-}} \right) - H\_c \frac{\boldsymbol{w}}{\boldsymbol{\tau}\_{\boldsymbol{w}}^{+}},\tag{2c}$$

where Neumann zero-flux boundary conditions are imposed for Equation (2a), i.e., [dij∂u/∂xj]n<sup>i</sup> = 0, where n<sup>i</sup> is the outward normal on the domain boundary. System (2) describes the propagation of a normalized dimensionless membrane potential, which can be mapped to physical quantities as u = (V<sup>m</sup> − Vo) / Vfi − V<sup>o</sup> (see Fenton and Karma, 1998b for details as modified Beeler-Reuter fit) where V<sup>m</sup> stands for the physical transmembrane potential, V<sup>o</sup> is the resting membrane potential and Vfi represents the Nernst potential of the fast inward current. In Equation (2a), the total transmembrane density current, Iion(u, v,w), is the sum of a fast inward depolarizing current, Ifi(u, v), a slow rectifying outward current, Iso(u), and a slow inward current, Isi(u,w), given by

$$\begin{aligned} \mathcal{I}\_{\text{fi}}(\boldsymbol{\mu}, \boldsymbol{\nu}) &= -\frac{\boldsymbol{\nu}}{\boldsymbol{\tau}\_{d}} H\_{\boldsymbol{\varepsilon}} \left( 1 - \boldsymbol{\mu} \right) \left( \boldsymbol{\mu} - \boldsymbol{\mu}\_{\boldsymbol{\varepsilon}} \right), \\ \mathcal{I}\_{\text{so}}(\boldsymbol{\mu}) &= \frac{\boldsymbol{\mu}}{\boldsymbol{\tau}\_{\boldsymbol{\sigma}}} \left( 1 - H\_{\boldsymbol{\varepsilon}} \right) + \frac{1}{\boldsymbol{\tau}\_{\boldsymbol{\tau}}} H\_{\boldsymbol{\varepsilon}}, \\ \mathcal{I}\_{\text{si}}(\boldsymbol{\mu}, \boldsymbol{\nu}) &= -\frac{\boldsymbol{\nu}}{2 \boldsymbol{\tau}\_{\boldsymbol{\varepsilon}i}} \left( 1 + \tanh \left[ k \left( \boldsymbol{\mu} - \boldsymbol{\mu}\_{\boldsymbol{\varepsilon}}^{\boldsymbol{s}i} \right) \right] \right), \end{aligned}$$

where τ − v (u) = Hvτ − <sup>v</sup><sup>1</sup> + (1 − Hv) τ − v2 is the time constant governing the reactivation of the fast inward current, and H<sup>x</sup> = H<sup>x</sup> (u − ux) is the standard Heaviside step function. Iext is the space and time-dependent external stimulation current with amplitude Imax ext . All model parameters are collected in **Table 2**.

The mechanical problem, stated also on the current configuration and occupying the domain (t), respects the balance of linear momentum and mass, written in terms of displacement, ϕ, and pressure, p, and set in a quasi-static form. The problem is complemented with displacement and traction boundary conditions set on two different parts of the boundary Ŵ<sup>D</sup> or ŴN:

$$\frac{\partial \sigma\_{ij}}{\partial \mathbf{x}\_{i}} = 0 \quad \text{and} \quad \rho d\hat{\boldsymbol{\nu}} = \rho\_{0} d\hat{\boldsymbol{\mathcal{V}}}, \qquad \text{in} \quad \Omega(t), \tag{3a}$$

$$
\mathfrak{g} = \tilde{\mathfrak{g}}(t), \qquad \text{on} \quad \Gamma\_D(t), \qquad \text{(3b)}.
$$

$$
\sigma\_{ik} n\_k = \tilde{t}\_i(t), \qquad \text{on} \quad \Gamma\_N(t), \qquad \text{(3c)}.
$$

where ρ0, ρ and dVˆ , dvˆ are the densities and volumes of the solid in the undeformed and deformed configurations, respectively.



*Time units are ms, length is cm, the term g*¯ *fi is in mS*/*cm*<sup>2</sup> *, dimensional voltages are in mV, and stiffness in MPa. Square brackets indicate range of parameter variability, and the rightmost column specifies initial conditions for a resting tissue.*

In Equation (3b), ϕ˜(t) is a known (possibly time-dependent) displacement and in Equation (3c), t˜ i(t) is a (possibly timedependent) traction force. In both cases, the tissue is stretched up to a maximum level of 20% of the resting length such to activate all MEF components. In addition, the time-variation of the imposed boundary conditions is much slower than the governing dynamic physical processes, and therefore a quasistatic mechanical equilibrium is maintained.

The two sub-problems (Equations 2, 3) are completed via the following mixed constitutive prescriptions for incompressible isotropic hyperelastic materials (J = 1):

$$
\sigma\_{\rm ij} = 2\varepsilon\_1 B\_{\rm ij} - 2\varepsilon\_2 B\_{\rm ij}^{-1} - p\delta\_{\rm ij} + T\_a \delta\_{\rm ij}, \tag{4a}
$$

$$\frac{\partial T\_a}{\partial t} = \epsilon(\mu)(k\_{T\_a}\mu - T\_a)\,,\tag{4b}$$

$$d\_{\vec{\imath}\vec{\jmath}}(\sigma\_{\vec{\imath}\vec{\jmath}}) = D\_0 \delta\_{\vec{\imath}\vec{\jmath}} + D\_1 \sigma\_{\vec{\imath}\vec{\jmath}} + D\_2 \sigma\_{ik} \sigma\_{k\vec{\jmath}},\tag{4c}$$

$$\mathcal{I}\_{\rm{scac}}(\lambda, u) = G\_s H\_{\rm{scac}}(\lambda - 1)(u\_{\rm{scac}} - u) \,. \tag{4d}$$

Equation (4a) specifies a constitutive form for the Cauchy stress tensor (total equilibrium stress in the current deformed configuration) highlighting two multiscale contributions on the tissue deformation. First, the passive material response follows that of an incompressible Mooney-Rivlin hyperelastic solid and it is characterized by two stiffness parameters c<sup>1</sup> and c2; and secondly, the active component contributing to the total stress in the form of an additional hydrostatic force with amplitude Ta. The dynamics of T<sup>a</sup> are described by Equation (4b), where the constant kTa modulates the amplitude of the active stress contribution, while ǫ(u) is a contraction switch function: ǫ(u) = ǫ<sup>0</sup> if u < 0.005, and ǫ(u) = 10ǫ<sup>0</sup> if u ≥ 0.005.

Equation (4c) characterizes the stress-assisted diffusion contribution describing the effect of tissue deformation on the AP spreading. The parameter D<sup>0</sup> represents the usual diffusion coefficient for isotropic media, i.e., diffusivity = [L<sup>2</sup> T −1 ], while D<sup>1</sup> and D<sup>2</sup> introduce the impact of mechanical stress through linear and nonlinear contributions, respectively, on the diffusive flux. Accordingly, D<sup>1</sup> and D<sup>2</sup> have units of [L<sup>2</sup> T <sup>−</sup><sup>1</sup> P −1 ] and [L<sup>2</sup> T <sup>−</sup><sup>1</sup> P −2 ], respectively. We also remark that Equation (4c) reduces to the characterization of the classical diffusion equation for D<sup>1</sup> ≡ D<sup>2</sup> = 0.

Finally, Equation (4d) describes the stretch-activated current contribution (which is usually adopted as the sole MEF effect). The term Isac(λ, u) affects the ionic (reaction) currents in the electrophysiological system and is formulated as a linear function of the membrane potential u and the fiber stretch λ. Here, G<sup>s</sup> modulates the amplitude of the current, usac represents a referential (resting) potential while, Hsac is a switch activating this additional reaction current only when the myocardial fiber is elongated, i.e., Hsac = 1 for λ ≥ 1 and Hsac = 0 for λ < 1.

We also introduce the definition of spiral tip (core of the spiral wave) as the point with instantaneous null velocity (see Fenton and Karma, 1998b for details). In practice, for two-dimensional domains, we choose an isopotential line of constant membrane voltage, u(R<sup>I</sup> , t) = uiso, where R<sup>I</sup> = xtipX<sup>I</sup> + ytipY<sup>I</sup> represents the position vector in the reference undeformed configuration identifying the boundary between depolarized and repolarized regions. Accordingly, the spiral tip can be defined as the point in space where the excitation front meets the repolarization waveback of the action potential, conforming with the operative definition:

$$
\mu(\mathcal{R}\_I, t) - \mu\_{\text{iso}} = \frac{\partial \mu(\mathcal{R}\_I, t)}{\partial t} \equiv 0 \,. \tag{5}
$$

We numerically identify the tip coordinates (xtip, ytip) by considering uiso = 0.5 with tolerance of 10−<sup>4</sup> .

#### 2.2. Numerical Approximation

The electromechanical problem is rewritten in the undeformed configuration and subsequently computationally solved via a finite element method. Even if the model originates as an extension of our contribution in Cherubini et al. (2017), the numerical method employed here is simpler, as we do not solve for stresses explicitly but rather postprocess them from the computed discrete displacements. The overall numerical scheme for active stress electromechanics with SAC is therefore not precisely novel, but we will still provide a few details for sake of completeness of the presentation and future reproducibility of results. Further details could be found in e.g., Ruiz-Baier (2015). We discretize displacements with vectorial piecewise quadratic and continuous polynomials, and the pressure field using piecewise linear and discontinuous elements. All remaining unknowns (associated to the electrophysiology and to the active tension) are approximated using piecewise linear and continuous elements. Let us then consider a regular, quasi-uniform partition T<sup>h</sup> of (0) into triangles T of diameter hT, where h = max{h<sup>T</sup> : T ∈ <sup>T</sup>h} is the meshsize. The finite element spaces mentioned above are defined as (see e.g., Quarteroni and Valli, 1994)

$$\begin{aligned} \mathbf{H}\_h &:= \{ \boldsymbol{\Psi} \in \mathbf{H}^1(\Omega(0)) : \boldsymbol{\Psi}|\_T \in [\mathbb{P}\_2(T)]^2 \,\,\forall T \in \mathcal{T}\_h, \text{ and} \\ &\quad \boldsymbol{\Psi} = \mathbf{0} \text{ on } \Gamma\_D(\mathbf{0}), \end{aligned}$$

$$\begin{aligned} Q\_h &:= \{ q \in L^2(\Omega(0)) : q|\_T \in \mathbb{P}\_1(T) \,\,\forall T \in \mathcal{T}\_h \}, \\ \mathcal{W}\_h &:= \{ \boldsymbol{\Psi} \in H^1(\Omega(0)) : \boldsymbol{\Psi}|\_T \in \mathbb{P}\_1(T) \,\,\forall T \in \mathcal{T}\_h \}, \end{aligned}$$

for the case of clamped boundaries at ŴD(0).

Let us also construct an equispaced partition of the time domain 0 = t <sup>0</sup> < t <sup>1</sup> = 1t < · · · < t <sup>M</sup> = tmax. The coupled problem is solved sequentially between the mechanical and electrochemical blocks. A description of the needed computations at each time step t n is as follows: **Step 1:** From the known values u n h , v n h ,w n h , T n a,h , D n h , λ n h , find u n+1 h , v n+1 h ,w n+1 h , T n+1 a,h such that

$$\begin{split} &\int\_{\Omega(0)} \frac{\boldsymbol{u}\_{h}^{n+1}}{\Delta t} \boldsymbol{\psi}\_{h}^{n} + \int\_{\Omega(0)} D\_{h}^{n} \nabla \boldsymbol{u}\_{h}^{n+1} \cdot \nabla \boldsymbol{\psi}\_{h}^{n} \\ &= \int\_{\Omega(0)} \left[ \frac{\boldsymbol{u}\_{h}^{n}}{\Delta t} + \mathbf{I}\_{\text{ion}} (\boldsymbol{u}\_{h}^{n}, \boldsymbol{\nu}\_{h}^{n}, \boldsymbol{\nu}\_{h}^{n}) + \mathbf{I}\_{\text{szc}} (\boldsymbol{\lambda}\_{h}^{n}, \boldsymbol{u}\_{h}^{n}) + \mathbf{I}\_{\text{ext}} \right] \boldsymbol{\psi}\_{h}^{n}, \\ &\frac{1}{\Delta t} \int\_{\Omega(0)} \boldsymbol{\nu}\_{h}^{n+1} \boldsymbol{\psi}\_{h}^{\boldsymbol{\nu}} = \int\_{\Omega(0)} \left[ \frac{1}{\Delta t} \boldsymbol{\nu}\_{h}^{n} + f\_{\boldsymbol{\nu}} (\boldsymbol{u}\_{h}^{n}, \boldsymbol{\nu}\_{h}^{n}) \right] \boldsymbol{\psi}\_{h}^{\boldsymbol{\nu}}, \end{split}$$

$$\begin{aligned} \frac{1}{\Delta t} \int\_{\Omega(0)} \boldsymbol{\nu}\_h^{n+1} \boldsymbol{\psi}\_h^{\boldsymbol{w}} &= \int\_{\Omega(0)} \left[ \frac{1}{\Delta t} \boldsymbol{\nu}\_h^{\boldsymbol{n}} + f\_{\boldsymbol{w}}(\boldsymbol{u}\_h^{\boldsymbol{n}}, \boldsymbol{\nu}\_h^{\boldsymbol{n}}) \right] \boldsymbol{\psi}\_h^{\boldsymbol{w}},\\ \frac{1}{\Delta t} \int\_{\Omega(0)} \boldsymbol{T}\_{a,h}^{n+1} \boldsymbol{\psi}\_h^{\boldsymbol{T}\_a} &= \int\_{\Omega(0)} \left[ \frac{1}{\Delta t} \boldsymbol{T}\_{a,h}^{\boldsymbol{n}} + f\_{\boldsymbol{T}\_a}(\boldsymbol{u}\_h^{\boldsymbol{n}}, \boldsymbol{T}\_{a,h}^{\boldsymbol{n}}) \right] \boldsymbol{\psi}\_h^{\boldsymbol{T}\_a}, \end{aligned}$$

for all (ψ u h , ψ v h , ψ w h , ψ Ta h ) ∈ [Vh] 4 . This scheme for the electric/activation system is given in a first-order semi-implicit form: the nonlinear reaction terms and the coupling stressassisted diffusion are taken explicitly, while the linear part of diffusion is advanced implicitly. Here

$$\begin{aligned} D\_h^n &= D\_0 \mathbf{C}^{-1} (\boldsymbol{\varphi}\_h^n) + \frac{D\_1}{J(\boldsymbol{\varphi}\_h^n)} \mathbf{S} (\boldsymbol{\varphi}\_h^n) + \frac{D\_1}{J(\boldsymbol{\varphi}\_h^n)^2} \mathbf{S} (\boldsymbol{\varphi}\_h^n)^2, \\ \lambda\_h^n &= \sqrt{C\_{11}(\boldsymbol{\varphi}\_h^n)}, \end{aligned}$$

are the explicit approximation of the stress-assisted diffusivity and of the stretch in the fiber direction, all in the reference configuration.

**Step 2:** Given the activation value T n+1 a,h computed in Step 1 of this iteration, solve the nonlinear elasticity equations

$$\int\_{\Omega(0)} \mathbb{F}(\boldsymbol{\varphi}\_h^{n+1}) \mathbb{S}(\boldsymbol{\varphi}\_h^{n+1}, \boldsymbol{p}\_h^{n+1}, T\_{a,h}^{n+1}) \colon \nabla \boldsymbol{\Psi}\_h = 0 \quad \forall \boldsymbol{\Psi}\_h \in \mathbf{H}\_h,$$

$$\int\_{\Omega(0)} q\_h [J(\boldsymbol{\varphi}\_h^{n+1}) - 1] = 0 \quad \forall q\_h \in Q\_h,$$

where

$$\begin{aligned} \mathbf{S} &= 2[\boldsymbol{c}\_1 + \boldsymbol{c}\_2 \text{tr} \left( \mathbf{C} (\boldsymbol{\varphi}\_h^{n+1}) \right)] \mathbf{I} - 2\boldsymbol{c}\_2 \mathbf{C} (\boldsymbol{\varphi}\_h^{n+1}) \\ &- p\_h^{n+1} J(\boldsymbol{\varphi}\_h^{n+1}) \mathbf{C}^{-1} (\boldsymbol{\varphi}\_h^{n+1}) + T\_{a,h}^{n+1} \mathbf{C}^{-1} (\boldsymbol{\varphi}\_h^{n+1}), \end{aligned}$$

is the second Piola-Kirchhoff stress tensor.

**Step 3:** The solution of the problem in Step 2 uses a Newton-Raphson method whose iterations are terminated once the energy residual drops below the relative tolerance of 10−<sup>6</sup> . The solution to each linear tangent problem is conducted with the BiCGSTAB method preconditioned with an incomplete LU(0) factorization. The iterations of the Krylov solver are terminated after reaching the absolute tolerance 10−<sup>5</sup> . The residual computation for the mechanical problem also contains the terms arising from timedependent displacement or traction boundary conditions, which also need to be assigned at each timestep. For instance, in an uniaxial test (denoted dynamic displacement in the examples below), the left segment of the boundary is clamped (zero displacements are imposed), the bottom and top edges are subject to zero normal stress, and the right edge is pulled according to the displacement ϕ˜(t) = - 0.2L sin<sup>2</sup> (π/400 t), 0<sup>T</sup> .

All tests are conducted using a two-dimensional slab of dimensions L × L = 6.2 × 6.2 cm<sup>2</sup> , which is the same configuration used to produce the dynamics analyzed in Fenton and Karma (1998b). The computational domain is discretized with a structured triangular mesh of 10,000 elements. After a mesh convergence test involving conduction velocities and reproducing the expected values for planar excitation waves reported in Fenton and Karma (1998b), we proceeded to fix the temporal and spatial resolutions to 1t = 0.1 ms, h = 0.062 cm, respectively. A representative example of the mesh is provided in **Figure 4**, plotted in the deformed configuration under both traction and displacement boundary conditions and highlighting the spiral wave resolution. All numerical tests were carried out using the open-source finite element library FEniCS (Alnæs et al., 2015).

#### 3. RESULTS

In the following, we adopt a parametric setup fitted for the modified Beeler-Reuter model (Equation 2), while selectively changing MEF parameters (D1,Gs). This choice provides a reference, unloaded, model configuration with constant CV of 0.42 m/s and a circular meandering for a free spiral on a homogeneous and isotropic domain. Such values deviate as the MEF coupling is activated.

#### 3.1. Conduction Velocity Analysis

We start analyzing the parameter space associated to the two MEF contributions in our model. That is, the stress-assisted coefficients D1, D<sup>2</sup> and the SAC amplitude G<sup>s</sup> . The study will be restricted to a static homogeneous stretched state (e.g., a uniaxial Dirichlet boundary condition ϕ = [0.2L, 0]<sup>T</sup> set on the right edge of the domain). All remaining material and electrophysiology parameters will be kept constant, except that we fix the relative influence of the nonlinear contribution in the stress-assisted diffusion, by setting D<sup>2</sup> to be one order of magnitude smaller than D1. This configuration will highlight MEF effects in a minimal, but still comprehensive manner.

**Figure 5** portrays the conduction velocity obtained for all combinations of (D1,Gs) on the parameter space. The quantity is measured as the wave-front velocity of a planar excitation wave along its propagation. The plot illustrates the variability of the recorded CV amplitude (in the range 0.25–0.5 m/s) according to the MEF coupling intensity variation and to histogram measures in **Figure 2**. In particular, starting from a physiological baseline of 0.42 m/s, when neither SAC nor SAD is present (D<sup>1</sup> = 0,G<sup>s</sup> = 0), we observe a net increase of CV for (D<sup>1</sup> = 0,G<sup>s</sup> > 0) while we recover CV decrements for (D<sup>1</sup> < 0,G<sup>s</sup> = 0). This specific

FIGURE 4 | Example of structured mesh employed in the computational results. The grid is displayed on the deformed configuration when the domain is subject to traction (arrows) and fixed displacement (lines) boundary conditions, and a zoom exemplifies the mesh resolution for a rather coarse spiral front.

aspect reproduces what is expected from experimental evidence, i.e., MEF decreases the CV of the excitation wave (Ravelli, 2003).

Besides, for higher values of G<sup>s</sup> , we obtain two unexpected results. First, for G<sup>s</sup> > 0.15 we observe a decrement of CV for different values of D1. Second, for the particular combination (D<sup>1</sup> < −10−<sup>4</sup> ,G<sup>s</sup> > 0.15) the wave disappears from the domain or annihilates due to excessive activation (see e.g., side panels in **Figure 5** or the top row in **Figure 8**). Consequently, we are not able to measure any propagation (which reflects in the combinations with × of the figure). This last result is somehow counterintuitive since, as evidenced by **Figure 1**, we experimentally experience a complete depolarization of the tissue with AP propagation, in the case of fixed stretch. To support this point, in **Figure 6** we provide a representative sequence of point-wise activations delivered on our simplified 2D domain and mimicking the experimental protocol conducted in **Figure 1** for a selected parameter choice, i.e., (D1,Gs) = (−0.75 · 10−<sup>4</sup> , 0). In this case, the AP excitation wave propagates differently according to the applied stretch state, both horizontal and vertical displacement and traction. In addition, the computed CVs change similarly to what observed in **Figure 2**. We remark that such a comparison with experimental observations is purely qualitative and does not represent a definitive validation of the model.

### 3.2. S1-S2 Excitation Protocol

We further investigate the strength of MEF coupling effects. In particular, we want to determine which specific contribution (stretch-activated currents or stress-assisted diffusion) exhibits a better match against experimental evidence, and for this we

assess changes in the S1-S2 stimulation protocol. In practice, in order to induce a spiral wave on an excitable tissue, one typically generates a planar electrical excitation (S1), followed by a second broken stimulus (S2) during the repolarization phase of the S1 wave, the so called vulnerable window (Karma, 2013). In

TABLE 3 | Parameter calibration associated to the S1-S2 protocol.


*Combination of MEF parameters* (*D*1, *Gs*)*, corresponding CV, minimum, tmin S*2 *, and maximum, tmax S*2 *, stimulation time required for spiral wave onset (vulnerable window).*

our case, we selected a reduced set of MEF parameters (D1,Gs) indicated in **Table 3** as A,B,C,D. These values are motivated by the results from **Figure 5**. In particular, we select only the parameter combinations that produce either a unique decrement or increment of CV.

**Figure 7** shows the different dynamics obtained via the S1- S2 protocol for the four different sets of MEF parameters. The first column is set at 100 ms from the S1 stimulus for all the combinations, while the remaining frames are selected to highlight the elicited behavior. As a result, we observe that the deformation state of the tissue influences the overall dynamics differently. The first column highlights the variability in the AP wavelength, representing the spatial extension of the activation wave, which is due to the different repolarization states of the tissue induced by stress-assisted diffusion and stretch-activated currents. In particular, the AP wavelength varies as > 6.2 cm for case A, = 6.2 cm for case B, and < 2 cm for cases C, D.

In fact, when the G<sup>s</sup> contribution is present, the excitation wave is much reduced with respect to the profiles generated with the electrophysiological three-variable model (2) and fine-tuned on experimental data. Such an effect is not present when G<sup>s</sup> = 0.

Secondly, cases A and B (that is, where only D<sup>1</sup> is activated) provide a similar behavior for spiral onset and case B shows the expected reduction in CV. Contrariwise, cases C and D (where also the contribution of G<sup>s</sup> is present) induce much more complex dynamics, not expected in an isotropic medium. In particular, case C leads to a wave break and multiple spirals generation at the S2 stimulus that eventually collide and result in a single spiral wave. On the other hand, case D shows a more stable behavior generated by the presence of D1.

In addition, **Table 3** also provides the minimum and maximum delay for the S2 stimulation (vulnerable window) allowing to induce a spiral wave in the uniaxially stretched tissue. It is evident that the presence of SAC reduces the minimum S2 stimulation time, t min S2 , by about 100 ms with respect to the other cases and slightly increase the overall time span of the vulnerable window. Such a variation is motivated on the additional reaction current induced by the presence of Isac(λ, u) everywhere in the medium, but it is not expected from the experimental isochrones provided in **Figure 1**.

To further corroborate this analysis, we provide in the top panels of **Figure 8** an additional sequence referring to the combination (D1,Gs) = (−1.5 · 10−<sup>4</sup> , 0.25) in the case with static displacement boundary conditions, which falls in the range where no CV wave was measured. As anticipated, an excessive contribution due to SAC elicits extra activations where the stretch is maximum, i.e., at the corners of the domain. This particular behavior is not obtained when the stress-assisted contribution D<sup>1</sup> is very high. Next, the bottom panels of **Figure 8** show results using the combination (D1,Gs) = (−0.75 · 10−<sup>4</sup> , 0.125), which allows the quantification of CV but can eventually lead to spiral breakup and non-sustainability of the arrhythmic patterns due to the mechanical state of the tissue (corresponding to the case of dynamic traction, described below). This is a representative example of the key importance of boundary conditions and how MEF effects could be effectively translated into clinical studies.

### 3.3. Spiral Drift and Effects due to Boundary Conditions

Finally, we turn to the analysis of meandering for the spiral tip for long run simulations (4 s of physical time) comparing the four selected sets of parameters A,B,C,D in combination with static/dynamic–displacement/traction boundary conditions. In particular, we initiate the spiral wave via the S1-S2 stimulation protocol as discussed in the previous section, in absence of any mechanical loading such to start from the same initial conditions for each selected case. After spiral onset and stabilization (namely, for t > t<sup>2</sup> = 250 ms), we apply the following four different loadings:


Second row shows the dynamic traction configuration for which the initiated spiral wave goes through breakup due to the effect of mechanical loading.

• Dynamic traction: uniaxial time-dependent force t˜ <sup>i</sup>(t) = tmax sin<sup>2</sup> (π/400 t) applied on the left and right boundaries while keeping the bottom side clamped (**Figure 9D**).

For each mechanical loading, panels in **Figure 9** show the trajectories of the spiral tip for the four MEF parameters combinations. Two important aspects are worthy of attention.

First, for each combination of the mechanical loading, the presence of the stress-assisted conductivity D<sup>1</sup> tends to stabilize the meandering (see black and green traces). This behavior is particularly evident in **Figure 9C** where the combination D<sup>1</sup> = −0.75 · 10−<sup>4</sup> ,G<sup>s</sup> = 0 results into a localized core, while the case D<sup>1</sup> = 0,G<sup>s</sup> = 0 presents a circular, but slightly drifting core. Consequently, local stress-based heterogeneities appear in the medium when D<sup>1</sup> is different from zero, leading to

FIGURE 9 | Tip trajectories for four combinations of MEF parameters (*D*1, *Gs*) (see Table 3), applying static/dynamic–displacement/traction boundary conditions as indicated in the corresponding inset. Inset color code refers to the magnitude of the displacement field. (A) The last second of simulation is shown for the four cases with localized cores. (B) The last 3 s of simulations are shown highlighting the differences of the meandering. (C) Different times are shown for the four cases since for *Gs* > 0 the spirals exit the domain soon after initiation. (D) The last 3 s are shown for the case *Gs* > 0 highlighting the different meandering obtained with respect to *Gs* = 0. Minor discontinuities are due to the frame resolution for post processing analysis and are not linked to the accuracy of the numerical solution.

pinning-like phenomena also observed in Cherry and Fenton (2008), Cherubini et al. (2012), Jiménez and Steinbock (2012), and Liu et al. (2013). Moreover, these conditions are associated with an ellipsoidal shape of the core underlying the effective anisotropy induced by the stress-assisted coupling. All these observations agree with the conclusions from the extended analysis conducted on the chosen AP model in the original work from Fenton and Karma (1998b).

Secondly, when also SAC is present, the spiral meandering is unpredictable and strongly dependent on the applied boundary conditions (see blue and red traces). In this scenario, it is interesting to note that static loading induces a simple meandering which eventually pushes the spiral wave out from the domain (see **Figure 9C**), whereas dynamic conditions dictate a chaotic behavior that makes the spiral either to explore the whole domain, or to exit it. These patterns seem to be extreme conditions of hyper-excitability not expected in a twodimensional isotropic medium (Fenton and Karma, 1998a; Fenton et al., 2002).

Finally, we highlight the symmetry of the observed behavior according to the clockwise or counterclockwise rotation of the spiral. This particular analysis is provided in **Figure 10** and

progressive spiral frames for the two cases.

further links the excitation dynamics to the mechanical features. The different traces refer to the spiral core meandering observed for a dynamic uniaxially stretched case with MEF parameters D<sup>1</sup> = 0,G<sup>s</sup> = 0.125 and initiated via the S1-S2 stimulation protocol: case (a) compares a clockwise and counterclockwise spiral propagation; case (b) shows a counterclockwise spiral core initiated from the top (red) and bottom (blue) case. Corresponding sequences are also shown as side panels. This result is limited to the simplified nature of the domain adopted, i.e., 2D isotropic. A more realistic computational domain, embedding fiber directionality and tissue thickness, would show more involved dynamics in a complex spatiotemporal and clinical relevant perspective.

### 4. CONCLUSION

We have advanced a minimal model for the electromechanics of cardiac tissue, where the mechano-electrical feedback is incorporated through two competing mechanisms: the stretchactivated currents commonly found in the literature, and the stress-assisted diffusion (or stress-assisted conductivity) recently proposed by Cherubini et al. (2017). Both the electrophysiology and the mechanical response adopt a phenomenological simplified description, but a preliminary validation is provided through a set of numerical simulations that agree qualitatively with a set of experimental data for pig right ventricle.

The implications of the intensity and degree of nonlinearity assumed for the stress-assisted diffusion effect are studied from the viewpoint of changes in the conduction velocity and the dynamics of spiral waves in simplified 2D domains. Multiple electrical stimulations protocols and non-trivial mechanical loadings have been investigated highlighting the strong coupling due to the different MEF contributions. The analysis supports the hypothesis that the simplistic formulation adopted for stretch-activated currents seems to deviate from the experimental evidence, in line with recent contributions addressing the coupled modeling of SACs and stretch-induced myofilament calcium release at the myocyte level (Timmermann et al., 2017). On the other hand, in a homogenized setting, the stress-assisted diffusion formulation produces a series of interesting phenomena that qualitatively match heterogeneities and anisotropies observed during mechanical stretching of pig right ventricle via fluorescence optical mapping.

Limitations of the present work are partially linked to the phenomenological approach adopted to describe the complex multiscale mechanisms intrinsic in the cardiac tissue and partially due to the simplified computational domain. In this regards, we aim at investigating more reliable stretch-activated current formulations leading to alternans behaviors (Galice et al., 2016) within a multiscale mechanobiology perspective (Nava et al., 2016; Stålhand et al., 2016; Cyron and Humphrey, 2017) and tacking into account the intracellular calcium cycling influenced by mechanical stretch, because all these effects have been proposed as concurring mechanisms of arrhythmogenesis within the heart. From the mechanical point of view, we mention as main limitation the adoption of a simplified isotropic hyperelastic material model which can be generalized to more complex and reliable formulations. This will include, for example, active strain anisotropies, muscular and collagen fiber distributions in an orthotropic mechanical framework that the authors have been extensively developing during the last decade (Cherubini et al., 2008; Nobile et al., 2012; Gizzi et al., 2015, 2016, 2018; Pandolfi et al., 2016). Such a generalization will maintain the nature of the present theoretical framework in terms of MEF competing effects. In this line, we also aim to generalize our theoretical and computational approach toward intrinsic multiscale and multiphysics mechano-transduction problems (Weinberg et al., 2017; Lenarda et al., 2018), e.g., the uterine smooth muscle activity (Young, 2016; Yochum et al., 2017) or the intestine biomechanics activity (Pandolfi et al., 2017; Brandstaeter et al., 2018) by implying the usage of network approaches (Giuliani et al., 2014; Robson et al., 2018) and data assimilation procedures (Barone et al., 2017). In addition, the investigation of the complex spatiotemporal dynamics, chaos control and multiphysics couplings in excitable systems (see e.g., Hörning et al., 2017; Christoph et al., 2018) can be emphasized within the proposed electromechanical framework by using realistic three-dimensional cardiac structures (Lafortune et al., 2012). We also mention implications of the proposed models in the mathematical study of general stress-assisted diffusion problems, as recently carried out in Gatica et al. (2018). Finally, we hope that the present contribution may open new experimental studies to translate the complex MEF phenomena into the clinical practice (Meijborg et al., 2017; Orini et al., 2017) identifying novel risk indices for cardiac arrhythmias (Gizzi et al., 2017).

## ETHICS STATEMENT

All experiments conform to the current Guide for Care and Use of Laboratory Animals published by the National Institutes of Health (NIH Publication No. 85–23, revised 1996), and approved by the Office of Research and Integrity Assurance at Georgia Tech.

### AUTHOR CONTRIBUTIONS

AL, AG, RR-B, CC, FF, and SF design of the study; AL, AG, and RR-B numerical methods and computational tests; FF experimental measurements; AL and AG Statistical analysis; AL, AG, RR-B, CC, FF, and SF Manuscript writing.

### ACKNOWLEDGMENTS

This work has been supported by the Italian National Group of Mathematical Physics GNFM-INdAM; by the International Center for Relativistic Astrophysics Network ICRANet; by the London Mathematical Society through its Grant Scheme 4; and by the EPSRC through the Research Grant EP/R00207X/1.

#### Loppini et al. Competing Mechanisms in Cardiac Electromechanics

### REFERENCES


ventricular systolic wall thickening in cardiac electromechanics. Eur. J. Mech. 48, 129–142. doi: 10.1016/j.euromechsol.2013.10.009


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Loppini, Gizzi, Ruiz-Baier, Cherubini, Fenton and Filippi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Semi-implicit Non-conforming Finite-Element Schemes for Cardiac Electrophysiology: A Framework for Mesh-Coarsening Heart Simulations

Javiera Jilberto<sup>1</sup> and Daniel E. Hurtado1,2 \*

<sup>1</sup> Department of Structural and Geotechnical Engineering, School of Engineering, Pontificia Universidad Católica de Chile, Santiago, Chile, <sup>2</sup> Institute for Biological and Medical Engineering, Schools of Engineering, Medicine and Biological Sciences, Pontificia Universidad Católica de Chile, Santiago, Chile

#### Edited by:

Mariano Vázquez, Barcelona Supercomputing Center, Spain

#### Reviewed by:

Joakim Sundnes, Simula Research Laboratory, Norway Mark Potse, Inria Bordeaux-Sud-Ouest Research Centre, France

> \*Correspondence: Daniel E. Hurtado dhurtado@ing.puc.cl

#### Specialty section:

This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology

Received: 08 May 2018 Accepted: 09 October 2018 Published: 30 October 2018

#### Citation:

Jilberto J and Hurtado DE (2018) Semi-implicit Non-conforming Finite-Element Schemes for Cardiac Electrophysiology: A Framework for Mesh-Coarsening Heart Simulations. Front. Physiol. 9:1513. doi: 10.3389/fphys.2018.01513 The field of computational cardiology has steadily progressed toward reliable and accurate simulations of the heart, showing great potential in clinical applications such as the optimization of cardiac interventions and the study of pro-arrhythmic effects of drugs in humans, among others. However, the computational effort demanded by in-silico studies of the heart remains challenging, highlighting the need of novel numerical methods that can improve the efficiency of simulations while targeting an acceptable accuracy. In this work, we propose a semi-implicit non-conforming finite-element scheme (SINCFES) suitable for cardiac electrophysiology simulations. The accuracy and efficiency of the proposed scheme are assessed by means of numerical simulations of the electrical excitation and propagation in regular and biventricular geometries. We show that the SINCFES allows for coarse-mesh simulations that reduce the computation time when compared to fine-mesh models while delivering wavefront shapes and conduction velocities that are more accurate than those predicted by traditional finite-element formulations based on the same coarse mesh, thus improving the accuracy-efficiency trade-off of cardiac simulations.

Keywords: non-conforming finite elements, computational cardiology, cardiac electrophysiology, conduction velocity, nonlinear finite elements

### 1. INTRODUCTION

Computer simulations of the electrical activity of the heart have increasingly gained attention in the medical community, as they have steadily shown potential in the study of cardiac diseases and in the design of novel cardiac therapies. Current models of the human heart are able to represent the complex three-dimensional anatomical structure of the heart chambers, incorporating key functional features such as the Purkinje network and the cardiomyocyte orientation (Vadakkumpadan et al., 2009). Such advanced representation of the heart has enabled novel in-silico studies of undesired pro-arrhythmic effects of drugs in patients (Sahli Costabal et al., 2018), potentially reducing the number of subjects needed in a clinical trial by aiding the experiment design. Computational models of the heart have also shown promise in assisting the design of effective therapies for terminating atrial fibrillation (Trayanova et al., 2018). While these examples can only confirm the tremendous relevance of computational models in advancing the

**22**

field of cardiology, they share the fundamental challenge of being highly demanding in terms of wall-clock time needed in computer simulations.

Mathematical models of the heart require the computer implementation of spatio-temporal discretization techniques in order to obtain a sequence of numerical representations of the physiological fields under study. Two fundamental aspects directly responsible for the computation time (CT) in a heart simulation are the ionic model used to account for subcellular electrochemical mechanisms, and the level of spatio-temporal discretization in terms of time-step size and mesh size (Sundnes et al., 2006). The choice of the mesh size typically faces a wellknown trade-off problem of accuracy vs. efficiency, as decreasing the mesh size in a simulation results in more accurate numerical approximations, at the cost of increasing the number of degrees of freedom (DOFs), which drives the CT. Indeed, current simulations of the heart typically employ mesh sizes in the range of tens to hundreds of micrometers for domains with lengths in the order of centimeters, which ultimately translates into large systems of equations with several millions of DOFs that need to be solved at each time step. Such high dimensionality renders the solution of heart simulations extremely challenging for personal computers, and calls for improving their implementation in highperformance computing (HPC) platforms (Niederer et al., 2011a; Vazquez et al., 2011).

In the particular case of cardiac electrophysiology simulations, a common criterion to select the mesh size is the ability of the numerical simulation to recover an accurate conduction velocity (CV) and wavefront shape (Pathmanathan et al., 2010; Krishnamoorthi et al., 2013; Dupraz et al., 2015). It has been shown that both the wavefront shape and the CV suffer from a strong dependence on the spatial discretization, which for the case of finite-element (FE) discretization using linear basis functions results in a significant loss of accuracy for the case of mesh sizes > 0.1 mm (Pezzuto et al., 2016). In order to achieve larger mesh sizes, higher-order FE formulations have been proposed, which show that FE Lagrange basis functions of order 2 and 3 result in accurate CV for coarser meshes (Arthurs et al., 2012; Pezzuto et al., 2016). It should be noted, however, that higher-order FE schemes based on Lagrange basis functions necessarily increase the total number of DOFs in simulations when compared to linear-element formulations, as well as they require an additional computational effort for quadrature procedures, as higher-order basis functions demand the use of more quadrature-point evaluations (Cantwell et al., 2014). Recently, Hurtado and Rojas (2018) introduced a non-conforming finite-element scheme (NCFES) for the spatial discretization of the monodomain equation of cardiac electrophysiology that allows for the use of coarse meshes without significant loss of accuracy measured in terms of CV and wavefront shape. More specifically, hexahedral trilinear elements (Q1) were enhanced with non-conforming basis functions of degree 2 to create a non-conforming element (Q1NC) that is capable of representing a second-order polynomial within the element domain, a concept widely employed in the context of solid mechanics FE simulations (Wilson et al., 1973; Taylor et al., 1976). Further, they showed that the DOFs associated to the non-conforming basis functions can be solved at the element level, and therefore the number of global DOFs of the Q1NC scheme equals that of a standard Q1 FE scheme. As a result, Q1NC simulations delivered a CV and wavefront shape similar to that of second-order Lagrange formulations (Q2) at the computational cost in the order of a Q1 formulation.

During the development of the NCFES for cardiac electrophysiology, a fully-implicit (FI) backward-Euler timestepping method was considered (Hurtado and Rojas, 2018). While FI schemes have important advantages in delivering a larger time-step stability region in cardiac simulations (Ying et al., 2008; Hurtado and Henao, 2014), they require the solution of a large system of non-linear equations at each time step that can be very costly in computational terms, and may not be wellsuited to parallel-computing platforms when compared to other numerical schemes. To improve the computational efficiency, the semi-implicit integration method has been proposed in the literature for solving the semi-discrete equations resulting from standard FE discretizations, showing a relevant decrease in the CT of cardiac simulations, as well as being amenable to HPC platforms (Whiteley, 2006; Pathmanathan et al., 2010). Consequently, the scientific question that motivates this work is: Can we further improve the efficiency-accuracy trade-off in cardiac simulations by combining non-conforming FE spatial discretizations with semi-implicit time-integration schemes? To answer such question, in the following we develop the numerical framework and present an algorithm for the implementation of a semi-implicit non-conforming FE scheme to solve the monodomain electrophysiology equations, and investigate the numerical consequences and potential contributions to cardiac simulations.

### 2. METHODS

### 2.1. Monodomain Model of Cardiac Electrophysiology

Let ∈ R 3 be the heart domain where electrical impulses travel during the time interval [0, T], and V<sup>m</sup> : × [0, T] → R be the transmembrane potential. A local statement of current balance yields the monodomain equation (Pullan et al., 2005)

$$A\_{\rm m} \left( C\_{\rm m} \frac{\partial V\_{\rm m}}{\partial t} + I\_{\rm ion}(V\_{\rm m}, r) \right) - \text{div}(\sigma \nabla V\_{\rm m}) = 0, \quad \text{in } \Omega \times \{0, T\}, \tag{1}$$

where Am, C<sup>m</sup> are the surface-to-volume ratio and membrane capacitance, respectively, σ is the conductivity tensor, Iion is the ionic current depending on the transmembrane potential Vm, and **r** : × (0, T] → R <sup>m</sup> is a vector field of state variables that include gating variables and ion concentrations. For convenience, we consider the normalized transmembrane potential field

$$\phi(\mathbf{x},t) = \frac{V\_{\mathbf{m}}(\mathbf{x},t) - V\_{\mathbf{r}}}{V\_{\mathbf{P}} - V\_{\mathbf{r}}},$$

where V<sup>p</sup> and V<sup>r</sup> are the peak and resting voltages, respectively. Based on this normalization, we obtain the non-dimensional monodomain equation,

$$\frac{\partial \phi}{\partial t} - \text{div}(\mathcal{D}\nabla \phi) - f(\phi, r) = 0 \quad \text{in } \Omega \times (0, T], \tag{2}$$

where **D** = 1 AmCm σ is the normalized conductivity tensor, and f(φ,**r**) = − Iion(Vm(φ),**r**) Cm(Vp−Vr) is the normalized ionic current. The time evolution of state variables is governed by kinetic equations of the form

$$\frac{\partial r}{\partial t} = \mathbf{g}(\phi, \mathbf{r}) \quad \text{in } \Omega \times \{0, T\}. \tag{3}$$

The expressions for f(φ,**r**) and g(φ,**r**) will depend on the choice of ionic model representing the transmembrane ionic current in a single cell. Equations (2, 3) are complemented with Dirichlet and Neumann boundary conditions,

$$
\phi = \bar{\phi}, \quad \text{on } \partial \Omega\_{\phi} \times \langle 0, T \rangle, \tag{4}
$$

$$
\mathfrak{q} \cdot \mathfrak{n} = \bar{\mathfrak{q}}, \quad \text{on } \partial \Omega\_{\mathfrak{q}} \times \{0, T\}, \tag{5}
$$

respectively, as well as initial conditions

$$\begin{aligned} \phi(\mathfrak{x},0) &= \phi\_0(\mathfrak{x}), \quad \mathfrak{x} \in \Omega, \\ r(\mathfrak{x},0) &= r\_0(\mathfrak{x}), \quad \mathfrak{x} \in \Omega. \end{aligned}$$

To state the weak form of the cardiac electrophysiology problem, we consider trial spaces S φ , S r and test spaces V φ , V <sup>r</sup> defined as

$$\mathcal{S}^{\phi} = \{ \phi \in L^2(\{0, T\}; H^1(\Omega, \mathbb{R})): \phi = \bar{\phi} \text{ on } \partial\Omega\_{\phi} \times (0, T] \} \tag{6}$$

$$\mathcal{S}^r = \{ \mathbf{r} \in L^2(\mathbf{(0, T]}; L^2(\Omega, \mathbb{R}^m)) \}\tag{7}$$

V <sup>φ</sup> = {ν ∈ H 1 (, R): ν = 0 on ∂φ} (8)

$$\mathcal{V}^r = \{ \eta \in L^2(\Omega, \mathbb{R}^m) \}\tag{9}$$

Multiplying (2) and (3) by appropriate test functions, integrating over and applying the divergence theorem yields the weak equations, and the statement of the weak formulation reads: ∀ t ∈ (0, T], find (φ,**r**) ∈ S <sup>φ</sup> × S r such that

$$\begin{aligned} G^{\phi}[(\phi,r),(\upsilon,\eta)] &:= \int\_{\Omega} \nu \frac{\partial \phi}{\partial t} \, \mathrm{d}x + \int\_{\Omega} \nabla \upsilon \cdot \mathbf{D} \nabla \phi \, \mathrm{d}x \\ &\quad - \int\_{\Omega} \upsilon f(\phi,r) \, \mathrm{d}x + \int\_{\partial \Omega\_{q}} \upsilon \bar{q} \, \mathrm{d}s \\ &\quad = \,\, 0, \quad \forall \, \upsilon \in \mathcal{V}^{\phi} \end{aligned} \tag{10}$$

$$\begin{aligned} G^r[(\phi, r), (\upsilon, \eta)] &:= \int\_{\Omega} \eta \left\{ \frac{\partial r}{\partial t} - \mathbf{g}(\phi, r) \right\} d\mathbf{x} \\ &= \; 0, \quad \forall \; \eta \in \mathcal{V}' \end{aligned} \tag{11}$$

#### 2.2. Spatial Discretization Using a Non-conforming Finite-Element Scheme

A Galerkin finite-element scheme to solve the weak formulation of the monodomain problem can be stated as follows. Let <sup>h</sup> = ∪ Nel <sup>e</sup>=1<sup>e</sup> be a domain discretization where Nel is the number of elements, and all elements comply with the condition i∩<sup>j</sup> = ∅ for i 6= j. We construct finite-dimensional subspaces S φ <sup>h</sup> <sup>⊂</sup> <sup>S</sup> φ ,

S r <sup>h</sup> <sup>⊂</sup> <sup>S</sup> r and V φ <sup>h</sup> <sup>⊂</sup> <sup>V</sup> φ , V r <sup>h</sup> <sup>⊂</sup> <sup>V</sup> r , to solve the following FE problem (Göktepe and Kuhl, 2009; Hurtado and Kuhl, 2014): ∀ t ∈ (0, T], find (φ h ,**r** h ) ∈ S φ <sup>h</sup> <sup>×</sup> <sup>S</sup> r h such that

$$G^{\phi}[(\phi^h, r^h), (\upsilon^h, \eta^h)] = 0, \quad \forall \upsilon^h \in \mathcal{V}\_h^{\phi}.$$

$$G^r[(\phi^h, r^h), (\upsilon^h, \eta^h)] = 0, \quad \forall \eta^h \in \mathcal{V}\_r^r.$$

A traditional discretization FE scheme is the hexahedral isoparametric finite-element space,

$$\mathcal{V}\_h^\phi := \left\{ \boldsymbol{\nu}^h \in C^0(\Omega^h, \mathbb{R}) : \boldsymbol{\nu}^h|\_{\Omega\_\varepsilon} \in Q\_k(\Omega\_\varepsilon), \boldsymbol{e} = 1, \dots, N\_{\text{el}} \right\}$$

where Q<sup>k</sup> (e) represents the space of isoparametric functions resulting from n-tensor product of 1-D Lagrange polynomials of order k, which are defined over the standard (isoparametric) domain ˆ = [−1, 1]<sup>n</sup> and mapping to a hexahedral element. We expand an element ν <sup>h</sup> ∈ V φ h as

$$\nu^h(\mathfrak{x}) = \sum\_{A=1}^{N\_{\text{dof}}} N\_A(\mathfrak{x}) \nu\_{A\*}$$

where {NA}a=1,Ndofs are the basis functions, Ndofs is the number of element nodes with unknown degrees of freedom, and {νA}a=1,Ndofs are the nodal coefficients. Using the same element basis functions, we expand the trial functions as

$$\phi^h(\mathbf{x}, t) = \sum\_{A=1}^{N\_{\text{dof}}} N\_A(\mathbf{x}) u\_A(t) + \mu\_{\text{BC}}(\mathbf{x}, t), \tag{12}$$

where {uA(t)}A=1,Ndofs correspond to the nodal values of the transmembrane potential field, and uBC ∈ S φ is a function that satisfies the boundary conditions (4), i.e., uBC = φ¯ in ∂<sup>φ</sup> × (0, T]. For simplicity, and without loss of generality, in the following we assume that uBC = 0. To construct the elements of V r h , we write

$$\eta^h(\mathbf{x}) = \sum\_{\varepsilon=1}^{N\_{\rm cl}} \sum\_{q=1}^{N\_{\rm q}} M\_q^{\varepsilon}(\mathbf{x}) \eta\_q^{\varepsilon},\tag{13}$$

where M<sup>e</sup> q is a characteristic function defined by

$$M\_q^\epsilon(\mathfrak{x}) = \begin{cases} 1, \mathfrak{x} \in \Omega\_{\mathfrak{e},q} \\ 0, \mathfrak{x} \notin \Omega\_{\mathfrak{e},q} \end{cases} \tag{14}$$

and e,<sup>q</sup> ⊂ <sup>e</sup> is the subdomain containing the q−quadrature point **<sup>x</sup>**q, and is such that <sup>S</sup>Nq <sup>q</sup>=<sup>1</sup> e,<sup>q</sup> = <sup>e</sup> and e,<sup>q</sup> ∩ e,<sup>q</sup> ′ = ∅ whenever q 6= q ′ . Analogously, we expand an element **r** <sup>h</sup> ∈ S r h as

$$r^h(\mathbf{x}, t) = \sum\_{\varepsilon=1}^{N\_{\text{cl}}} \sum\_{q=1}^{N\_{\text{l}}} M\_q^{\varepsilon}(\mathbf{x}) r\_q^{\varepsilon}(t), \tag{15}$$

where **r** e q :(0, T] → R <sup>m</sup> represents the time evolution of the state variables at the q-quadrature point.

In this work, we consider a non-conforming spatialdiscretization scheme for the monodomain equations (Hurtado and Rojas, 2018). To this end, we rewrite the residuals as

$$G^{\phi}[(\phi, r), (\boldsymbol{\nu}, \boldsymbol{\eta})] = \sum\_{\epsilon=1}^{N\_{\rm el}} \left\{ \int\_{\Omega\_{\epsilon}} \boldsymbol{\nu} \frac{\partial \phi}{\partial t} \, \mathrm{d}\mathbf{x} + \int\_{\Omega\_{\epsilon}} \nabla \boldsymbol{\nu} \cdot \mathbf{D} \nabla \phi \, \mathrm{d}\mathbf{x} \right. $$

$$- \int\_{\Omega\_{\epsilon}} \boldsymbol{\nu} f(\phi, r) \, \mathrm{d}\mathbf{x} + \int\_{\partial \Omega\_{\epsilon \mathcal{g}}} \boldsymbol{\nu} \bar{q} \, \mathrm{d}\mathbf{s} \right\}, \quad \text{(16)}$$

$$G^{\tilde{r}}[(\phi,r),(\upsilon,\eta)] = \sum\_{\epsilon=1}^{N\_{\rm cl}} \left\{ \int\_{\Omega\_{\rm cl}} \eta \left\{ \frac{\partial r}{\partial t} - \mathfrak{g}(\phi,r) \right\} \mathrm{d}x \right\},\tag{17}$$

and note that in such form, integrability of the trial and test functions and their weak derivatives is required only at the element level. We enhance the polynomial basis of V φ h at the element level by adding polynomial terms not included in Qk (e). To this end, we consider the non-conforming space

$$\mathcal{E}\_h^{\phi} := \left\{ \beta^h : \beta^h|\_{\Omega\_\varepsilon} \in P\_{k+m}(\Omega\_\varepsilon) \backslash Q\_k(\Omega\_\varepsilon) \right\}.$$

where m ∈ Z<sup>+</sup> and Pk+m(e) is the space of polynomial functions of degree k + m defined on the standard domain ˆ . We then consider enhanced test functions ν <sup>h</sup> which we expand as

$$\boldsymbol{\upsilon}^{h}(\mathbf{x}) = \sum\_{A=1}^{N\_{\rm dof}} N\_A(\mathbf{x}) \boldsymbol{\upsilon}\_A + \sum\_{\varepsilon=1}^{N\_{\rm d}} \sum\_{\varepsilon=1}^{N\_{\rm nc}} W\_{\varepsilon}^{\varepsilon}(\mathbf{x}) \boldsymbol{\beta}\_{\varepsilon}^{\varepsilon} \tag{18}$$

where β e <sup>c</sup> <sup>∈</sup> <sup>R</sup> are coefficients, <sup>W</sup><sup>e</sup> c are non-conforming element basis functions, and it holds that W<sup>e</sup> <sup>c</sup> = 0, **x** ∈/ <sup>e</sup> . Analogously, we enhance S φ h with the time-dependent non-conforming space F φ h , and expand the enhanced trial functions as

$$\boldsymbol{\phi}^{h}(\mathbf{x},t) = \boldsymbol{u}^{h}(\mathbf{x},t) + \boldsymbol{\alpha}^{h}(\mathbf{x},t) \tag{19}$$

where

$$\mu^h(\mathbf{x}, t) := \sum\_{B=1}^{N\_{\text{dof}}} N\_B(\mathbf{x}) \mu\_B(t) \tag{20}$$

$$\alpha^h(\mathfrak{x}, t) := \sum\_{\mathfrak{e}=1}^{N\_{\text{cl}}} \sum\_{d=1}^{N\_{\text{nc}}} W\_d^{\mathfrak{e}}(\mathfrak{x}) \alpha\_d^{\mathfrak{e}}(t). \tag{21}$$

and α e d :(0, T] → R is a time-dependent coefficient that scales the non-conforming basis functions W<sup>e</sup> d . Substitution of approximations Equations (13 15, 18, and 19) into the residuals Equations (16) and (17) yields the following semi-discrete problem: ∀ t ∈ (0, T], find (u h , α h ,**r** h ) ∈ S φ <sup>h</sup> <sup>×</sup> <sup>F</sup> φ <sup>h</sup> <sup>×</sup> <sup>S</sup> r h such that

$$\int\_{\Omega} N\_A \{\dot{u}^h + \dot{\alpha}^h\} \mathrm{d}x + \int\_{\Omega} \nabla N\_A \cdot \mathbf{D} \nabla \{\dot{u}^h + \alpha^h\} \mathrm{d}x - \int\_{\Omega} N\_A f \{\dot{u}^h\}$$

$$+ \alpha^h, \boldsymbol{r}^h \rangle \mathrm{d}x + \int\_{\partial \Omega\_{\boldsymbol{q}}} N\_A \bar{q} \, \mathrm{d}s = 0, \quad A = 1, \ldots, N\_{\mathrm{dof}\,\mathrm{s}}.\tag{22}$$

$$\int\_{\Omega^{\boldsymbol{\varepsilon}}} W\_{\boldsymbol{\varepsilon}}^{\boldsymbol{\varepsilon}} \{\dot{u}^h + \dot{\alpha}^h\} \mathrm{d}x + \int\_{\Omega^{\boldsymbol{\varepsilon}}} \nabla W\_{\boldsymbol{\varepsilon}}^{\boldsymbol{\varepsilon}} \cdot \mathbf{D} \nabla \{\boldsymbol{u}^h + \alpha^h\} \, \mathrm{d}x$$

$$\varepsilon - \int\_{\Omega^{\varepsilon}} W\_{\varepsilon}^{\varepsilon} (\mu^{\hbar} + a^{\hbar}, r^{\hbar}) \mathrm{d}x = 0, \quad \varepsilon = 1, \dots, N\_{\mathrm{cl}}; \ \varepsilon = 1, \dots, N\_{\mathrm{nc}}; \tag{23}$$

$$\int\_{\Omega^{\varepsilon}} M\_q^{\varepsilon} \{ \dot{r}^h - g(u^h + \alpha^h, r^h) \} \mathrm{d}x = 0, \quad \varepsilon = 1, \dots, N\_{\mathrm{cl}}; \ q = 1, \dots, N\_{\mathrm{q}} \tag{24}$$

#### 2.3. Semi-implicit Temporal Discretization

To integrate (22), (23) and (24) in time, we consider partitioning the time interval into [0, . . . , tn, tn+1, . . . , T], and approximate the time-dependent coefficients (tn) ≈ n. For a generic time interval [tn, tn+1] we define 1t : = tn+<sup>1</sup> − tn. We further group the expansion coefficients into vectors, and write

$$\begin{aligned} \boldsymbol{u}\_{n} &= [\boldsymbol{u}\_{n,1}, \dots, \boldsymbol{u}\_{n, \text{N}\_{\text{dofs}}}]^T, \quad \boldsymbol{\alpha}\_{n}^{\boldsymbol{e}} = [\boldsymbol{\alpha}\_{n,1}^{\boldsymbol{e}}, \dots, \boldsymbol{\alpha}\_{n, \text{N}\_{\text{nc}}}^{\boldsymbol{e}}]^T, \\ &\boldsymbol{r}\_{n}^{\boldsymbol{e}} = [\boldsymbol{r}\_{n,1}^{\boldsymbol{e}}, \dots, \boldsymbol{r}\_{n, \text{N}\_{\text{q}}}^{\boldsymbol{e}}]^T \end{aligned} \tag{25}$$

Following a semi-implicit (SI) time-integration approach (Whiteley, 2006), time derivatives are replaced by the finite-difference approximation

$$
\Box(t\_{n+1}) \approx \frac{\Box\_{n+1} - \Box\_n}{\Delta t}.\tag{26}
$$

Diffusive terms in Equations (22) and (23) are evaluated at t = tn+<sup>1</sup> and the reaction terms are evaluated at t = tn. Evolution Equation (24) were integrated using an explicit Forward-Euler scheme. As a result, the incremental time update for t = tn+<sup>1</sup> reads: Given **u**n,{α e n ,**r** e n }e=1,...,Nel , find **u**n+1,{α e n+1 ,**r** e n+1 }e=1,...,Nel such that

$$\sum\_{B=1}^{N\_{\text{dok}}} \left\{ \int\_{\Omega} \frac{N\_A N\_B}{\Delta t} + \int\_{\Omega} \nabla N\_A \cdot \mathbf{D} \nabla N\_B \right\} u\_{n+1,B}$$

$$+ \sum\_{e=1}^{N\_{\text{el}}} \sum\_{d=1}^{N\_{\text{fel}}} \left\{ \int\_{\Omega} \frac{N\_A W\_d^e}{\Delta t} + \int\_{\Omega} \nabla N\_A \cdot \mathbf{D} \nabla W\_d^e \right\} a\_{n+1,d}^e$$

$$- \left\{ \int\_{\Omega} \frac{N\_A}{\Delta t} \{u\_n^h + a\_n^h\} + \int\_{\Omega} N\_A f(u\_n^h + a\_n^h, r\_n^h) \right\} = 0,$$

$$A = 1, \dots, N\_{\text{dofs}}, \tag{27}$$

X Nen b=1 Z e W<sup>e</sup> cN e b 1t + Z e ∇W<sup>e</sup> c · **D**∇N e b | {z } =: L e cb u e n+1,b + X Nnc d=1 Z e W<sup>e</sup> cW<sup>e</sup> d 1t + Z e ∇W<sup>e</sup> c · **D**∇W<sup>e</sup> d | {z } =: K<sup>e</sup> αcd α e n+1,d − Z e W<sup>e</sup> c 1t {u h <sup>n</sup> + α h n } + <sup>Z</sup> e W<sup>e</sup> c f(u h <sup>n</sup> + α h n ,**r** h n ) | {z } =: p e αc = 0 e = 1, . . . , Nel; c = 1, . . . , Nnc (28) Z e Me q X Nq s=1 Me s r e <sup>n</sup>+1,<sup>s</sup> − r e n,s 1t − g(u h <sup>n</sup> + α h n ,**r** h n ) dx = 0,

$$e = 1, \ldots, N\_{\text{el}}; \ q = 1, \ldots, N\_{q}, \tag{29}$$

where N e b : = N<sup>B</sup> e is the restriction of the basis function to the local element domain, and u e b is the corresponding nodal value, where lowercase letters indicate the local degree of freedom b corresponding to its global counterpart B. At this point, we note that Equation (28) can be written in matrix form as

$$L^{\varepsilon}u\_{n+1}^{\varepsilon} + K\_{\alpha}^{\varepsilon}u\_{n+1}^{\varepsilon} - p^{\varepsilon}(u\_n^{\varepsilon}, \alpha\_n^{\varepsilon}, r\_n^{\varepsilon}) = 0, 1$$

for e = 1, . . . , Nel, from where we define the time update for the element non-conforming coefficient vector as

$$\begin{aligned} \left(\mathfrak{a}\_{n+1}^{\varepsilon,\ast}(\mathfrak{u}\_{n+1}^{\varepsilon};\mathfrak{u}\_{n}^{\varepsilon},\mathfrak{a}\_{n}^{\varepsilon},\mathfrak{r}\_{n}^{\varepsilon})\right) &:= \left\{\mathcal{K}\_{\alpha}^{\varepsilon}\right\}^{-1} \mathfrak{p}\_{\alpha}^{\varepsilon}(\mathfrak{u}\_{n}^{\varepsilon},\mathfrak{a}\_{n}^{\varepsilon},\mathfrak{r}\_{n}^{\varepsilon}) \\ & - \left\{\mathcal{K}\_{\alpha}^{\varepsilon}\right\}^{-1} \mathcal{L}^{\varepsilon} \mathfrak{u}\_{n+1}^{\varepsilon} \end{aligned} \tag{30}$$

which is computed exclusively using element-level variables, given the element vector **u** e n+1 . To update the gating-variable field, we note from Equation (14) that Equation (29) can be solved point-wise at each quadrature point **x**<sup>q</sup> inside an element, and thus is equivalent to writing

$$\frac{r\_{q,n+1}^{\varepsilon} - r\_{q,n}^{\varepsilon}}{\Delta t} - g(u\_n^h(\mathfrak{x}\_q) + \alpha\_n^h(\mathfrak{x}\_q), r\_{q,n}^{\varepsilon}) = 0,$$

$$e = 1, \dots, N\_{\mathrm{el}}; \quad q = 1, \dots, N\_q,$$

from which the (explicit) time update for the gating variables can be solved at the quadrature-point level as

$$r\_{q,n+1}^{\varepsilon,\*} (\boldsymbol{u}\_n^{\varepsilon}, \boldsymbol{\alpha}\_n^{\varepsilon}, r\_n^{\varepsilon}) := r\_{q,n}^{\varepsilon} + \Delta t \, g(\boldsymbol{u}\_n^h(\mathbf{x}\_q) + \boldsymbol{\alpha}\_n^h(\mathbf{x}\_q), r\_{q,n}^{\varepsilon}). \tag{31}$$

We now turn to residual Equation (27), and note that it can be constructed by assembling element-level nodal contributions defined by

$$R\_a^{\mu,\varepsilon} := \sum\_{b=1}^{N\_{\text{en}}} \underbrace{\left\{ \int\_{\Omega^{\varepsilon}} \frac{N\_a N\_b}{\Delta t} + \int\_{\Omega^{\varepsilon}} \nabla N\_a \cdot \mathbf{D} \nabla N\_b \right\}}\_{:= K\_{\text{ul}}^{\varepsilon}} u\_{n+1,b}^{\varepsilon}$$

$$+ \sum\_{b=1}^{N\_{\text{en}}} \underbrace{\left\{ \int\_{\Omega^{\varepsilon}} \frac{N\_a W\_d}{\Delta t} + \int\_{\Omega^{\varepsilon}} \nabla N\_a \cdot \mathbf{D} \nabla W\_d \right\}}\_{L\_{ad}^{\varepsilon T}} \alpha\_{n+1,d}^{\varepsilon}$$

$$- \underbrace{\left\{ \int\_{\Omega^{\varepsilon}} \frac{N\_a}{\Delta t} \{u\_n^h + \alpha\_n^h\} + \int\_{\Omega^{\varepsilon}} N\_d f(u\_n^h + \alpha\_n^h, \mathbf{r}\_n^h) \right\}}\_{:= p\_{\text{ul}}^{\varepsilon}},\quad(32)$$

which can also be written in matrix form at the element level as

$$\mathcal{R}^{\mu,\varepsilon} = K\_{\mu}^{\varepsilon} \mu\_{n+1}^{\varepsilon} + L^{\varepsilon T} \mathfrak{a}\_{n+1}^{\varepsilon} - \mathfrak{p}\_{\mu}^{\varepsilon} (\mu\_n^{\varepsilon}, \mathfrak{a}\_n^{\varepsilon}, r\_n^{\varepsilon}). \tag{33}$$

Substituting update Equation (30) into Equation (33), we obtain an element residual that only depends on **u** e n+1 that reads

$$\begin{split} \mathcal{R}^{\mu,\varepsilon} &= \underbrace{\left(\mathcal{K}\_{\mu}^{\varepsilon} - \mathcal{L}^{\varepsilon T} \left\{\mathcal{K}\_{\alpha}^{\varepsilon}\right\}^{-1} \mathcal{L}^{\varepsilon}\right)}\_{A^{\varepsilon}} \boldsymbol{u}\_{n+1}^{\varepsilon} \\ &+ \underbrace{\mathcal{L}^{\varepsilon T} \left\{\mathcal{K}\_{\alpha}^{\varepsilon}\right\}^{-1} \boldsymbol{p}\_{\alpha}^{\varepsilon}(\boldsymbol{u}\_{n}^{\varepsilon}, \boldsymbol{\alpha}\_{n}^{\varepsilon}, \boldsymbol{r}\_{n}^{\varepsilon}) - \boldsymbol{p}\_{\boldsymbol{u}}^{\varepsilon}(\boldsymbol{u}\_{n}^{\varepsilon}, \boldsymbol{\alpha}\_{n}^{\varepsilon}, \boldsymbol{r}\_{n}^{\varepsilon})}\_{b^{\varepsilon}\_{n}(\boldsymbol{u}\_{n}^{\varepsilon}, \boldsymbol{\alpha}\_{n}^{\varepsilon}, \boldsymbol{r}\_{n}^{\varepsilon})} \end{split} \tag{34}$$

As a consequence, solving residual Equation (27) is equivalent to solving the matrix linear system

$$A\mathfrak{u}\_{n+1} + \mathfrak{b}\_n = \mathbf{0} \tag{35}$$

where **A** and **b**<sup>n</sup> are the global matrix and vector assembled from element contributions defined in Equation (34). We note that Equation (35) defines the time update for the global potential vector

$$(\mathfrak{u}\_{n+1}^\*(\mathfrak{u}\_n, \langle \mathfrak{a}\_n^\varepsilon, r\_n^\varepsilon \rangle\_{\mathfrak{e} = 1, \dots, N\_{\text{cl}}}) := -A^{-1} \mathfrak{b}\_n \tag{36}$$

We remark that matrix **A** does not depend on the coefficient vectors, and therefore will take the same values for all time steps. Thus, it can be computed on a initialization stage, inverted and stored for later use in updating the potential vector. For the sake of clarity, the steps for the solving the semi-implicit scheme are summarized in **Algorithm 1**.

#### 2.4. The Q1NC Element

We materialize the non-conforming scheme defined in the previous section using incompatible-modes basis functions (Wilson et al., 1973; Taylor et al., 1976), which enhance Q1 elements. We recall that the isoparametric basis functions for Q1 3D (solid) elements are

$$\begin{aligned} \hat{N}\_{\mathbb{I}} &= \frac{1}{8}(1-\xi\_{1})(1-\xi\_{2})(1-\xi\_{3}), & \hat{N}\_{2} &= \frac{1}{8}(1+\xi\_{1})(1-\xi\_{2})(1-\xi\_{3}), \\ \hat{N}\_{\mathbb{S}} &= \frac{1}{8}(1+\xi\_{1})(1+\xi\_{2})(1-\xi\_{3}), & \hat{N}\_{4} &= \frac{1}{8}(1-\xi\_{1})(1+\xi\_{2})(1-\xi\_{3}), \\ \hat{N}\_{\mathbb{S}} &= \frac{1}{8}(1-\xi\_{1})(1-\xi\_{2})(1+\xi\_{3}), & \hat{N}\_{6} &= \frac{1}{8}(1+\xi\_{1})(1-\xi\_{2})(1+\xi\_{3}), \\ \hat{N}\_{\mathbb{T}} &= \frac{1}{8}(1+\xi\_{1})(1+\xi\_{2})(1+\xi\_{3}), & \hat{N}\_{8} &= \frac{1}{8}(1-\xi\_{1})(1+\xi\_{2})(1+\xi\_{3}), \end{aligned}$$

where (ξ1, ξ2, ξ3) ∈ ˆ : = [−1, 1]<sup>3</sup> , and

> N e <sup>a</sup> <sup>=</sup> <sup>N</sup><sup>ˆ</sup>

with

$$
\hat{\mathfrak{x}} = \sum\_{a=1}^{8} \hat{N}\_a \mathfrak{x}\_{a\*}^{\varepsilon}
$$

<sup>a</sup> ◦ ˆ**x** −1

where **x** e a is the vector with nodal coordinates. Incompatible modes enhance the Q1(<sup>e</sup> ) element basis by adding basis functions {M<sup>e</sup> c }c=1,2,3, with M<sup>e</sup> <sup>c</sup> <sup>=</sup> <sup>M</sup><sup>ˆ</sup> <sup>c</sup> ◦ ˆ**x** −1 , where

$$
\hat{M}\_1 = 1 - (\xi\_1)^2, \quad \hat{M}\_2 = 1 - (\xi\_2)^2, \quad \hat{M}\_3 = 1 - (\xi\_3)^2
$$

#### **Algorithm 1:** Solution algorithm

/\* initialization \*/ **u**<sup>0</sup> = **0 r**<sup>0</sup> = **r**init α <sup>e</sup> = **0 A** = **0 for** e = 1 **to** Nel **do** Compute **K** e α , **K** e u and **L** e (Equations (28) and (32)) and store Compute **A** e (Equation (34)) and assemble contribution to **A end** Compute **A** −1 and store /\* time integration loop \*/ **for** n = 0 **to** Nsteps **do for** e = 1 **to** Nel **do** Compute **b** e (**u** e n , α e n ,**r** e n ) (Equation (34)) and assemble contribution to **b**<sup>n</sup> **end** Update **u**n+<sup>1</sup> = **u** ∗ n+1 (**u**n,{α e n ,**r** e n }e=1,...,Nel) = −**A** <sup>−</sup>1**b**<sup>n</sup> **for** e = 1 **to** Nel **do** Update α e <sup>n</sup>+<sup>1</sup> = α e,∗ n+1 (**u** e n+1 ; **u** e n , α e n ,**r** e n ) (see Equation 30) **for** q = 1 **to** N<sup>q</sup> **do** Update **r** e <sup>q</sup>,n+<sup>1</sup> = **r** e,∗ q,n+1 (**u** e n , α e n ,**r** e n ) (see Equation 31) **end end end**

TABLE 1 | Element DOFs and quadrature rules employed in numerical integration of residuals and tangents.


DOFs, degrees of freedom; NC, incompatible mode (internal variable).

for (ξ1, ξ2, ξ3) ∈ ˆ . **Table 1** details the number of DOFs used for the 3D elements considered in this work. Integrals have been approximated using Gaussian quadrature on the standard domain. **Table 1** reports the quadrature rules employed in the numerical integration of Q1, Q1NC and Q2 element implementations.

### 2.5. The Modified Aliev-Panfilov Model for Transmembrane Ionic Current

All simulations considered the modified Aliev-Panfilov model, which accounts for physiological voltage upstroke slopes and conduction velocities (Aliev and Panfilov, 1996; Hurtado et al., 2016), whose expressions are described below for



completeness:

$$f(\phi, r) = c\_1 \phi (\phi - \alpha)(1 - \phi) - c\_2 r \phi \tag{37}$$

$$g(\phi, r) = \left(\gamma + \frac{\mu\_1 r}{\mu\_2 + \phi}\right) \left(-r - c\_2 \phi(\phi - b - 1)\right) \tag{38}$$

where c1, c2, α, γ , µ1, µ<sup>2</sup> and b are constants, whose values are included in **Table 2**, and are the same employed by Hurtado and Rojas (2018). To account for a steady-state regime, initial values of the recovery value where set to r = 0.1146.

### 3. RESULTS

Finite-element simulations using Q1, Q2, and Q1NC element formulations were implemented for the FI and SI timeintegration schemes described in the previous section in an enhanced version of FEAP (Taylor, 2014).

### 3.1. Plane-Wave Tests on CV and CT

A 3D cardiac rod with a total length of 25 mm was discretized using regular hexahedral elements with a uniform element size, with the exception of elements adjacent to the boundary where the size was at times smaller to fit the geometry. To study the effect of the element size, simulations were carried out with mesh sizes ranging from h = 2 mm to h = 0.0156 mm. A zero-flux boundary condition was assumed for all boundary surfaces, with exception of the left end of the rod which was stimulated with a normalized external current of 20 mV/ms, which corresponds to 28, 000µA/cm<sup>3</sup> , for 2 ms to elicit a plane traveling wave along the direction of the rod. A time-step size of 0.001 ms was set for all simulations, which is small when compared to standard cardiac simulations using the selected ionic model (Hurtado et al., 2016). Such small time-step size is chosen to minimize the contribution of the temporal discretization error to the overall numerical error. To compute the CV, we tracked the voltage evolution on x<sup>1</sup> = 18 mm and x<sup>2</sup> = 22 mm and recorded the activation time, which is defined as the time when the φ > 0.5 for the first time at a certain point. Then, the CV was calculated as the difference between x<sup>2</sup> and x<sup>1</sup> divided by the difference in the activation time. The results for the CV for different element sizes are shown in **Figure 1A**. All formulations converged to a CV = 36.9 cm/s as the mesh size approached h = 0.0156 mm. CV monotonically decreased as mesh size was decreased for Q1 and Q2 formulations. The computational effort of simulations in terms of CT is reported in **Figure 1B**. We observe that the computational demand of simulations monotonically increases as the mesh size decreases, independently of the element formulation. We do observe, however, that the FI time-integration scheme always results in

FIGURE 1 | CV tests for plane-waves propagating on a 3D bar for FI and SI schemes on different element formulations. (A) Convergence study of CV as a function of the mesh size h. (B) Computational effort in terms of CT as a function of h.

higher CT than the SI scheme for all element formulations considered.

To facilitate the analysis of the accuracy-efficiency trade-off of the different schemes studied, **Figure 2** shows the CT vs. the error in CV for the Q1, Q2, and Q1NC formulations for both the implicit and semi-implicit time updates. Since we seek to minimize two objective functions, namely the CT and the CV error, the Pareto frontier, defined as the set of choices that are Pareto-efficient, is included in each subfigure. The subset of the Pareto-efficient cases that correspond to the Q1NC formulation are {1.2, 1.5}[mm] and {1.0, 1.2, 1.5, 2.0}[mm] for the FI and SI cases, respectively.

### 3.2. Benchmark Simulations on a Cardiac Cuboid

We studied the behavior of the SINCFES using as a second test case the benchmark study on a cardiac cuboid developed by Niederer et al. (2011b), and adapted to the case of the Aliev-Panfilov model by Hurtado et al. (2016). To this end, we consider a cuboid domain with dimensions of 20 × 7 × 3 mm with cardiac fibers oriented in the longest axis direction. A subdomain with dimensions 1.5 × 1.5 × 1.5 mm located at one of the corners of the cuboid was stimulated with an electrical current density of 50, 000/cm<sup>3</sup> for 2 ms. The normalized longitudinal and transversal conductivities were 0.0952 and 0.0126 mm<sup>2</sup> /ms, respectively. **Figure 3A** shows the activation map and isochrones obtained on a plane that contains opposite corners in the diagonal, as defined in Niederer et al. (2011b), for a fine (Baseline) and coarse discretization using Q1 elements, and for the same coarse discretization using Q1NC elements. We note that the Q1NC case with mesh size h = 0.8 mm resulted in an activation map and isochrones similar to the baseline case, defined as a Q1 model with mesh size h = 0.1 mm. In contrast, the activation map delivered by the Q1 coarse-mesh case with mesh size h = 0.8 mm largely differed from the baseline case, delivering a less

curved wave-front profile. **Figure 3** displays the activation time values along the diagonal of the cuboid for the three cases under study. We observe that the Q1NC case closely follows the baseline case, whereas the Q1 coarse-mesh case resulted in shorter activation times at all locations along the diagonal. As a reference, the CT for the Baseline (Q1 fine), Q1NC and Q1 cases were 122, 341 , 344, and 184 s, respectively, which is equivalent to a CT ratio of 665 : 2 : 1.

### 3.3. Biventricular Human Heart Simulations

To study the potential of the Q1NC-SI formulation in wholeheart cardiac simulations, we modeled the propagation of an action potential on an idealized human biventricular domain stimulated at the atrio-ventricular node. The heart biventricular geometry was generated from two truncated ellipsoids (Streeter and Hanna, 1973), and later discretized using non-regular hexahedral elements. For the baseline case, a size-varying mesh with average characteristic length of 0.48 mm was employed. A coarse mesh with average element length of 1.0 mm was also considered for two additional cases with Q1 and Q1NC element formulations, see left column of **Figure 4** for a representation of the biventricular meshes. All three cases considered the same initial boundary conditions and time step size of 0.001 ms. The transmembrane potential distribution at different time

instants during ventricular activation is depicted in **Figure 4**. We clearly observe that, as time elapses, the action-potential wave front of the Q1NC case is very similar to the Baseline case, whereas the Q1 case results in a wave front that propagates faster than the other two models due to the artificially high CV. The last column in **Figure 4** shows the activation maps, where we observe that isochrones for the Baseline and Q1NC cases are very similar, and they both differ from the Q1 case. Biventricular simulations were ran in a HPC cluster with 128 GB of RAM memory using 32 processors using the parallel implementation of the code FEAP (Taylor, 2014). The CT for the baseline, the Q1NC and the Q1 simulation were 1805, 452 and 154 min respectively, which is equivalent to a CT ratio of 18 : 3 : 1.

### 3.4. Spiral Wave Simulations

To assess the performance of the proposed non-conforming scheme in the simulation of spiral waves, we considered a 50 × 50 mm cardiac domain excited by means of an S1–S2 stimulation protocol. To this end, we first applied a surface stimulus (S1) of 12 mV/(ms mm<sup>2</sup> ) for 2 ms on the border defined by x = 0 to create a plane wave. After 280 ms, we applied a second stimulus (S2) of 15 mV/(ms mm<sup>3</sup> ) in the quadrant x < 25, y < 25 mm for 5 ms, which resulted in the formation of a spiral wave (Costabal et al., 2017). This S1–S2 model was solved using three numerical models: a fine mesh with element size h = 0.1 mm using Q1 elements (Baseline), a coarse mesh with element size h = 1 mm using Q1 elements (Q1), and a coarse mesh with h = 1 mm using the proposed non-conforming element formulation (Q1NC). In all cases, we considered a semi-implicit time update with timestep size 1t = 0.005 ms. **Figure 5** shows the distribution of the transmembrane potential of the three models under study for several time instants. We note that at early times (t = 110 ms) the Q1 case displays a wave front that advanced considerably faster than the baseline and Q1NC cases. At t = 400 ms a spiral wave formed in the Baseline and Q1NC cases, whereas for the Q1 case a curved wave front propagated in the outward direction but did not create a spiral. At a later instant (t = 600 ms), a spiral was steadily rotating in the Baseline and Q1NC cases, constantly reexciting tissue, whereas in the Q1 case cardiac tissue was found under complete rest, and no electrical activity was observed.

### 4. DISCUSSION

In this work we have studied the features and advantages of a novel SINCFES in the solution of the monodomain model of cardiac electrophysiology. From plane-wave CV tests we note that the FI and SI schemes yield similar results for the conduction velocity for the time-step size employed, see **Figure 1A**. This is expected, as the time-step size considered here is small compared to standard values employed in numerical simulations (Krishnamoorthi et al., 2013). Interestingly, we observe that in the case of mesh sizes h < 0.6 mm, the Q1, Q2, and Q1NC element formulations delivered very similar results in terms of CV error. For the cases where h > 0.6 mm, the CV error incurred by the Q1 formulation grows at a much faster rate than the Q2 and Q1NC formulations. An interesting result that deserves further study is the convergence trend of the Q1NC formulation, as it is not monotonically convergent in the whole range of mesh sizes studied, and it reverts the sign of the CV error in a bounded interval of mesh sizes. A similar convergence trend has been reported in the literature for standard FE discretizations, in the context of mass-lumping techniques (Pezzuto et al., 2016), which suggest as future work a more detailed study of the effect of NC spatial discretization schemes on the stiffness and mass matrices that govern the dynamics of the problem. To better analyze the accuracy-efficiency trade-off for each scheme, we constructed CT vs CV-error plots, where the Pareto frontier has been identified. We conclude that the SINCFES delivers Pareto-optimal results for cases with mesh size in the range of {1.0, 1.2, 1.5, 2.0}[mm]. For smaller mesh sizes, traditional Q1 formulations deliver better combinations of CT and CVerror than Q1NC and Q2. It is interesting to note that, in general, Q2 elements are less efficient than the Q1 and Q1NC elements from a Pareto-optimality viewpoint for the whole range of mesh sizes studied. We also note that these conclusions are particular to a plane-wave propagation case, where anisotropy of conductivity and curvature of propagating wavefronts are absent.

We further studied the performance of Q1NC elements using a benchmark problem on a cuboid cardiac domain (Niederer et al., 2011b). Our simulations showed that the Q1NC formulation on a coarse mesh (h = 0.8 mm) can result in activation maps that are similar to those obtained on fine meshes using Q1 (h = 0.1 mm) , adequately capturing the anisotropic conduction of the propagating waves, see **Figure 3A**. An analysis of the activation-time profile along the cuboid diagonal shows that the Q1NC scheme delivers an accurate conduction velocity, which is comparable to Q1 meshes with mesh sizes that are 8 times smaller, see **Figure 3B**. This result confirms the ability of Q1NC elements to capture the propagation of electrical waves in anisotropic media with good accuracy at significantly reduced CTs. In contrast, Q1 coarse-mesh simulations resulted in markedly higher conduction velocity profiles, and did poorly in capturing the anisotropic propagation of wavefronts when compared to the Q1NC formulation.

Numerical simulations on a biventricular domain showed that our non-conforming scheme can be effectively used in unstructured meshes of idealized anatomical geometries of the heart, see **Figure 4**. Similarly to the cardiac rod case, a coarse mesh using Q1NC elements performs much better than a simulation using standard Q1 elements on the same discretization level, as it predicts more accurately the wavefront propagation pattern, when compared to the baseline case. This conclusion is also reached from observing the resulting activation maps, where the spatial distribution and curved shape of isochrones in the Q1NC and baseline are similar, whereas the Q1 formulation delivers an isochrone distribution with lower activation-time values. We note here that this study considered an idealized and smooth geometrical representation of the ventricles of the human heart, useful for numerical verification purposes. It is important to note that such idealized domain does not include important anatomical structures such as the intricate endocardial surface, papillary muscles, and Purkinje network, that are currently included in advanced heart models (Ponnaluri et al., 2016; Sahli Costabal et al., 2016). Future work should focus in understanding how non-conforming formulations can handle such fine-scale anatomical details and structures.

The performance of the SINCFES was studied in the simulation of spiral waves. Remarkably, a very coarse mesh using Q1NC elements is capable to correctly produce, and sustain in time, a spiral wave, whereas a standard Q1 formulation using the same mesh size results in no activation of cardiac tissue. The ability of SINCFES to reproduce spiral wave formation and dynamics is a key result of this work, as it shows that the method is physically more accurate than standard FE formulations for coarse discretizations. This result can be explained by the reduced dependance of the CV on the mesh size, and highlights the potential of the SINCFES in the simulation of cardiac arrhythmias, the main clinical focus of cardiac electrophysiology simulations. While spiral patterns and dynamics obtained with the Q1NC formulation are very similar to the baseline results, a time delay is observed for the former, which resulted in differences in the spatial distribution of the transmembrane voltage, see last column of **Figure 5**. Such delay, which can ultimately be attributed to differences in the local CV, has also been observed in studies employing very high-order space-time formulations (Coudière and Turpault, 2017), confirming that state-of-the-art simulations of spirals using standard values of mesh size and time step are also affected by this time delay. Despite this persistent numerical error, we believe that the focus of future studies should be in recovering the overall dynamical features of spirals, i.e., spiral tip trajectories (Fenton and Karma, 1998; Gizzi et al., 2013).

We close by noting that while whole-heart simulations reported in the literature predominantly employ tetrahedral discretizations, effective methods for generating patientspecific hexahedral meshes are currently available (Lamata et al., 2011). Further, hexahedral meshes have gained great attention in the context of cardiac simulations, as the numerical performance of hexahedral elements is superior to tetrahedral elements when solving mechanics of the heart, particularly under the assumption of incompressible and quasi-incompressible regimes (Hadjicharalambous et al., 2014). As a conclusion, a natural continuation of this work is the application of non-conforming schemes in the solution of electromechanical models of the heart (Nash and Panfilov, 2004). One important reason for mesh-coarsening FE models of the heart is to reduce the number of DOFs, which in the case of electromechanical cardiac models is much larger than in pure electrophysiological simulations, as displacement, fiber stretch/stress variables, and the nonlinearity of tissue constitutive models drastically increase the dimensionality and computational effort needed to solve the governing equations (Göktepe and Kuhl, 2010; Hurtado et al., 2017).

### AUTHOR CONTRIBUTIONS

JJ and DH developed the theoretical framework, numerical schemes and computer algorithms. DH designed the numerical experiments. JJ coded the implementation and ran simulations.

#### REFERENCES


JJ and DH wrote the manuscript draft. DH reviewed the final version of the manuscript.

### ACKNOWLEDGMENTS

This research was financially supported by the Chilean Fondo Nacional de Desarrollo Científico y Tecnológico (FONDECYT) through grant #1180832. The support of the UCT Three-Way PhD Global Partnership Programme funded by the Dr. Leopold und Carmen Ellinger Stiftung is also appreciated. Authors would like to thank the reviewers for the constructive feedback and comments.

human heart. Comput. Methods Biomech. Biomed. Eng. 17, 986–996. doi: 10.1080/10255842.2012.729582


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Jilberto and Hurtado. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Muscle Thickness and Curvature Influence Atrial Conduction Velocities

Simone Rossi <sup>1</sup> \*, Stephen Gaeta<sup>2</sup> , Boyce E. Griffith1,3,4 and Craig S. Henriquez <sup>5</sup>

*<sup>1</sup> Cardiovascular Modeling and Simulation Laboratory, Carolina Center for Interdisciplinary Applied Mathematics, University of North Carolina, Chapel Hill, NC, United States, <sup>2</sup> Clinical Cardiac Electrophysiology/Cardiology Division, Duke University Medical Center, Durham, NC, United States, <sup>3</sup> Departments of Mathematics, Applied Physical Sciences, and Biomedical Engineering, University of North Carolina, Chapel Hill, NC, United States, <sup>4</sup> McAllister Heart Institute, University of North Carolina, Chapel Hill, NC, United States, <sup>5</sup> Department of Biomedical Engineering, Pratt School of Engineering, Duke University, Durham, NC, United States*

Electroanatomical mapping is currently used to provide clinicians with information about the electrophysiological state of the heart and to guide interventions like ablation. These maps can be used to identify ectopic triggers of an arrhythmia such as atrial fibrillation (AF) or changes in the conduction velocity (CV) that have been associated with poor cell to cell coupling or fibrosis. Unfortunately, many factors are known to affect CV, including membrane excitability, pacing rate, wavefront curvature, and bath loading, making interpretation challenging. In this work, we show how endocardial conduction velocities are also affected by the geometrical factors of muscle thickness and wall curvature. Using an idealized three-dimensional strand, we show that transverse conductivities and boundary conditions can slow down or speed up signal propagation, depending on the curvature of the muscle tissue. In fact, a planar wavefront that is parallel to a straight line normal to the mid-surface does not remain normal to the mid-surface in a curved domain. We further demonstrate that the conclusions drawn from the idealized test case can be used to explain spatial changes in conduction velocities in a patient-specific reconstruction of the left atrial posterior wall. The simulations suggest that the widespread assumption of treating atrial muscle as a two-dimensional manifold for electrophysiological simulations will not accurately represent the endocardial conduction velocities in regions of the heart thicker than 0.5 mm with significant wall curvature.

Keywords: cardiac electrophysiology, bidomain model, conduction velocity, bath-loading conditions, left atrial posterior wall, electroanatomical mapping, atrial fibrillation

### 1. INTRODUCTION

Atrial fibrillation (AFib) is the most common cardiac arrhythmia, and symptoms can range from being nonexistent to severe, possibly leading to stroke, heart failure, sudden death, and cardiovascular morbidity (January et al., 2014; Kirchhof et al., 2016). Electroanatomic mapping, which involves acquiring extracellular signals (electrograms) at multiple locations using catheterbased electrode, is often used in clinical procedures to identify triggers of the AF and to characterize the electrophysiological health of the tissue. One outcome of this mapping is a display of the pattern of the spread electrical activation obtained by identifying the local activation time from the electrograms. These activation maps can be used to estimate the conduction velocity and help to localize regions of slow conduction associated with cellular decoupling and fibrosis

#### Edited by:

*Hernan Edgardo Grecco, Universidad de Buenos Aires, Argentina*

#### Reviewed by:

*Martin Bishop, King's College London, United Kingdom Sanjay Ram Kharche, University of Western Ontario, Canada*

> \*Correspondence: *Simone Rossi simone.rossi@unc.edu*

#### Specialty section:

*This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology*

Received: *05 June 2018* Accepted: *06 September 2018* Published: *29 October 2018*

#### Citation:

*Rossi S, Gaeta S, Griffith BE and Henriquez CS (2018) Muscle Thickness and Curvature Influence Atrial Conduction Velocities. Front. Physiol. 9:1344. doi: 10.3389/fphys.2018.01344*

**34**

(King et al., 2013; Grossi et al., 2016). Several approaches can be used to evaluate CVs from the measured electrophysiological data, such as polynomial surface-fitting algorithms, finitedifference techniques, and triangulation, among many others (Cantwell et al., 2015). Because of the paucity of data that can be acquired at high resolution in a clinical procedure, accurate CV estimates are difficult to obtain, particularly in regions of the heart with significant curvature. In addition, it is well known that conduction velocity is very sensitive to membrane excitability, tissue conductivity, fiber orientation, wavefront shape, and even the properties of the adjoining blood, making interpretation of CV measurements challenging at best.

To better understand the various factors affecting both normal and abnormal conduction, computer models of the atria have been developed (Harrild and Henriquez, 2000; Seemann et al., 2006; Muñoz et al., 2011; McDowell et al., 2012; Rossi and Griffith, 2017). Because of the high computational cost required by these simulations, the atria are sometimes simplified as a single two-dimensional manifold (Harrild and Henriquez, 2000; Seemann et al., 2006; Muñoz et al., 2011; McDowell et al., 2012; Rossi and Griffith, 2017). However, this surface representation of the atria cannot be used to describe the endo-epicardial electrical dissociation taking place during AFib (Gharaviri et al., 2012). To overcome this limitation, bilayer models have been proposed (Gharaviri et al., 2012; Labarthe et al., 2014; Coudière et al., 2017). Although these models have a reduced computational cost with respect to fully threedimensional simulations, they fail to capture the loading of the muscle thickness and adjoining blood layer. The complete effects of the geometric factors on conduction can only be determined in a volumetric model of the atria. The goal of this work is to investigate how wall thickness and curvature affect conduction velocity and whether these geometric factors need to be considered in modeling relatively thin tissue such as the atria. In addition, we investigate how the thickness of the adjoining blood layer affects the CV and the resulting extracellular signals. Simulations are performed on idealized geometries and a patient specific geometry of the posterior wall of the atria. The results show that variations of more than 10% in CV can derive from the atrial geometry even without considering changes in the transmural properties.

The blood is the natural volume conductor that bathes the cardiac wall (Trayanova, 1997). Since endocardial bipolar signals measured by electroanatomical mapping systems are influenced by the presence of blood, our computational model is augmented with a perfusing endocardial bath (Henriquez et al., 1996). The role of muscle thickness on the CV in the presence of a bath has been studied only on a thick strand of muscle without curvature (Roth, 1991). Although the role of the perfusing bath has been extensively studied (Roth, 1991, 1996; Henriquez et al., 1996; Trayanova, 1997; Srinivasan and Roth, 2004; Vigmond et al., 2009; Bishop et al., 2011; Colli-Franzone et al., 2011), and methods have been proposed to reduce the computational demands of these simulations (Bishop and Plank, 2011), the minimum depth of the bath that adequately accounts for the bath-loading conditions on CV is not currently known. For this reason, we investigate the role of the bath thickness on the CVs and bipolar signals. Our results show that a bath thickness of at least 1.5 mm is needed to capture endocardial CVs with good accuracy. The same thickness of the intracardiac bath layer also guarantees a satisfactory representation of the endocardial bipolar signals.

### 2. THE BIDOMAIN MODEL

Most common tissue-scale models of cardiac electrophysiology consider the myocardium to be composed of continuous intracellular and extracellular compartments, coupled via a continuous cellular membrane. This study considers such a model; specifically, we consider the bidomain model of the propagation of the action potential in cardiac muscle. The bidomain model of the propagation of the action potential in cardiac muscle formulated by Tung (1978) is a continuum model derived from a homogenized description of excitation propagation in the cardiac microarchitecture (Neu and Krassowska, 1993; Keener and Sneyd, 1998; Griffith and Peskin, 2013).

The bidomain equations describe the dynamics of a local average of the voltages in the intracellular and extracellular compartments over a control volume. One of the assumptions required by the homogenization procedure is that the control volume is large compared to the scale of the cellular microarchitecture, but small compared to any other important spatial scale of the system, such as the width of the action potential wavefront. Although the validity of this model has been questioned, for example by Bueno-Orovio et al. (2014), this approach has been extremely successful, and at present, most organ-scale simulations of cardiac electrophysiology use such a model. For a detailed review of the bidomain model and other models of cardiac electrophysiology, we refer to other references (Griffith and Peskin, 2013; Franzone et al., 2014).

In our model, we also consider a conductive blood cavity adjacent to the tissue. In the bidomain model, current flow is restricted to the intracellular (denoted by subscript i), extracellular (denoted by subscript e), and bath (denoted by subscript b) compartments and is described by a set of coupled partial differential equations. Referring to **Figure 1A**, <sup>m</sup> denotes the muscle region and <sup>b</sup> denotes the perfusing bath. From charge conservation, the bidomain equations can be written in the muscle region <sup>m</sup> as

$$\nabla \cdot (\mathbf{D\_i} \nabla V) + \nabla \cdot (\mathbf{D\_i} \nabla V\_\mathbf{e}) = \chi \left(\mathbf{C\_m} \partial\_t V + I\_{\text{ion}}\right) - I\_\mathbf{i}^\mathbf{v}, \tag{1}$$

$$\nabla \cdot (\mathcal{D}\_{\mathbf{i}} \nabla V) + \nabla \cdot ((\mathcal{D}\_{\mathbf{i}} + \mathcal{D}\_{\mathbf{e}}) \nabla V\_{\mathbf{e}}) = I\_{\text{total}}^{\text{v}},\tag{2}$$

in which V<sup>i</sup> and V<sup>e</sup> are the potentials of the homogenized intracellular and extracellular compartments, respectively, such that V = V<sup>i</sup> − V<sup>e</sup> is the transmembrane potential difference, **D**<sup>i</sup> and **D**<sup>e</sup> are the intracellular and extracellular conductivity tensors, C<sup>m</sup> is the membrane capacitance, χ is membrane area per unit volume of tissue, and I v i and I v e are the volumetric intracellular and extracellular applied currents such that I v total = I v <sup>i</sup> + I v e . The dynamics of the transmembrane current Iion, accounting for charged ionic species moving from the intracellular compartment to the extracellular compartment

FIGURE 1 | (A) Schematic representation of the configurations of the muscle and blood bath. Inside the heart, blood acts as a low resistance conductor. Outside the heart, between the epicardium and the epicardial sac, an interstitial fluid can also act as a conductor. (B) Schematic representation of the idealized left atrial posterior wall. A rectangular strand of muscle (green) of length *L* = 2.3 cm and thickness ℓm is adjacent to an endocardial bath (blue) of thickness ℓb. The thickness of the muscle ℓm and of the bath ℓb are varied to study their influence on endocardial CVs. Curvature is applied to the top part of the rectangle such that the curved endocardial length *L*e is fixed at 2 cm. The corresponding curvature κ is defined as the inverse of the endocardial radius. The curvature is positive if the muscle is bent to the left and it is negative if it is bent to the right. Endocardial CVs are measured using the activation times at **x**1 (yellow circle) and **x**2 (red circle). In the straight geometry, **x**1 and **x**2 correspond to the points **X**1=(0 cm, 1 cm) and **X**2=(0 cm, 1.5 cm) , fixed at distance 5 mm. As described by equation (7), endocardial CVs are defined as the distance between these two points divided by the difference of the respective activation times. Unipolar extracellular signals *V* 1 e and *V* 2 e are recorded at 1 kHz at **p**<sup>1</sup> and **p**<sup>2</sup> , corresponding to the points **P**1=(0 cm, 1.75 cm) and **P**2=(0 cm, 1.95 cm) in the straight geometry. Bipolar signals were computed as the difference *V* 2 <sup>e</sup> − *V* 1 e .

and vice-versa, are described by a set of ordinary differential equations, called the ionic model. More precisely, the ionic model introduces the additional variables **w**, satisfying ∂t**w** = **g** (V,**w**), such that Iion = Iion(V,**w**).

In the blood region, the bath potential V<sup>b</sup> satisfies the Poisson's equation

$$\nabla \cdot (\mathbf{D}\_{\mathbf{b}} \nabla V\_{\mathbf{b}}) = I\_{\mathbf{b}},\tag{3}$$

in which **D**<sup>b</sup> represents the blood conductivity tensor and I<sup>b</sup> is a volumetric applied current in the blood domain.

The anisotropic nature of the muscle is accounted for in the bidomain model through the conductivity tensors **D**<sup>i</sup> and **D**e. Denoting with **f** the local direction of the fiber field, we assume axial symmetry relative to **f**, such that the conductivity tensors can be written as **D**<sup>i</sup> = σ t i **I** + σ f <sup>i</sup> − σ t i **f** ⊗ **f**, and **D**<sup>e</sup> = σ t e **I** + σ f <sup>e</sup> − σ t e **f** ⊗ **f**. Here, σ f i and σ t i denote the tissue conductivities along and across the fiber direction in the intracellular space, σ f e and σ t e denote the extracellular conductivities, and **I** is the identity tensor. The blood conductivity is assumed isotropic, so that **D**<sup>b</sup> = σb**I**. Representative values of the intracellular, extracellular, and blood conductivities are taken from the work of Roth (1996). Specifically, we set σ f <sup>e</sup> = σ f <sup>i</sup> = 4.5 mS/cm, σ t <sup>e</sup> = 1.8 mS/cm, σ t <sup>i</sup> = 0.45 mS/cm, and σ<sup>b</sup> = 20 mS/cm.

The boundary conditions for the tissue are those derived for a spatially periodic cellular syncytium by Krassowska and Neu (1994). Referring to **Figure 1A**, at the muscle boundaries Ŵm, current fluxes in the intracellular and extracellular compartments depend on externally applied surface currents,

$$\mathbf{n} \cdot (\mathbf{D}\_{\mathbf{i}} \nabla V\_{\mathbf{i}}) = \mathbf{n} \cdot (\mathbf{D}\_{\mathbf{i}} \nabla \left(V + V\_{\mathbf{e}}\right)) = I\_{\mathbf{i}}^{\mathrm{s}},\tag{4}$$

$$\mathbf{n} \cdot (\mathbf{D}\_{\mathbf{e}} \nabla V\_{\mathbf{e}}) = I\_{\mathbf{e}}^{\mathbf{s}},\tag{5}$$

in which the vector **n** is the outward unit normal to the boundary of the tissue domain. In the following, we shall assume that the intracellular and extracellular surface currents, I s i and I s e , are zero. At the bath boundary Ŵb, we enforce no current flux in the blood domain **n** · (**D**b∇Vb) = 0. Along the muscle-blood interface Ŵ<sup>i</sup> , we require continuity of the extracellular and bath potentials V<sup>e</sup> = Vb, continuity of the normal currents **n** · (**D**e∇Ve) = **n** · (**D**b∇Vb) and zero intracellular current density **n** · (**D**i∇Vi). We refer to the **Supplementary Material** for more details on the model and its numerical discretization.

A modified version of the Courtemanche et al. (1998) ionic model available on Model DB (Carnevale and Hines, 2006; McDougal et al., 2017), defining the ionic current Iion and gating variable dynamics **g** (V,**w**), is chosen to represent human atrial action potential.

### 2.1. Idealized Model of the Atrial Left Posterior Wall

To study the effects of muscle thickness, muscle curvature, and bath-loading conditions on the measured conduction velocities, we devised a simple idealized test case. Although our main interest is to understand measurements of endocardial CVs in the posterior wall of the left atrium, which has overall positive curvature, our study is not limited to that application. For this reason, we also consider negative curvatures. Although those cases are not representative of the left atrial posterior wall, the relationships between negative curvature and CV can be easily explored in our idealized model.

A strand of muscle is connected to a bath as depicted in **Figure 1B**. This simplified model represents a piece of atrial tissue where the endocardial surface Ŵ<sup>i</sup> separates the blood from the muscle. We consider a straight strand of tissue of length 2.3 cm. To study the influence of curvature, the strand of tissue is then bent on both sides keeping the length of the endocardial interface fixed. The epicardial boundary is instead allowed to change in length. For this reason, curved domains with opposite curvature are not symmetric.

Consider **Figure 1B**. The initial muscle domain is a rectangle of length L = 2.3 cm and thickness ℓm, such that <sup>m</sup> = [0 cm, ℓm] × [−0.3 cm, 2 cm]. Denote with Le=2 cm the length of the rectangle where curvature will be varied. We set the "endocardial" surface Ŵ<sup>i</sup> = {0 cm} × [−0.3 cm, Le] on the left edge of the muscle. The length L<sup>e</sup> = 2 cm denotes the length of the endocardial surface where curvature will be imposed. A bath is added adjacent to the interface Ŵ<sup>i</sup> with thickness ℓb, such that <sup>b</sup> = [−ℓb, 0 cm] × [−0.3 cm, Le]. These geometrical settings represent the straight domain with zero curvature. We denote with (X, Y) the coordinates of a point in this straight domain. Given an angle θ ∈ (0, π], we bend the domain, keeping the measure L<sup>e</sup> of the top part of the endocardial interface Ŵ<sup>i</sup> fixed at 2 cm. The transformation from the straight rectangle to the curved one is performed using the relations

$$\begin{aligned} \lambda &= \begin{cases} X & \text{if } Y \le 0, \\ c \left( R\_0 - R \cos \left( \theta \frac{Y}{Y\_{\text{max}}} \right) \right) & \text{if } Y > 0, \end{cases} \\\\ \lambda &= \begin{cases} Y & \text{if } Y \le 0, \\ cR \sin \left( \theta \frac{Y}{Y\_{\text{max}}} \right) & \text{if } Y > 0, \end{cases} \end{aligned} \tag{6}$$

in which the parameter c specifies in which direction the bending is performed, that is, c = −1 means bending to the right whereas c = 1 means bending to the left. The radius of curvature of Ŵ<sup>i</sup> is given by R<sup>0</sup> = Ymax/θ, such that R = R<sup>0</sup> + cX, in which Ymax is the maximum Y coordinate. We define the curvature of the interface Ŵ<sup>i</sup> by κ = 1/ (cR0), such that the curvature is negative if c is negative (bend to the right) and positive if c is positive (bend to the left). Given such construction, geometries with opposite curvature will not be symmetric even though the length and the magnitude of the curvature of the endocardial interface are the same. For any possible curvature, the region defined by the points x, y ∈ R 3 : y ≤ 0 is the same in all cases. Applying the same initial conditions and the same initial stimulus in this region, we can compare the effects of curvature on the conduction velocity.

Before applying any curvature, the domain is discretized using a structured triangular mesh with mesh size h<sup>Y</sup> = 50 µm in the longitudinal fiber direction and h<sup>X</sup> = 25 µm in the transversal fiber direction. Denoting with v<sup>X</sup> the conduction velocity in the longitudinal fiber direction, we used the CFL condition hX/v<sup>X</sup> ≤ 1 to determine the timestep (Rossi and Griffith, 2017). We chose the largest negative power of 2 such that the CFL condition was satisfied, which led to the timestep 1t = 0.03125 ms. This choice is also sufficient to ensure the stability of the time integrator used for the ionic model.

To quantify the changes in endocardial conduction velocity with respect to the curvature, we use a simple finite difference method: measuring the activation times t<sup>1</sup> and t<sup>2</sup> on the endocardial surface Ŵ<sup>i</sup> at two locations **x**<sup>1</sup> and **x**2, corresponding to the points **X**<sup>1</sup> = (0 cm, 1 cm) and **X**<sup>2</sup> = (0 cm, 1.5 cm) in the straight domain (no curvature), we define the conduction velocity as

$$\nu = \frac{\|\mathbf{x}\_2 - \mathbf{x}\_1\|\_{\Gamma\_1}}{t\_2 - t\_1} = \frac{\|\mathbf{X}\_2 - \mathbf{X}\_1\|}{t\_2 - t\_1} = \frac{\text{0.5 cm}}{t\_2 - t\_1},\tag{7}$$

where k · kŴ<sup>i</sup> represents the distance between **x**<sup>1</sup> and **x**<sup>2</sup> on the endocarial surface. Referring to **Figure 1B**, the points **x**<sup>1</sup> and **x**<sup>2</sup> correspond to the position on the muscle-bath interface Ŵ<sup>i</sup> of the yellow and red circles, respectively.

An equal and opposite stimulus is applied in the interior and exterior compartments of the muscles such that I v <sup>e</sup> = 100 µA/cm<sup>3</sup> for the first 2 ms if y was smaller than –0.2797 mm. This choice generates a plane wave propagating in the longitudinal direction whenever the domain is straight (no curvature) and no bath region is considered.

Unipolar signals V 1 e and V 2 <sup>e</sup> were recorded at 1 kHz on the endocardial surface at **p**<sup>1</sup> and **p**<sup>2</sup> corresponding to the points **P**<sup>1</sup> = (0 cm, 1.75 cm) and **P**<sup>2</sup> = (0 cm, 1.95 cm) of the straight domain. Referring to **Figure 1B**, the points **p**<sup>1</sup> and **p**<sup>2</sup> correspond to the position on the muscle-bath interface Ŵ<sup>i</sup> of the light blue and pink red stars, respectively. Bipolar signals were then computed taking the difference V 2 <sup>e</sup> − V 1 e .

#### 2.2. Left Atrial Posterior Wall

A detailed geometry of the whole left atrium was collected by fast anatomical mapping (FAM) with a 2-5-2 PentaRay catheter. High-density maps of the left atrial posterior wall (LAPW) endocardial surface were created using the Carto3 electroanatomic mapping system (Biosense Webster, Diamond Bar, CA). The LAPW was mapped following pulmonary vein isolation by wide antral circumferential ablation (WACA). The region was defined as the area of the posterior left atrium between WACA lesion sets encircling the bilateral pulmonary veins and extending from their inferior margin to their superior margin. The LAPW surface was extracted from the reconstructed geometry of the left atrium and its geometrical representation was generated using SOLIDWORKS (Dassault Systèmes, Waltham, MA). The surface was then thickened outward to obtain a uniform 1.5 mm LAPW thickness. The bath region was created thickening in the opposite direction for 2.85 mm. We used the Trelis software (Computational Simulation Software, American Fork, UT American Fork, Utah) to generate a simplex mesh of the LAPW with bath. The mesh size for muscle domain was selected to yield 16 elements through the muscle thickness. As the solution in the bath is smooth, a larger mesh size was used in the bath domain. Still, on the muscle-bath interface Ŵ<sup>i</sup> , the two meshes are conforming.

To investigate the role of muscle thickness and curvature on the LAPW, we used SOLIDWORKS to flatten the LAPW endocardial surface. The same procedure as for the curved LAPW was then used to thicken and mesh the resulting geometry. The resulting geometries are shown in **Figure 2A**.

The fiber field was created by assuming the existence of a harmonic potential ϕ (**x**) such that **f** = ∇ϕ. In practice, we solve numerically the equation 1ϕ = 0 with mixed boundary conditions. In particular, we set ϕ = 0 on the surface Ŵ<sup>0</sup> and ϕ = 1 on Ŵ1, and ∂**n**ϕ = 0 on ∂m\ (Ŵ<sup>0</sup> ∪ Ŵ1), where the surfaces Ŵ0,

and Ŵ<sup>1</sup> are the boundaries delimiting the LAPW from the top and from the bottom. The resulting fiber field, depicted in **Figure 2B**, qualitatively matches the anatomical structures of the LAPW shown by Markides et al. (2003). Referring to **Figure 2B**, Ŵ<sup>0</sup> and Ŵ<sup>1</sup> are the boundaries orthogonal to the fiber field. Similarly, a fiber field was generated for the flattened geometry.

The simulations were initiated by imposing an initial condition for the transmembrane potential V. As shown in **Figure 2C**, the potential was set to 30 mV on Ŵ<sup>0</sup> and to its resting value of −81.2 mV everywhere else. Similarly, we imposed the initial condition on the flattened geometry.

For both the LAPW and the flattened LAPW, we solved the bidomain equations in three scenarios: (1) considering only the endocardial surface (referred to as 2D); (2) considering only the muscle domain (referred to as 3D); and (3) considering the muscle with bath-loading conditions (referred to as Bath). In all cases, we registered the activation times A<sup>t</sup> (**x**), as the earliest time when the transmembrane potential was larger than −5 mV. The endocardial conduction velocities were then reconstructed on each node of the triangulation in the following way. For each triangle K on the endocardial surface, we define the elemental conduction velocity **v**<sup>K</sup> = ∇At/ (∇A<sup>t</sup> · ∇At). Since A<sup>t</sup> is interpolated between the nodes using linear basis functions on each element, its gradient ∇A<sup>t</sup> and the conduction velocity **v**<sup>K</sup> are constant on each triangle. For each vertex q, we define the patch 5<sup>q</sup> as the set of triangles K surrounding the node q. The averaged nodal velocity is then given as

$$\mathfrak{v}\_q = \frac{\sum\_{K \in \Pi\_q} |K| \,\mathfrak{v}\_K}{\sum\_{K \in \Pi\_q} |K|},\tag{8}$$

in which |K| denotes the area of the triangle K.

#### 3. NUMERICAL RESULTS

The bidomain model was discretized in space using linear finite elements (Plank et al., 2005; Franzone et al., 2006; Pathmanathan et al., 2010; Bishop and Plank, 2011; Landajuela et al., 2018). The intracellular, extracellular, and bath potentials are solved monolithically (Bernabeu and Kay, 2011), using IMEX temporal schemes. We use the C++ implementation of the model of Courtemanche et al. (1998) provided by Hsing-Jung Lai and Sheng-Nan Wu on Model DB (McDougal et al., 2017), which includes the modifications by Ingemar Jacobson (Carnevale and Hines, 2006) needed for ion concentrations to be stable at a pacing rate of 1 Hz. Since the Courtemanche ionic model contains many discontinuous parameters that negatively influence the expected optimal rate of convergence of the finite element discretization (Arthurs et al., 2012), we rely on the simple IMEX BDF1 method. We refer to the **Supplementary Material** for more details on the numerical methods used. Unless explicitly stated, we used the same set of parameters for all the numerical tests presented below. The parameters are reported in **Table 1**.

The code developed in this work, BeatIt (available at github.com/rossisimone/beatit), relies on the parallel C++ finite element library libMesh (Kirk et al., 2006) and on PETSc (Balay et al., 1997, 2017) and HYPRE linear solvers (Falgout and Yang, 2002). More specifically, we used the FieldSplit preconditioner (Brown et al., 2012) provided by PETSc to solve the system using the block Gauss-Seidel method, and each sub-block is preconditioned using BoomerAMG (Falgout et al., 2010). More details about the algorithmic implementation can be found in the **Supplementary Material**. Using a uniform structured grid for the muscle and bath domain, simulations of the two-dimensional idealized test case were run in serial on a Linux workstation. Simulations on the patient-specific left atrial posterior wall used a fine discretization in the muscle domain and a coarse one in the bath domain. A boundary layer in the mesh of the bath was created to correctly resolve the bath potential close to the muscle interface. Simulations were run on a single node (44 processors) of the Dogwood Linux cluster at the University of North Carolina at Chapel Hill. The visualization of the results and their analysis have been carried out using

TABLE 1 | Bidomain model parameters used in the numerical simulations.

Paraview (Ahrens et al., 2005) and MATLAB The Mathworks, Inc., Natick, MA.

change depending on the curvature. Relative changes in CVs are only slightly influenced by transversal conductivities.

### 3.1. Without Bath-Loading Conditions Endocardial CVs Depend on Tissue-Thickness and Curvature

We start investigating how muscle thickness influences the endocardial conduction velocities when no bath-loading conditions are considered. For this test, we consider muscles of thicknesses ℓ<sup>m</sup> = 0.025, 0.5, 1, 1.5, and 2 mm. The thickness ℓ<sup>m</sup> = 0.025 mm corresponds to the case where the atrial tissue is considered to be so thin that can be approximated with a bidimensional manifold.

**Figure 3A** shows the evaluation of the conduction velocities on the endocardial surface for the considered muscle thicknesses. When the curvature is zero, the conduction velocity is independent of muscle thickness. If the muscle thickness is small enough, say of the order of a handful of cardiomyocytes, the conduction velocities are also independent of the curvature. On the other hand, when curvature is imposed on a thicker muscle the endocardial conduction velocities can change quite drastically. For positive curvatures (bending to the left) the endocardial CVs become slower, while for negative curvatures (bending to the right) the endocardial CVs become faster.

We also test if the relative changes in the CVs are influenced by the anisotropy ratio (AR) for muscle thickness of 1.5 mm. In a first test we have increased and decreased the longitudinal conductivities σ f <sup>e</sup> = σ f i by 50%, keeping fixed the transversal conductivities. As shown in **Figure 3B**, variations in the longitudinal conductivities do not affect the relative changes in CVs with respect to curvature. Clearly, the magnitude of the CVs is different. For κ = 0 cm−<sup>1</sup> , if σ f <sup>e</sup> = σ f <sup>i</sup> = 4.5 mS/cm, the CV is measured to be 73.7 cm/s; if σ f <sup>e</sup> = σ f <sup>i</sup> = 6.75 mS/cm (+50% case), the CV is measured to be 90.4 cm/s; if σ f <sup>e</sup> = σ f <sup>i</sup> = 2.25 mS/cm (–50% case), the CV is measured to be 51.8 cm/s. These values are in accordance with the expected dependence on the CVs on the square root of the conductivities. In a second test, we have increased and decreased the transversal conductivities σ f <sup>e</sup> = σ f i by 50%, keeping fixed the longitudinal conductivities. The relative changes in CVs are shown in **Figure 3C**, where we have also included the results for isotropic propagation. Although some differences can be found at different anisotropy ratios, changes in the ARs seem to only have a minor effect on the relative changes in CVs. In all these cases for which κ = 0 cm−<sup>1</sup> , the endocardial CV was measured to be about 73.7 cm/s.

Although, we have found that the AR does not influence the relative changes in the CVs, AR does influence the shape of the wavefront in the curved domains. We show in **Figure 4** the shapes of the wavefronts at different internal and external anisotropic ratios (AR<sup>i</sup> and ARi) for κ = π/2. Specifically, fixing the longitudinal conductivity coefficients σ f <sup>e</sup> = σ f <sup>i</sup> = 4.5 mS/cm, we show the activation times (black lines are iscochrones separated by 1 ms increments) in various cases changing the transversal conductivities. As shown in **Figure 4A**, the initial condition creates a plane wave in the straight region of the domain. **Figure 4A** shows the rectangular region where we look

FIGURE 4 | (A) Transmembrane potential wavefront at *t* = 4 ms without bath-loading conditions for curvature κ = π/2 cm−<sup>1</sup> . The initial stimulus generates a plane wave in the straight part of the muscle for any anisotropy ratio (AR). (B–F) Shape of the wavefronts at 1ms distance (zoom of the rectangular region in A) for several external (ARe) and internal (ARi ) anisotropy ratios. The longitudinal conductivity σ f <sup>e</sup> = σ f i is fixed for all cases and the transversal condcitivities are changed. While in the isotropic case (B) the fronts remain almost planar, in all other cases, the wavefronts become curved.

FIGURE 5 | (A) Activation times at about 3.3 ms distance at different curvatures without bath-loading conditions for a muscle thickness of 1.5 mm. The change in shape of the wavefronts in the curved domains is clearly noticeable. The shapes depend on the sign and magnitude of the curvature κ. Note that the construction of the domain leads to different geometries for positive and negative curvatures. This is because that we keep the endocardial length fixed, but we allow the epicardial surface to become shorter or longer. (B) Endocardial CVs as a function of the curvature for several muscle thicknesses when an intracardiac bath of size ℓb =6 mm is considered. (C) Endocardial CVs as a function of the curvature for several muscle thicknesses when intracardiac and extracardiac baths of size ℓb =3 mm are considered. As for the case with no bath-loading conditions conduction endocardial CVs speed up for negative curvatures and slow down for positive curvatures (B,C). The case of mucles thickness 25 µm correspends to the case of two-dimensional manifolds in three-dimensional simulations. When the muscle thickness ℓm is very small (25 µm), the CVs are independent of the curvature. In this case, the signal speed is strongly influenced by the bath conductivities. If ℓm >1 mm, then muscle thickness does change much endocardial CVs but curvature does.

at the wavefronts. Under isotropic conditions, the wavefronts remain straight even in the curved domain. This is shown in **Figure 4B**, where the isochrones are radial. Under anisotropic conditions, **Figures 4C–F**, the wavefronts have a different orientation with respect to the radial direction. Additionally, the boundary conditions induce wavefront curvatures close to the boundaries.

Finally, we show in **Figure 5A** the activation times at different curvatures for ℓ<sup>m</sup> = 1.5 mm, using the parameters specified in **Table 1**. The black isolines are at distance 3.3 ms. The marked solid black line represents the endocardial surface where we measure the conduction velocities. As it can be seen, curvature greatly influences the activation times: the endocardial activation is slower for positive curvature (bending on the left) and faster for negative curvature (bending on the right) than in the straight case.

### 3.2. Endocardial CVs Depends on Muscle-Thickness and Curvature With Bath-Loading Conditions

Here, we investigate the role of muscle thickness and curvature in presence of a bath. We consider a fixed intracardiac bath thickness ℓ<sup>b</sup> = 6 mm and we test muscle thicknesses ℓ<sup>m</sup> = 0.025, 0.5, 1, 1.5, and 2 mm. In **Figure 5B**, we show the endocardial CVs evaluated for the different muscle sizes. It can be noted here that in the case of muscle thickness ℓ<sup>m</sup> = 25 µm, the conduction velocities are mainly dictated by the conductivity of the bath.

When the domain is straight (κ = 0 cm−<sup>1</sup> ) and both intracardiac and extracardiac bath-loading conditions are considered, the wavefront takes the characteristic "V" shape. (B) When the domain is curved, the wavefront does not take the characteristic "V"-shape. (C) If we consider only the intracardiac bath, the endocardial CVs are still in good agreement with the case of intracardiac and extracardiac baths.)

A small thickness is sufficient to reveal the dependence on muscle curvature.

We also consider the case of intracardiac and extracardiac baths of thickness ℓ<sup>b</sup> = 3 mm, testing again muscle thicknesses ℓ<sup>m</sup> = 0.025, 0.5, 1, 1.5, and 2 mm. The corresponding CVs are shown in **Figure 5C**. Once again, if ℓ<sup>m</sup> = 25 µm, the CVs are independent of the curvature. As expected, the extracardiac bath mainly influences endocardial CVs for muscle thickness smaller than 1 mm.

We can conclude that if we are interested only in capturing endocardial CVs, using only an intracardiac bath is sufficient if the muscle thickness is greater than 1 mm. We show this in **Figure 6**, which shows the wavefront and the extracellular potential at time t = 32 ms for muscle thickness of 2 mm. As a reference, we show in **Figure 6A** the characteristic V-shaped wavefront when intracardiac and extracardiac bath are both considered in a straight domain. In the curved domain, **Figure 6B**, the front loses the characteristic V-shape. When only the intracardiac bath is considered, **Figure 6C**, the epicardial details of the front are lost, but the endocardial CVs are about the same. This can be noted by comparing the position of the endocardial fronts in **Figures 6B,C**.

Finally, in **Figure 7**, we plot the bipolar signals measured on the endocardial surface, as explained in section 2.1, on three selected curvature: κ = π/2 cm−<sup>1</sup> , κ = 0 cm−<sup>1</sup> , and κ = −π/2 cm−<sup>1</sup> . Once again, if the muscle thickness is not accounted for, the peak of the signal is out of phase due to the increased CVs. Moreover, the amplitude of the signal is not accurate. Nonetheless, we can appreciate that the amplitude of the signals is greatly affected by the thickness of the muscle. No major differences in the signals can be noted for different curvatures.

### 3.3. Endocardial CVs Depends on Bath-Size and Curvature at Fixed Muscle Thickness

In this test, we evaluate the size of the bath that is needed to correctly capture the endocardial CVs. Fixing the muscle

bath-loading conditions increases the speed at which the wavefronts propagate. Positive (negative) curvature of the muscle decreases (increases) the endocardial

CVs. A bath size of at least 1.5 mm was need to correctly capture the effects of the intracardiac bath-loading conditions.

thickness ℓ<sup>m</sup> = 1.5 mm, we vary the bath thickness ℓb.

We show in **Figure 8A** the extracellular potential V<sup>e</sup> at t = 20 ms for some selected curvatures and ℓ<sup>b</sup> = 6 mm. The solid black line corresponds to the endocardial interface. The front of the wave is localized in the muscle region where we have an abrupt change in the polarity of Ve. We can appreciate from these plots the differences in the wavefront curvatures which depends both on the curvature of the domain and on the imposed boundary and interface conditions.

**Figure 8B** shows the dependence of the endocardial CVs on the curvature of the domain. We note here that for baths larger than 1.5 mm we measure the same CVs. This suggests that the bath should be at least of the size of the muscle to correctly capture the magnitude of the CVs.

Similarly to the simplified case studied above, the conduction velocities strongly depend on the muscle curvature. Still, curvature has small effects on the bipolar signals. In **Figure 9**, we show the bipolar signals recorded at 1 kHz for the different bath sizes at three selected curvatures. Specifically, we show in **Figures 9A–C** the bipolar signals recorded for bath sizes between 0 and 2 mm, and in **Figures 9D–F** the bipolar signals recorded for bath sizes between 1.5 and 6 mm. These plots also suggest that a bath size of at least 1.5 mm is needed to correctly capture the bipolar signals.

### 3.4. Patient-Specific Left Atrial Posterior Wall

In **Figure 10**, we show the solutions of the bidomain model on the patient-specific LAPW. More specifically, we show **Figure 10A**, the endocardial activation times (black isochrones at about 10 ms) when using the bath-loading conditions. In **Figure 10B**, we show the extracellular potential in the muscle and in the bath regions at time t = 40 ms. In **Figure 10C**, we show the shape of the wavefront at time t = 80 ms without a bath. The wavefront is highlighted in white, and the corresponding straight wavefront is depicted in the dashed green line. The corresponding results in the flattened LAPW are shown in **Figures 10D–F**. While in the flat geometry the wavefront remains straight, in the curved domain transversal conductivity and boundary conditions lead to a transmurally curved wavefront.

Finally, we show in **Figure 11** the distributions of the endocardial CV evaluated using (8). The CV of the LAPW and of the flattened LAPW have the same distribution if a bidimensional manifold is considered, where the most frequent conduction velocities are around 74–76 cm/s; see **Figure 11A**. Additionally, in the flattened LAPW, the thickness of the muscle does not influence the endocardial CV distribution; see **Figure 11B**. In the curved LAPW, the small thickness of the muscle is sufficient to slow down the endocardial conduction velocities; see **Figure 11C**. This is represented by the broader distribution of the 3D simulation in the CVs smaller than 70 cm/s. Additionally, the peak of the three-dimensional distribution corresponds to slightly slower CVs of about 72–75 cm/s. A similar difference can be noted also when comparing the distributions of the endocardial CVs for the flat and curved three-dimensional domains; see **Figure 11D**. When the bath-loading effects are considered; see **Figure 11E**, the differences are smaller but still noticeable: the peak of the distribution slows down from 85 to 84 cm/s and CVs smaller than 80 cm/s are more frequent throughout the domain.

### 4. CONCLUSIONS

Measurements of endocardial CVs can be used to characterize the electrophysiological health of the tissue substrate in patients with atrial fibrillation (AFib). CV is known to be affected by membrane excitability, front curvature, fiber orientation, and tissue anisotropy (Roberts et al., 1979; Rogers and McCulloch, 1994; Kléber and Rudy, 2004). In patients with persistent AFib, the morphological structure of the left atrium is correlated with pro-arrhythmic wave dynamics (Song et al., 2018). Heterogeneous atrial wall thickness is believed to contribute spiral wave localization or drift (Yamazaki et al., 2012; Biktasheva et al., 2015) and to support scroll waves underlying AFib maintenance (Yamazaki et al., 2012). The regional left

FIGURE 9 | Bipolar signals *V* 2 <sup>e</sup> − *V* 1 e recorded at 1 kHz for three selected curvatures κ = π/2 cm−<sup>1</sup> , 0 cm, π/2 cm−<sup>1</sup> for different bath sizes and muscle thickness ℓm = 1.5 mm using the modified version of the Courtemanche atrial ionic model. (A–C) Bipolar signals for bath sizes between 0 and 2 mm. (D–F) Bipolar signals for bath sizes between 1.5 and 6 mm. An overlap of the data has been used between the top and bottom rows to better understand the differences in signals for different bath sizes. The curvature of the domain does not play a major role in the recorded signals. Large differences in the signal amplitudes can be found for bath sizes smaller than 2 mm. Although minor differences can also be noted for bath larger than 1.5 mm, the amplitude of the signals is well captured for baths of size at least 3 mm.

atrial wall thickness has been strongly correlated with the dominant frequency, Shannon entropy, and the presence of complex fractionated atrial electrogram, associated with diseased tissue (Song et al., 2018). In addition, it has been shown that electrical dissociation between the epicardial layer and the endocardial layer during AFib increases stability and complexity of the AFib and is more pronounced in regions of thicker atrial wall (Eckstein et al., 2010). Along with wall thickness, curvature changes in wall geometry can also contribute to the initiation and maintenance of reentries by promoting wavebreaks (Rogers, 2002). Rogers showed that an expansion of the diffusive term of the monodomain model in terms of curvilinear coordinates reveals the role of curvature and muscle thickness on CVs (Rogers, 2002). For a spiral wave on a spherical manifold, an analytical expression for the angular velocities as a function of the curvature can be derived (Davydov and Zykov, 1991). These findings suggests that even under spatially uniform electrical and membrane properties, the complex geometry of the heart can destabilize wavefronts, causing fragmentation and complex activation patterns (Rogers, 2002). Rogers found that propagation was only affected by surface curvature when curvature was present in two directions (Rogers, 2002). In our simulations of an initially planar wave, we also found that that surface curvature in one direction does not influence propagation if the muscle is represented by a surface in three dimensions. Our simulations on a surface representation of a patient-specific left atrial posterior wall (LAPW) showed that the distributions

of CVs are not influenced by the Gaussian curvature (curvature in two directions). As soon as muscle thickness is incorporated, curvature in one direction is sufficient to affect wavefront propagation speed.

We started our investigation by considering the role of thickness and curvature without bath-loading conditions. As expected, CVs are not influenced by muscle thickness if no curvature is imposed on the domain. Additionally, CVs are not influenced by curvature whenever the muscle thickness is negligible (e.g., 25 µm). This situation corresponds also to the manifold representation of the atria in some computational models (Vigmond et al., 2001; Zemlin et al., 2001; Virag et al., 2002; van Dam and van Oosterom, 2003; Weiser et al., 2010; Patelli et al., 2017). For larger muscle thicknesses, geometrical curvature influences the propagation of the electrical signal. For negative curvatures, the signal propagates faster, whereas for positive curvatures, the signal propagates with decreased CVs. For thin muscles (up to about 1 mm), the thicker the muscle the slower (faster) the CVs for positive (negative) curvatures. This relationship between curvature, muscle thickness, and CV is analogous to the well-known dependence of propagation efficacy on wavefront curvature (Tyson and Keener, 1988; Rogers and McCulloch, 1994; Rogers, 2002). We have shown that these changes in CV take place even without considering variations in the transmural properties of the muscle.

Under conditions with uniform transmural properties one might assume that a planar wavefront remains planar for any curvature. The term planar wavefront is used in analogy with the theory of plates, in which straight lines normal to the mid-surface remain normal to the mid-surface after deformation. In a similar sense, we understand that a planar wavefront is a front that is parallel to a straight line normal to the mid-surface and remains normal for any curvature of the domain. We have demonstrated in our tests that this assumption holds only under isotropic conditions. When anisotropy is introduced, the wavefront in curved domains does not remain planar. The transmural shape of the wavefront depends on two factors: (i) the anisotropy ratio and (ii) the boundary conditions. Even in more refined versions of the surface-based models of atrial electrophysiology (Chapelle et al., 2013), derived from asymptotic analysis averaging through the thickness, these factors are not well captured. For example, the surface-based models cannot represent the dissociation of endocardial and epicardial electrical activities during fibrillation. Single layer surface models have been improved by introducing a second layer to account for a more three-dimensional character of the fibrillatory conduction (Gharaviri et al., 2012; Labarthe et al., 2014; Coudière et al., 2017). Comparisons of these bilayer models with three-dimensional simulations are very limited and do not consider the possible influence of geometrical curvature on the electrical propagation. We have also shown that even under isotropic conditions where the fronts remain planar in curved domains, the endocardial CVs depend on the curvature. These results show that the fully three-dimensional atrial models are necessary to accurately capture the propagation of electrical signals and the corresponding conduction velocities on the endocardial surface.

A number of studies have shown that bath-loading conditions can increase conduction velocities (Roth, 1991, 1996; Henriquez et al., 1996; Srinivasan and Roth, 2004; Bishop et al., 2011). Comparing **Figures 3** and **5**, the CV for a muscle thickness of 25 µm increases from 74 to 104 cm/s in the presences of a bath. But as in the cases that omit the bath, curvature does not play a major role in determining the velocities. For muscle thicknesses between 0.5 and 2 mm, we have found that curvature in the presence of a bath acts to increase endocardial conduction velocities, but, in accordance with Roth (1991), the differences between the various thicknesses are smaller than they are without a bath. For positive curvatures, we have found that when no bath is considered, changes up 10% of the planar CVs can be measured. Although the curvature effect is smaller with bath-loading conditions, changes of up to 6% were found. These variations in CVs can actually be measured by electroanatomic mapping systems. We also found that changes in CVs for negative curvatures were more pronounced. For large negative curvature, we found variations of more than 10 cm/s. Even if these results may not be applied directly to the measurements of the CVs on the LAPW, which has mostly positive curvature, they highlight the strong correlation between structure and speed of propagation. We conclude that in the presence of bath-loading, three-dimensional atrial models are still necessary to accurately capture the propagation of electrical signals and their conduction velocities. To reduce the computational cost when bath-loading conditions are considered, and the main interest is the evaluation of endocardial conduction velocity during normal propagation, it could be possible to consider a uniform atrial thickness of about 1 mm. On the other hand, this approximation may fail to correctly represent endo-epicardial dissociation and transmural breakthrough during Afib. In accordance to the results shown by Bishop and Plank (2011), in our simplified test case, fixing the muscle thickness at 1.5 mm, a bath size larger than 1.5 mm was necessary to correctly capture endocardial CVs. The effects of curvature on CVs are important for all bath sizes and the same considerations as in the case with no bath-loading conditions described above hold.

The above considerations, drawn from a simple twodimensional test case, were found to also hold in realistic geometries. Specifically, we have reconstructed a human model of the LAPW, assuming a uniform fiber field, in which the direction of anisotropy was obtained from a scalar harmonic potential. We solved the anisotropic bidomain model considering: (i) only the endocardial surface; (ii) only the atrial muscle with thickness 1.5 mm; and (iii) the atrial muscle with an intracardiac bath of 2.85 mm of thickness. Additionally, in patient-specific geometries, it is not possible to precisely control the direction of propagation. Therefore, to study the role of curvature, we recreated a flattened version of the LAPW. Endocardial conduction velocities were computed at each vertex of the triangulation of the domain using weighted averages based on the gradients of the activation times. Comparing the distributions in the various scenarios, we have concluded that curvature and muscle thickness can strongly influence the measured conduction velocities. In fact, we have found a shift in the peak CVs, with a reduction of about 2–4 cm/s, when comparing the distributions of the three-dimensional patient-spceific geometry with those of a manifold or a flattened representation of the LAPW. More importantly, when muscle thickness and curvature are included, the overall distributions have slower decays on the left and faster decays on the right. A two-sample t-test (Snedecor and William, 1989) has determined that the difference in two distributions means is statistically significant (p value = 0). This behavior, shown in **Figure 11**, leads to an overall slow down of the propagation of the electrical signal.

Because clinical CV maps are derived from extracellular electrograms, we also investigated how bipolar signals are affected by muscle thickness, curvature, and bath size. To mimic clinical conditions, the unipolar signals were sampled at 1 kHz at two points on the endocardial surface at a distance of 2 mm. We found that curvature does not play any substantial role on the electrogram morphology. On the other hand, muscle thickness and bath size can influence the amplitude of the signals. Still, the differences for a bath size larger than 1.5 mm were small. A major difference was found when approximating the muscle thickness with a bidimensional manifold. This corresponds to the test with a muscle thickness of 25µm. In this case, the amplitude and the shape of the signals were very different from the cases in which the muscle thickness was between 0.5 and 2 mm. In particular, we recorded a maximum peak smaller than 1 mV for 25 µm muscle thickness, while for thicker muscles the peak was greater than 1 mV. Given the faster CVs for the thin muscle case, the maximum peak was recorded earlier than for thicker muscles. We also note that, accordingly to the discussion above on CVs, the time at which the peak bipolar signal is recorded depends on the muscle curvature. This finding suggests the use of threedimensional models for atrial electrophysiology for accurately simulating surface electrogarms.

To verify that our findings were not largely affected by the numerical methods used, we also solved the bidomain model with a cubic reaction term in place of the Courtemanche ionic model. This simple reaction model can be used to represent a propagating front guaranteeing a second order convergence of the numerical method used herein (Rossi and Griffith, 2017). The details of this model and the results can be found in the **Supplementary Material**. Except for some differences in the details of the registered bipolar signals due to the different shape of the propagating pulse, the same qualitative behavior with respect to muscle thickness, curvature, and bath size was found. This suggests that our findings obtained using numerical methods with a suboptimal order of convergence are correct.

In conclusion, we have found evidence that even under homogeneous conditions, a surface-based model of the atria is not accurate in capturing the endocardial CVs and magnitude of the endocardial bipolar signals. In general, the change in CV for different curvatures is a function of muscle thickness (**Figure 3**). This effect is reduced in the presence of an adjoining bath. For the left atrial posterior wall with positive curvature, the electrical signal propagates more slowly on the endocardial surface than it would on a flat region. It has been shown in the ventricles that regions of slow conduction regions are correlated with anatomical sites critical for tachycardia (Irie et al., 2015). This slowing seen during curvature may be exacerbated under compromised electrophysiological conditions. The effects of geometry and bath-loading on conduction is important if CV is to be used as an index to indicate regions with fibrosis or poor conductivity. From the computational point of view, the findings suggest that models of atrial electrophysiology used to guide and understand endocardial catheter measurements should be fully three-dimensional and account for bath-loading effects with a simulated bath size of at least 1.5 mm was necessary for our simulation to get consistent CV measurements.

### 5. LIMITATIONS

Our simulations had several limitations. First, we considered uniform muscle thicknesses between 0.5 mm and 2 mm and uniform curvatures. The atrial wall (Bishop et al., 2015) thickness varies and has been shown to affect wavefront dynamics in atrial fibrillation (Rogers, 2002; Biktasheva et al., 2015; Song et al., 2018). Although the cases we considered are within the range of left atrial wall thicknesses (Bishop et al., 2015), measurements of the LAPW have shown that the muscle thickness can be as large as 5,mm superiorly and 8 mm

### REFERENCES


inferiorly (Platonov et al., 2008). Additionally, the average the atrial wall thickness is about 2.73 mm (Pashakhanloo et al., 2016). We have shown that the thicker the muscle, the more important is to consider a three-dimensional model of cardiac electrophysiology, but the endocardial CVs are captured with good accuracy even when loading-bath conditions are considered and a thickness of about 1.5 mm is used. Introducing non-uniform wall thickness in the patient-specific simulations can be challenging because a description of the epicardial surface is not readily available by endocardial electroanatomical maps or easily discernable from standard imaging. Finally, we have mainly considered propagation along a strand of tissue without considering transmural propagation. In fact, we have shown that in an anisotropic domain with no bath, a planar wavefront becomes curved if the domain has curvature. A more detailed study should be carried out to have proper insights on transmural propagation.

## DATA AVAILABILITY STATEMENT

BeatIt, the C++ code created for this study is publicly available on GitHub at the address github.com/rossisimone/beatit.git.

## AUTHOR CONTRIBUTIONS

SR: conception, design, code implementation, drafting, data preprocessing, data analysis, and interpretation of data; SG: Conception, design, Patient-specific data acquisition, critical revision; BG and CH: Design, critical revision.

### FUNDING

The authors gratefully acknowledge research support from NIH Award No. HL117063 and HL143336 and from NSF Award No. OAC 1450327.

### ACKNOWLEDGMENTS

We gratefully acknowledge support by the libMesh developers in aiding the development of the code used in the simulations reported herein.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphys. 2018.01344/full#supplementary-material

Balay, S., Abhyankar, S., Adams, M. F., Brown, J., Brune, P., Buschelman, K., et al. (2017). PETSc Users Manual. Technical Report ANL-95/11 - Revision 3.8. Argonne National Laboratory.

Balay, S., Gropp, W. D., McInnes, L. C., and Smith, B. F. (1997). Efficient management of parallelism in object oriented numerical software libraries. in Modern Software Tools in Scientific Computing, eds E. Arge, A. M. Bruaset, and H. P. Langtangen (Boston, MA: Birkhäuser Press), 163–202.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Rossi, Gaeta, Griffith and Henriquez. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# GEMS: A Fully Integrated PETSc-Based Solver for Coupled Cardiac Electromechanics and Bidomain Simulations

#### Sander Arens <sup>1</sup> , Hans Dierckx <sup>1</sup> and Alexander V. Panfilov 1,2 \*

<sup>1</sup> Department of Physics and Astronomy, Ghent University, Ghent, Belgium, <sup>2</sup> Laboratory of Computational Biology and Medicine, Ural Federal University, Ekaterinburg, Russia

Cardiac contraction is coordinated by a wave of electrical excitation which propagates through the heart. Combined modeling of electrical and mechanical function of the heart provides the most comprehensive description of cardiac function and is one of the latest trends in cardiac research. The effective numerical modeling of cardiac electromechanics remains a challenge, due to the stiffness of the electrical equations and the global coupling in the mechanical problem. Here we present a short review of the inherent assumptions made when deriving the electromechanical equations, including a general representation for deformation-dependent conduction tensors obeying orthotropic symmetry, and then present an implicit-explicit time-stepping approach that is tailored to solving the cardiac mono- or bidomain equations coupled to electromechanics of the cardiac wall. Our approach allows to find numerical solutions of the electromechanics equations using stable and higher order time integration. Our methods are implemented in a monolithic finite element code GEMS (Ghent Electromechanics Solver) using the PETSc library that is inherently parallelized for use on high-performance computing infrastructure. We tested GEMS on standard benchmark computations and discuss further development of our software.

#### Edited by:

Mariano Vázquez, Barcelona Supercomputing Center, Spain

#### Reviewed by:

Jazmin Aguado-Sierra, Barcelona Supercomputing Center, Spain Constantine Butakoff, Universidad Pompeu Fabra, Spain

#### \*Correspondence:

Alexander V. Panfilov Alexander.Panfilov@ugent.be

#### Specialty section:

This article was submitted to Biophysics, a section of the journal Frontiers in Physiology

Received: 14 January 2018 Accepted: 20 September 2018 Published: 16 October 2018

#### Citation:

Arens S, Dierckx H and Panfilov AV (2018) GEMS: A Fully Integrated PETSc-Based Solver for Coupled Cardiac Electromechanics and Bidomain Simulations. Front. Physiol. 9:1431. doi: 10.3389/fphys.2018.01431 Keywords: cardiac arrhythmias, electromechanics, cardiac modeling, ionic models, anatomical models

## 1. INTRODUCTION

The heart is an electromechanical pump whose mechanical contraction is initiated by electrical activation, in a process called excitation-contraction coupling. In normal circumstances, contraction is highly synchronized, resulting in an efficient throughput of oxygenated blood to the body. Failure in doing so can lead to sudden cardiac death. The contraction also affects excitation via the process called mechano-electrical feedback. An example of mechano-electrical feedback that has fatal consequences is commotio cordis (Maron and Estes, 2010), a long-known (Akenside, 1763; Meola, 1879; Nesbitt et al., 2001) phenomenon where a blow to the chest (even without damaging the heart) may cause ventricular fibrillation. Commotio cordis is still an important cause of sudden cardiac death in young athletes (Maron, 2003). The underlying mechanism of mechanoelectrical feedback is caused by several factors, including stretch-activated ionic channels (Kohl et al., 2001). Although much is already known about the subcellular contributions to mechanoelectrical feedback (Quinn et al., 2014), it is still unclear how these translate to macroscopic scales. Computational models can further help understand the mechanisms and consequences of cardiac mechano-electrical feedback up to the organ level.

The heart is mostly modeled as a continuum via partial differential equations (PDEs). For the spatial coupling between cells, the cardiac mono- or bidomain equations (Keener and Sneyd, 2009) are commonly used, in which any specific model for individual cardiac cells can be inserted. For the mechanical problem, the most commonly used are the PDEs of finite (hyper)elasticity (Nash and Hunter, 2000). The joint solution of these equations is a considerable numerical challenge. The difficulties largely originate from the different physical interactions that occur on a wide range of spatial and temporal scales (Plank et al., 2008; Keyes et al., 2013). The multiphysics nature makes it impossible to use a general-purpose black-box solver for this task. Solvers can only be optimal if they use as much information as possible about the problem. For example, implicit/explicit integrators need to know which processes are fast or slow, field-split preconditioners (Brown et al., 2012; Liu and Keyes, 2015) need to be able to extract fields belonging to different physics, and multigrid (Briggs et al., 2000; Trottenberg et al., 2000) and domain decomposition (Quarteroni and Valli, 1999; Smith et al., 2004) solvers need information about the meshes and discretizations.

In recent years, computational modeling of cardiac electromechanics has become an active field of research see e.g., (Göktepe and Kuhl, 2010; Lafortune et al., 2012; Land et al., 2012; Fritz et al., 2014; Rossi et al., 2014; Franzone et al., 2015; Augustin et al., 2016). However, different groups often use different descriptions for the same problems with different forms for deformation-dependent conduction tensors and sometimes convective terms in the undeformed configuration. In addition, current electromechanics codes are often the result of ad hoc coupling methods between the electrophysiology and finite elasticity codes, limiting time integration to only first order numerical schemes and poor stability, although some approaches are known to address these stability issues (Niederer and Smith, 2008; Pathmanathan and Whiteley, 2009). This problem is common in other fields that use multiphysics (Keyes et al., 2013).

Our contributions in this paper are the following. First, we give a consistent derivation of the continuum equations of coupled electromechanics of the heart based on basic principles from geometry and physics and the clarification of the constitutive equations used. From this we show that there are no convective terms in the undeformed configuration and that the variety of deformation-dependent conduction tensors from literature are all special cases of a more general form that we present here. Second, we generalize Euler-based implicit-explicit schemes for electromechanics to higher order implicit-explicit Runge-Kutta schemes, based on the knowledge of fast/slow dynamics. Third, we explain on how to solve the resulting nonlinear implicit equations from a general multiphysics perspective.

This paper is structured as follows. In section 2 we introduce the necessary notations and concepts and present the strong and weak form for the continuum electromechanics equations, followed by a brief discussion on how to discretize the weak form equations using finite elements in section 3. Next, we discuss on how to discretize the electromechanics equations in time using implicit-explicit schemes and how to solve the resulting nonlinear equations in 4. Finally, in section 5 we explain how we implemented this using PETSc (Balay et al., 1997, 2016a,b) in our GEMS (Ghent ElectroMechanics Solver) code, and give examples of numerical results in section 6.

### 2. PHYSICS

In this section we introduce the mathematical basis for physical modeling in the moving domain, distinguishing between the Eulerian and Lagrangian viewpoints. Then we show how the balance equations (i.e., physical conservation laws) need to be closed by constitutive equations. By imposing symmetry (e.g., a locally uniaxial medium), the constitutive equations involving tensors cannot be chosen freely, but need to be of certain form which we here propose and discuss. We conclude by splitting the equations in fast and slow components, which will be respectively treated implicitly and explicitly during time stepping in section 4. At the end of this section, we will have cast the modeling equations in variational form, suitable for use in the finite element approach.

### 2.1. Definitions and Notation for Geometrical Concepts

To formulate the problem of electromechanics, it is important to understand the underlying geometry. Since we will consider continuum equations here, it is natural to consider them on a manifold, i.e., a "curved" space which locally resembles Euclidean space. For additional background material we refer to Marsden and Hughes (1994) and Frankel (2012).

Let B be the material manifold of dimension m. This is a reference manifold for our body. For an excitable surface, m = 2 and for a three-dimensional tissue, m = 3. On every patch of B, we define material coordinates X I , I = 1, .., m.

The space in which the body moves is given by the spatial manifold S (which is sometimes called the ambient or target manifold), of dimension n. For example, if an excitable surface is restricted to move in a plane, n = 2. However, in the general case where the tissue can move in 3D, n = 3. On every patch of S, we define spatial coordinates x i .

We will assume that we have a metric for these manifolds, which we denote by resp. G and g, so that we have Riemannian manifolds. In the simplest case (which we will use further) S will be n-dimensional Euclidean space, such that x i are Cartesian coordinates x, y, z, and B will be an open subset of Euclidean m-dimensional space. However, a non-Euclidean metric on B can be important in growth and remodeling phenomena (Ozakin and Yavari, 2010), e.g., hypertrophy and thermoelasticity (Yavari, 2010).

A configuration of B is a mapping φ : B → S which represents the deformation of the body and we will often use the notation x <sup>i</sup> = φ i . The set of all configurations of B is called the configuration space C and is an infinite-dimensional manifold.

The tangent map Tφ : TB → TS, Tφ(X,V) = (φ(X), Dφ(V)) is called the deformation gradient F and is F i <sup>I</sup> = ∂φ<sup>i</sup> ∂X<sup>I</sup> in components. This tells us how a tangent vector at a point X ∈ B transforms under φ.

Another important concept is the deformation tensor C, which is the pullback of the metric g: C = φ ∗ g, or in components CIJ = F i I gijF j J . Note that the squared infinitesimal distance between nearby points with coordinates X I and X <sup>I</sup> +dX<sup>I</sup> or x i and x <sup>i</sup>+dx<sup>i</sup> is ds<sup>2</sup> = gijdxidx<sup>j</sup> = CIJdXIdX<sup>J</sup> , showing that CIJ is a measure for how length and angles between fixed pairs of points in the tissue change under a deformation. If we pull back the volume form dv on S to B, we get φ <sup>∗</sup>dv = JdV, where dV is the volume form on B and J = √ det g √ detG det F the Jacobian of the deformation.

The strain in the tissue will depend on how the current length and angles relate to the reference case, which is quantified by the strain tensor E = φ ∗ g−G 2 . Since φ is an isometry only if φ ∗ g = G, E measures the deviation between the current deformation and an isometry.

In cardiac contraction, the configuration (or deformation) φ is time-dependent, which can be represented by a curve C in configuration space, i.e., a mapping R → C;t → φ<sup>t</sup> , called the motion. The material velocity and acceleration are then defined to be respectively the first and second time derivatives of the motion. Their components are given by V <sup>i</sup> = ∂φ<sup>i</sup> ∂t and A <sup>i</sup> = ∂V i <sup>∂</sup><sup>t</sup> + (γ i jk ◦ φ)V jV k , where γ i jk are connection coefficients on <sup>S</sup>. Since we use Euclidean space for S, we have γ i jk = 0.

At this point, it is useful to discuss the Eulerian and Lagrangian viewpoints. Given the above definitions, any objects that are defined on B are called Lagrangian or material, while the concepts defined on S are called Eulerian or spatial. The Lagrangian and Eulerian point of view are equivalent, because anything that is defined in one can be transformed to the other. For cardiac tissue it is natural to use the Lagrangian framework. This has the advantage that we do not need convective derivatives in the description.

To model the cardiac microstructure, i.e., the fiber, sheet and normal direction, we will use frame fields, which are also called vielbeins in physics. Frame fields are a set of orthonormal vector fields. They span at each point of a manifold a basis for the tangent space. If G is the metric of our (material) manifold and {EA} m A=1 the frame field, the orthonormality condition is

$$G(E\_A, E\_B) = \delta\_{AB}.\tag{1}$$

The dual of the frame field is denoted E <sup>A</sup> (with upper indices) and called the coframe field. It is defined to obey E <sup>A</sup>(EB) = δ A B , such that it can be used to write the metric in the simple form

$$G = \sum\_{A=1}^{m} E^A \otimes E^A. \tag{2}$$

We will denote the components of the frame field E<sup>A</sup> in the coordinate basis by E I A and of the coframe field E <sup>A</sup> by E A I .

#### 2.2. Balance Equations

Although the bidomain and elasticity equations are wellknown, we will still derive for consistency the equations of cardiac electromechanics here starting from basic continuum balance laws. This will allow us explicitly mention assumptions and approximations made, and to emphasize that cardiac electromechanics is more than just the sum of bidomain and elasticity equations, giving rise to more complicated constitutive equations (such as deformation-dependent conduction tensors).

Our starting point are physical conservation laws: balance of charge in the intra- and extracellular domains, no accumulation of total charge, balance of momentum, and the dynamics of the internal variables (such as gating variables and ionic concentrations):

$$\frac{\partial Q\_i}{\partial t} + \text{DIV}\,I\_i = -I\_{i\alpha n} \tag{3a}$$

$$\frac{\partial Q\_{\varepsilon}}{\partial t} + \text{DIV}\,I\_{\varepsilon} = I\_{i\text{on}},\tag{3b}$$

$$\frac{\partial (Q\_i + Q\_\ell)}{\partial t} = 0,\tag{3c}$$

$$
\rho\_{\text{Ref}}A - \text{DIV}P - \rho\_{\text{Ref}}B = 0,\tag{3d}
$$

$$\frac{\partial \Gamma}{\partial t} = R,\tag{3e}$$

where Q<sup>i</sup> and Q<sup>e</sup> are the intra- and extracellular charge densities, J<sup>i</sup> and J<sup>e</sup> are the intra- and extracellular current densities, ρRef is the reference mass density, A is the acceleration, P the first Piola-Kirchhoff stress tensor, B is the body force (e.g., gravity), Ŵ is a column matrix of the internal variables and R are their reaction rates. Note that all quantities live on the material manifold B and DIV is the divergence operator on B.

The assumptions in the bidomain formulation are the following. First, the cell membrane can be modeled as a capacitor: Q<sup>i</sup> − Q<sup>e</sup> = 2CmVm, where C<sup>m</sup> is the capacitance per volume and V<sup>m</sup> = V<sup>i</sup> − V<sup>e</sup> the transmembrane voltage. Second, the intraand extracellular space are ohmic conductors, with intra- and extracellular conductivities 6<sup>i</sup> and 6<sup>e</sup> . Thus we get the following set of equations:

$$\begin{aligned} \frac{\partial \{C\_{\text{m}} V\_{\text{m}}\}}{\partial t} + \text{DIV} \left(\Sigma\_{i} \cdot \text{GRAD} \, V\_{\text{m}}\right) + \text{DIV} \left(\Sigma\_{i} \cdot \text{GRAD} \, V\_{\text{c}}\right) &= -I\_{\text{ion}}, \tag{4a} \\ \text{(4a)} \\ \text{DIV} \left(\Sigma\_{i} \cdot \text{GRAD} \, V\_{\text{m}}\right) + \text{DIV} \left(\{\Sigma\_{i+c}\} \cdot \text{GRAD} \, V\_{\text{c}}\right) &= 0, \\ \text{(4b)} \\ \frac{\partial \Gamma}{\partial t} &= R, \tag{4c} \\ \rho\_{\text{Ref}}A - \text{DIV}P - \rho\_{\text{Ref}}B &= 0. \end{aligned} \tag{4d}$$

An assumption often made in cardiac mechanics is the neglect of the inertial term ρRefA. This is justified because sound waves occur on a much faster time scale than the electrical waves in cardiac tissue: the ratio of the speed of sound to conduction velocity is around 25. This was also validated numerically in an electromechanical model of a 1D fiber (Whiteley et al., 2007).

#### 2.3. Constitutive Equations

To close Equations (4) we need to specify constitutive equations for 6<sup>i</sup> , 6e, Iion, R, and P. We will only consider the dependencies as pointwise functions of material position X, transmembrane potential Vm, internal variables Ŵ and deformation C:

$$
\Sigma\_i = \hat{\Sigma}\_i(X, C), \tag{5a}
$$

$$
\Sigma\_{\epsilon} = \hat{\Sigma}\_{\epsilon}(X, \mathcal{C}),
\tag{5b}
$$

$$I\_{ion} = \hat{I}\_{ion}(\mathbf{X}, V\_m, \Gamma, \mathbf{C}), \tag{5c}$$

$$R = \hat{R}(X, V\_m, \mathcal{C}) \tag{5d}$$

$$P = F\hat{\mathbb{S}}(X, \Gamma, \mathbb{C}). \tag{5e}$$

Instead of working with a function Pˆ for the first Piola-Krichhoff stress tensor, we directly work with a function Sˆ for the second Piola-Kirchhoff stress tensor, because it is symmetric. It is also possible that the material capacitance depends on deformation, and therefore we write C<sup>m</sup> = Cˆ <sup>m</sup>(X, C). Based on the symmetries of the material we can deduce more specific representations for the scalar (ˆIion, Rˆ, and Cˆ <sup>m</sup>) and symmetric second order tensor functions (6ˆ i , 6ˆ <sup>e</sup>, Sˆ). Because of the specific microstructure of cardiac tissue, we only focus on orthotropic materials, but more general symmetries based on crystal groups are possible (Smith, 2012). For the following we will use the notation {EA}A∈{F,S,N} for the local fiber, sheet and sheet normal directions.

Let us start with the scalar functions. It can be shown (Itskov, 2013) that every scalar-valued function of a symmetric rank-2 tensor M, such as the deformation tensor C, the second Piola-Kirchhoff stress tensor S and the conduction tensors 6i , 6e, which is invariant under orthotropic symmetries can be written as a function of the seven invariants {MFF, MSS, MNN, (MFS) 2 , (MFN) 2 , (MSN) 2 , MFSMSNMNF}. If det(M) = 1, (e.g., when M is the deformation tensor of an incompressible material), these seven invariants are not independent anymore and we can leave out the last one. In that case our scalar constitutive equations would be a function of the six invariants {MFF, MSS, MNN, (MFS) 2 , (MFN) 2 , (MSN) 2 }. Often ˆIion and Rˆ are taken to be a function of the fiber stretch λ = √ CFF only, see for example Niederer et al. (2006) and Panfilov et al. (2007).

Orthotropic tensor-valued functions T of a symmetric tensor M can be shown to be of the form (Itskov, 2013)

$$
\hat{T}(M) = \sum\_{A \in \{F, S, N\}} \left[ \hat{\alpha}\_A \left( E\_A \otimes E\_A \right) \tag{6}
$$

$$
\begin{aligned}
&+ \frac{\hat{\beta}\_A}{2} \left( M \cdot E\_A \otimes E\_A + E\_A \otimes E\_A \cdot M \right) \\
&+ \frac{\hat{\gamma}\_A}{2} \left( M^2 \cdot E\_A \otimes E\_A + E\_A \otimes E\_A \cdot M^2 \right) \\
&+ \frac{\hat{\delta}\_A}{2} \left( M \cdot E\_A \otimes E\_A - E\_A \otimes E\_A \cdot M \right) \\
&+ \frac{\hat{\epsilon}\_A}{2} \left( M^2 \cdot E\_A \otimes E\_A - E\_A \otimes E\_A \cdot M^2 \right) \end{aligned}
$$

where αˆ, βˆ, γˆ, δˆ, and ǫˆ are now scalar-valued functions of M. Note that for Tˆ(M) symmetric δˆ<sup>A</sup> = ˆǫ<sup>A</sup> = 0 while for Tˆ(M) antisymmetric αˆ<sup>A</sup> = βˆ<sup>A</sup> = ˆγ<sup>A</sup> = 0.

When we write out this expression in components of the E<sup>A</sup> frame (A, B ∈ {F, S, N}, no summation implied) we get:

$$
\hat{T}\_{AB}\text{(M)} = \frac{\hat{\alpha}\_A + \hat{\alpha}\_B}{2}\delta\_{AB} + \frac{\hat{\beta}\_A + \hat{\beta}\_B + \hat{\delta}\_A - \hat{\delta}\_B}{2}M\_{AB}
$$

$$
+ \frac{\hat{\wp}\_A + \hat{\wp}\_B + \hat{\epsilon}\_A - \hat{\epsilon}\_B}{2}\text{(M}^2\text{)}\_{AB}.\tag{7}
$$

The second Piola-Kirchhoff stress tensor S is symmetric and in the case that it is hyperelastic (such that Sˆ IJ(C) = 2 ∂ψˆ ∂CIJ , where ψˆ is a function of the invariants), the constitutive equation simplifies to

$$\hat{S}(\mathbf{C}) = \sum\_{A \in \{F, S, N\}} \left[ \hat{\alpha}\_A \left( E\_A \otimes E\_A \right) + \frac{\hat{\beta}\_A}{2} \left( \mathbf{C} \cdot E\_A \otimes E\_A \right) \right]$$

$$+ E\_A \otimes E\_A \cdot \mathbf{C} \Big] + \hat{\boldsymbol{\gamma}} \, \mathbf{C}^2. \tag{8}$$

For ventricular cardiac tissue, the Guccione (Guccione et al., 1995) and Holzapfel-Ogden (Holzapfel and Ogden, 2009) constitutive equations are popular choices.

Throughout the literature on cardiac electromechanical modeling, several deformation-dependent conduction tensors have been proposed. The simplest form is obtained by making the conduction coefficients 6<sup>A</sup> dependent on the stretch along the principal material directions: with λ<sup>A</sup> = √ C(EA, EA),

$$
\hat{\Sigma}(\mathbf{C}) = \sum\_{A \in \{F, S, N\}} \hat{\Sigma}\_A(\lambda\_A) E\_A \otimes E\_A \tag{9}
$$

Examples for these are 6ˆ <sup>A</sup>(λA) = 6A, i.e., deformationindependent or "gap-junction based" conduction(Bakir and Dokos, 2015) or 6ˆ <sup>A</sup>(λA) = 6A λ 2 A (Colli Franzone et al., 2016). Yet another form for the conduction tensor can be found in Bakir and Dokos (2015), which they call "spatially based" conduction:

$$\hat{\Sigma}\{\mathbf{C}\} = JU^{-1} \cdot \left( \sum\_{A \in \{F, \mathcal{S}, \mathcal{N}\}} \Sigma\_A(\lambda\_A) E\_A \otimes E\_A \right) \cdot U^{-T}, \tag{10}$$

where U is the right stretch tensor, i.e., U = √ C. A related form is (Sachse, 2004):

$$\hat{\Sigma}(\mathbf{C}) = \boldsymbol{W} \cdot \left( \sum\_{A \in \{F, S, N\}} \Sigma\_A(\lambda\_A) E\_A \otimes E\_A \cdot \right) \boldsymbol{W}^T,\tag{11}$$

where W = U −1 1 + θ(U − 1) and θ ∈ [0, 1] is a parameter which reduces this conduction tensor to the "spatial based" conduction for θ = 0 (apart from the Jacobian factor) and to a "gap-junction based" conduction for θ = 0.

In Göktepe and Kuhl (2010) and Göktepe et al. (2013) the following transversely isotropic form

$$
\hat{\Sigma}(\mathbf{C}) = \Sigma\_{iso}\mathbf{C}^{-1} + \Sigma\_{ani}E\_F \otimes E\_F \tag{12}
$$

was used and in Plank et al. (2013):

$$\hat{\Sigma}^{\circ}(\mathbf{C}) = \left(\sum\_{A \in \{F, \mathbf{S}, N\}} \Sigma\_A E\_A \otimes E^A\right) \cdot C^{-1}.\tag{13}$$

This variety of deformation-dependent conduction tensors is mostly a consequence of the assumptions that were made about the conduction coefficients, for example one assumes that the conduction coefficients are constant in the spatial or in the material frame. However, nothing says a priori if these should even be constant. So to have realistic deformation-dependent conduction tensors relationships, the conduction coefficients should be based on measurements with different deformations.

#### 2.4. Variational Formulation

In view of the time-integration methods which will be presented in section 4.1, let us split R in fast processes (to be treated implicitly) and slow processes: R = R<sup>I</sup> + RE. Furthermore, let Pappl denote the applied pressure on the pressure boundary of the deformation φ (e.g., the fluid pressure at the endocardial surfaces). Writing the fast processes on the left-hand side and the slow processes at the right-hand side, the weak or variational form for electromechanics can be written as: find Vm,Ve, Ŵ, φ such that

$$\int\_{B} \delta V\_{m} \frac{\partial V\_{m}}{\partial t} dV + \int\_{B} \delta V\_{m} |\_{I} \left(\Sigma\_{l}\right)^{\mathrm{II}} \left(V\_{m}|\_{I} + |V\_{\mathrm{f}}|\_{\mathrm{I}}\right) dV = -\int\_{B} \delta V\_{m} I\_{\mathrm{ion}} dV. \tag{14a}$$

$$\int\_{B} \delta V\_{\mathfrak{e}} |\_{I} \left( (\Sigma\_{l})^{I \overline{l}} \; V\_{m} |\_{\mathfrak{I}} + (\Sigma\_{l+\mathfrak{e}})^{I \overline{l}} \; V\_{\mathfrak{e}} |\_{\mathfrak{I}} \right) dV = 0 \tag{14b}$$

$$\int\_{B} \delta \Gamma \left( \frac{\partial \Gamma}{\partial t} - R\_I \right) dV = \int\_{B} \delta \Gamma R\_E dV \tag{14c}$$

$$\int\_{B} \delta \phi^{l} \big|\_{I} P\_{l}^{I} dV + \int\_{\partial\_{\mathcal{N}} \mathcal{B}} \delta \phi^{l} P\_{\rm appl} \big| J \left( F^{-1} \right)^{I} l\_{l} N\_{l} dS = 0,\tag{14d}$$

for all test functions δVm, δVe, δŴ, δφ. The notation |<sup>I</sup> was introduced for the I'th component of the covariant derivative, i.e., δV<sup>e</sup> | <sup>I</sup> = ∂Vm <sup>∂</sup>X<sup>I</sup> and δφ i <sup>I</sup> = ∂(δφ<sup>i</sup> ) <sup>∂</sup>X<sup>I</sup> +γ i jkδφ jF k I (again, for Euclidean S the connection γ vanishes).

Note that we can write any left-hand side of (14) in the following form:

$$\int\_{\mathcal{B}} \left( \boldsymbol{\nu} \cdot \boldsymbol{f}\_0 + \nabla \boldsymbol{\nu} \cdot \boldsymbol{f}\_1 \right) dV + \int\_{\partial\_{\partial} \mathcal{B}} \boldsymbol{\nu} \cdot \boldsymbol{g}\_0 d\mathbf{S} \tag{15}$$

where v represents any of the test functions and f0, f1, and g<sup>0</sup> are general functions of Vm, Ve, Ŵ, and φ, their gradients and time derivatives, time and spatial coordinates. More specifically, we can summarize all the fast physics by pointwise functions in the following table:

$$\begin{array}{c|cc} & f\_0 & f\_1 & \text{go} \\ \hline V\_m & \frac{\partial V\_m}{\partial t} & \Sigma\_i \cdot \nabla V\_m + \Sigma\_i \cdot \nabla V\_\varepsilon & \\ V\_\varepsilon & \Sigma\_i \cdot \nabla V\_m + (\Sigma\_{i+\varepsilon}) \cdot \nabla V\_\varepsilon & \\ \Gamma & \frac{\partial \Gamma}{\partial t} - R\_I & P\_{\text{appl}} J F^{-T} \cdot N \\ \phi & & \\ \end{array} \tag{16}$$

For implicit time integration we will also need the Jacobian of the left-hand side. Its action on the increments 1Vm, 1Ve, 1Ŵ, and 1φ is given by

$$\int\_{\mathcal{B}} \delta V\_m \mathcal{y} \, \Delta V\_m dV + \int\_{\mathcal{B}} \delta V\_m |\_{I} \, (\Sigma\_i)^{I \prime} \, \Delta V\_m|\_{I} \, dV \tag{17a}$$

$$\int\_{\mathcal{B}} \delta V\_m |\_{I} \ (\Sigma\_i)^{I\intercal} \ \Delta V\_e |\_{I} \, dV \tag{17b}$$

$$\int\_{\mathcal{B}} \delta V\_e |\_{I} \left( \Sigma\_i \right)^{\mathcal{U}} \, \Delta V\_m |\_{I} \, dV \tag{17c}$$

$$\int\_{\mathcal{B}} \delta V\_e |\_{I} (\Sigma\_{i+e})^{I^I} \,\Delta V\_e |\_{I} \,dV \tag{17d}$$

$$\int\_{\mathcal{B}} \delta \Gamma \left( \chi - \frac{\partial R\_I}{\partial \Gamma} \right) \Delta \Gamma dV \tag{17e}$$

$$\int\_{\mathcal{B}} \delta \phi^i \big|\_{I} A\_i^{IJ} \big|\Delta \phi^j \big|\_{\mathcal{J}} dV + \int\_{\partial\_{\mathcal{N}} \mathcal{B}} \delta \phi^i P\_{\text{appl}} B\_{ij}^{\int\_{\mathcal{J}} \Delta \phi^j \big|\_{\mathcal{J}}} \, d\mathcal{S} \tag{17f}$$

where γ is the shift factor determined by the numerical integration scheme (for example, for backward Euler with time step h, γ = h −1 ) and

$$\boldsymbol{B}\_{ij}^{\boldsymbol{J}} = \frac{\partial \left( \boldsymbol{J} \left( \boldsymbol{F}^{-1} \right)^{\boldsymbol{I}}\_{\boldsymbol{i}} \boldsymbol{N}\_{\boldsymbol{I}} \right)}{\partial \boldsymbol{F}^{\boldsymbol{j}}\_{\boldsymbol{J}}} = \boldsymbol{J} \boldsymbol{N}\_{\boldsymbol{I}} \left( \left( \boldsymbol{F}^{-1} \right)^{\boldsymbol{I}}\_{\boldsymbol{i}} \left( \boldsymbol{F}^{-1} \right)^{\boldsymbol{J}}\_{\boldsymbol{j}} \right)$$

$$- \left( \boldsymbol{F}^{-1} \right)^{\boldsymbol{I}}\_{\boldsymbol{j}} \left( \boldsymbol{F}^{-1} \right)^{\boldsymbol{J}}\_{\boldsymbol{i}} \boldsymbol{)}, \tag{18}$$

and

$$A\_{i\;j}^{I\;j} = \frac{\partial P\_i^{I}}{\partial P\_J^{\dot{j}}} \tag{19}$$

is called the first elasticity tensor (Marsden and Hughes, 1994). The expressions (17) can generally be written as

$$\int\_{\mathcal{B}} \begin{bmatrix} \boldsymbol{\nu}^{T} & \boldsymbol{\nabla} \boldsymbol{\nu}^{T} \end{bmatrix} \begin{bmatrix} \boldsymbol{f}\_{0,0} \ \boldsymbol{f}\_{0,1} \\ \boldsymbol{f}\_{1,0} \ \boldsymbol{f}\_{1,1} \end{bmatrix} \begin{bmatrix} \boldsymbol{\nu} \\ \boldsymbol{\nabla} \boldsymbol{w} \end{bmatrix} dV + \int\_{\partial \boldsymbol{\mathcal{B}}} \begin{bmatrix} \boldsymbol{\nu}^{T} \end{bmatrix} \begin{bmatrix} \boldsymbol{\mathcal{g}}\_{0,0} \ \boldsymbol{\mathcal{g}}\_{0,1} \end{bmatrix} \begin{bmatrix} \boldsymbol{\nu} \\ \boldsymbol{\nabla} \boldsymbol{w} \end{bmatrix} dS \tag{20}$$

and the pointwise Jacobians can be summarized as

$$\begin{array}{c|cccc} & f\_{0,0} & f\_{1,1} & \gcd{1} \\ \hline \{V\_m, V\_m\} & \mathcal{V} & \Sigma\_i \\ \{V\_m, V\_e\} & & \Sigma\_i \\ \{V\_e, V\_m\} & & \Sigma\_i \\ \{V\_e, V\_e\} & & \Sigma\_{i+e} \\ \{\Gamma, \Gamma\} & \mathcal{V} - \frac{\partial R\_I}{\partial \Gamma} \\ \{\phi, \phi\} & A & P\_{\text{appl}}B \end{array} \tag{21}$$

where for example (Vm,Ve) refers to the derivative of the weak equation for V<sup>m</sup> w.r.t. Ve.

#### 3. DISCRETIZATION

In this section we apply standard methods to express the variational equations in a finite element basis, to obtain a large non-linear system to solve instead of continuum partial differential equations.

We will use the finite element method (Ciarlet, 2002; Brenner and Scott, 2007; Zienkiewicz et al., 2013) to spatially discretize the weak forms (14). Let the manifold B be triangulated into E m-simplices {K e } E e=1 (cells/elements), each diffeomorphic to the standard m-simplex Kˆ (with coordinates ξ ˆI ): for each e there is a coordinate map Xˆ e : Kˆ → K e for which their element Jacobians (J e ) I ˆI = ∂Xˆ I ∂ξ <sup>ˆ</sup><sup>I</sup> and their inverse exist (and are continuous). If we also choose a function space P and a basis for the dual space 6 over each element, the triple (K, P, 6) defines the finite element (Ciarlet, 2002). Here we will only use 1st order Lagrange elements (Brenner and Scott, 2007). Let {ϕp} dim P p=1 denote the basis functions for P and let {ξq} Q q=1 and {wq} Q q=1 be the quadrature points of a quadrature rule with Q quadrature points (e.g., Gauss-Jacobi in the case of simplices Karniadakis and Sherwin, 2013). Then we can define the element basis evaluation, derivative and integration matrices as (B e )qp = ϕp(ξq), D e I qp =

$$\frac{\partial \varphi\_p}{\partial \xi^{\hat{I}}} (\xi\_q) \left( \left( J^{\varepsilon} \right)^{-1} \right)^{\hat{I}}\_I \text{ and } (W^{\varepsilon})\_{qp} = \delta\_{qp} \,\omega\_q \,\det \left( J\_{\varepsilon} \right)$$

Following (Brown, 2010; Knepley et al., 2013) we discretize the volume terms

$$\int\_{\mathcal{B}} \left( \boldsymbol{\nu} \cdot \boldsymbol{f}\_0 + \nabla \boldsymbol{\nu} : \boldsymbol{f}\_1 \right) dV \tag{22}$$

as

$$\sum\_{\varepsilon} \mathcal{E}\_{\varepsilon}^{T} \left[ \left( \mathcal{B}^{\varepsilon} \right)^{T} W^{\varepsilon} \Lambda^{\varepsilon} \langle \mathfrak{f} \mathfrak{f} \rangle + \sum\_{I} \left( D\_{I}^{\varepsilon} \right)^{T} W^{\varepsilon} \Lambda^{\varepsilon} \langle \mathfrak{f}\_{1}^{I} \rangle \right], \tag{23}$$

where E<sup>e</sup> is the element restriction operator and 3<sup>e</sup> transforms a function into function evaluations at the quadrature points. Note that evaluation of a field u at the quadrature points are evaluated as u <sup>e</sup> = B <sup>e</sup>Eeu and their derivatives as ∇Iu <sup>e</sup> = D e I Eeu.

The boundary integrals

$$\int\_{\partial N \mathcal{B}} \boldsymbol{\nu} \cdot \boldsymbol{g} \, d\mathcal{S} \,\tag{24}$$

are discretized as

$$\sum\_{f} \mathcal{E}\_{\mathfrak{e}(f)}^{T} \left[ (\mathcal{B}^{\mathfrak{e}(f)})^{T} \boldsymbol{W}^{f} \boldsymbol{\Lambda}^{\mathfrak{e}(f)} (\mathcal{g}\_{0}) \right],\tag{25}$$

where e(f) refers to the neighboring element of f , i.e., we evaluate at the quadrature points of the face using the neighboring element's basis functions and field coefficients.

#### 4. ALGORITHMS

In this section we present IMEX integration schemes, the resulting non-linear equations and approaches to solve them numerically for the specific structure of the electromechanical equations.

### 4.1. Time Integration Using IMEX Schemes

For systems that have multiple time scales that are well-separated, we have to choose a time scale that we are interested in. In studying the long term or slowly varying behavior, the fast transient processes don't need to be fully resolved, as these decay rapidly. These systems are called stiff (see Söderlind et al., 2015 for a discussion on stiffness). Note that in discretized PDEs, the fastest time scale often comes in the form of a Courant-Friedrichs-Levy limit (Courant et al., 1928), making it meshdependent.

Explicit schemes require the time step to be of the same order as the fastest process for stability, so they are very inefficient for stiff systems. Implicit schemes can step over those fast processes, but the downside is that they produce large fully coupled nonlinear systems. Implicit-Explicit (IMEX) schemes combine the best of both worlds: they integrate the fast processes implicitly and the slow processes explicitly. A class of IMEX methods are Additive Runge-Kutta Implicit-Explicit (ARKIMEX) schemes (Ascher et al., 1997; Kennedy and Carpenter, 2001; Giraldo et al., 2013). They combine two s-stage methods (ERK and (ES)DIRK), summarized by two Butcher tableaus (Butcher, 2016)

$$\begin{array}{c c c c} c\_1^E \mid a\_{11}^E & \cdots & 0 \\ \vdots & \vdots & \vdots & \vdots \\ c\_s^E \mid a\_{s1}^E & \cdots & 0 \\ \hline & b\_1^E & \cdots & b\_s^E \end{array} \qquad \qquad \begin{array}{c c c} c\_1^I \mid a\_{11}^I & \cdots & 0 \\ \vdots & \vdots & \vdots \\ c\_s^I \mid a\_{s1}^I & \cdots & a\_{ss}^I \\ \hline & b\_1^I & \cdots & b\_s^I \end{array} \tag{26}$$

additively to integrate equations of the following form

$$M\dot{\boldsymbol{y}} = f^I(\boldsymbol{y}, t) + f^E(\boldsymbol{y}, t), \tag{27}$$

where y :I → R <sup>N</sup> describes the evolution of the discretized state, f I and f E are resp. the implicitly and the explicitly treated functions and M is a mass matrix. The implicit function contains the fast or stiff physics, whereas the explicit function contains the slow or non-stiff physics. Often f I is linear and fE non-linear. The i-th stage value Y<sup>i</sup> can then be computed as

$$Y\_i = \wp\_n + h \sum\_{j=1}^{i-1} a\_{ij}^E \dot{Y}\_j^E + h \sum\_{j=1}^i a\_{ij}^I \dot{Y}\_j^I,\tag{28}$$

where the implicit and explicit stage derivates are given by Y˙ I <sup>i</sup> = M−<sup>1</sup> f I (Y<sup>i</sup> , t<sup>n</sup> +cih) and Y˙ <sup>E</sup> <sup>i</sup> <sup>=</sup> <sup>M</sup>−<sup>1</sup> f E (Y<sup>i</sup> , t<sup>n</sup> +cih). The difference between both terms is that the stage Y<sup>i</sup> depends on only previous stages for the explicit part, but also on the current stage for the implicit part. The numerical constants a I ij, a E ij follow from the chosen integration scheme, see the Butcher tableaus (26).

After rearranging, Equation (28) produces a non-linear equation in Y<sup>i</sup> , if the a I ii 6= 0:

$$M\boldsymbol{\upchi}(Y\_i - Z\_i) - f^I(Y\_i, t\_n + c\_i h) = 0,\tag{29}$$

where γ is the shift factor determined by the numerical integration scheme (for example, for backward Euler with time step h, γ = h −1 ). The Jacobian for this equation is

$$
\gamma M - \frac{\partial f^I}{\partial \nu} (Y\_{i\cdot} t\_n + c\_i h) \tag{30}
$$

and is used while iteratively solving Equation (29) for Y<sup>i</sup> . Thereafter, the implicit stage derivative can be simply found as

$$\dot{Y}\_i^I = \mathcal{Y}(Y\_i - Z\_i) \tag{31}$$

and the explicit stage derivative by evaluating the explicit function

$$\dot{Y}\_i^E = M^{-1} f^E(Y\_i, t\_n + \mathcal{c}\_i h). \tag{32}$$

The solution at the next time step is then calculated as

$$\boldsymbol{\gamma}\_{n+1} = \boldsymbol{\gamma}\_n + h \sum\_{i=1}^s b\_i^E \dot{Y}\_i^E + h \sum\_{j=1}^s b\_i^I \dot{Y}\_i^I. \tag{33}$$

Note that if f <sup>I</sup> = 0 we have a purely explicit scheme and if f <sup>E</sup> = 0 we have a purely implicit scheme. In order to avoid the need to invert M, we will only use schemes for which a I si = b I i and a E si = b E i , the so-called globally stiffly accurate schemes (Boscarino et al., 2013). Then, the completion step (33) can be skipped. For a more thorough discussion on the technical aspects, we refer to Kennedy and Carpenter (2001). In the context of electrophysiology they were previously applied only to single cell models, where they have been shown to outperform other integration schemes (Spiteri and Dean, 2008).

#### 4.2. Non-linear Solvers

The IMEX schemes allow us to put some of the complicated non-linear dependencies in the right-hand sides, making the implicit solve easier. If we make the following assumptions, we can essentially solve the whole non-linear system by solving each subproblem one after another: the ionic current, the stretchdependent terms in the cell models and dependence of the tension variables on Ca<sup>i</sup> or V<sup>m</sup> must be in the RHS. Now we can solve for the stage values by doing the following: first solve the active tension internal variable equations, then solve the mechanical equations (14d), then solve the bidomain equations ((14a) and (14b)) together and finally solve the electrophysiological internal variable equations (14c). This approach is nothing more than the non-linear Gauss-Seidel method applied to the fields:


During this process, we solve the bidomain and, if possible, the implicit internal variables equations with a linear solver (to be specified below), while we solve the non-linear mechanical equations with Newton's method. If for some reason some of the above assumptions do not hold and coupling between variables is strong enough, more Gauss-Seidel sweeps are done to converge. Alternatively, one could use the above algorithm as a non-linear preconditioner (Liu and Keyes, 2015).

#### 4.3. Linear Solvers and Preconditioners 4.3.1. Bidomain

We solve the discretized bidomain equations with conjugate gradients preconditioned by block preconditioners (Sundnes et al., 2002; Pennacchio and Simoncini, 2009; Bernabeu et al., 2010; Pavarino and Scacchi, 2011). For this we use PETSc's FieldSplit preconditioner, allowing us to flexibly choose between different strategies (Brown et al., 2012) from the command line. Both blocks are preconditioned with one V-cycle of PETSc's native algebraic multigrid preconditioner (GAMG). If no Dirichlet boundary conditions are given for the extracellular voltage, we also provide the constant nullspace vector to the respective block solve.

#### 4.3.2. Mechanics

We solve the linearized elasticity equations arising from Newton's method with conjugate gradients, preconditioned with PETSc's algebraic multigrid preconditioner. The difference here with previous work (Franzone et al., 2015; Gurev et al., 2015; Augustin et al., 2016) is that this algebraic multigrid preconditioner uses smoothed aggregation (Vanek et al., 1996 ˇ ), which is more efficient for elasticity problems (Van et al., 2001; Adams, 2002). We provide the rigid body modes to PETSc's GAMG preconditioner to obtain more accurate coarse spaces, resulting in a significant drop in iterations. Here we use a full multigrid cycle as this also helps in lowering the number of iterations of the linear solver at the expense of only a small percentage more work than a single V-cycle.

#### 4.3.3. Internal Variables

As the internal variables on different points are completely decoupled these can be solved easily as small linear systems. Very often these systems are even diagonal, for example when most of the stiffness comes from the gating variables.

#### 5. IMPLEMENTATION: GEMS

#### 5.1. Source Code in C Using PETSc

We implemented our code in C using the PETSc library (Balay et al., 1997, 2016a,b). This allows us to have a large choice of scalable and efficient algorithms and data structures for the solution of time-dependent PDE's, which can be easily changed or finetuned through command line options. By using PETSC's unstructured mesh data structure, we can easily read and write common mesh formats, (re)distribute meshes and associated data and we have access to powerful solvers which need access to mesh and field information (e.g., multigrid and block preconditioners). More specifically, we used DMPlex (Isaac and Knepley, 2015; Knepley et al., 2015; Lange et al., 2015) for mesh management and PetscFE for finite element technology, TS (Abhyankar, 2014) for time stepping, SNES for non-linear solvers and KSP/PC for linear solvers and preconditioners. Input and output routines are coupled to PETSc. Meshes can be read in through DMPlex if it is of the ExodusII (Schoof and Yarberry, 1994), Gmsh (Geuzaine and Remacle, 2009), CGNS (Poirier et al., 1998), MED (Open CASCADE, 2017), Fluent Case (Fluent, 2006), or PLY (Wikipedia, 2017) file format. Alternatively, meshes can also

G EMSM o d elS e tU pDi s c r e ti z a ti o n ( model , dm ) ;

be created by giving the vertex numbers per cell and vertex coordinates. Output can be generated using the builtin PETSc viewers. For example, DM (mesh) and Vec (representing discrete fields) objects can be stored as HDF5 (The HDF Group, 1997- 2017) data, which can be read by ParaView (Ayachit, 2015) or VisIt (Childs et al., 2012) with XDMF metadata (Kitware, 2017). The extensible nature of PETSc also makes it possible to implement new solvers and use them through PETSc. This way we implemented a SNES solver called SNESFieldSplit, which is the non-linear block Gauss-Seidel solver we discussed in 4.2. Once this solver knows about the field layout and the equations per field through the DM, it can automatically do the subsolves. This is the non-linear equivalent to PCFieldSplit (Brown et al., 2012), already in PETSc.

### 5.2. Main GEMS Classes and Usage

The most important part of our GEMS library is the GEMSModel class. It is responsible for providing all the model-dependent information such as pointwise residuals and Jacobians, discretizations, null spaces, and initial guess/conditions to the appropriate PETSc classes. Current subclasses include GEMSModelMonodomain, GEMSModelBidomain, GEMSModelElasticity, GEMSModelElectromechanics (combining monodomain and quasi-static elasticity), and GEMSModelFibres (to create rule-based fiber directions based on solving Laplace equations, following Bayer et al., 2012).

Typical usage for a non-linear problem is illustrated in 1. Note that nothing should be done extra to run simulations in different dimensions besides changing the mesh, which can be as simple as just changing the filename of the mesh. The ...FromOptions(...) functions are meant to be configured from the command line or options file. For example, if the GEMSModel should be changed to GEMSModelMonodomain, the option gemsmodel\_type monodomain would be added to the command line or options file.

Listing 1 | Typical usage of the GEMSModel class

MPI\_Comm comm ; SNES s n e s ; DM dm ; Vec u ; GEMSModel model ; /∗ I n i t i a l i z e GEMS , PETSc , MPI , r e a d o p t i o n s ∗/ G EM S I n i t i a l i z e (& a r g c , &a rgv , NULL , h e l p ) ; comm = PETSC\_COMM\_WORLD ; /∗ C r e a t e a DMPlex u si n g , e . g . , DM Pl e xC r ea t e F r om Fil e ( ) ∗/ DMPlexC rea te . . . ( comm , . . . , &dm ) ; /∗ C r e a t e and c o n f i g u r e a GEMSModel ∗/ GEMSModelCreate (comm , &model ) ; GEMSModelSe tFromOp tions ( model ) ; /∗ S e t model−s p e c i f i c d i s c r e t i z a t i o n s and e q u a t i o n s i n t h e DM ∗/

```
/∗ C r e a t e model−s p e c i f i c n ea r−n u l l s p a c e
    ( t h i s i s u s e d by GAMG) ∗/
GEMSModelC rea teNea rNullSpace ( model , dm ,
    NULL ) ;
/∗ C r e a t e and i n i t i a l i z e t h e s o l u t i o n
    v e c t o r ∗/
DMC r e a t eGl o b alV e c t o r (dm, &u ) ;
P e t s c O bj e c t S e t N am e ( ( P e t s c O b j e c t ) u ,
    " s o l u t i o n " ) ;
M o d e l I n i t i a l i z e S o l u t i o n V e c t o r ( model , dm ,
    u ) ;
/∗ U se DMPlex ' s i n t e r n a l FEM r o u t i n e s ∗/
DMSNESSe tBoundaryLocal (dm,
  DMPlexSNESComputeBoundaryFEM , NULL ) ;
DMSNESSe tFunc tionLocal (dm,
  DMPlexSNESComputeResidualFEM , NULL ) ;
DMSNESS e t Ja c o bian L o cal (dm,
  DMPlexSNESComputeJacobianFEM , NULL ) ;
/∗ C r e a t e and c o n f i g u r e t h e n o n l i n e a r
    s o l v e r and s o l v e ∗/
SNESC rea te (comm , &s n e s ) ;
SNESSetDM ( s n e s , dm ) ;
SNESSe tF romOp tion s ( s n e s ) ;
SNESSolve ( s n e s , NULL , u ) ;
/∗ View t h e mesh ∗/
DMViewFromOptions (dm, NULL , "−dm_view " ) ;
/∗ View t h e s o l u t i o n ∗/
VecViewFromOp tions ( u , NULL , "−s o l _ v e c _
vi ew " ) ;
/∗ Cl ea n up ∗/
SNESDe s t roy (& s n e s ) ;
V e cD e s t r o y (&u ) ;
M o d elD e s t r oy (&model ) ;
DMDestroy (&dm ) ;
GEMS Finali z e ( ) ;
```
Further we have a class for the electrophysiological 0D cell models called GEMSCellModel. Its only function is to give the pointwise implicit and explicit functions, Jacobian and initial conditions. Currently implemented cell models include FitzHugh-Nagumo (FitzHugh, 1961; Nagumo et al., 1962) and Ten Tusscher-Panfilov 2006 (ten Tusscher and Panfilov, 2006) models.

### 5.3. Comparison to Other Cardiac Solvers

One of the main features of GEMS is, that it uses PETSc (and other third party packages it interfaces) as much as possible and not just as a linear algebra solver. In particular it uses the DM object prominently, which makes it easy to input/output meshes and field data in various formats, feed field and mesh data to various advanced (non)linear, often consisting of combinations of subsolvers, etc (for example, the block preconditioners for bidomain or incompressible elasticity in which each field has a different preconditioner and linear iterative solver). These solvers (and their subsolvers) can then be configured just from command line options, without recompiling. Thus it strives for maximal flexibility and easy experimentation. Other cardiac solvers such as Chaste (Mirams et al., 2013) or Continuity (Continuity, 2018) have already existed for many years and have functionalities such as reading generic cell models through CellML and solving mechanics. But the typical approach to electromechanics is first order operator splitting with separate codes for mechanics and electrophysiology. Our library was built with a flexible approach to coupling between different physics from the beginning. To specify a problem we start with a coupled set of equations (defined by pointwise residuals, right hand sides and Jacobians) and through command line options we can configure the solvers. This makes experimentation with different combinations of solvers a whole lot easier and also makes it possible to use higher order integration schemes.

### 6. NUMERICAL RESULTS

### 6.1. Electrophysiology

As a first test we did the benchmark for electrophysiology with the cardiac monodomain equations as described in Niederer et al. (2011), with the suggested spatial resolutions of 0.5, 0.2, and 0.1 mm (using linear tetrahedral elements) and temporal resolutions of 0.05, 0.01, and 0.005 ms. We did the benchmark of propagation in a 3D slab with three different integration schemes: with FBE111 (forward-backward Euler), the ARS222 (Ascher et al., 1997), and the BPR353 schemes (Boscarino et al., 2013) (the numbers in the names of these integration scheme names reflect the number of explicit and implicit stages and the order of accuracy). As an extra, we also ran the benchmark using a large time step of 0.5 ms at a spatial resolution of 0.1 mm, to showcase the stability and temporal convergence of the used methods. The internal variables were stored at the quadrature points. In **Figure 1** we display the activation times along the diagonal of the bar geometry. We see that increasing spatial and temporal resolutions have opposite effects on arrival times: increasing spatial resolution raises the arrival time, while increasing the temporal resolution lowers the arrival time. The faster convergence rate of the arrival time for higher order time integration is also noticeable. For example, for the BPR353 scheme the arrival times for the time steps of 0.05, 0.01, and 0.005 are almost indistinguishable. In Niederer et al. (2011) different codes were found to have arrival times between 37.8 and 48.7 ms at the highest spatial and temporal resolutions. Our arrival times are within those bounds at these highest resolutions. (It is inevitable that at lower resolutions the arrival time will deviate more.) Execution times for the simulations can be found in **Table 1**.

### 6.2. Electromechanics

At this stage of development of our package we decided just to illustrate the solution of electromechanical equations using the most simple tools. The comparison of various integration methods and constitutive relations will be done at a later stage. As an illustration for the fully coupled electromechanical equations we simulated the contraction of an idealized biventricular geometry that was stimulated at the apex. The mesh for this geometry was created using Gmsh (Geuzaine and Remacle, 2009) with a resolution of 0.2 mm resulting in a tetrahedral

Table 1 | Execution times for Niederer's electrophysiology benchmark.


Simulations were run on 32 nodes of Intel E5-2670 CPUs, using 1 core per node. See section 6.1 for details.

mesh consisting of 1529230 cells and 312888 vertices. We used the algorithm from Bayer et al. (2012) to generate myofiber orientations. The fiber angle varied from −45◦ (epi) to 75◦ (endo). We used the monodomain formulation and the TNNP06 (ten Tusscher and Panfilov, 2006) model for electrophysiology, with the same parameters as in Niederer et al. (2011). For the passive hyperelastic equations we used the Guccione constitutive equations (Guccione et al., 1995), where a penalty term κ/2(J − 1)<sup>2</sup> was added to the strain energy and for the active tension generation we used the Niederer-Hunter-Smith model (Niederer et al., 2006). Parameters were taken from Keldermann et al. (2010) and κ was taken as 350 kPa. Here we used a timestep of 0.5 ms with the FBE111 scheme and we used linear elements for the transmembrane voltage and deformations, while the internal variables were stored at the quadrature points. The resulting activation and contraction sequence can be seen in **Figure 2**. The simulation took 7.5 h on 32 nodes of Intel E5-2670 CPUs, using 1 core per node. The electromechanical testing will be continued in subsequent studies.

### 7. DISCUSSION AND OUTLOOK

In this paper we presented an overview of the methodology used in cardiac electromechanics and our numerical approach to these challenging problems. In particular, in section 2 we presented a short derivation of the main equations of electromechanics from basic principles (i.e., geometry and balance equations) in strong and weak form. We discussed constitutive equations to close these equations and clearly list all assumptions made. We derived a general representation of a deformationdependent conduction tensor, assuming orthotropic symmetry and pointwise dependence on deformation and showed that previous deformation-dependent conduction tensors found in literature are all special cases of this. Note however, that the scalar functions in this representation still need to be determined from experiment. In section 3 we applied standard finite element methods to express the variational equations in a finite basis, which can then be solved by the numerical methods in section 4. There we discussed additive implicit-explicit Runge-Kutta time integration methods and how with appropriate partitioning of fast and slow physics the non-linear implicit equations can be solved more easily by solving smaller problems belonging to different fields one after another. Efficient (non-)linear solvers for these problems were also discussed. Further we reviewed the structure and possibilities of the GEMS library in section 5 and how PETSc gives us a wide range of tools to solve our PDE's, including meshes, I/O and solvers. In section 6 we presented some numerical results as verification and illustration of the GEMS library.

Our main conclusion is that additive implicit-explicit Runge-Kutta time integration methods, combining the advantages of implicit and explicit integration, work very well for electromechanical problems. This method allows larger time steps, with limited complication of Jacobians and non-linear solves. Our numerical implementation uses the PETSc library extensively, which gave us access to powerful and scalable mesh management, time stepping and (non)linear solvers which may need mesh and field information. One of the things which could be further researched is whether we can get much advantage of anistropic mesh adaptation through the PRAgMaTIc library (Rokos and Gorman, 2013), which has been recently interfaced to PETSc (Barral et al., 2016). This could also be used to build mesh hierarchies in a geometric multigrid approach.

The GEMS package is still in the process of further development. Although the user can access and set all solver options through the command line, a graphical user interface may be desirable in the future, both for input and visualization. Regarding modeling, we currently hard-coded two cell models (FHN and TP06) and foresee to import more models of cardiac electrophysiology in a semi-automated way via the CellML repository (www.cellml.org). We are currently using pressure boundary conditions on the endocardial surface, which can be extended with physical models for circulation and valve action.

Our method has been designed to enable strong coupling between the electrical and mechanical subsystems at every time step of the simulation, and at the same high spatial resolution, both for the electrical and mechanical equations. One possible

speed-up factor is the following: currently all field values and gradients at the quadrature points are calculated for each residual or Jacobian belonging to some field(s), for maximum flexibility. Thus, one may avoid unnecessary interpolation in order to accelerate the computation of the residuals: if the residual of field A is independent of field B, the value or gradient of field B at the quadrature points is not needed.

The use of PETSc enables to parallelize the computation on high-performance computing clusters (HPC). Smaller (test) runs, can be run on a desktop computer, requiring about 32GB of RAM memory to run the biventricular model in **Figure 2** with the TP06 cell model. There is no significant difference in memory cost between mono- and bidomain equations, since the latter introduces only few new state variables (extracellular potential, extracellular conductivities).

In this paper we have chosen to illustrate our approach using simple standard problems: the benchmark for electrophysiology (Niederer et al., 2011) and simple illustration of electromechanics for the fully coupled equations an idealized biventricular geometry. This is because we mainly wanted to describe of the methodology and place it to the existing environment and did not focus on specific scientific applications. Such simulations can definitely be performed using our methodology and will be presented in subsequent papers.

## AUTHOR CONTRIBUTIONS

AP and HD designed the research. SA implemented the methods. SA, HD, and AP wrote the manuscript.

### FUNDING

SA was funded by BOF-Ghent. HD was funded by FWO-Flanders during part of this work.

## ACKNOWLEDGMENTS

The computational resources (Stevin Supercomputer Infrastructure) and services used in this work were provided by the VSC (Flemish Supercomputer Center), funded by Ghent University, FWO and the Flemish Government—department EWI.

#### Arens et al. GEMS: An Integrated Solver for Coupled Cardiac Electromechanics

#### REFERENCES


Meola, F. (1879). La commozione toracica. Gior Internaz Sci. Med. 1, 923–937.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Arens, Dierckx and Panfilov. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## NOTATION


# Estimation of Diabetic Retinal Microaneurysm Perfusion Parameters Based on Computational Fluid Dynamics Modeling of Adaptive Optics Scanning Laser Ophthalmoscopy

Miguel O. Bernabeu1†, Yang Lu2†, Omar Abu-Qamar <sup>2</sup> , Lloyd P. Aiello2,3 and Jennifer K. Sun2,3 \*

<sup>1</sup> Centre for Medical Informatics, Usher Institute, The University of Edinburgh, Edinburgh, United Kingdom, <sup>2</sup> Beetham Eye Institute, Joslin Diabetes Center, Boston, MA, United States, <sup>3</sup> Department of Ophthalmology, Harvard Medical School, Boston, MA, United States

Diabetic retinopathy (DR) is a leading cause of vision loss worldwide. Microaneurysms (MAs), which are abnormal outpouchings of the retinal vessels, are early and hallmark lesions of DR. The presence and severity of MAs are utilized to determine overall DR severity. In addition, MAs can directly contribute to retinal neural pathology by leaking fluid into the surrounding retina, causing abnormal central retinal thickening and thereby frequently leading to vision loss. Vascular perfusion parameters such as shear rate (SR) or wall shear stress (WSS) have been linked to blood clotting and endothelial cell dysfunction, respectively in non-retinal vasculature. However, despite the importance of MAs as a key aspect of diabetic retinal pathology, much remains unknown as to how structural characteristics of individual MAs are associated with these perfusion attributes. MA structural information obtained on high resolution adaptive optics scanning laser ophthalmoscopy (AOSLO) was utilized to estimate perfusion parameters through Computational Fluid Dynamics (CFD) analysis of the AOSLO images. The HemeLB flow solver was used to simulate steady-state and time-dependent fluid flow using both commodity hospital-based and high performance computing resources, depending on the degree of detail required in the simulations. Our results indicate that WSS is lowest in MA regions furthest away from the feeding vessels. Furthermore, areas of low SR are associated with clot location in saccular MAs. These findings suggest that morphology and CFD estimation of perfusion parameters may be useful tools for determining the likelihood of clot presence in individual diabetic MAs.

Keywords: diabetic retinopathy, microaneurysm, adaptive optics, blood flow, computational fluid dynamics

#### Edited by:

Peter V. Coveney, University College London, United Kingdom

#### Reviewed by:

Paolo Melillo, Università degli Studi della Campania "Luigi Vanvitelli" Caserta, Italy Mariano Vázquez, Barcelona Supercomputing Center, Spain Jazmin Aguado-Sierra, Barcelona Supercomputing Center, Spain

#### \*Correspondence:

Jennifer K. Sun jennifer.sun@joslin.harvard.edu

†These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology

Received: 15 December 2017 Accepted: 05 July 2018 Published: 07 September 2018

#### Citation:

Bernabeu MO, Lu Y, Abu-Qamar O, Aiello LP and Sun JK (2018) Estimation of Diabetic Retinal Microaneurysm Perfusion Parameters Based on Computational Fluid Dynamics Modeling of Adaptive Optics Scanning Laser Ophthalmoscopy. Front. Physiol. 9:989. doi: 10.3389/fphys.2018.00989

## INTRODUCTION

As the worldwide prevalence of diabetes mellitus continues to increase, diabetic retinopathy (DR) remains the most common vascular complication in diabetic patients (Kempen et al., 2004; Klein, 2007; Ko et al., 2012). The chronic hyperglycemic state of diabetes results in pathological alterations of retinal microvascular structures and blood flow (Curtis et al., 2009). Retinal microaneurysms (MAs), which are outpouchings of the retinal capillary walls, are one of the earliest clinical signs in the diabetic eye and are among the key lesions for DR severity classification (ETDRS\_No10, 1991; ETDRS\_No12, 1991; Wilkinson et al., 2003; Hirai et al., 2007). Whereas some MAs do not appear to affect vision, other MAs can be associated with abnormal vascular leakage caused by the local loss of endothelial barrier function. In some cases, this may lead to subsequent retinal edema and associated vision loss (Nunes et al., 2009; Murakami et al., 2011). MA leakage affecting the local neural retina can often be detected by fluorescein angiography (FA), and treated by intraocular injections of anti vascular endothelial growth factor agents or steroids, as well as macular laser photocoagulation (Duh et al., 2017).

Several studies have evaluated the pathogenesis and natural history of MAs using ex vivo (e.g., transmission electron microscopy and scanning electron microscopy) and in vivo (scanning laser ophthalmoscopy and optical coherence tomography) imaging approaches to characterize pericyte loss, basement membrane thickening, and endothelial proliferation and disruption (Wise, 1957; Cogan et al., 1961; de Oliveira, 1966; Ashton, 1974; Moore et al., 1999). One study (Ezra et al., 2013) proposed using MA-to-vessel radius ratio as a potential marker for assessing risk of leakage, and suggested that shear stress at the MA wall may lead to endothelial dysfunction.

Advances in adaptive optics scanning laser microscopy (AOSLO) have recently enabled non-invasive investigation of the living human retina with single cell level resolution (∼2µm) (Tam et al., 2010; Chui et al., 2012, 2013), allowing detailed characterization of MA features (wall hyper-reflectivity, wall deformability), morphology (saccular, fusiform, focal bulge, irregular) and perfusion status (fully/partially perfused or nonperfused). One recent study (Dubow et al., 2014), which combined high resolution AOSLO with FA to provide a highresolution and high-contrast view of individual MAs, extended the qualitative morphologic classification into six morphology groups.

Retinal MAs are known to be highly dynamic lesions. Over the course of the disease, some lesions will disappear (possibly due to thrombus formation and revascularization) while others will either stabilize or grow. A series of studies (Goatman et al., 2003; Bernardes et al., 2009; Ribeiro et al., 2013) have characterized MA turnover (defined as the sum of the MA formation and disappearance rates Ribeiro et al., 2013) and found this metric to be a predictor of macular edema progression. However, these studies were limited in their ability to fully characterize MAs and did not include perfusion status or morphological characteristics of individual MAs in their analysis.

In a recent study, we demonstrated the feasibility of computational fluid dynamics (CFD) analysis to characterize the hemodynamic environment of the diabetic eye (Lu et al., 2016). Comparable approaches have been extensively used for the characterization of larger scale vascular lesions, such as intracranial aneurysms (IA) (Dhar et al., 2008; Chien et al., 2011). Morphological parameters, such as aneurysm aspect ratio and non-sphericity index (Chien and Sayre, 2014) have been identified as risk factors for rupture of IA. Perfusion parameters, such as velocity, wall shear stress (Tarbell, 2010), and shear rate have been proposed to study IA progression and resolution. In particular, a relationship between shear rate and IA thrombosis has been established (Ribeiro de Sousa et al., 2016), leading to a better understanding of IA progression.

In this study, morphological and CFD analyses of individual diabetic MAs were performed based on high resolution AOSLO technology. Our aim is to develop a method capable of establishing which MA characteristics are associated with a higher risk of leakage or clotting. We propose two novel morphological indices to quantify MA shape and aspect ratio. In addition, we introduce two CFD-based perfusion parameters to predict areas with higher risks of endothelial dysfunction and blood clotting. Finally, we demonstrate how to account for the pulsatile nature of blood flow in the models development and investigate the previous indices throughout the cardiac cycle.

## METHODS

### Imaging Instrument

The AOSLO used in this study was a modified version of the Indiana system described previously (Burns et al., 2007). A near infrared superluminesent diode (SLD) with a central wavelength of 830 nm (BLM-S-830, Superlum, Ireland) was used for imaging. Another SLD with a central wavelength of 780 nm (BLM-S-780, Superlum, Ireland) was used for wavefront sensing. A micro-electro-mechanical system deformable mirror (DM, Multi-DM, Boston Micromachines Corp., Cambridge, MA, USA) provided wavefront correction. The DM has an active area of 4.95 × 4.95 mm and 12 × 12 actuators with a maximum stroke of 5.5µm. The system uses doubler mirrors to amplify the usable stroke of the DM (Webb et al., 2004). The maximum beam size at the exit pupil is 6.5 mm. Based on theoretical calculations, this AOSLO system is capable of compensating for over 90% of the optical aberrations from an eye with clear media and a dilated pupil, achieving ∼2.5µm resolution. With such resolution, MA structural and perfusion information can be characterized in much greater detail than previously achievable with standard techniques such as fundus photography or fluorescein angiography (**Figure 1**).

### Image Processing and Morphological Analysis

#### MA Segmentation and Skeletonisation

The body and feeding/draining vessels of the MAs under study were manually segmented from AOSLO images. The MA outline

was created by using the Fiji/ImageJ "Polygon Selections" tool to define series of line segments along the MA wall. The outline was adjusted based on both the scattered light images (**Figure 2a**) and their corresponding "perfusion map" (standard deviation map) images calculated from the AOSLO frames (**Figure 2b**). The "Create Mask" function was used to turn the segmentation file into a binarized figure file. In the segmentation process, the length of each feeding/draining capillary was taken to be roughly equal to the MA body length along the flow direction axis (see section Hemodynamic Analysis for more details). The region representing the MA body was differentiated from the feeding/draining vessels (**Figure 2c**). In the subset of MAs in which we could identify blood clots, the perfused versus clotted areas within the MA body were also segmented (**Figure 2d**). Direction of flow was recorded from the AOSLO videos of each MA. Binary masks defining the two-dimensional projection of the MA body along with feeding/draining capillaries were prepared for further processing. In the cases where clots were present, the clotted area was also included in the binary mask. We employed the methodology described previously (Bernabeu et al., 2014) to calculate the MA centerline and radii along the centerline from the Voronoi diagram of the pixels defining the boundary of each binary mask (Attali and Montanvert, 1997). Briefly, the centerline is the subset of the Voronoi diagram defining the medial axis of the mask. For any point along the centerline, its radius is given by the largest circle centered on that point and inscribed within the mask (**Figure 3**).

#### Morphological Analysis

In this work, we propose two novel indices to describe the morphology of a retinal MA: the body-to-neck ratio (BNR)

and the asymmetry ratio (AR). BNR provides a measure of how dilated the MA body is in relation to the caliber of the feeding/draining capillaries. BNR is defined as the quotient between the MA body width and the caliber of the feeding/draining vessels (see **Figure 3**). Chien et al. employed a similar measure to characterize arterial brain aneurysms and found a trend for increases in this index when comparing aneurysms before and after rupture (Chien and Sayre, 2014). BNR is computed based on the skeleton/radii analysis described in section MA Segmentation and Skeletonisation. Briefly, the MA width and feeding/draining vessel caliber are defined to be the largest and smallest radii registered along the skeleton of the MA, respectively.

AR quantifies the degree of asymmetry of the MA body. AR is defined as the ratio between the larger (A1) and smaller (A2) areas in the MA body mask to each side of the centerline (A1 divided by A2 in **Figure 4**, respectively, where A1>A2). Vorp et al. proposed a comparable measure of asymmetry for idealized aortic abdominal aneurysm (AAA) geometries and used it to characterize mechanical wall stress (Vorp et al., 1998). In subsequent work, Finol et al. studied the impact of AAA asymmetry on their hemodynamics and found that asymmetry tends to increase the maximum wall shear stress at peak flow and to induce the appearance of secondary flows in late diastole in idealized AAA geometries (Finol et al., 2003). AR is computed based on the MA body segmentation and centerline. Briefly, the polygon approximating the MA body is split into two along the MA centerline and the area of each sub-polygon is subsequently calculated. Custom Python scripts were developed to calculate BNR and AR.

### Hemodynamic Analysis

Based on the MA skeletonisation previously described and assuming rotational symmetry, we reconstructed the threedimensional luminal surface of each MA under study (**Figure 5**). This surface encloses the approximate MA volume including

its body and feeding/draining capillaries. The CFD package HemeLB (Bernabeu et al., 2014) was used to simulate both steadystate and time-dependent flow of a shear-thinning fluid modeled with the Carreau-Yasuda rheology model parametrized for human blood (Boyd and James, 2007). HemeLB uses the Lattice Boltzmann Method for the numerical simulation of blood flow. The interested reader can refer to (Aidun and Clausen, 2010; Krüger et al., 2017) for a complete presentation. The velocity field at the inlet was assumed to be parabolic (Poiseuille flow) for a given centerline peak velocity. To define this velocity, we took advantage of recent measurements of blood flow velocities in parafoveal capillaries by de Castro et al. (2016). Figure 4b of de Castro et al. (2016) reports velocity values over 4 cardiac cycles (equivalent to 3.13 s), which we used in the time-dependent flow simulations, with a mean capillary velocity of 1.69 mm/s, which we used in the steady-state flow simulations. Furthermore, noslip velocity was imposed at the walls and a reference pressure was set at the outlet. To ensure that the flow field in the MAs is not affected by the finite length of the feeding/draining capillaries, we take them to be longer than the entrance length, Le, required for laminar flow to fully develop in a circular straight pipe. This is given by the expression L<sup>e</sup> = 0.035 ∗ D ∗ R<sup>e</sup> (Bird et al., 2002), where D and R<sup>e</sup> are the diameter and Reynolds number of the feeding vessel, respectively. In all the MAs studied L<sup>e</sup> can be shown to be shorter than D. Therefore, the feeding/draining capillaries were segmented to be of length comparable to the MA body length along the flow axis for statistical purposes in the hemodynamic analyses that follow. Steady-state HemeLB simulations were run inexpensively in a four-core commodity hospital-based workstation, while time-dependent simulations made use of ARCHER, the UK National Supercomputing Service (http://www.archer.ac.uk). Typical execution times for the latter ranged between 4 and 10 h using 312 cores. All computational domains were discretized as a regular grid ensuring a minimum of 8 lattices sites across the narrowest point in the domain (Bernabeu et al., 2014) and comprised between 45,000 and 520,000 fluid lattices sites.

Our computer simulations generated a description of the velocity, shear rate, and pressure fields in the whole computational domain as well as the wall shear stress on the

model surface. In this study, we decided to characterize the changes in shear rate (SR) and wall shear stress (WSS) present in the MAs. Low SR has been associated with blood cell aggregation and clotting (Runyon et al., 2007) and abnormal WSS levels have been linked to endothelial cell dysfunction and changes in permeability (Tarbell, 2010). To reduce the dimensionality of the data and facilitate further statistical analysis we propose two indices for the characterization of the hemodynamic state of an MA: the shear rate mean drop (SRMD) and the wall shear stress mean drop (WSSMD). SRMD reports the ratio between the mean of the SR field in the MA feeding/draining vessels and the same measurement inside the MA body. Similarly, WSSMD indicates the ratio between the mean of the WSS on the MA feeding/draining vessels surface and the same measurement on the surface of the MA body. SRMD and WSSMD are dimensionless quantities. Finally, in the case of MA displaying clots, we also estimated SRMD for the clotted and perfused parts of the MA separately. Custom Python scripts were developed to calculate SRMD and WSSMD.

### Study Cohort

In this study, 20 MAs were imaged from 13 eyes of 11 diabetic patients with varying severity of DR. The patient and MA characteristics are given in **Supplementary Table 1**. In this cohort, 9 of 11 patients had Type 1 diabetes, mean diabetes duration was 25 years and mean HbA1c was 8.1%. Informed written consent was obtained from each subject prior to the performance of any study procedures. This study adhered to the tenets of the Declaration of Helsinki and was approved by the institutional review board of the Joslin Diabetes Center.

### Imaging Protocol and Light Safety

Mydriasis and cycloplegia were achieved by instillation of 1 drop each of 1% tropicamide (Akorn, Inc., Lake Forest, IL) and 2.5% phenylephrine hydrochloride (Akorn, Inc., Lake Forest, IL). Prior to AOSLO imaging, eye axial length (IOL Master, Zeiss, Germany) was measured to determine the magnification factor on AOSLO images. Ultrawide field, 200◦ digital fundus sphotographs (Optos 200Tx and Optos California, UK) were taken to determine MA location. During imaging, the subject's head was placed on a chin rest, and a head rest was used against the forehead for secure positioning. Precise head position adjustment and pupil alignment were achieved using a threeaxis motorized stage (MT3-Z8 Thorlabs, NJ). MAs were imaged using AOSLO confocal imaging mode and multiply scattered light imaging mode with 75-frame videos. A 500µm and 150µm pinhole was used for forward scattering image and confocal imaging, respectively. Two SLDs were used for for imaging (830 nm) and wavefront sensing (780 nm). Output power at the cornea was 200 µW for the imaging SLD, and 70 µW for the wavefront sensing SLD. The light power was checked periodically to ensure compliance with the ANSI laser safety standard (American National Standards Institute, 2014).

### Statistical Analysis

The segmentation of all the MAs and clotted regions are performed by at least 2 trained graders. For agreement between graders, <10% area variation for each MA and sub-region is ensured. All statistical analyses are completed using custom Python scripts and the Statistics package of the SciPy library (https://www.scipy.org). The Wilcoxon rank-sum test is used to test for significance in the comparison between groups.

### RESULTS

#### Morphological and Hemodynamic Indices

Twenty MAs were imaged from 13 eyes of 11 diabetic subjects as shown in **Figure 6**, 10 were classified as saccular (5 partially clotted) and 10 as fusiform (none was clotted). For each MA, projected MA body size, asymmetry ratio (AR), body-to-neck ratio (BNR), shear rate mean drop (SRMD), and wall shear stress mean drop (WSSMD) are shown in **Supplementary Table 2**.

### Analysis of Partially Perfused Mas

In the 5 partially perfused MAs (**Figure 7**), which had evidence of clot within the MA body, we calculated the hemodynamic indices within the perfused and clotted regions of the MA separately (**Table 1**).

Among the partially clotted MAs, the SRMD and WSSMD values in the perfused regions were lower (mean ± SD: 63.36 ± 39.66 and 29.02 ± 16.74, respectively) than the values (211.85 ± 118.22 and 82.94 ± 30.91, respectively) in the clotted regions.

### Asymmetry Ratio Predicts Manual MA Morphology Classification

All of the MAs in the study were qualitatively classified as saccular or fusiform according to the taxonomy proposed by Dubow et al. (2014). AR was calculated for all MAs and was found to be lower on average in the fusiform group compared to the saccular group (p < 0.001, **Figure 8**). Our data indicate that an AR threshold of ∼1.5 reliably distinguishes fusiform from saccular MAs in this cohort. However, given the degree of overlap between both groups in terms of AR, it may not be advisable to define a unique cutoff value for automatic classification. Instead we propose a semi-automatic approach were MA with an AR below 1.4 and above 1.8 are automatically classified as fusiform and saccular, respectively, while those in the 1.4-1.8 region are labeled for manual classification by graders.

### Association of MA Morphology and Size

The area defined by the MA body segmentation, which is determined from an en face projection (xy plane) of the MA volume, was calculated for all the MAs in the study and used as a surrogate measure of MA size. Saccular MAs were found to be smaller than fusiform MAs (p = 0.004, **Figure 9**) with some saccular outliers having comparable size to the fusiform group. Moore et al. (1999) measured the extent of saccular and fusiform MAs in the direction perpendicular to the en face projection (z axis) and found no statistically significant difference. Taken

TABLE 1 | Perfusion indices of the 5 MA within perfused versus clotted areas.


together these results could indicate that size variability is more likely to be observed along the en face cross section compared to the transverse direction.

### Shear Rate Mean Drop Is Higher in MA Regions Likely to Clot

In the current study, all the MAs containing clots were of saccular type. MAs presenting clots appeared to have a higher AR approaching statistical significance in the comparison (p = 0.061, **Figure 10**).

Clots were always identified in contact with the MA wall. We performed hemodynamics analysis of the MAs to understand the relationship between flow and clot formation. The flow models were defined to include both the perfused and clotted portions of any given MA. In the 5 partially perfused MAs, we found that SRMD and WSSMD were higher in the regions where the clots were present compared to those that had not developed clots (p = 0.028 and p = 0.009, respectively, **Figure 11**). This speaks in favor of a model where MA thrombosis occurs in regions adjacent to the wall that experience low shear rates (hence high SRMD). In agreement with our results, low SR has been associated with blood clotting in vitro (Runyon et al., 2007) and with thrombus formation in intracranial aneurysms (Ribeiro de Sousa et al., 2016). Indeed, despite the obvious structural and hemodynamic differences between the macro and microcirculation, flow diverters, which rely on the principle of flow reduction from the parent circulation into the aneurysm body (hence SR reduction), are an established treatment for brain aneurysms (Jiang et al., 2016) to promote progressive intra-aneurysmal thrombosis.

### Body-to-Neck Ratio Correlates With Perfusion Changes in the MA Body

Both saccular and fusiform MAs are characterized by a sudden and non-uniform expansion of the vascular lumen. This change is most asymmetrical in the saccular class of MAs. This abnormal morphological configuration has a profound impact on the hemodynamics of the MA. We propose BNR as a simple metric for the quantification of hemodynamic abnormalities. Our results demonstrate that BNR is a good surrogate marker of SRMD (Pearson's r = 0.9, **Figure 12**) and WSSMD (Pearson's r = 0.83, **Figure 12**). Furthermore, mean WSSMD in this cohort was 35.2 with values as high as 78.4 (compared to a theoretical value of ∼1 in the absence of MAs) showing the highly abnormal level of WSS experienced by endothelial cells lining the MA body wall compared to those in neighboring vessels.

### Hemodynamic Changes Throughout the Cardiac Cycle

Blood flow displays pulsatile characteristics throughout the cardiac cycle. In our flow models, we can account for this property by defining a time-dependent inlet boundary condition based on the velocity traces measured by de Castro et al. (2016). Based on these simulations, we investigate the changes in velocity and shear rate throughout the cardiac cycle and their potential link with MA perfusion status and MA progression.

As expected, we find velocity and shear rate to be largest during systole, with regions that have developed clots experiencing reduced velocity and shear rate. We hypothesize that clots will form in areas of slow flow (i.e., low velocity) due to a sustained reduction in shear rate throughout the cardiac cycle (i.e., a low shear rate threshold). This is in agreement with in vitro studies looking at clot formation and propagation (Runyon et al., 2007). We calculate this threshold for the clotted region of MA1 to be ∼1 s−<sup>1</sup> on the previously described AOSLO delineation. In **Supplementary Movie 1**, we show the variation in the velocity field inside MA1 throughout the cardiac cycle and, color-coded in yellow, the regions of the MA experiencing a shear rate smaller or equal to 15 s−<sup>1</sup> . Interestingly, we observe how MA regions adjacent to the clotted part will fall below the threshold following systole (hence the yellow color disappear/appear in this region) when flow in the MA slows down.

Based on this observation, we postulate that a clot can propagate over time in areas where shear rate remains under threshold. We selected two MAs from the same eye for follow-up, one partially clotted at the time of baseline imaging (MA1) and another fully perfused (MA4). After 15 months of follow-up, the body of MA1 appeared to become non-perfused with persistent

FIGURE 11 | Shear rate mean drop (SRMD, left) and wall shear stress mean drop (WSSMD, right) in partially perfused MAs by clotted/unclotted regions.

MAs studied. Regression lines and associated correlation coefficients are given to demonstrate the good correlation between the pairs of variables. The marginal histograms in each of the plots present the distribution of each of the variables studied.

blood flow through a central vessel (**Figure 13**). Interestingly, the shape of MA4 remained unchanged and no clot development was observed.

### DISCUSSION AND CONCLUSIONS

In the current work, we propose 4 novel indices for the classification and study of retinal MAs. Two of them are structural (asymmetry ratio, AR and body-to-neck ratio, BNR), and the other two describe the hemodynamic environment of the MA (shear rate mean drop, SRMD and wall shear stress mean drop, WSSMD). The limitations of the CFD methodology include the assumption of rotational symmetry in the MA surface reconstruction and the use of non-patient-specific boundary conditions. We calculated these indices in a set of 20 retinal MAs imaged with AOSLO. Our aim is to develop a method capable of establishing which MA characteristics are associated with a higher risk of leakage or clotting.

The data demonstrate that the proposed AR index is highly correlated with the qualitative MA classification of being either saccular or fusiform as performed by trained graders. The area calculated from the en face AOSLO projection of the MA body

FIGURE 13 | Saccular (A) and fusiform (B) MAs from the same eye of a patient with severe NPDR. The shape and the perfusion status of the saccular MA changed dramatically, whereas the fusiform MA's shape and perfusion status was maintained during the 15-month non-treatment period.

volume was found to be smaller in the saccular MAs studied compared to fusiform MAs.

It remains elusive why only some MAs are associated with retinal edema due to the disruption of endothelial cell barrier function. Previous work has linked abnormal WSS levels to endothelial cell dysfunction and changes in permeability (Tarbell, 2010). In the current work, we have proposed a method for the quantification of the changes in WSS experienced by the cells lining the MAs. Our results show a consistent WSS reduction with up to one order of magnitude difference among all cases (7- vs. 78-fold reduction). In future work, we will investigate the association between WSSMD and clinically observed MA leakage in longitudinal datasets. Furthermore, we shall investigate associations between the changes in WSSMD throughout the cardiac cycle and MA outcomes as changes in hemodynamic frequency have been shown to regulate pathologic phenotypes in endothelial cells (Feaver et al., 2012).

Previous studies have described and quantified the dynamic turnover of MAs in retinal vasculature (Goatman et al., 2003, Bernardes et al., 2009). In the current work, we took advantage of high resolution AOSLO imaging to observe partially clotted MAs. Five out of 20 MAs presented clots. All the partially clotted cases were of saccular type. Therefore asymmetry appeared to play a role in clotting. In one occasion, we could observe thrombosis of the MA body and remodeling of the affected capillary. Based on previous reports of the relationship between hemodynamics and blood clotting (Runyon et al., 2007) and thrombosis of vascular lesions (Ribeiro de Sousa et al., 2016), we studied SRMD and WSSMD in the MAs prior to clot development and identified a statistically significant reduction of both indices in the regions that would subsequently develop clots. Taken together, these results are consistent with the hypothesis that MA asymmetry promotes MA thrombosis through the wellcharacterized mechanism of blood clotting at low shear stress.

We anticipate that this work will shed light on the assessment of the dynamic processes of retinal MA development, clotting, and regression. We believe the proposed indices can be exploited as biomarker for vascular stability and DR disease progression. In future work, we will quantify this relationship and establish WSSMD/SRMD thresholds that facilitate the prediction of MA progression on a lesion-specific basis, as well as their relationship with MA leakage.

### ETHICS STATEMENT

This study was carried out in accordance with the recommendations of the institutional review board of the Joslin Diabetes Center with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the institutional review board of the Joslin Diabetes Center.

#### AUTHOR CONTRIBUTIONS

MB, YL, LA, and JS designed research. MB, YL, and OA-Q performed research. MB, YL, OA-Q, and JS analyzed data. MB, YL, OA-Q, LA, and JS wrote the manuscript.

### FUNDING

National Eye Institute (1R01EY0-24702-01, 2R44EY-16295-04A1); Juvenile Diabetes Research Foundation (2-SRA-2014-264-M-R, 17-2011-359); Eleanor Chesterman Beatson Childcare Ambassador Program Foundation Grant; Massachusetts Lions Eye Research Fund; Engineering and Physical Sciences Research Council (EP/L00030X/1); Fondation Leducq (17 CVD 03). Funding support for author OA-Q comes from the Dubai Harvard Foundation for Medical Research.

#### ACKNOWLEDGMENTS

BEI research study coordinators (mydriasis), technicians. The authors acknowledge the contributions of the HemeLB

#### REFERENCES


development team. This work used the ARCHER UK National Supercomputing Service (http://www.archer.ac.uk).

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphys. 2018.00989/full#supplementary-material

aneurysm rupture risk assessment. Neurosurgery 63, 185–197. doi: 10.1227/01.NEU.0000316847.64140.81


**Conflict of Interest Statement:** JS received research support from Boston Micromachines Corp.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Bernabeu, Lu, Abu-Qamar, Aiello and Sun. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Parameter Estimation of Platelets Deposition: Approximate Bayesian Computation With High Performance Computing

Ritabrata Dutta<sup>1</sup> , Bastien Chopard<sup>2</sup> \*, Jonas Lätt <sup>2</sup> , Frank Dubois <sup>3</sup> , Karim Zouaoui Boudjeltia<sup>4</sup> and Antonietta Mira1,5

1 Institute of Computational Science, Università della Svizzera italiana, Lugano, Switzerland, <sup>2</sup> Computer Science Department, University of Geneva, Geneva, Switzerland, <sup>3</sup> Microgravity Research Centre, Université libre de Bruxelles (ULB), Brussels, Belgium, <sup>4</sup> Laboratory of Experimental Medicine (ULB 222 Unit), Université Libre de Bruxelles and CHU de Charleroi, Brussels, Belgium, <sup>5</sup> Department of Science and High Technology, Università degli Studi dell'Insubria, Varese, Italy

#### Edited by:

Rajat Mittal, Johns Hopkins University, United States

#### Reviewed by:

Jacopo Biasetti, CorWave SA, France Lucy T. Zhang, Rensselaer Polytechnic Institute, United States

> \*Correspondence: Bastien Chopard Bastien.Chopard@unige.ch

#### Specialty section:

This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology

Received: 29 September 2017 Accepted: 27 July 2018 Published: 20 August 2018

#### Citation:

Dutta R, Chopard B, Lätt J, Dubois F, Zouaoui Boudjeltia K and Mira A (2018) Parameter Estimation of Platelets Deposition: Approximate Bayesian Computation With High Performance Computing. Front. Physiol. 9:1128. doi: 10.3389/fphys.2018.01128 Cardio/cerebrovascular diseases (CVD) have become one of the major health issue in our societies. Recent studies show the existing clinical tests to detect CVD are ineffectual as they do not consider different stages of platelet activation or the molecular dynamics involved in platelet interactions. Further they are also incapable to consider inter-individual variability. A physical description of platelets deposition was introduced recently in Chopard et al. (2017), by integrating fundamental understandings of how platelets interact in a numerical model, parameterized by five parameters. These parameters specify the deposition process and are relevant for a biomedical understanding of the phenomena. One of the main intuition is that these parameters are precisely the information needed for a pathological test identifying CVD captured and that they capture the inter-individual variability. Following this intuition, here we devise a Bayesian inferential scheme for estimation of these parameters, using experimental observations, at different time intervals, on the average size of the aggregation clusters, their number per mm<sup>2</sup> , the number of platelets, and the ones activated per µℓ still in suspension. As the likelihood function of the numerical model is intractable due to the complex stochastic nature of the model, we use a likelihood-free inference scheme approximate Bayesian computation (ABC) to calibrate the parameters in a data-driven manner. As ABC requires the generation of many pseudo-data by expensive simulation runs, we use a high performance computing (HPC) framework for ABC to make the inference possible for this model. We consider a collective dataset of seven volunteers and use this inference scheme to get an approximate posterior distribution and the Bayes estimate of these five parameters. The mean posterior prediction of platelet deposition pattern matches the experimental dataset closely with a tight posterior prediction error margin, justifying our main intuition and providing a methodology to infer these parameters given patient data. The present approach can be used to build a new generation of personalized platelet functionality tests for CVD detection, using numerical modeling of platelet deposition, Bayesian uncertainty quantification, and High performance computing.

Keywords: platelet deposition, numerical model, Bayesian inference, approximate Bayesian computation, high performance computing

### 1. INTRODUCTION

Blood platelets play a major role in the complex process of blood coagulation, involving adhesion, aggregation, and spreading on the vascular wall to stop a hemorrhage while avoiding the vessel occlusion. Platelets also play a key role in the occurrence of cardio/cerebro-vascular accidents that constitute a major health issue in our societies. In 2015, Cardiovascular diseases (CVD), including disorders of the heart and blood vessels, were the first cause of mortality worldwide, causing 31% of deaths (Organization, 2015). Antiplatelet therapy generally reduces complications in patients undergoing arterial intervention (Mehta et al., 2001; Steinhubl et al., 2002). However, the individual response to dual antiplatelet therapy is not uniform and consistent studies reported that even under platelets therapy there were recurrences of atherothrombotic events (Matetzky et al., 2004; Gurbel et al., 2005; Geisler et al., 2006; Hochholzer et al., 2006; Marcucci et al., 2009; Price et al., 2008; Sibbing et al., 2009). In most cases, a standard posology is prescribed to patients, which does not take into account the inter-individual variability linked to the absorption or the effectiveness of these molecules. This was supported by a recent study (Koltai et al., 2017), reporting the high patient-dependency of the response of the antithrombotic drugs. We should also note that the evaluation of the response to a treatment by the existing tests is test-dependent.

Nowadays, platelet function testing is performed either as an attempt to monitor the efficacy of anti-platelet drugs or to determine the cause of abnormal bleeding or prothrombotic status. The most common method consists of using an optical aggregometer that measures the transmittance of light passing through plasma rich in platelets (PRP) or whole blood (Born and Cross, 1963; Harrison, 2009), to evaluate how platelets tend to aggregate. Other aggregometers determine the amount of aggregated platelets by electric impedance (Velik-Salchner et al., 2008) or luminescence. In specific contexts, flow cytometry (Michelson et al., 2002) is also used to assess platelet reactivity (VASP test; Bonello et al., 2009). Determination of platelet functions using these different existing techniques in patients undergoing coronary stent implantation have been evaluated in Breet et al. (2010), which shows the correlation between the clinical biological measures and the occurrence of a cardiovascular event was null for half of the techniques and rather modest for others. This may be due to the fact that no current test allows the analysis of the different stages of platelet activation or the prediction of the in vivo behavior of those platelets (Picker, 2011; Koltai et al., 2017). It is well-known that the phenomenon of platelet margination (the process of bringing platelets to the vascular wall) is dependent on the number and shape of red blood cells and their flow (Piagnerelli et al., 2007), creating different pathologies for different diseases (e.g., diabetes, End Renal Kidney Disease, hypertension, sepsis). Further, platelet margination is also known to be influenced by the aspect ratio of surrogate platelet particles (Reasor et al., 2013). Although there is a lot of data reported by recent research works (Maxwell et al., 2007) on the molecules involved in platelet interactions, these studies indicate that there is a lack of knowledge on some fundamental mechanisms that should be revealed by new experiments.

Hence, the challenge is to find parameters connecting the dynamic processes of adhesion and aggregation of platelets to the data collected from the individual patients. Recently, by combining digital holography microscopy (DHM) and mathematical modeling, (Chopard et al., 2015; Boudejltia et al., 2015; Chopard et al., 2017) provided a physical description of the adhesion and aggregation of platelets in the Impact-R device. A numerical model is developed that quantitatively describes how platelets in a shear flow adhere and aggregate on a deposition surface. This is the first innovation in understanding the molecular dynamics involved in platelet interactions. Five parameters specify the deposition process and are relevant for a biomedical understanding of the phenomena. One of the main intuition is that the values of these parameters (e.g., adhesion and aggregation rates) are precisely the information needed to assess various possible pathological situations and quantify their severity regarding CVD. Further, it was shown in Chopard et al. (2017) that, by hand-tuning the parameters of the mathematical model, the deposition patterns observed for a set of healthy volunteers in the Impact-R can be reproduced.

Assuming that these parameters can determine the severity of CVD, how do we estimate the adhesion and aggregation rates of given patients by a clinical test? The determination of these adhesion and aggregation rates by hand-tuning is clearly not a solution as we need to search the high-dimensional parameter space of the mathematical model, which becomes extremely expensive and time consuming. We further notice, this has to be repeated for each patient and thus requires a powerful numerical approach. In this work, we resolve the question of estimating the parameters using Bayesian uncertainty quantification. Due to a complex stochastic nature, the numerical model for platelet deposition does not have a tractable likelihood function. We use Approximate Bayesian Computation (ABC), a likelihood-free inference scheme, with an optimal application of HPC (Dutta et al., 2017a) to provide a Bayesian way to estimate adhesion and aggregation rates given the deposition patterns observed in the Impact-R of platelets collected from a patient. Obviously, the clinical applicability of the proposed technique to provide a new platelet function test remains to be explored, but the numerical model (Chopard et al., 2017) and the proposed inference scheme here, bring the technical elements together to build a new class of medical tests.

In section 2 we introduce the necessary background knowledge about the platelet deposition model, whereas section 3 recalls the concept of Bayesian inference and introduces the HPC framework of ABC used in this study. Then we illustrate the results of the parameter determination for platelet deposition model using ABC methodology, collectively for seven patients in section 4. Clearly, the same methodology can be used to determine the parameter values for each individual patients in a similar manner for a CVD clinical test. Finally, in section 5 we conclude the paper and discuss its impact from a biomedical perspective.

### 2. BACKGROUND AND SCIENTIFIC RELEVANCE

The Impact-R (Shenkman et al., 2008) is a well-known platelet function analyzer. It is a cylindrical device filled in with whole blood from a donor. Its lower end is a fixed disk, serving as a deposition surface, on which platelets adhere and aggregate. The upper end of the Impact-R cylinder is a rotating cone, creating an adjustable shear rate in the blood. Due to this shear rate, platelets move toward the deposition surface, where they adhere or aggregate. Platelets aggregate next to already deposited platelets, or on top of them, thus forming clusters whose size increase with time. This deposition process has been successfully described with a mathematical model in Chopard et al. (2015); Chopard et al. (2017).

The numerical model (coined M in what follows) requires five parameters that specify the deposition process and are relevant for a bio-medical understanding of the phenomena. In short, the blood sample in the Impact-R device contains an initial number Nplatelet(0) of non-activated platelets per µℓ and a number Nact−platelet(0) of pre-activated platelets per µℓ. Initially both type of platelets are supposed to be uniformly distributed within the blood. Due to the process known as shearinduced diffusion, platelets hit the deposition surface. Upon such an event, an activated platelets will adhere with a probability that depends on its adhesion rate, pAd, that we would like to determine. Platelets that have adhered on the surface are the seed of a cluster that can grow due to the aggregation of the other platelets reaching the deposition surface. We denote with pAg the rate at which new platelets will deposit next to an existing cluster. We also introduce p<sup>T</sup> the rate at which platelets deposit on top of an existing cluster. An important observation made in Chopard et al. (2015); Chopard et al. (2017) is that albumin, which is abundant in blood, compete with platelet for deposition. This observation is compatible with results reported in different experimental settings (Sharma et al., 1981; Remuzzi and Boccardo, 1993; Fontaine et al., 2009). As a consequence, the number of aggregation clusters and their size tends to saturate as time goes on, even though there are still a large number of platelets in suspension in the blood.

To describe this process in the model, two extra parameters, pF, the deposition rate of albumin, and aT, a factor that accounts for the decrease of platelets adhesion and aggregation on locations where albumin has already deposited, were introduced. The numerical model is described in full detail in Chopard et al. (2015); Chopard et al. (2017). Here we simply repeat the main elements. Due to the mixing in the horizontal direction, it was assumed that the activated platelets (AP), non-activated platelets (NAP) and albumin (Al) in the bulk can be described by a 1D diffusion equation along the vertical axis z

$$
\partial\_t \rho = D \partial\_z^2 \rho \qquad J = -D \text{grad}\,\rho \tag{1}
$$

where ρ is the density of either AP, NAP or Al, J and D are correspondingly the flux of particles and the shear induced diffusion. Upon reaching a boundary layer above the deposition substrate, adhesion and aggregation will take place according to

$$
\dot{N} = -J(0, t)\Delta S - p\_d N(t) \tag{2}
$$

where N is the number of particles in the boundary layer, 1S a surface element on the deposition surface, and p<sup>d</sup> is the deposition rate, which evolves during time and varies across the substrate, according to the deposition history. For the deposition process, particles are considered as discrete entities that can attach to any position of the grid representing the deposition surface, as sketched in **Figure 1**. In this figure, the gray levels illustrate the density of albumin already deposited in each cell. The picture also illustrates the adhesion, aggregation, and vertical deposition along the z-axis. On the left panel, activated platelets (gray side disks) deposit first. Then in the second panel, nonactivated platelets (white side disks) aggregate next to an already formed cluster. Both pre-activated and non-activated platelets can deposit on top of an existing cluster.

The deposition rules are the following. An albumin that reaches the substrate at time t deposits with a probability P(t) which depends on the local density ρal(t) of already deposited Al. We assume that P is proportional to the remaining free space in the cell,

$$P(t) = p\_F(\rho\_{\text{max}} - \rho\_{al}(t)),\tag{3}$$

where p<sup>F</sup> is a parameter and ρmax is determined by the constraint that at most 100,000 albumin particles can fit in a deposition cell of area 1S = 5 (µm) 2 , corresponding to the size of a deposited platelet (obtained as the smallest variation of cluster area observed with the microscope).

An activated platelet that hits a platelet-free cell deposits with a probability Q, where Q decreases as the local concentration ρal of albumin increases. We assumed that

$$Q = p\_{Ad} \exp(-a\_T \rho\_{al}),\tag{4}$$

where pAd and a<sup>T</sup> are parameters. This expression can be justified by the fact that a platelet needs more free space than an albumin to attach to the substrate, due to their size difference. In other words, the probability of having enough space for a platelet, decreases roughly exponentially with the density of albumin in the substrate. This can be validated with a simple deposition model on a grid, where small and large objects compete for deposition.

Once an activated platelet has deposited, it is the seed of a new cluster that grows further due to the aggregation of further platelets. In our model, AP and NAP can deposit next to already deposited platelets. From the above discussion, the aggregation probability R is assumed to be

$$R = p\_{A\text{g}} \exp(-a\_T \rho\_{al}),\tag{5}$$

with pAg another parameter.

The above deposition probabilities can also be expressed as deposition rate over the given simulation time step 1t = 0.01 s (see Chopard et al., 2017 for details), hence giving a way to couple the diffusion Equation (1) with the 2D discrete deposition process

sketched in **Figure 1**. Particles that did not deposit at time t are re-injected in the bulk and contribute to boundary condition of Equation (1) at z = 0.

To the best of our knowledge, except for Chopard et al. (2015); Chopard et al. (2017) there is no model in the literature that describes quantitatively the proposed in-vitro experiment. The closest approach is that of Affeld et al. (2013) but albumin is not included, and the role of pre-activated and non-activated platelets is not differentiated. Also, we are not aware of any other study than ours that reports both the amount of platelets in suspension as a function of time and those on the deposition surface.

The validity of the proposed numerical model has been explored in detail in Chopard et al. (2017). This validation is based on the fact that the model, using hand-tuned parameters can reproduce the time-dependent experimental observations very well. We refer the readers to Chopard et al. (2017) for a complete discussion. Here we briefly recall the main elements that demonstrate the excellent agreement of the model and the simulations. We reproduce **Figure 2** from Chopard et al. (2017), showing the visual similarity between the actual and simulated deposition pattern. In the validation study, the evolution of the number of clusters, their average size and the numbers of preactivated and non-activated platelets still in suspension matched quantitatively with the experimental measurements at times 20, 60, 120, and 300 s. In addition, a very good agreement between the simulated deposition pattern and the experiment was also found by comparing the distributions of the areas and volumes of the aggregates.

To be noticed, the validation reported in Chopard et al. (2017) was done using manually estimated parameters. As the main goal of this research is to propose an inference scheme to learn the parameters in a data-driven manner, a validation for the model and the inference scheme is reported in **Figure 6** below, using the inferred posterior distribution which also includes a quantification of prediction error.

For the purpose of the present study, the model M is parametrized in terms of the five quantities introduced above, namely the adhesion rate pAd, the aggregation rates pAg and pT, the deposition rate of albumin pF, and the attenuation factor aT. Some additional parameters of the model, specifically, the shearinduced diffusion coefficient and the thickness of the boundary layer (Chopard et al., 2017), are assumed here to be known. Collectively, we define

$$\boldsymbol{\theta} = (\rho\_{A\emptyset}, \rho\_{Ad\bullet}\mathcal{P}\_T, \rho\_F, a\_T).$$

If the initial values for Nplatelet(0) and Nact−platelet(0), as well as the concentration of albumin are known from the experiment, we can forward simulate the deposition of platelets over time using model M for the given values of these parameters θ = θ ∗ :

$$\mathcal{M}[\boldsymbol{\theta}] = \boldsymbol{\theta}^{\*}] \to \left\{ \left( \mathcal{S}\_{\text{agg}-\text{clust}}(t), \mathbb{N}\_{\text{agg}-\text{clust}}(t), \mathbb{N}\_{\text{platelrt}}(t), \mathbb{N}\_{\text{act}-\text{platelrt}}(t) \right), \dots, \mathbb{N}\_{\text{eff}} \right\}. \tag{6}$$

$$t = 0, \dots, T$$

where <sup>S</sup>agg−clust(t), Nagg−clust(t), Nplatelet(t) , and Nact−platelet(t) are correspondingly average size of the aggregation clusters, their number per mm<sup>2</sup> , the number of non-activated and pre-activated platelets per µℓ still in suspension at time t.

The Impact-R experiments have been repeated with the whole blood obtained from seven donors and the observations were made at time, 0 , 20 , 60 , 120, and 300 s. At these five time points, - <sup>S</sup>agg−clust(t), Nagg−clust(t), Nplatelet(t), Nact−platelet(t) are measured. Let us call the observed dataset collected through experiment as,

$$\begin{aligned} \mathbf{x}^{\mathbf{0}} &= \{ (\mathcal{S}\_{\mathrm{agg}-clust}^{\mathbf{0}} \mathbf{t}), \mathbb{N}\_{\mathrm{agg}-clust}^{\mathbf{0}} \mathbf{t} \}, \mathbb{N}\_{\mathrm{platleft}}^{\mathbf{0}} \mathbf{t} \}, \mathbb{N}\_{\mathrm{act}-platelt}^{\mathbf{0}} \mathbf{t} \} \, \mathbf{t} \\ \mathbf{t} &= \mathbf{0} \, \text{s} \, \text{s} \, \text{h} \, \text{s} \, \text{.} \end{aligned} \tag{7}$$

By comparing the number and size of the deposition aggregates obtained from the in-vitro experiments with the computational results obtained by forward simulation from the numerical model (see **Figure 2** for an illustration), the model parameters were manually calibrated by a trial and error procedure in Chopard et al. (2017). Due to the complex nature of the model and highdimensional parameter space, this manual determination of the parameter values are subjective and time consuming.

However, if the parameters of the model could be learned more rigorously with an automated data-driven methodology, we could immensely improve the performance of these models and bring this scheme as a new clinical test for platelet functions. To this aim, here we propose to use ABC for Bayesian inference of the parameters. As a result of Bayesian inference to this context, not only we can automatically and efficiently estimate the model parameters, but we can also perform parameter uncertainty quantification in a statistically sound manner, and determine if the provided solution is unique.

#### 3. BAYESIAN INFERENCE

We can quantify the uncertainty of the unknown parameter θ by a posterior distribution p(θ |x) given the observed dataset x = x 0 . A posterior distribution is obtained, by Bayes' Theorem as,

$$p(\boldsymbol{\theta}|\mathfrak{x}) = \frac{\pi(\boldsymbol{\theta})p(\mathfrak{x}|\boldsymbol{\theta})}{m(\mathfrak{x})},\tag{8}$$

where π(θ ), p(x|θ ) and m(x) = R π(θ )p(x|θ )dθ are correspondingly the prior distribution on the parameter θ , the likelihood function, and the marginal likelihood. The prior distribution π(θ ) ensures a way to leverage the learning of parameters with prior knowledge, which is commonly known due to the availability of medical knowledge regarding cardiovascular diseases. If the likelihood function can be evaluated, at least up to a normalizing constant, then the posterior distribution can be approximated by drawing a sample of parameter values from the posterior distribution using (Markov chain) Monte Carlo sampling schemes (Robert and Casella, 2005). For the simulator-based models considered in section 2, the likelihood function is difficult to compute as it requires solving a very high dimensional integral. In next subsection 3.1, we illustrate ABC to perform Bayesian Inference for models where the analytical form of the likelihood function is not available in closed form or not feasible to compute.

### 3.1. Approximate Bayesian Computation

ABC allows us to draw samples from the approximate posterior distribution of parameters of the simulator-based models in absence of likelihood function, hence to perform approximate statistical inference (e.g., point estimation, hypothesis testing, model selection etc.) in a data-driven manner. In a fundamental Rejection ABC scheme, we simulate from the model M(θ ) a synthetic dataset x sim for a parameter value θ and measure the closeness between x sim and x <sup>0</sup> using a pre-defined discrepancy function d(x sim, x 0 ). Based on this discrepancy measure, ABC accepts the parameter value θ when d(x sim, x 0 ) is less than a pre-specified threshold value ǫ.

As the Rejection ABC scheme is computationally inefficient, to explore the parameter space in an efficient manner, there exists a large group of ABC algorithms (Marin et al., 2012). As pointed in (Dutta et al., 2017a), these ABC algorithms, consist of four fundamental steps:


These four steps are repeated until the weighted set of parameters, interpreted as the approximate posterior distribution, is "sufficiently close" to the true posterior distribution. The steps (1) and (4) are usually quite fast, compared to steps (2) and (3), which are the computationally expensive parts.

These ABC algorithms can be generally classified into two groups based on the decision rule in step (2). In the first group, we simulate x sim using the perturbed parameter and accept it if d(x sim, x 0 ) < ǫ, an adaptively chosen threshold. Otherwise we continue until we get an accepted perturbed parameter. For the second group of algorithms, we do not have this "explicit acceptance" step but rather a probabilistic one. Here we accept the perturbed parameter with a probability that depends on ǫ; if it is not accepted, we keep the present value of the parameter. The algorithms belonging to the "explicit acceptance" group are RejectionABC (Tavaré et al., 1997) and PMCABC (Beaumont, 2010), whereas the algorithms in the "probabilistic acceptance" group are SMCABC (Del Moral et al., 2012), RSMCABC (Drovandi and Pettitt, 2011), APMCABC Lenormand et al. (2013), SABC (Albert et al., 2015), and ABCsubsim Chiachio et al. (2014). For an "explicit acceptance" to occur, it may take different amounts of time for different perturbed parameters (more repeated steps are needed if the proposed parameter value is distant from the true parameter value). Hence the first group of algorithms are inherently imbalanced. We notice that an ABC algorithm with "probabilistic acceptance" do not have the similar issue of imbalance as a probabilistic acceptance step takes approximately the same amount of time for each parameter.

The generation of x sim from the model, for a given parameter value, usually takes up huge amounts of computational resources (e.g., 10 min for the platelets deposition model in this paper). Hence, we want to choose an algorithm with faster convergence to the posterior distribution with minimal number of required forward simulations. For this work we choose Simulated Annealing ABC (SABC) which uses a probabilistic decision rule in Step (2) and needs minimal number of forward simulation than other algorithms as shown in Albert et al. (2015). As all tasks of SABC in Step (2) can be run independently, in our recent work Dutta et al., 2017a, we have adapted SABC for HPC environment. Our implementation is available in Python package ABCpy and shows a linear scalability.

We further note that the parallelization schemes in ABCpy were primarily meant for inferring parameters from models, for which forward simulation takes almost equal time for any values of θ . Due to the complex stochastic nature of the numerical model, forward simulation time for different values of θ , can be quite variable. To solve this imbalance in the forward simulation, additionally to the imbalance reported for ABC algorithms, we use a new dynamic allocation scheme for MPI developed in Dutta et al. (2017b).

#### 3.2. Dynamic Allocation for MPI

Here we briefly discuss how a dynamic allocation strategy for map-reduce provides better balancing of ABC algorithms compared to a straightforward allocation approach.

In the straightforward approach, the allocation scheme initially distributes m tasks to n executors, sends the map function to each executor, which in turn applies the map function, one after the other, to its m/n map tasks. This approach is visualized in **Figure 3**, where a chunk represents the set of m/n map tasks. For example, if we want to draw 10, 000 samples from the posterior distribution and we have n = 100 cores available, at each step of SABC we create groups of 100 parameters and each group is assigned to one individual core.

On the other hand, the dynamic allocation scheme initially distributes k < m tasks to the k executors, sends the map function to each executor, which in turn applies it to the single task available. In contrast to the straightforward allocation, the executor requests a new map task as soon as the old one is terminated. This clearly results in a better balance of the work. The dynamic allocation strategy is an implementation of the famous greedy algorithm for job-shop scheduling, which can be shown to have an overall processing time (makespan) up to twice as better than the best makespan (Graham, 1966).

This approach is illustrated in **Figure 3**, reproduced from Dutta et al. (2017b). The unbalanced behavior is apparent if we visualize the run time of the individual map tasks on each executor. In **Figure 4**, the individual map tasks processing time is shown for an ABC algorithm performing inference on a weather prediction model, reported in Dutta et al. (2017b). Each row corresponds to an executor (or rank) and each bar corresponds to the total time spent on all tasks assigned to the respective rank (row) for one map call. For the straightforward allocation strategy, one can easily verify that most of the ranks finish their map tasks in half the time of the slowest rank. This clearly leads to large inefficiencies. Conversely, using the dynamic allocation strategy, the work is more evenly distributed across the ranks. The unbalancedness is not a problem that can be overcome easily by adding resources, rather speed-up and efficiency can drop drastically compared to the dynamic allocation strategy with increasing number of executors. For a detailed description and comparison, we direct readers to Dutta et al. (2017b).

#### 3.3. Posterior Inference

Using SABC within HPC framework implemented in ABCpy (Dutta et al., 2017a), we draw Z = 5000 samples approximating the posterior distribution p(θ |x 0 ), while keeping all the tuning parameters for the SABC fixed at the default values suggested in ABCpy package, except the number of steps and the acceptance rate cutoff, which was chosen respectively as 30 and 1e −4 . The parallelized SABC algorithm, using HPC makes it possible to perform the computation in 5 h [using 140 nodes with 36 core of Piz Daint Cray architecture (Intel Broadwell + NVidia TESLA P100)], which would have been impossible by a sequential algorithm. To perform SABC for the platelets deposition model, the summary statistics extracted from the dataset, discrepancy measure between the summary statistics, prior distribution of parameters, and perturbation Kernel to explore the parameter space for inference are described next.

#### Summary Statistics

Given a dataset, x ≡ {(Sagg−clust(t), Nagg−clust(t), Nplatelet(t), Nact−platelet(t)): t = 0 s., . . . , 300 s.}, we compute an array of summary statistics.

$$\mathcal{F}: \mathfrak{x} \to (\mathfrak{mu}, \mathfrak{a}, \mathfrak{a}, \mathfrak{c}, \mathfrak{c})$$

defined as following,


The summary statistics, described above, are chosen to capture the mean values, variances, and the intra- and inter- dependence of different variables of the time-series over time.

#### Discrepancy Measure

Assuming the above summary statistics contain the most essential information about the likelihood function of the simulator-based model, we compute Bhattacharya-coefficient (Bhattachayya, 1943) for each of the variables present in the timeseries using their mean and variance and Euclidean distances between different inter- and intra- correlations computed over time. Finally we take a mean of these discrepancies, such that, in the final discrepancy measure discrepancy between each of

the summaries are equally weighted. The discrepancy measure between two datasets, x 1 and x 2 can be specified as,

$$\begin{split} d(\mathbf{x}^1, \mathbf{x}^2) &\equiv d(\mathcal{F}(\mathbf{x}^1), \mathcal{F}(\mathbf{x}^2)) \\ &= \frac{1}{8} \sum\_{l=1}^4 (1 - \exp(-\rho(\mu\_l^1, \mu\_l^2, \sigma\_l^1, \sigma\_l^2))) \\ &+ \frac{1}{2} \sqrt{\frac{1}{16} \left( \sum\_{l=1}^4 (a c\_l^1 - a c\_l^2)^2 + \sum\_{l=1}^6 (c\_l^1 - c\_l^2)^2 + \sum\_{l=1}^6 (c c\_l^1 - c c\_l^2)^2 \right)}, \end{split}$$

where ρ(µ 1 ,µ 2 , σ 1 , σ 2 ) = 1 4 log 1 4 σ 1 σ <sup>2</sup> + σ 2 σ <sup>1</sup> + 2 + 1 4 (µ <sup>1</sup>−µ 2 ) 2 σ 1+σ 2 is the Bhattacharya-coefficient (Bhattachayya, 1943) and 0 ≤ exp(−ρ(•)) ≤ 1. Further, we notice the value of the discrepancy measure is always bounded in the closed interval

#### Prior

[0, 1].

We consider independent Uniform distributions for the parameters with a pre-specified range for each of them, pAg ∼ U(5, 20), pAd ∼ U(50, 150), p<sup>T</sup> ∼ U(0.5e − 3, 3e − 3), p<sup>F</sup> ∼ U(.1, 1.5), and a<sup>T</sup> ∼ U(0, 10).

#### Perturbation Kernel

To explore the parameter space of θ = (pAg , pAd, pT, pF, aT) ∈ [5, 20]×[50, 150]×[0.5e−3, 3e−3]×[.1, 1.5]×[0, 10], we consider a five-dimensional truncated multivariate Gaussian distribution as the perturbation kernel. SABC inference scheme centers the perturbation kernel at the sample it is perturbing and updates the variance-covariance matrix of the perturbation kernel based on the samples learned from the previous step.

#### 3.4. Parameter Estimation

Given experimentally collected platelet deposition dataset x 0 , our main interest is to estimate a value for θ . In decision theory, Bayes estimator minimizes posterior expected loss, Ep(<sup>θ</sup> <sup>|</sup><sup>x</sup> 0 ) (L(θ , •)|x 0 ) for an already chosen loss-function L. If we have Z samples (θ <sup>i</sup>) Z i=1 from the posterior distribution p(θ |x 0 ), the Bayes estimator can be approximated as,

$$\hat{\boldsymbol{\theta}} = \operatorname\*{arg\,min}\_{\boldsymbol{\theta}} \frac{1}{M} \sum\_{i=1}^{M} \mathcal{L}(\boldsymbol{\theta}\_i, \boldsymbol{\theta}). \tag{9}$$

As we consider the Euclidean loss-function L(θ , ˆθ ) = (θ − ˆθ ) 2 as the loss-function, the approximate Bayes-estimator can be shown to be <sup>ˆ</sup><sup>θ</sup> <sup>=</sup> <sup>E</sup>p(<sup>θ</sup> <sup>|</sup><sup>x</sup> 0 ) (θ ) ≈ 1 Z P<sup>Z</sup> i=1 θ i .

### 4. INFERENCE ON EXPERIMENTAL DATASET

SABC. The (gray-solid) line indicates the manually estimated values of the parameters in Chopard et al. (2017).

The performance of the inference scheme described in section 3 is reported here, for a collective dataset created from the experimental study of platelets deposition of seven blood-donors. The collective dataset was created by a simple average of <sup>S</sup>agg−clust(t), Nagg−clust(t), Nplatelet(t), Nact−platelet(t) over seven donors at each time-point t. In **Figure 5**, we show the Bayes estimate (black-solid) and the marginal posterior distribution (black-dashed) of each of the five parameters computed using 5000 samples drawn from the posterior distribution p(θ |x 0 ) using SABC. For comparison, we also plot the manually estimated values of the parameters (gray-solid) in Chopard et al. (2017). We notice that the Bayes estimates are in a close proximity of the manually estimated values of the parameters and also the manually estimated values observe a significantly high posterior probability. This shows that, through the means of ABC we can get an estimate or quantify uncertainty of the parameters in platelets deposition model which is as good as the manually estimated ones, if not better.

Next we do a Posterior predictive check to validate our model and inference scheme. The main goal here is to analyze the degree to which the experimental data deviate from the data generated from the inferred posterior distribution of the parameters. Hence we want to generate data from the model using parameters drawn from the posterior distribution. To do so, we first draw 100 parameter samples from the inferred approximate posterior distribution and simulate 100 data sets, each using a different parameter sample. We call this simulated dataset as the predicted dataset from our inferred posterior distribution and present the mean predicted dataset (blue-solid) compared with experimental dataset (black-solid) in **Figure 6**. Note that since we are dealing with the posterior distribution, we can also quantify uncertainty in our predictions. We plot the 1/4-th quantile, 3/4-th quantile (red-dashed), minimum and maximum (gray-dashed) of the predicted dataset at each timepoints to get a sense of uncertainty in the prediction. Here we see a very good agreement between the mean predicted dataset and the experimentally observed one, while the 1/4 and 3/4-th quantile of the prediction being very tight. This shows a very good prediction performance of the numerical

by simulating 100 datasets, each using a different parameter sample drawn from the posterior distribution. Here, we plot the experimental dataset (black-solid) used for inference, mean predicted dataset (blue-solid), 1/4-th and 3/4-th quantile (red-dashed), minimum and maximum (gray-dashed) of the predicted datasets at each timepoints.

model of platelet deposition and the proposed inference scheme.

Additionally, to point the strength of having a posterior distribution for the parameters we compute and show the posterior correlation matrix between the five parameters in **Figure 7**, highlighting a strong negative correlation between (pF, aT), strong positive correlations between (pF, pAg ) and (pF, pT). A detailed investigation of these correlation structure would be needed to understand them better, but generally they may point toward: (a) the stochastic nature of the considered model for platelet deposition and (b) the fact that the deposition process is an antagonistic or synergetic combination of the mechanisms proposed in the model.

Note finally that the posterior distribution being the joint probability distribution of the five parameters, we can also compute any higher-order moments, skewness etc. of the parameters for a detailed statistical investigation of the natural phenomenon.

#### 5. CONCLUSIONS

Here, we have demonstrated that approximate Bayesian computation (ABC) can be used to automatically explore

the parameter space of the numerical model simulating the deposition of platelets subject to a shear flow as proposed in Chopard et al. (2015); Chopard et al. (2017). We also notice the good agreement between the manually tuned parameters and the Bayes estimates, while saving us from subjectivity and a tedious manual tuning. This approach can be applied patient per patient, in a systematic way, without the bias of a human operator. In addition, the approach is computationally fast enough to provide results in an acceptable time for contributing to a new medical diagnosis, by giving clinical information that no other known method can provide. The clinical relevance of this approach is still to be explored and our next step will be to apply our approach at a personalized level, with a cohort of patients with known pathologies. The possibility of designing new platelet functionality test as proposed here is the result of combining different techniques: advanced microscopic observation techniques, bottom-up numerical modeling and simulations, recent data-science development and high performance computing (HPC).

Additionally, the ABC inference scheme provides us with a posterior distribution of the parameters given observed dataset, which is much more informative about the underlying process. The posterior correlations structure shown in **Figure 7** may not have a direct biophysical interpretation, though it illustrates some sort of underlying and unexplored stochastic mechanism for further investigation. Finally we note that, although the manual estimates achieve a very high posterior probability, they are different from the Bayes estimates learned using ABC. The departure reflects a different estimation of the quality of the match between experimental observation and simulation results. As the ABC algorithms are dependent on the choice of the summary statistics and the discrepancy measures, the parameter uncertainty quantified by SABC in section 4 or the Bayes estimates computed are dependent on the assumptions in section 3.3 regarding their choice. Fortunately there are recent works on automatic choice of summary statistics and discrepancy measures in ABC setup (Gutmann et al., 2017), and incorporating some of these approaches in our inference scheme is a promising direction for future research in this area.

### REFERENCES


### ETHICS STATEMENT

This study conforms with the Declaration of Helsinki and its protocol was approved by the Ethics Committee of CHU de Charleroi(comité déthique OM008). All volunteers gave their written informed consent.

### DATA AVAILABILITY

The codes used to simulate the platelets deposition processes and to infer the process parameters from data can be downloaded from: https://github.com/eth-cscs/abcpy-models/ tree/master/BiologicalScience/PlateletsDeposition.

### AUTHOR CONTRIBUTIONS

RD, BC, and AM design of the research. RD performed research. KZB and FD experimental data collection. RD and BC writing of the paper. AM, KZB, JL, FD, and AM contribution to the writing. BC and JL design and coding of the numerical forward simulation model.

## FUNDING

RD and AM are supported by Swiss National Science Foundation Grant No. 105218\_163196 (Statistical Inference on Large-Scale Mechanistic Network Models). We thank CADMOS for providing computing resources at the Swiss Super Computing Center. We acknowledge partial funding from the European Union Horizon 2020 research and innovation programme for the CompBioMed project (http://www.compbiomed.eu/) under grant agreement 675451.

### ACKNOWLEDGMENTS

We thank Dr. Marcel Schoengens, CSCS, ETH Zürich for helps regarding HPC services to run ABCpy on super computers. We thank CHU Charleroi for supporting the experimental work used in this study.


long-term therapy in patients undergoing percutaneous coronary intervention: the pci-cure study. Lancet 358, 527–533. doi: 10.1016/S0140-6736(01)05701-4


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer JB and handling Editor declared their shared affiliation

Copyright © 2018 Dutta, Chopard, Lätt, Dubois, Zouaoui Boudjeltia and Mira. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Gaussian Process Regressions for Inverse Problems and Parameter Searches in Models of Ventricular Mechanics

Paolo Di Achille<sup>1</sup> , Ahmed Harouni <sup>2</sup> , Svyatoslav Khamzin3,4, Olga Solovyova3,4 , John J. Rice<sup>1</sup> and Viatcheslav Gurev <sup>1</sup> \*

*<sup>1</sup> Healthcare and Life Sciences Research, IBM T.J. Watson Research Center, Yorktown Heights, NY, United States, <sup>2</sup> IBM Research Almaden, San Jose, CA, United States, <sup>3</sup> Ural Federal University, Yekaterinburg, Russia, <sup>4</sup> Institute of Immunology and Physiology, Ural Branch of the Russian Academy of Sciences (UB RAS), Yekaterinburg, Russia*

Patient specific models of ventricular mechanics require the optimization of their many

#### Edited by:

*Raimond L. Winslow, Johns Hopkins University, United States*

#### Reviewed by:

*Pablo Lamata, King's College London, United Kingdom Martyn P. Nash, University of Auckland, New Zealand*

> \*Correspondence: *Viatcheslav Gurev vgurev@us.ibm.com*

#### Specialty section:

*This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology*

Received: *25 January 2018* Accepted: *09 July 2018* Published: *14 August 2018*

#### Citation:

*Di Achille P, Harouni A, Khamzin S, Solovyova O, Rice JJ and Gurev V (2018) Gaussian Process Regressions for Inverse Problems and Parameter Searches in Models of Ventricular Mechanics. Front. Physiol. 9:1002. doi: 10.3389/fphys.2018.01002* parameters under the uncertainties associated with imaging of cardiac function. We present a strategy to reduce the complexity of parametric searches for 3-D FE models of left ventricular contraction. The study employs automatic image segmentation and analysis of an image database to gain geometric features for several classes of patients. Statistical distributions of geometric parameters are then used to design parametric studies investigating the effects of: (1) passive material properties during ventricular filling, and (2) infarct geometry on ventricular contraction in patients after a heart attack. Gaussian Process regression is used in both cases to build statistical models trained on the results of biophysical FEM simulations. The first statistical model estimates unloaded configurations based on either the intraventricular pressure or the end-diastolic fiber strain. The technique provides an alternative to the standard fixed-point iteration algorithm, which is more computationally expensive when used to unload more than 10 ventricles. The second statistical model captures the effects of varying infarct geometries on cardiac output. For training, we designed high resolution models of non-transmural infarcts including refinements of the border zone around the lesion. This study is a first effort in developing a platform combining HPC models and machine learning to investigate cardiac function in heart failure patients with the goal of assisting clinical diagnostics.

Keywords: LV mechanics, FEM, infarct model, unloaded configuration, kriging, inverse optimization, statistical learning

### 1. INTRODUCTION

Multi-scale models of cardiac mechanics, although are promising (e.g., Kerckhoffs et al., 2007; Nordsletten et al., 2011; Gurev et al., 2015; Land et al., 2017), have found limited applications for diagnosis and treatment. To reach the levels of accuracy needed to assist clinical decisions, models need to overcome major complications related to accessing clinical data, constraining unknown parameters, and coping with computational complexity. Some of the uncertainties associated to patient-specific cardiac models can be partially addressed with increased public access to large clinical datasets (Fonseca et al., 2011) and to high performance computing resources (Towns et al., 2014). Sophisticated finite element (FE) biomechanical simulations can be combined with machine learning techniques to translate parametric studies into efficient statistical models of virtual patient populations. Once an upfront computational cost is paid for training, the coupled effects of varying model parameters can be explored almost in real time, facilitating the solution of the optimization and inverse estimation problems that are required to personalize models for specific patients.

This paper discusses statistical models based on a machine learning technique called Gaussian Process (GP) regression, also known as kriging (Rasmussen and Williams, 2006). After training a "surrogate" of the more expensive FE models, GP regression can be used to assist optimization algorithms, even in complex cases where objective functionals cannot be easily differentiated (Booker et al., 1999; Abramson et al., 2009). More recently, GP regression has also been used in cardiovascular modeling, where it has found application in both fluid and solid mechanics (Marsden et al., 2008; Sankaran and Marsden, 2011; Pérez et al., 2016).

Recent developments in medical imaging techniques have opened new opportunities for cardiac modeling to augment image-based biomarkers from CT, MRI, and ultrasound scans (Lamata et al., 2014). As accuracy and availability of imaging modalities continues to improve, there is a growing need for novel strategies that exploit the capabilities of multi-scale models to enhance diagnostic tools. We present a systematic analysis of the Sunnybrook Cardiac MRI database, a public collection of cine-MRIs (Radau et al., 2009). Statistics gathered from the database were used to design two parametric studies investigating the passive behavior of the myocardium upon inflation and the effects of infarct on cardiac performance.

In the first parametric study, we developed a novel strategy to estimate the unloaded configuration (needed to initialize both passive and active FEM simulations) given either the end-diastolic intraventricular pressure, or the end-diastolic fiber strain. The new method relies on solving multiple forward problems to train a regression model from which unloaded configurations can be inferred for ventricles with arbitrary shapes. Despite such a problem could be alternatively solved with the fixed point iteration method (Sellier, 2011; Genet et al., 2015), our approach has some advantages. Specifically, our method can be easily applied in situations where the intraventricular pressure is not directly known (but could be inferred, for example, from the fiber strain), or where the unloaded geometry is one of the unknown parameters of an optimization problem.

The second example integrates machine learning and multiscale modeling in a systematic parametric study investigating the effects of infarct on simulated cardiac performance. Location, size, and transmural depth of the infarct were chosen as input variables of a GP regression model predicting changes in simulated stroke volume due to the scar. This work exploited the capabilities of our in-house solver and an automatized workflow to run 40 simulations of infarct with varying shapes and locations. After training on results of FE simulations, the GP regression model provides a useful representation for the analysis of complex effects. Non-transmural infarcts were simulated with a high numerical accuracy.

### 2. METHODS

### 2.1. Cine-MRI Segmentations and Parameterization via Idealized Models

Publicly available imaging datasets from the Sunnybrook Cardiac MRI database (Radau et al., 2009) were systematically processed to establish boundaries and proper feature distribution for parametric exploration. The Sunnybrook database gathered 45 cine-MRI scans collected from healthy subjects (N, n = 9), patients with ventricular hypertrophy (HYP, n = 12), and patients affected by heart failure both in presence and absence of myocardial infarction (HF-I, n = 12 and HF-NI, n = 12, respectively). For each scan, we considered only the short axis stack series, which provided ∼10–15 axial slices per left ventricle (LV) and 20 frames per cardiac cycle. Average voxel sizes were (1.36 ± 0.057 mm) × (1.36 ± 0.057 mm) × (8.8 ± 1.0 mm) in the left-right, anterior-posterior, and apical-basal directions, respectively.

An in-house multi-atlas image processing technique (Xie et al., 2015) was used to co-register the axial slices of each dataset and then segment the LV boundaries. The first 2 columns of **Figure 1A** show the procedure applied to a representative 3-D image from the database. Outputs were labeled voxels marking the LV blood pool (shown in white semi-transparent overlay) and the ventricular wall (shown in red). The low resolution in the apical-basis direction typical of cine-MRI short axis views introduced segmentation artifacts that prevented direct use in FEM models. We therefore performed a further parameterization step (see third column) to approximate LV geometries as truncated prolate spheroids, as initially proposed by Streeter and Hanna (1973) and more recently revisited by Pravdin et al. (2014). According to such a scheme, the endocardial and epicardial profiles of an idealized axisymmetric LV were described by the following relations

$$\begin{array}{l} \rho\_{\varepsilon p i} = R\_b \left[ e \cos \psi + (1 - e)(1 - \sin \psi) \right] \\ \zeta\_{\varepsilon p i} = Z \left( 1 - \sin \psi \right) \\ \rho\_{\varepsilon d i} = (R\_b - L) \left[ e \cos \psi + (1 - e)(1 - \sin \psi) \right] \\ \zeta\_{\varepsilon d i} = (Z - H)(1 - \sin \psi) + H \end{array} \tag{1}$$

linking the radial (ρ) and axial (ζ ) coordinates of the epicardial and endocardial boundaries to the angle variable ψ ∈ [ψ0, π/2]. In the equations above, the idealized geometry is defined by 6 parameters: the outer radius at base, R<sup>b</sup> ; the length of the longitudinal semi-axis of the outer spheroid, Z; the ventricular wall thicknesses at base and apex, L and H, respectively; the sphericity/conicity of the spheroid, e ∈ [0, 1]; and, finally, the truncation angle, ψ0. **Figure 1B** shows a schematic of an idealized LV annotated with geometric descriptions of the parameters.

In order to describe the segmentation results in terms of the idealized models described above, we implemented an ad hoc optimization procedure to find sets of parameters ξ = {R<sup>b</sup> , Z, L, H, e, 90} that would best match the MRI segmentations (IMR). Each iteration involved first generating a binary 3-D image

I<sup>ξ</sup> marking the LV volume defined by ξ , and then evaluating an objective function J defined as

$$J(I\_{\xi}, I\_{\text{MR}}) = 1 - \frac{1}{2} \left( \frac{C\_{\xi} \cap C\_{\text{MR}}}{C\_{\xi} \cup C\_{\text{MR}}} + \frac{W\_{\xi} \cap W\_{\text{MR}}}{W\_{\xi} \cup W\_{\text{MR}}} \right), \tag{2}$$

where C<sup>ξ</sup> and CMR indicate the ventricular cavity regions in the idealized and MR segmentation images, respectively; and W<sup>ξ</sup> and WMR similarly indicate corresponding ventricular wall volumes. In other words, J ∈ [0, 1] provides a measure of similarity between a "synthetic" segmentation Iξ generated for any given ξ and the actual MRI processing results IMR. The "Nelder-Mead" algorithm available in SciPy was used to carry out the optimization up to convergence for every image dataset included in the database.

The relations in (1) do not include any parameters accounting for the rigid translation and rotations that LVs normally experience during a cardiac cycle. To overcome such limitation and to improve fitting results, each objective function evaluation was preceded by a rigid transformation step aimed at aligning the idealized model to the target segmented geometry. Specifically, we first estimated the main longitudinal axis of the segmented ventricle as the best-fit direction aligning the centers of gravity of the LV segmented axial slices. We then rigidly transformed the idealized models to let the longitudinal axes and the centers of gravity of the two geometries coincide. **Figure 1C** shows overlapped optimization results and corresponding MRI segmentation for a representative cine-MRI frame after rigid motion correction.

#### 2.2. Passive Material Properties

To assess whether the inverse esimation method presented in this work would generalize to describe other constitutive behaviors (e.g., from future experiments on animal and human tissues, or from novel modeling developments), we considered 3 sets of material parameters (and related functional formulations) from the literature that describe experimental findings on canine, swine, and human ventricle biomechanics. Usyk et al. (2000) fitted a Fung-type orthotropic strain energy function to experiments on canine models

$$W\_U = \frac{C}{2} \left( \exp(Q) - 1 \right), \quad Q = \left. b\_{\rm ff} E\_{\rm ff}^2 + b\_{\rm ss} E\_{\rm ss}^2 + b\_{\rm nn} E\_{\rm nn}^2 \right.$$

$$\begin{split} + b\_{\rm fs} \left( E\_{\rm fs}^2 + E\_{\rm sf}^2 \right) &+ b\_{\rm fn} \left( E\_{\rm fr}^2 + E\_{\rm nf}^2 \right) \\ + b\_{\rm ns} \left( E\_{\rm ns}^2 + E\_{\rm sn}^2 \right), \end{split} \tag{3}$$

where Eij (i, j = f ,s, n) are components of the Green-Lagrange strain tensor expressed in a reference frame locally aligned along the fiber direction (f), the orthogonal direction spanning the myocardial sheet (s), and the cross-fiber direction (n). Values for the C and bij (i, j = f ,s, n) coefficients are reported in **Table 1**.

The remaining 2 constitutive behaviors here considered followed the constitutive law based on the invariants of the right Cauchy-Green strain tensor **C** proposed by Holzapfel and Ogden (2009),

$$W\_{HO} = \frac{a}{2b} \left\{ \exp\left[b(I\_1 - 3)\right] \right\} + \sum\_{i = \text{fl,ss}} \frac{a\_i}{2b\_i} \left\{ \exp\left[b\_i(I\_{4i} - 1)^2\right] - 1 \right\}$$

$$+ \frac{a\_{\text{fs}}}{2b\_{\text{fs}}} \left\{ \exp\left[b\_{\text{fs}}I\_{8\text{fs}}^2\right] - 1 \right\},\tag{4}$$

where I1=tr **C** is the first invariant of **C**, here applied as the argument of an exponential term; I4<sup>i</sup> = **v**<sup>i</sup> · (**C** · **v**i), i = ff, ss is the fourth invariant of **C**, which corresponds to the squared stretch of a line element oriented along the fiber (**v**ff) or sheet (**v**ss) directions; finally, I8fs = **f**<sup>0</sup> · (**C** · **s**0) is the eighth


TABLE 1 | Sets of material properties considered in the study.

*<sup>W</sup><sup>U</sup> is expressed in terms of components of the Green-Lagrange strain tensor* <sup>E</sup>*, while W*{*W*,*G*} *HO depends on invariants of the right Cauchy-Green tensor* C*.*

invariant of **C**, which captures the effects of strain coupling. Equation (4) has been shown to describe well experiments on pig ventricles (Dokos et al., 2002), and more recently the biaxial and triaxial tests conducted on human myocardial tissue by Sommer et al. (2015). Among best-fit values reported in literature, we selected materials parameters for (4) from Wang et al. (2013) (W<sup>W</sup> HO, fitted to experiments on swine models) and Gültekin et al. (2016) (W<sup>G</sup> HO, fitted to experiments on human tissue). The coefficients for all considered material properties are reported in **Table 1**.

### 2.3. FEM Models of LV Passive Biomechanics

High-resolution FEM simulations of LV biomechanics are at the core of the parameter exploration and inverse estimation strategies presented in this work. To cope with the complexities of the mechanical behavior of the myocardium, we employed a recently validated numerical solver suitable for dealing with incompressible hyperelastic material laws such as those in (3) and (4) (Gurev et al., 2015), and extended to use stabilized P1/P1 finite elements. The capabilities are necessary for infarct simulations, where capturing sufficient detail at the border zone region around the lesion is pivotal (see section 2.6). The solution algorithm also allows multi-scale effects, and we used the TriSeg ODE-based model with parameters for human to drive myofilament active contraction (Lumens et al., 2009; Gurev et al., 2015). Coupling between cellular and tissue mechanics occurred at the Gauss point level.

To handle the relatively large number of simulations needed to train statistical models, we developed an automatic workflow to construct high-resolution computational domains from any given sets of geometric parameters ξ describing LV anatomy. In this pipeline, analytical models built according to (1) were first converted to 3-dimensional triangulated surfaces, and then to solid meshes of several hundred thousands of tetrahedral elements. Nodes at the base of the ventricle were prevented to move axially, while epicardial nodes in the vicinity of the base (i.e., closer than 3 mm) were fully locked to prevent rigid motions. Boundary traction effects from the pericardial membrane and the right ventricle were neglected, and intraventricular pressure was uniformly applied at the endocardial surface in quasi-static steps. The vector **v**ff of alignment of myocardial fibers varies heterogeneously along the radial direction of the myocardium (McCulloch, 1999; Humphrey, 2002). Without specific measurements for the patients in the database, we relied on a rule-based approach to assign fiber directions linearly varying their angle with respect to the circumferential direction from 90◦ at the endocardial surface (i.e., longitudinally aligned) to -60◦ at the epicardium.

The mechanical equilibrium equations were solved in parallel on the Cognitive Computing Cluster (CCC), a hybrid high performance shared resource developed at IBM Research deploying both Intel and Power8 nodes. Active infarct simulations required ∼10 times more resources than passive models, and were run on the Uran Supercomputer hosted by UB RAS and Ural Federal University. Outputs of the simulations were nodal displacement vector fields, and components of stress and strain tensors defined at the element Gauss points. To relate predictions also to strain dependent length activation of the sarcomere, we also evaluated stretch in the fiber direction, defined as

$$
\lambda\_{\text{ff}} = \sqrt{\mathbf{v}\_{\text{ff}} \cdot \mathbf{C} \cdot \mathbf{v}\_{\text{ff}}} \tag{5}
$$

where **v**ff is the vector aligned along the myofiber direction (as described above), and **C** is the right Cauchy-Green strain tensor. As a representative scalar of each loading state, we also averaged λff at midwall, which we defined as a tissue slab located between 40 and 60% of the LV wall thickness and between 45 and 55% of the apex-base distance.

#### 2.4. Parameterization of FEM Results

A key aspect of the inverse unloading method presented in this work is the re-parameterization of FEM simulation results in terms of the same geometric parameters employed to process the Sunnybrook database. A 2-step optimization procedure was implemented to fit idealized models of LV anatomy to the deformed configurations predicted by the FEM analyses upon varying loading conditions. First, optimal values for Rb , Z, e, and 9<sup>0</sup> were found to minimize average nodal distance between the profile of an idealized epicardium and the corresponding boundary obtained from a FE mesh warped according to the simulations results. Second, a similarly defined nodal distance measure was used to quantify discrepancies between endocardial profiles in order to adjust the remaining L and H parameters. The 2 steps were re-iterated until reaching convergence. An alternative monolithic approach where the 6 parameters were optimized at the same time was also evaluated, but proved to be less computationally efficient.

#### 2.5. Statistical Learning of LV Unloading

Bulk processing the Sunnybrook cine-MRI image datasets provided information on expected anatomical variability among patients. As part of our inverse unloading estimation strategy, we leveraged database statistics to define a 6-D parameteric space that enclosed all likely LV unloaded configurations. More specifically, we reasoned that the parametric study should conservatively admit and explore large variations in ventricle geometries, since the unloaded state might differ significantly from any of the imaged configurations. Limits of the parametric space were therefore defined to encompass variations of more than 3 standard deviations from the average beginning of diastole (BoD) state, which we chose as most reasonable guess lacking the measurements needed for better estimates (e.g., Xi et al., 2013). More details on the subdivision of the cardiac cycle into its phases are reported in the **Supplemental Material**. **Figure 2A** shows pairs of limit parameter values and corresponding LV cross-sections representing maximum allowed variations of each of the 6 geometric features. In drawing the profiles, only one of the 6 parameters was changed while keeping the remaining 5 at corresponding mid-range values. Unloaded configurations admitted to our study were, therefore, intermediate states of the low- and high-parameter geometries shown in **Figure 2A** in gray and black tones, respectively. The statistical distribution of

FIGURE 2 | Design of training sets for the 2 statistical models: LV unloading (A–C) and infarct shape effects (D–F). (A) Pairs of LV cross-sections representing extreme geometries limiting parameter space dimensions. Gray (black) cross-sections correspond to extreme negative (positive) variations of one of the geometric parameters, with the remaining 5 parameters kept at mid-range values. (B) Projection of the 6-D parametric space onto a 3-D cube obtained by neglecting the last 3 dimensions (H, *e*, and 90). Spherical glyphs indicate locations of 600 sampling points chosen via latin hypercube sampling from a normal distribution centered on the average LV geometry and with a doubled standard deviation compared to that of the complete Sunnybrook database. (C) Cross-section of the parameter space for LV unloading showing combined variations of R*b* and Z parameters. (D) Similar to (A), but showing pairs of FE meshes including infarct regions with extreme shapes. The lightest tone of gray indicates the healthy region, the darkest tone indicates the infarct, and the intermediate one marks the refined border zone. (E) 3-D projection of the 4-D parameter space defining infarct shape obtained neglecting the 1Long. dimension. Similarly to (B), spheres indicate locations of 40 sampling points chosen uniformly in the allowed range parameters. (F) Mid-range slice of the 3-D projection showing representative FE meshes accounting for combined variations of longitudinal location and transmural depth of the infarct.

BoD states was also used to design an efficient probing scheme for the parametric space defined above. As is common now (Marsden, 2014), we used latin hypercube sampling to select 600 points (i.e., 100 times the number of parameters) from a normal distribution centered on the average BoD state and with doubled standard deviation compared to that of the Sunnybrook database. Samples falling beyond the limits defined in **Figure 2A** were projected onto the closest admissible point. A cloud of chosen probing locations is shown within a unitary 3-D projection of the parameter space in **Figure 2B**. For this representation, the H, e, and 9<sup>0</sup> dimensions were neglected. **Figure 2C** further shows a mid-slice of the parametric cube exploring geometries corresponding to coupled variations of the R<sup>b</sup> and Z parameters.

For each of the 600 sampled ventricle geometries we ran passive inflation simulations for inner LV pressures ranging between 0 and 5 kPa. Results were processed as described in section 2.4 to find optimal geometric parameters for 100 intermediate loading configurations (i.e., differing by 0.05 kPa). These best-fit parameters constituted the training set for GP regression models mapping loaded configurations to their corresponding unloaded state. Overall, we optimized 100 statistical models (one for each considered inner pressure), and fitted 2 additional GP regressions for unloading the inflated configurations for which the midwall fiber stretch reached the values of 10 and 15%.

### 2.6. Statistical Learning of Infarct Shape and Size on LV Performance

With our solver capable of handling high-resolution tetrahedral meshes, we explored the effects of different infarct shapes and locations on simulated LV cardiac cycles. The lesions were parameterized according to 4 features: longitudinal position (Long. ∈ [0, 1]), indicating whether an infarct was closer to the base (Long.=0) or apex (Long.=1); circumferential extension (1Circ. ∈ [0, π]), indicating the portion of circumference (measured in radians) occupied by the infarct; longitudinal extension (1Long. ∈ [0, 1]) indicating the fraction of longitudinal cross section harboring a lesion; and wall depth (Depth ∈ [0, 1]), indicating the transmural extension of the infarct, with the maximum value of 1 indicating a fully transmural lesion. **Figure 2D** shows computational domains reconstructed from limit cases of the infarct parameterization. Similar to that presented in section 2.5, latin hypercube sampling was used to efficiently probe the parameter space. Our sample size was of 40 points, (i.e., 10 times the number of parameters), and we assumed a uniform distribution of parameters across the admissible range. Also, to restrict our attention to the effects of infarct without the added complications introduced by changing LV geometry, all lesions were inserted into the same baseline LV from patient I-02. Infarct effects were simulated by deactivating active contraction in the lesion regions, while maintaining the same passive material properties. Similar to **Figures 2B,C,E,F** show projections of selected samples onto the considered parameter space of infarct lesions. More details on the general procedure followed to mesh infarcted regions of arbitrary shapes are available in the **Supplemental Material**.

## 3. RESULTS

Once enhanced with rigid motion correction, the 6-parameter description of LV geometry was able to capture anatomical and kinematic features from the Sunnybrook MRI scans. Median values of the similarity functional J(I<sup>ξ</sup> ,IMR) averaged for each category of patients were 0.29 for N, 0.30 for HYP, 0.23 for HF-NI and 0.19 for HF-I, respectively. **Figure 3** shows average group traces of best-fit geometric parameters (see Equation 1) over the course of a normalized cardiac cycle. Certain trends agreed well with known morphologic features of cardiac disease. Patients affected by heart failure (i.e., from the HF-I and HF-NI categories) presented on average the most dilated ventricles, as indicated by the largest R<sup>b</sup> values, and the smallest cyclical variations in both e and 90, probably due to myocardial dysfunction. Hypertrophic patients, on the other hand, maintained highest L values throughout the cycle (L = 12 mm on average) and showed a large systolic thickening (L = 15 mm at peak systole for HYP patients). Only N subjects contracted more visibly, with an average 54% increase in L from diastole to systole. N and HYP subjects overall exhibited the largest changes in truncation angles. Other parameterization findings were less intuitive. For all LVs, contraction in the longitudinal direction was captured mainly by varying 9<sup>0</sup> rather than Z, which instead remained close to constant throughout the cycle. Also, the dynamic pattern of e observed in HF patients was peculiar. For example, 10 out of 12 HF-I subjects exhibited increased e at systole compared to diastole, while the opposite was typically observed in the N and HYP categories of patients. Combined behavior of e and 9<sup>0</sup> differed also among the 2 classes of HF patients: in presence of an infarct, both e and 9<sup>0</sup> were smaller in magnitude, indicating that HF-NI ventricles tended to be more spherical than the HF-I ones. **Table 2** reports best fit sets of geometric parameters for all 45 patients at both beginning and end of diastole (BoD and EoD, respectively).

The distribution of LV shapes at BoD (see **Table 2**) was pivotal to design our admissible parameter space, both for establishing range limits and for choosing the frequency of allowed variations. **Figure 4A** shows 3 representative unloaded configurations out of the 600 selected to probe the space. Each geometry was first discretized into a computational domain (see meshes below the idealized profiles) and then inflated with inner pressures up to 5 kPa. Shown also are color coded distributions of the first invariant of the Green-Lagrange strain tensor (I1E). Strain fields were visibly larger in the LVs endowed with W<sup>U</sup> material properties (i.e., those on the first row of each subgroup) than in those endowed with W<sup>G</sup> HO (i.e., those on the second row of each subgroups). While the parameteric study extensively explored combined effects of LV geometric features on deformation, the subsequent processing step converted results back to the 6 parameter description (see profiles in gray above and below strain results). Out of the chosen 600 probing profiles, 67 exhibited incompatible features that prevented completion of the FEM simulations (e.g., a disproportionately large L and negative 9<sup>0</sup> in a ventricle with minimum R<sup>b</sup> ), and were therefore excluded from the analysis. **Figure 4B** shows violin plots of geometric parameter distributions for ventricles at the BoD configuration

from the database, at the assumed unloaded configuration, and at 10 deformed configurations for pressures ranging from 0.5 to 5.0 kPa. The BoD distributions (see plots in black tone, leftmost sector) clearly reflected the categories of the database. For example, the violin plot of the R<sup>b</sup> parameter (first row) indicated a bimodal distribution, as expected given the sharp differences in ventricle radius between HF patients and the others. By design, the sampled unloaded configurations followed a normal distribution allowing large variations (see plots in lightest gray tone, second sector from the left). Some hard limits on admissible parameter values were enforced to reduce the number of incompatible geometries selected (see section 2.5). The effects of these hard limits were noticeable particularly within the L, e, and 9<sup>0</sup> distributions (see last 3 rows), the tails of which were thickened by assimilating parameters beyond allowed range.

custom procedure to fit idealized model to segmentation results. See text for more detail.

Finally, the distributions of loaded configurations (see plots in intermediate gray tones, three rightmost sectors) showed the evolution of geometric parameters upon pressurization, which followed the expected behavior for incompressible hyperelastic materials. For example, the R<sup>b</sup> parameter increased relatively fast at low pressures, but then dilation progressively stopped accounting for the exponential increase in stiffness. The thickness parameters L and H decreased upon pressurization (also ensuring incompressibility), while the 9<sup>0</sup> parameter distributions were the most sensitive to pressure. Finally, the material properties could be ranked in order of increasing stiffness as WU, W<sup>G</sup> HO, and <sup>W</sup><sup>W</sup> HO, as shown by changes in mean values from the distributions (see white lines inside the violin plots).

The computational cost of optimizing a GP regression to a few hundred training points (∼1 CPU min) is negligible compared



FIGURE 5 | Unloading via kriging and comparison to the fixed point iteration method. (A) Unloading procedure is shown applied to a representative case (NI-14, unloading pressure P = 1 kPa, and *W<sup>G</sup> HO* material properties) for which a statistical model trained on 75 arbitrary ventricles matched best unloading results via fixed point iteration method. While the fixed point iteration method required meshing of the ventricles in the loaded configuration and iterative updates (middle row), the statistical method allowed to infer the unloaded geometry directly from the 6 parameters describing the end-diastolic (loaded) configuration (bottom row). Top row is similar to bottom row, but shows result obtained after training a statistical model on results from the full parametric study of 500+ LVs. The rightmost column shows overlapped cross-sections of unloaded LVs obtained via the fixed point iteration method (dashed boundary) and 2 statistical models (solid gray tones). (B) Similar to (A), but applied to another representative case (I-07, unloading pressure P = 2 kPa, and *W<sup>U</sup>* material properties) for which the statistical learning method (with *n*train = 75) yielded the worst overlap to fixed point iteration results (Dice score of 0.90). In this case, increasing the training set size led to improved results (Dice score of 0.96).

to that of running even only a single passive high resolution simulation. To optimize the use of computational resources, we sought, therefore, the minimum training set size that ensured satisfactory accuracy in estimating the unloaded configurations for all patients in the dataset. **Figures 5A,B** show cases where predictions by GP regression compared best (see **Figure 5A**) and least well (see **Figure 5B**) to the configurations predicted via fixed point iteration for a relatively small training size (ntrain=75). As starting (loaded) configurations, we chose geometries from the database at EoD (see first column in both panels), and from these we inferred corresponding unloaded configurations assuming inner LV pressures of either 1 or 2 kPa. Comparison between results from the 2 methods were evaluated in terms of Dice score between unloaded profiles (see **Supplemental Material** for details on Dice score computations). According to our analysis, ntrain=75 was the minimum training set size ensuring Dice scores larger or equal than 0.90 for all cases considered (i.e., including all the LV geometries, both EoD inner pressures, and the 3 sets of material properties). From the last column of **Figure 5B** one can appreciate how even a Dice score of 0.90 corresponds to a visibly good match between the GP regression prediction (see LV in gray tone) and corresponding geometry obtained via fixed point iteration (see overlapped dashed line). Small mismatches could be observed even in cases with high Dice score in regions close to the base of the LV (e.g., see last column of **Figure 5A**). These artifacts could be attributed to the zero-displacement boundary condition applied to epicardial elements within 3 mm from the base in the fixed point optimizations. Note that the fixed point iteration method required discretization of the EoD domain and repeated mesh deformation steps (see middle row in both panels). In contrast, after GP regression training the unloaded configurations could be inferred almost in real time, and as another advantage, the GP regression method eliminates potential issues introduced by iteratively warping the mesh (e.g., element degeneration) in the fixed point iteration method. The top row in both panels shows unloaded profiles inferred from GP regressions trained on the full set of simulation results. In the best match case shown in panel A results were essentially the same for ntrain=75, while the Dice score increased by 0.06 when we expanded the training dataset from ntrain=75 to 533 in the worst match case (see **Figure 5B**).

**Figure 6** plots average Dice scores comparing GP regressions to fixed point iteration. The 3 rows in **Figure 6A** show results for different sets of material properties at an unloading pressure of 1 kPa (first column) or 2 kPa (second column). As expected, increasing training sizes generally yielded better Dice scores, although little improvement was observed beyond ntrain=75. Also reported are average Dice scores quantifying the overlap between fixed point iteration and the BoD or EoD configurations, as well as the average overlap with the OptD configuration, which was chosen as the imaged diastolic configuration that matched best the unloaded geometry. White dashed lines overlapped to the bars indicate the lower 10th-percentile Dice score observed for predictions from GP regressions.

Additional GP regression models were trained to handle situations where intraventricular pressure is unknown, but can be estimated by indirect measurements such as the fiber strain at midwall (see section 2.3). **Table 3** reports unloaded geometries for all patients in the Sunnybrook database under the assumptions of W<sup>U</sup> material properties and end-diastolic

fiber strains of either 1.10 (λ 10% ff ) or 1.15 (λ 15% ff ). Outputs of the procedure included end-diastolic LV pressure values corresponding to the target fiber strains in the loaded configurations. **Figure 6B** reports accuracy of GP regression predictions measured in terms of Dice score with predictions via fixed-point iteration method.

Other than being used for inverse problems, GP regressions are ideal as tools for rapidly exploring multi-dimensional parameter spaces. As a proof of concept for the usage, we show preliminary results for a parametric study of the effect of infarct location and shape on cardiac performance as assessed by stroke volume (SV). **Figure 7A** shows color maps of simulated SV over 2-D slices of the 4-D parameteric space. Also shown, are projections onto each slice of the probing locations composing the full training set (see black dots). Each plot isolates the combined effects of 2 out of the 4 parameters used to define infarct shape and location. As expected, increases in lesion sizes yielded significant drops in SV. Maximum combined effect was reached by increasing both circumferential and transmural extension. Starting from a baseline failing LV with SV = 49 ml, GP regression predicted a drop down to SV = 21 mL at maximum depth and circumferential extension. **Figure 7B** shows 5-fold cross-validation for evaluating progressive convergence of GP regression for increasing training sizes. Average relative discrepancies between SV values from simulations and corresponding predictions from GP regression progressively decreased to 6% for a maximum training size of 40 simulations.

**Figure 8A** compares in detail 2 simulations from the training set characterized by different infarct morphologies. While INF<sup>16</sup> (on the left) harbored a non-transmural basal infarct, the lesion in INF<sup>30</sup> was larger, more apical, and fully transmural. The high level of mesh refinement within and surrounding the infarct (see regions in darkest and intermediate gray tones, respectively)


TABLE 3 | Unloaded geometries inferred via GP regression assuming EoD fiber stretches at midwall of either 1.10 (λ 10% ff ) or 1.15 (λ 10% ff ) and *WU* set of material properties.

*See text for more details*

FIGURE 7 | Statistical model of infarct shape and location effects on simulated SV. (A) Color-coded distribution of SV as predicted by kriging on 6 slices of the 4-D parametric hyperspace. Each plot shows combined effects of variations of 2 parameters on simulated SV, as shown by scale bar (values outside the range are truncated). Darker (lighter) color tones indicate stronger (weaker) impairment due to infarct. Dots represent projections of the probing points onto the slice plane. (B) 5-fold cross-validation to assess performance of the statistical model for varying training sizes *n*train. Relative error on simulated SV predictions approached 6% for the maximum training set size (*n*train = 40).

required the capability of our solver of handling high resolution tetrahedral meshes. **Figure 8B** compares simulated PV loops for the 2 models described above. As expected, INF<sup>30</sup> (see dashed line), which harbored a larger lesion, exhibited a stronger impairment in simulated cardiac performance. The PV loops

show the weaker contraction generated by INF<sup>30</sup> despite an increase in end-diastolic volume (i.e., SV = 40 ml and SV = 32 ml for INF<sup>16</sup> and INF30, respectively).

loops showing smaller SV for the largest lesion INF30 (dashed line), as

### 4. DISCUSSION

expected.

Numerous computational models of LV mechanics have been developed over the years to understand better LV function in normal and diseased hearts with the ultimate goal of assisting personalized diagnostics and treatment. Available models differ both in terms of enclosed biophysical detail and of anatomical representation. In the simplest form, left ventricular function can be captured by a time-varying elastance model, where a single time-varying ODE couples the evolution of intraventricular pressure and volume over the course of a cycle (Suga and Sagawa, 1972; Stergiopulos et al., 1996). At the other end of the complexity scale, models of LV mechanics incorporate phenomenological or biophysical descriptions of muscle contraction at the microscopic level, while at the same time capturing in detail the cardiac anatomy on high-resolution computational domains (e.g., Guccione et al., 1995; Kerckhoffs et al., 2007; Göktepe and Kuhl, 2010; Baillargeon et al., 2014; Sundnes et al., 2014; Gurev et al., 2015; Augustin et al., 2016). Although these highly refined 3D models provide valuable information, they entail high computational costs. To improve computational efficiency, models with intermediate levels of complexity have been based on simplifying assumptions on ventricular geometry and structure (Arts et al., 1979; Beyar and Sideman, 1984; Lumens et al., 2009). For prolate spheroid geometries and passive mechanics simulations, distributions of stress in other low order models can match well FEM results despite running faster than in real-time (Moulton and Secomb, 2013, 2014; Moulton et al., 2017).

Significant reductions in computational costs can be similarly achieved by training machine learning models on the results of opportunely sampled biophysical simulations. As a proof of concept, in this paper we applied GP regression, a popular supervised learning technique, to 2 problems of interest in cardiac mechanics modeling. First, 600 LV geometries described by a 6-parameter (R<sup>b</sup> , L, Z, H, 90, e) prolate spheroid were extracted randomly from a conservatively defined parameter space. For each geometry, a forward simulation was run to trace ventricle geometries upon inflation at progressively larger intraventricular pressures. GP regression models then allowed to infer unloaded configurations given sets of 6 parameters defining the loaded geometries and either their corresponding intraventricular pressure or their fiber strain at midwall. For the second statistical model, we built a GP regression between parameters characterizing the location and shape of an infarct and corresponding stroke volumes predicted by high-resolution simulations accounting for the presence of the lesion.

### 4.1. Ventricular Shape Analysis

The Sunnybrook Cardiac MRI database was the primary source of imaging data for this study. Conventional analyses of the segmentations from such a database have employed methods to either extract features directly from images (e.g., Chumarnaya et al., 2016), or have used finite element models to analyze ventricular shapes and build statistical classifiers of patient disease (e.g., Piras et al., 2017). A geometric description with fewer parameters is better suited for parameterizing the geometry of ventricles in regressions trained on biophysical simulation results. Therefore, instead of finite element models, we adopted a 6-parameter description (Streeter and Hanna, 1973; Pravdin et al., 2014) to approximate ventricular geometry. In spite of its simplicity, this approach was able to capture some of the shape features and biomarkers that have been previously extracted using the conventional finite element models (e.g., Zhang et al., 2014). In particular, ventricular sphericity (e) separated ventricles with and without myocardial infarction in patients with heart failure (see HF-I and HF-NI traces in **Figure 3**). The 6-parameter model analysis also captured higher average wall thickness in hypertrophic hearts and highest relative dynamic thickening in normal patients. To partially compensate for the limits of considering a fully axisymmetric parameterization, we accounted for eventual rigid rotations and translations to better align parameterized and segmented ventricles throughout the cardiac cycle. This ensured us overall good fitting results, especially for the failing hearts, which proved to be more symmetric. Nonetheless, the methods here presented could be promptly extended also to non-axisymmetric parameterizations such as those based on non-uniform rational B-splines at the expense of extending the parameter space to additional dimensions.

Out of the several field views provided in the Sunnybrook database, we restricted our analyses to short-axis stack series, which have the disadvantage of providing relatively low resolution in the coronal planes. As a result, some artifacts were particularly evident close to the apex of the ventricle, where the segmentation and subsequent parameterization were sometimes not able to resolve correctly the apical thickness, especially in the thinner failing LVs. Not surprisingly then, the H parameter showed the largest relative standard deviations within the same cardiac cycle for all patients, indicating that apex parameterization accuracy could be likely corrected by registering and merging multiple MRI views.

### 4.2. Ventricular Unloading

Standard FE simulations need to be initialized from an unloaded state, which cannot be directly extracted from images because ventricles are pressurized in all of the configurations imaged by cine-MRI or CT scans. Given material properties and inner LV pressure, iterative approaches such as the fixed point iteration method allow to estimate the unloaded configuration by progressively correcting a loaded state (Sellier, 2011; Genet et al., 2015). Nonetheless, due to their large computational cost and added complexity, these techniques are not typically incorporated into sophisticated optimization schemes proposed to estimate model parameters from images (Asner et al., 2016, 2017; Nasopoulou et al., 2017). To ensure feasibility, many modeling studies tend instead to use representative loaded configurations (i.e., at beginning or end of diastole) as approximations for the unknown unloaded state. As shown by our analyses, this could significantly bias results, since BoD and EoD configurations tend to match poorly to the profiles of unloaded geometries (see **Figure 6**). GP regression models of unloading can help circumvent some of the limitations associated with iterative methods and enable larger parameter search studies. Somewhat surprisingly, even a training set of ntrain=75 forward simulations was sufficient to ensure good inverse estimation results. LV profiles inferred from the statistical model matched those obtained via fixed point iteration with Dice scores always larger than 0.90 under 2 loading pressures and for 3 different sets of material properties. Considering that in our experience 7–10 iterations are needed to reach convergence via fixed point iteration, the preparation of an accurate statistical model might then require a computational cost comparable to unloading 7–10 ventricles with the standard method. Unlike fixed point iteration our strategy requires also an additional step of re-parameterizing simulation results in a format that can be handled by the machine learning model. The computational cost of reparameterizing is often negligible (on the order of few CPU mins), and after training the statistical model can be further interrogated to unload additional geometries at essentially no computational cost.

In addition to morphology, estimating the unloaded configuration relies on the knowledge of loading conditions and of the material properties of the myocardium. To fully characterize the material behavior of cardiac tissue, sophisticated experiments are required to reproduce in vitro the principal strain modes experienced by the heart during the cycle. The most extensive dataset on the passive behavior of the human myocardium is provided by Sommer et al. (2015). This work confirms how the micro-architecture of myocardial sheets leads to complex nolinear anisotropic behavior combined to a persisting viscoelastic response. Although viscoelastic effects were neglected in this work, we considered material properties based on the triaxial experiments of Sommer et al. (2015) as well as 2 other sets of constitutive behaviors based on experiments on animal models (Usyk et al., 2000; Wang et al., 2013; Gültekin et al., 2016). Our unloading procedure proved to work well for all of these sets of material properties.

In the form presented herein, our method for ventricular unloading required building a new training dataset and subsequently a new GP regression model for each set of material properties considered. Nonetheless, for future applications, the input parametric space could be extended to additional dimensions to account also for variations in material properties. While more training simulations would likely be needed to reach the desired convergence, the presented approach could still prove to be convenient for material property identification based on strain energy functions with reduced number of parameters (e.g., Nasopoulou et al., 2017), and especially in cases where large high performance machines are available to tackle the required computational cost in a distributed fashion.

The diastolic fiber strain at midwall, the constitutive law, and the shape of the ventricles at end-diastole are sufficient to uniquely unload geometries either via fixed-point iteration or GP regression. In this paper, we proposed to constrain end-diastolic fiber stretch to account for scenarios where diastolic pressure in the ventricles is not known. Animal model experiments suggest that end-diastolic fiber strain varies within a relatively small range in several circumstances (e.g., see Ross et al., 1971). Inspired by studies on inverse stress identification (Miller et al., 2010; Miller and Lu, 2013), we therefore tried to find the unloaded ventricular shape without solving for ventricular pressure. This was also motivated by the fact that unloading by strain would yield the same unloaded configuration independently from a homogeneous scaling of the constitutive law (i.e., predicted enddiastolic pressures would scale accordingly). To illustrate the potential of such approach, we additionally computed Dice scores between unloaded ventricles with 10% diastolic fiber strain using different constitutive laws. Our results (Dice scores of 0.90±0.05 for W<sup>U</sup> vs. W<sup>G</sup> HO, 0.85±0.03 for <sup>W</sup><sup>U</sup> vs. <sup>W</sup><sup>W</sup> HO and 0.96±0.03 for W<sup>G</sup> HO vs. <sup>W</sup><sup>W</sup> HO, respectively) suggested strong similarity between unloaded ventricles endowed with umlaut Gultekin and Wang material behaviors, which followed the same Holzapfel-Ogden functional formulation.

### 4.3. Modeling of Infarct Mechanics

Two main factors increase the complexity of ischemia and myocardial infarction models. The first one is the need to account for the progressive changes in passive and active material properties that are triggered by the lesion and driven by tissue damage recovery and remodeling (Holmes et al., 2005). The second one is the more complex numerical framework required to handle the large finite element meshes needed to accurately capture realistic infarct shapes. In the past, only few studies have simulated non-transmural infarcts (Leong et al., 2015; Duchateau et al., 2016; Leong et al., 2017), while most models have either simulated infarct with simplified morphologies, or have allowed infarct/ischemic regions crossing the finite element boundaries (e.g., Mazhari et al., 2000; Jie et al., 2010; Wenk et al., 2011; Mojsejenko et al., 2015). Here, we present a model of nontransmural infarct that has refined elements in the border region of infarct. To handle large finite element meshes that result from such a refinement, we use an iterative solver for the large system of linearized equations with an efficient preconditioner (Gurev et al., 2015). To quickly summarize our results, the 2 main parameters affecting simulated SV were the transmural and circumferential extensions of the lesion, while location of the infarct played a minor role. Our models of infarct and the corresponding statistical model are still at a preliminary stage of development, and were here presented mainly to demonstrate the concept of integration between statistical and physical models.

### 4.4. Summary

This work shows 2 applications of GP regression in modeling ventricular heart mechanics. First, we present a strategy to estimate the ventricular unloaded configuration given material properties and intraventricular pressure (or alternatively fiber strain at midwall). Once an upfront computational cost (amounting to ∼10 applications of a conventional iterative method) is paid for training, GP regression models allow the estimation of unlimited unloaded geometries at no additional cost. The method is therefore suitable to be used in analyses involving large number of patients such as those collected in publicly available databases. Second, we use GP regression as a convenient tool to explore results of a parametric study investigating coupled effects of infarct shape and location. While just a proof of concept study, these preliminary results demonstrate the power of the approach. That is, we were able to characterize a large variation in infarct location and size, including non-transmural infarcts with highly complex meshes that are computationally demanding to solve.

### AUTHOR CONTRIBUTIONS

PD designed the study, analyzed results, and wrote the manuscript, AH provided the image segmentations and supervised the image processing pipeline, SK performed the parametric infarct simulations, OS supervised the infarct simulations, JR wrote the manuscript and supervised the project; VG designed the study, wrote the manuscript, and supervised the project. All authors agree to be accountable for the content of the work.

### FUNDING

Supported by the Program No. 27 of the Presidium of the RAS, the Decree of the Government of the Russian Federation No. 211 of 16/03/2013.

### REFERENCES


### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphys. 2018.01002/full#supplementary-material


**Conflict of Interest Statement:** PD, AH, JR, and VG were employed by IBM Research.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Di Achille, Harouni, Khamzin, Solovyova, Rice and Gurev. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Effect of Cell Morphology on the Permeability of the Nuclear Envelope to Diffusive Factors

Alberto García-González <sup>1</sup> \* † , Emanuela Jacchetti 2†, Roberto Marotta<sup>3</sup> , Marta Tunesi 2,4 , José F. Rodríguez Matas 2† and Manuela T. Raimondi 2†

<sup>1</sup> Laboratori de Càlcul Numèric, E.T.S. de Ingenieros de Caminos, Canales y Puertos, Universitat Politècnica de Catalunya – (UPC BarcelonaTech), Barcelona, Spain, <sup>2</sup> Department of Chemistry, Materials and Chemical Engineeering "Giulio Natta," Politecnico di Milano, Milan, Italy, <sup>3</sup> Electron Microscopy Facility, Istituto Italiano di Tecnologia, Genoa, Italy, <sup>4</sup> Unità di Ricerca Consorzio INSTM, Politecnico di Milano, Milan, Italy

#### Edited by:

Mariano Vázquez, Barcelona Supercomputing Center, Spain

#### Reviewed by:

Chiara Giverso, Politecnico di Torino, Italy Hermann Frieboes, University of Louisville, United States

> \*Correspondence: Alberto García-González berto.garcia@upc.edu

†These authors have contributed equally to this work.

#### Specialty section:

This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology

Received: 30 March 2018 Accepted: 25 June 2018 Published: 13 July 2018

#### Citation:

García-González A, Jacchetti E, Marotta R, Tunesi M, Rodríguez Matas JF and Raimondi MT (2018) The Effect of Cell Morphology on the Permeability of the Nuclear Envelope to Diffusive Factors. Front. Physiol. 9:925. doi: 10.3389/fphys.2018.00925 A recent advance in understanding stem cell differentiation is that the cell is able to translate its morphology, i.e., roundish or spread, into a fate decision. We hypothesize that strain states in the nuclear envelope (NE) cause changes in the structure of the nuclear pore complexes. This induces significant changes in the NE's permeability to the traffic of the transcription factors involved in stem cell differentiation which are imported into the nucleus by passive diffusion. To demonstrate this, we set up a numerical model of the transport of diffusive molecules through the nuclear pore complex (NPC), on the basis of the NPC deformation. We then compared the prediction of the model for two different cell configurations with roundish and spread nuclear topologies with those measured on cells cultured in both configurations. To measure the geometrical features of the NPC, using electron tomography we reconstructed three-dimensional portions of the envelope of cells cultured in both configurations. We found non-significant differences in both the shape and size of the transmembrane ring of single pores with envelope deformation. In the numerical model, we thus assumed that the changes in pore complex permeability, caused by the envelope strains, are due to variations in the opening configuration of the nuclear basket, which in turn modifies the porosity of the pore complex mainly on its nuclear side. To validate the model, we cultured cells on a substrate shaped as a spatial micro-grid, called the "nichoid," which is nanoengineered by two-photon laser polymerization, and induces a roundish nuclear configuration in cells adhering to the nichoid grid, and a spread configuration in cells adhering to the flat substrate surrounding the grid. We then measured the diffusion through the nuclear envelope of an inert green-fluorescent protein, by fluorescence recovery after photobleaching (FRAP). Finally, we compared the diffusion times predicted by the numerical model for roundish vs. spread cells, with the measured times. Our data show that cell stretching modulates the characteristic time needed for the nuclear import of a small inert molecule, GFP, and the model predicts a faster import of diffusive molecules in the spread compared to roundish cells.

Keywords: nuclear pore complex, passive diffusion, nuclear envelope permeability, stem cell differentiation, finite element modeling, scanning transmission electron microscopy, confocal microscopy

## INTRODUCTION

The mechanobiological cues guiding stem cell fate are currently being intensely explored in vivo (Rompolas et al., 2013) and in vitro (Nava et al., 2012). In vitro, they can be modulated through substrate stiffness, surface nanotopography, microgeometry, and extracellular forces. For example, the culture of mesenchymal stem cells (MSCs) on substrates with tuned elasticity (Swift et al., 2013), or with a size and geometry constraint (Nathan et al., 2011; Tseng et al., 2012), results in an alteration in cell spreading, leading to major remodeling of the cellular cytoskeleton. This remodeling, in turn, alters the nuclear shape, mediated by the traction transmitted to the nucleus by the filamentous actin cytoskeleton (Badique et al., 2013). However, exactly how alterations in nuclear shape are transduced into stem cell fate are unknown.

Here, we hypothesize that strain states in the nuclear envelope (NE) cause changes in the structure of the nuclear pore complex (NPC). This would lead to a significant change in the permeability of the nuclear envelope to the traffic of those transcription factors involved in stem cell differentiation which are very small and thus imported in the nucleus through the NPCs by simple passive diffusion. The molecular weight of these diffusive molecules has been estimated to be lower than 40 kDa (Paine et al., 1975; Ribbeck and Görlich, 2001) but can reach dimensions up to 70 kDa (Wei et al., 2003; Cardarelli and Gratton, 2010; Bizzarri et al., 2012).

The multiprotein structure of a NPC is detailed in **Figure 1**. NPCs are a substructure assembly composed of several coaxial rings and 8-fold rotational (Goldberg and Allen, 1996; Beck et al., 2004; Löschberger et al., 2012) symmetrical structures named according to their spatial location: (1) The cytoplasmic ring (CR) and filaments in the cytoplasmic side; (2) The spoke ring (SR) and transmembrane ring, which provide stiffness and stability to the complete NPC, in the nuclear envelop; (3) The nuclear ring (NR), which is attached to the lamina, the nuclear basket, and the distal ring (DR) in the nuclear side. For a detailed description of the structure of the NPC, see the review paper (Garcia et al., 2016).

NPCs pose efficient barriers to big inert objects (Mohr et al., 2009) and regulate the protein translocation between the cytoplasm and cell nucleus, thus suppressing an intermixing of the contents of the two compartments in order to control cell life and regulate gene expression, as in cell differentiation. Small proteins and molecules can pass unassisted through the NPC by passive diffusion. This translocation process becomes increasingly restricted as the particle size increases (Paine et al., 1975; Wei et al., 2003). Passive diffusion becomes very inefficient approaching an upper molecular weight limit of around 40– 70 kDa. Thus, larger proteins are let into the nucleus by a NPC selective receptor on the FG-domain, which recognizes a specific import motif (called the nuclear localization signal) expressed by the cargos. This process of protein translocation, named facilitated translocation, is often associated with an input of metabolic energy, thus enabling transport also against a concentration gradient (Paine et al., 1975; Ribbeck and Görlich, 2001; Naim et al., 2007).

According to the basic principles of mass transport, the nuclear flux of small transcription factors occurring by passive diffusion should be proportional to their concentration gradient across the NPC, by a coefficient related to the dimension of the pore lumen. Variable diameters have been observed in the NPC, likely made possible by large-scale rearrangements of doublering protein subcomplexes (Bui et al., 2013). Such large-scale rearrangements are now believed to be biologically significant only for the transport of huge macromolecular cargoes.

To the best of our knowledge, no one has yet hypothesized a role for the pore dimensional variations in regulating the purely diffusive nuclear fluxes of signaling molecules, such as transcription factors, including those involved in stem cell differentiation. This work defines one of these mechanisms using an advanced mechanobiology model based on the integration of a computational model of protein nuclear diffusion with nuclear deformation, with direct measurements on the cells of nuclear import flows of small diffusive proteins.

Computational modeling of nuclear diffusion-deformation phenomena entails coupling structural mechanics models for the NE and NPC with diffusion equations for the transcription factors, which is an essentially unexplored field. Few published examples of numerical simulations address the mechanics of the NPC and its effect on nucleocytoplasmic transport (reviewed in Garcia et al., 2016). At the cell scale, our group developed a finite element simulation of passive diffusive fluxes from the cytoplasm to the nucleus, accounting for nuclear deformation (Nava et al., 2015a). This model coupled nuclear diffusion with local NE deformation in transient conditions, through a straindependent diffusion coefficient. At the nanoscale, numerical simulations based on molecular dynamics predicted a cargo trajectory through an NPC by interaction with the FG-domain of an NPC selective receptor (Moussavi-Baygi et al., 2011). This model also supports the hypothesis that the mechanical response of the NPC may affect the diffusion of cargos and smaller molecules through the nuclear pore.

A major challenge in calibrating these numerical models is the direct measurement of small diffusive proteins in cells of the nuclear import flows. The study of protein mobility or translocation of protein between different compartments of live cells (such as nucleocytoplasmic translocation) was made possible by the discovery and development of fluorescent proteins (FPs) (Chalfie, 1994; Tsien, 1998). FPs are a class of genetically encodable proteins derived from sea organisms and, in particular, from the jellyfish Aequorea victoria.

Using molecular biology techniques and commercial scanning microscopes, FPs can be tagged to any protein of interest. In addition, fluorescent microscopy can visualize, localize and track proteins in live cells and also reveal the extensive networks of protein-protein interactions that regulate cell processes (Lippincott-Schwartz et al., 2003). Fluorescence recovery after photobleaching (FRAP) is particularly useful in assessing the dynamic and biochemical properties of intracellular proteins in a single or multiple cell compartment (Sprague and McNally, 2005). FRAP was originally conceived in 1974 by Peters et al. (1974) and is very useful for studying protein mobility because it is only based on the change in optical properties, whereas

the dynamics and biochemistry of the molecules of interest are not perturbed. FRAP, along with other optical fluorescence microscopy techniques, has been widely used to study and understand passive and active diffusion mechanisms through the NPC (Wei et al., 2003; Yang et al., 2004; Cardarelli and Gratton, 2010; Bizzarri et al., 2012).

In this work, we hypothesized that strain states in the NE cause changes in the structure of the NPC, thus in turn causing a significant change in the permeability of the NE to the traffic of transcription factors that are imported into the nucleus by passive diffusion. To quantify this effect, we set up a numerical model of the interaction between the NPC and the NE. We measured geometrical parameters of the NPC size/shape on reconstructed three-dimensional (3D) portions of the NE, in both roundish and spread cell configurations, by applying electron tomography (ET) analysis on cultured cells. We set up a computational model of the NPC-NE mechanical interaction in which the changes in NPC permeability due to the NE strains are caused by variations in the opening configuration of the nuclear basket, which in turn modifies the porosity of the NPC nuclear side in the NE.

To validate this model, we cultured cells in a substrate nanoengineered by two-photon laser polymerization which can maintain roundish cell nuclei due to the isotropic adhesion of cells to a 3D micro-lattice, called the "nichoid" (Raimondi, 2013). Cells adhering to the flat substrate surrounding the individual nichoids adhered in standard spread conditions to the flat 2D surface and showed spread nuclei. We transfected untagged GFP protein into MSCs grown in both roundish and spread conditions.

Our aim was to quantify, with FRAP experiments, how cell morphology affects the nuclear envelope permeability and hence the nucleocytoplasmic exchange of transcription factors. Finally, we compared the diffusion time constants predicted by the numerical model for roundish vs. spread cells with the constants measured by FRAP.

### MATERIALS AND METHODS

### Experimental Protocols for NE Reconstruction by Scanning TEM (STEM) Cell Culture

MSCs were isolated from the bone marrow of adult rats (Zoja et al., 2012). Cells were isolated and cultured in alpha-MEM medium supplemented with 20% fetal bovine serum (FBS), 1% L-glutamine (2 mM), penicillin (10 units/ml), and streptomycin (10µg/ml) at 37◦C and in 5% CO<sup>2</sup> (Euroclone, Italy). The culture medium was changed every 2–3 days and cells were used at stages 1–3 after thawing. The animal protocols used in this study comply with the institutional protocols for ethical use currently in force.

#### Sample Preparation for STEM Analysis

MSCs were plated (20,000 cells/cm<sup>2</sup> , n = 3) on glass coverslips (13 mm diameter) or 35 mm-Petri dishes. One day after plating, the culture medium was removed and cells were washed with phosphate buffered saline. To model the deformed (spread) configuration, MSCs were fixed for 2 h at room temperature with 1.5% glutaraldehyde in 0.1 M sodium cacodylate (pH 7.2), detached by scraping, centrifuged to recover the pellet, kept overnight at 4◦C in 1.5% glutaraldehyde in 0.1 M sodium cacodylate and finally rinsed in 0.1 M sodium cacodylate (pH 7.2). To model the undeformed (roundish) configuration, MSCs were detached with trypsin, centrifuged to recover the pellet, fixed overnight with 1.5% glutaraldehyde in 0.1 M sodium cacodylate, and rinsed in 0.1 M sodium cacodylate.

#### STEM Analysis

After chemical fixation, MSCs cells in the spread and roundish configurations were washed several times in 0.1 M sodium cacodylate (pH 7.2), post-fixed in 1% osmium tetroxide in distilled water for 2 h and stained overnight at 4◦C in an aqueous 0.5% uranyl acetate solution. After several washes in distilled water, the samples were dehydrated in a graded ethanol series, and embedded in EPON resin. Sections of about 70 nm were cut with a diamond knife (DIATOME) on a Leica EM UC6 ultramicrotome. Transmission electron microscopy (TEM) images were collected with an FEI Tecnai G2 F20 (FEI Company, The Netherlands). EM tomography was performed in scanning TEM (STEM) mode, using a high angular annular dark field (HAADF) detector on 400 nm thick sections of MSCs cells in both spread and roundish configurations. The tilt series were acquired from a ±60◦ tilt range. The resulting images had a pixel size of 1.85 nm as shown in **Figure 2**. The tomograms were computed with IMOD (version 4.8.40) (Kremer et al., 1996). Isosurface based segmentation and three-dimensional visualization on unbinned and unfiltered tomograms were performed using Amira (FEI Visualization Science Group, Bordeaux, France).

#### Nuclear Envelope 3D Reconstruction

Open source image processing software, IMOD (Kremer et al., 1996), specialized in tomographic reconstruction developed by the University of Colorado was used to segment STEM images. Segmentation was performed manually on each slice. This process was guided by first locating the heterochromatin which is located very close to the membrane on the nuclear side (**Figure 2**). **Figure 3A** shows a typical slice segmentation detailing the location of several nuclear pores in the membrane. This process was followed for each slice as shown in **Figure 3B**. The nuclear envelope was then reconstructed by linear interpolation of the segmentation between consecutive slices (**Figure 3C**).

When the 3D reconstruction of the NE had been modeled, the geometrical data of the pores were measured directly using IMOD. Since the pore section is slightly elliptical, in order to obtain the area of each NPC, the two main diameters were obtained by measuring the pixel-by-pixel distances using IMOD. Additional post-processing regarding pore dimensions was performed in Matlab R2017b. Since we were measuring the main distances of the pixels between the mounted segmented slices, the main diameters were the closest approximations to the real diameters, due to the limited resolution of the STEM images. In order to obtain an accurate approximation of the pore area, a total of 16 and 19 pores were found in the reconstruction and measured for both spread and roundish configurations, respectively.

## Experimental Protocols to Analyse the Diffusive Process on Cells

#### Cell Culture on Flat and 3D Substrates

To recreate the two spread and roundish cell morphologies, cells were seeded on a chambered 160µm-thick cover glass (Lab-Tek II, Thermo Scientific-Nunc) patterned with 3D "nichoid"

FIGURE 3 | STEM Cell segmentation of the Nuclear Envelope and Pores. (A) Cell electron tomography with Nuclear Envelope segmentation (green). (B) Segmented cell tomographies for 3D reconstruction. (C) 1-slide segmentation of the NE (blue-left). 3D reconstruction (blue-right).

structures fabricated using an organic-inorganic photoresist (SZ2080) by two-photon laser polymerization (Raimondi, 2013). In each chamber well, three niches were arranged in a triangular pattern, at a relative distance of 200 µm. Individual niches were 30 µm high and 90 × 90µm<sup>2</sup> in transverse dimensions. They consisted of a lattice with interconnected lines, comprising a complex structure with pores of a graded size (**Figure 4**). The lines had a uniform spacing of 15 µm in the vertical direction, and a graded spacing of 10, 20, and 30 µm in the transverse direction. Each niche was surrounded by four outer confinement walls, made up of horizontal rods spaced by 7.5 µm, resulting in small gaps of 2 µm, which allow the diffusion of nutrients, but prevent the cells from escaping outside the niche (Nava et al., 2015b).

Before cell seeding, samples were washed three times in deionized water, incubated overnight in ethanol 70%, washed three times in sterile deionized water and irradiated with UV light for at least 1 h. The samples were then treated with 0.01% of Poly-L-lysine solution (Sigma-Aldrich, Italy) to improve the cell adhesion, and again washed three times with sterile deionized water. Once dry, 20 · 10<sup>3</sup> MSCs cells were seeded on each chamber. The day after, the cells were transient transfected with untagged GFP protein (pmaxGFP, Lonza, Switzerland).

#### Cell Transfection

Cells were transiently transfected with GFP plasmid (pmaxGFP, Lonza, Switzerland) using the jet PRIME reagent (Polyplus, USA). A solution consisting of 0.5 µg of DNA, 25 µl of jet PRIME buffer and 1.12 µl of jet PRIME reagent was prepared and kept at RT for 15 min. Cells were incubated with the transfection solution added to 400 µl of antibiotic-free medium (alpha-MEM, 20% (FBS), 1% L-Glutamine; Euroclone, Italy). After 4 h, the solution was replaced with the complete medium. The day after, the medium was replaced with a DMEM phenol-red free medium (Lonza, Switzerland) containing 10% FBS, 1% Pen/Strep, 1% L-Glutamine. Nuclei were stained with 1 µM DRAQ5 fluorescent probe (ThermoFisher, Italy) 10 min before the measurements.

Fluorescence Recovery After Photobleaching (FRAP)

FRAP measurements were performed with a confocal Laser Scanning microscope (Leica SP8, Germany) equipped with an Argon laser and a white laser, a 63X PlanApo oil-immersion objective (NA 1.4) and the incubator chamber. To identify the cell nucleus and choose the best plane to perform the FRAP measurement, DRAQ5 dye was detected using 8% of the Leica white-light laser (excitation 633 nm, emission 650–750 nm). For each cell, a region of interest identifying the section of the nucleus on which the FRAP measurement was later taken, was recorded to calculate the area. To acquire GFP protein emission, 0.2% of the 70% full power argon laser (excitation wavelength 488 nm, emission wavelength 500–580 nm) was used. Photobleaching of nuclear GFP was achieved by a single-point bleach (nonscanning) near the center of the nucleus with the 488 nm laser at full (100%) power. The time required to photobleach most of the nuclear fluorescence, without destroying too much of the cytosolic fluorescence, in flat cells was 3–5 s. In the case of cells grown in the niche, the maximum photobleaching time was 100 ms to avoid bleaching the GFP protein present in the cytoplasm.

Fluorescence recovery was measured starting a time-lapse acquisition within a few hundred milliseconds (382 ms) after the bleaching, acquiring 20 images every 191 ms and then 90 images every 6 s. Image size was 256 × 256 pixels and the scan speed was 700 Hz. Pinhole size was set to the value of 3.0 Airy, corresponding to a z resolution of 2.3 µm. Ten acquisitions were performed for cells grown on a flat surface (spread cells) and cells grown in the 3D scaffold (roundish cells), respectively. The recovery of the fluorescence was evaluated for about 10 min, which is enough time to observe a fluorescence intensity plateau for a few minutes in the recovery curve. This plateau means that the exchange of dark and bright protein from the cytosol and the nucleus is indistinguishable. The curves associated with the image background was subtracted from each acquisition. To remove the intrinsic loss of fluorescence due to the imaging process, the nuclear fluorescence data were normalized to the total cell GFP-fluorescence intensity calculated with a ROI located on the cell edge (**Figures 5A,B**). Data were also normalized by the average value of the nuclear fluorescence intensity calculated over the last 30 s of the measurement. The curves obtained were then shifted to start at zero of the graph.

The fluorescence signal was assumed as being proportional to the GFP concentration and described by the function:

$$F\left(t\right) = F^{\infty}\left(t\right) + \left(F^{0}\left(t\right) - F^{\infty}\left(t\right)\right)e^{\frac{t}{t\_{1}}}\tag{1}$$

This average and normalized fluorescence recovery in the cell nucleus of the spread and roundish cells was calculated and was fitted (Origin Pro software) to a single exponential function using

in one roundish cell (red line). The GFP protein ratio bleached during the measurement and the ratio of fluorescent GFP protein recovered into the cell

$$\chi = \chi\_0 + A\_1 e^{\frac{t}{t\_1}} \tag{2}$$

where t<sup>1</sup> is the characteristic time (time constant) of the protein translocation from the cytosol to the nucleus, A1, is the difference between the nuclear fluorescence after the bleaching and the nuclear fluorescence at the end of the recovery, which corresponds to the fraction of protein involved, and y<sup>0</sup> is the fluorescence background (see **Table 2**).

### Numerical Modeling of the Passive Diffusion Stretch Dependency Through the NE

Mechanical stretching of the nuclear lamina network (LN) plays a vital role in our hypothesis of stretch-dependent passive diffusion along the NE through the NPCs. This cytoplasmic fiber remodeling of the cytoskeleton (i.e., actin-myosin contraction) induces lamina deformation, therefore the NPC structure deforms at the nuclear ring (since it is directly linked to the lamina), and thus opening and closing the NR depending on the nuclear deformation. This effect causes an increase in flux in the case of the NR opening, since the effective area through the nuclear basket will become larger and thus leads to an increment on the velocity exchange of solutes. In addition, the flux of calcium released from the endoplasmic reticulum increasing through the NPCs, also increases the effective area of the distal ring (Stoffler et al., 1999a).

It thus seems logical to suggest that the permeability of the NE increases due to the increase in the NPC's effective transport area (because of the mechanotransduction to the lamina network-NPC assembly) in the nuclear basket and the distal ring (DR). In this section, we propose a stretch-dependent model of the NE permeability, φNE<sup>i</sup> . The model depends upon the local Green-Lagrange deformation tensor of the NE. **Figure 6** summarizes the main aspects of the calculation of the local permeability, and thus the local diffusion coefficient.

The local diffusion coefficient DNE<sup>i</sup> along the NE shown in Fick's Laws is calculated as a product of the Diffusion Coefficient of the GFP in the cytoplasm (assumed to be free diffusion) Dcyto and the local permeability φNE<sup>i</sup> . To calculate the local permeability, as shown in **Figure 6**, we first calculated the local deformation at every point of the NE surface assuming that the nuclear envelope was isotropic and subject to a biaxial planestress distribution. We then used these values to calculate the effective transport area through the NPC by modifying the surface area of the basket. Finally, the local permeability is the ratio between the effective area of transport and the total area corresponding to a single NPC. The results predicted with this numerical model are compared with experiments described in section Confocal Analysis and Results of the GFP Transport Measurement.

Numerical simulations of the diffusion of GFP between the nucleus and cytoplasm were carried out in two different ellipsoidal configurations of the nucleus, roundish (cells proliferating in the niche) and flat (cells growing out in flat environment outside the niche), see **Figure 4**. The dimensions of the ellipsoidal main axis are taken from the experimental analysis previously reported by our group (Nava et al., 2015a).

#### Multiscale Numerical Model of Stretch-Dependent Diffusion for the Nucleocytoplasmic Exchange of Solutes

In order to determine the strain field in the NE, it is assumed that nucleus deformation occurs at a constant volume, as reported in (Nava et al., 2015a). In addition, it is assumed that the stress-free configuration of the nucleus corresponds to a sphere, whereas the

nucleus are highlighted.

the following equation:

FIGURE 6 | Stretch-dependent permeability model. Local permeability of the nuclear membrane varies according to the degree of deformation of the nuclear envelop separating the cytoplasm (gray) from the nucleus (light violet). The orange arrows in the biaxial stretching illustration shows the Lamina Network. The bottom panel depicts a typical nuclear envelope permeability distribution for cells with roundish and spread shapes.

deformed configuration is an ellipsoid. The mapping between the sphere and the ellipsoid surfaces can be written as

$$\begin{array}{l} x = \frac{a}{R}X, \\ y = \frac{b}{R}Y, \\ z = \frac{c}{R}Z, \end{array} \tag{3}$$

where R is the radius of the reference sphere, a, b, and c, are the semi-axes of the deformed ellipsoid, X, Y, Z are the coordinates of the points in the nucleus in the reference sphere, and x, y, z are the coordinates of the points in the nucleus in the deformed configuration. This mapping can be parametrized in terms of spherical coordinates θ (polar angle) and ϕ (azimuthal angle)

$$\begin{array}{ll} \infty = a \cos \theta \sin \varphi, & X = R \cos \theta \sin \varphi, \\ \nu = b \sin \theta \sin \varphi, & Y = R \sin \theta \sin \varphi, \\ \nu = c \cos \varphi, & Z = R \cos \varphi. \end{array} \tag{4}$$

As already noted, the parametrization in Equation (4) provides a one-to-one mapping between the reference and deformed configuration.

The in-plane deformation of the nuclear envelope between the reference sphere and the deformed ellipsoid can be calculated using standard continuum mechanics theory from the exact mapping described in Equations (3) and (4). In this regard, the principal in-plane Green-Lagrange deformations are given as

$$\begin{aligned} E\_1 &= \mathbf{t}\_\theta \cdot \left( \mathbf{E} \cdot \mathbf{t}\_\theta \right), \\ E\_2 &= \mathbf{t}\_\varphi \cdot \left( \mathbf{E} \cdot \mathbf{t}\_\varphi \right), \end{aligned} \tag{5}$$

where **t**θ and **t**ϕ are tangent vectors along the polar and azimuthal direction in the reference sphere, respectively

$$\mathbf{t}\_{\theta} = \begin{pmatrix} -\sin\theta \\ \cos\theta \\ 0 \end{pmatrix}, \mathbf{t}\_{\varphi} = \begin{pmatrix} \cos\theta \cos\varphi \\ \sin\theta \cos\varphi \\ -\sin\varphi \end{pmatrix}, \tag{6}$$

and **E** is the Green-Lagrange deformation tensor

$$\mathbf{E} = \frac{1}{2} \left( \mathbf{F}^t \mathbf{F} - \mathbf{I} \right), \tag{7}$$

with **F** = ∂**x** ∂**X** the deformation gradient obtained from the mapping in Equation (3). Substituting in Equation (5) results in the following expression for the principal in-plane Green-Lagrange deformations

$$\begin{aligned} E\_1 &= \left(\frac{a^2}{\mathbb{R}^2} - 1\right) \cos^2 \theta \cos^2 \varphi + \left(\frac{b^2}{\mathbb{R}^2} - 1\right) \sin^2 \theta \cos^2 \varphi \\ &\quad + \left(\frac{b^2}{\mathbb{R}^2} - 1\right) \sin^2 \varphi, \\\ E\_2 &= \left(\frac{a^2}{\mathbb{R}^2} - 1\right) \sin^2 \theta + \left(\frac{b^2}{\mathbb{R}^2} - 1\right) \cos^2 \theta. \end{aligned} \tag{8}$$

**Figure 7** shows the "Lamina-NR-basket-DR" assembly considered in the model. Since the radius of curvature of the NE is larger than the nuclear pore dimensions (radius of a curvature ratio of 100:1), the pore in the nuclear lamina can be modeled as a plate with a circular hole under biaxial stress

which allows for an analytic solution (Mal and Singh, 1991). We also assume that the NE deformation induces an equibiaxial state of stress/deformation on every pore in which the stress is proportional to the trace of the in-plane Green-Lagrange deformation i.e., tr (**E**i) = E<sup>1</sup> + E2. Note that, in the case of plane-stress, the trace of the in-plane Green-Lagrange tensor in small deformations is proportional to the relative area change. Following the solution in Mal and Singh (1991), the change in the nuclear ring radius (see **Figure 7**), r, is given by

$$
\Delta r = 2 \frac{\operatorname{tr}(\mathbf{E}\_i)}{(1 - \nu)} r\_0,\tag{9}
$$

where tr (**E**i) is the trace of the local in-plane Green-Lagrange deformation tensor, ν is the Poisson ratio of the lamina, assumed as 0.3, and r<sup>0</sup> is the undeformed NR radius. Hence, the radius of the deformed NR after deformation is

$$r\_{\eta p c\_i} = r\_0 + \Delta r = r\_0 \left( 1 + 2 \frac{tr(\mathbf{E}\_i)}{(1 - \nu)} \right). \tag{10}$$

With these calculated radii of the NR in the deformed configurations, it is possible to obtain the lateral surface of the nuclear basket, Scone<sup>i</sup> , (see **Figure 7**) and thus the effective area of the transport of solutes through one single pore as

$$A\_{npc\_i}(\mathbf{E}\_i) = A\_{DR} + \left[ \mathbf{S}\_{cont\_i}(\mathbf{E}\_i) - (1 - A\_e) \, \mathbf{S}\_{cont\_0} \right],\tag{11}$$

where ADR is the area of the Distal Ring, Scone<sup>0</sup> is the value of the lateral surface of the nuclear basket in the undeformed configuration, and A<sup>e</sup> is a surface correction factor accounting for the pillars connecting the NR and DR which reduce the effective area of transport. In the model, A<sup>e</sup> is set to 0 which implies that the lateral surface of the basket is closed in the undeformed configuration. Once the effective transport surface area has been computed, the local permeability and the local Diffusion Coefficient can be readily calculated as:

$$\phi\_{\rm NE\_i} = \frac{A\_{\rm npc\_i}(\mathbf{E}\_i)}{A\_i} = \frac{A\_{\rm npc\_i}(\mathbf{E}\_i)}{A\_{\rm NE}} N\_P = \frac{A\_{\rm npc\_i}(\mathbf{E}\_i)}{A\_{\rm NE}} \rho\_{\rm npc} A\_{\rm NE\_0}. \tag{12}$$

where A<sup>i</sup> = ANE NP is the area ratio corresponding to a single NPC, ANE is the total area of the nuclear envelope, N<sup>P</sup> is the total number of pores, ρnpc = NP/ANE<sup>0</sup> is the pore density (number of pores per unit area), and ANE<sup>0</sup> is the zero-stress (spherical) surface area of the NE. With the expression of the permeability in Equation (12), the effective nuclear membrane diffusion coefficient can be calculated as:

$$D\_{\rm NE\_i} = R\phi\_{\rm NE\_i} D\_{\rm cyto},\tag{13}$$

where Dcyto is the GFP-FRAP free diffusion coefficient and R is a "traffic resistance" parameter that takes into account the resistance to the trafficking of high molecular weight cargos through the pore (pores are always full of molecules passing through them). Therefore, the final permeability is reduced due to the resistance. Note, however, that R is considered to be the same for the roundish and spread configurations. In our case, we found that 96.1% flux resistance was optimal to fit the numerical model to the experimental results.

The presented model is used to compute local values of diffusion DNE<sup>i</sup> to be included in a finite element model of the passive nuclear transport (see **Figure 6**, bottom panel). As can be seen, the finite element models of diffusion consist in a symmetric octant of a solid ellipsoidal stem cell (created using Comsol Multiphysics), one for a roundish and another for spread configurations. Such models were meshed with a total of 2,91,420 hexahedral elements and 3,02,236 nodes for the roundish cell, and 4,62,264 elements with 4,77,468 nodes for the spread cell. We divided the FE models in three main parts: (i) an external thin layer of elements that represents the nuclear envelope, in which the different calculated values of DNE<sup>i</sup> were added in each of the elements (accounting for the permeability of the NE-NPC). (ii) The full nucleus and cytoplasm volumes in which free diffusion was considered. The model simulates (run in Abaqus 6.14-1) the transport of GFP from the cytoplasm to nucleus through the NE until equilibrium is reached. (iii) Finally, a post-process of the simulation results is performed suing Python-Matlab to calculate the total concentration in the nucleus vs. time.

Since the numerical finite element model is meant to be able to fit the experimental results, the different parameters in the model were selected to be of the same order of magnitude as reported in the literature (Stoffler et al., 1999b; Beck et al., 2004; Moussavi-Baygi et al., 2011; Maimon et al., 2012; Adams and Wente, 2013; Bui et al., 2013; Eibauer et al., 2015). In particular, the SR radius was taken as 0.01 µm, the initial NR radius was 0.04 µm, the DR radius as 0.0 µm (since the DR is assumed not to be influenced by mechanical deformations of the NE), and a basket length of 0.075 µm. In addition, the value of GFP-FRAP free diffusion coefficient was taken as <sup>D</sup>cyto <sup>=</sup> <sup>31</sup> <sup>m</sup><sup>2</sup> s (Baum et al., 2014), and the nuclear pore density ρnpc = 10 pores µm<sup>2</sup> (Bizzarri et al., 2012), with which we obtain a total of 2908 NPCs/nucleus.

#### RESULTS

#### Nuclear Envelope 3D Reconstruction

**Table 1** shows the pore diameters and areas obtained from the 3D reconstruction. It is worth mentioning that in line with the STEM, the diameters measured correspond to the distance at the SR level since it is only possible to visualize the NE rather than the NPC itself. The mean diameter values show a higher deformation

of the pore area in the spread cells compared to the roundish cells. These differences in diameter between the spread and roundish configurations are shown in **Figure 8A**. Despite this change in diameter, both the spread and roundish configurations show similar pore areas values, see **Table 1**, with a higher dispersion of values in the spread cells as shown in **Figure 8B**. The difference in the pore area between the roundish and spread configurations was tested with a paired, two-sided signed rank test that founded no statistical differences (p = 0.19). These results reinforce the strong hypothesis that the effective area of diffusive transport relies on the NR-basket-DR assembly due to the deformation of the NE-Lamina Network (directly linked to the NR). Thus, the SR and the transmembrane ring become the main substructures on which the main stiffness of the whole NPC depends.

### Confocal Analysis and Results of the GFP Transport Measurement

One day after MSC transient transfection with GFP protein, FRAP experiments on cell nuclei were performed. GFPtransfected cell images are reported in **Figures 4B,C** which shows

TABLE 1 | Diameters of the NPCs of both roundish and spread configurations.


Data are reported as mean and standard deviations.

TABLE 2 | Parameters of the mono-exponential function used to fit the fluorescence recovery curve on spread and roundish cells: t1 is the characteristic time of the GFP protein translocation from cytosol to the nucleus; A1 corresponds to the fraction of protein involved in the exchange; and y0 is the fluorescence background.


images acquired before the FRAP experiment of a cell grown in the niche, and of a spread cell adhered to the glass substrate shown in **Figure 4A**. The pictures show that the cell morphology drastically changes depending on the environment, flat glass substrate-−2D— or the NICHEs-−3D—, in which the cell is grown.

FRAP experiments were performed as reported in the Materials and Methods (see section entitled FRAP) and representative images of the nuclear fluorescence recovery are shown in **Figures 5A,B**. The graph in **Figure 5C** shows the relative curves of the fluorescence recovery in the cell nucleus. Each of these functions shows the initial value of nuclear fluorescence, on which the curves were normalized, the bleaching time and the recovery of the nuclear fluorescence over time. As shown in the graph, despite the GFP being a non-interacting protein with other cellular components, the recovery of the fluorescence does not reach the initial intensity because, during the bleaching time, many GFP-proteins (in and outside the nucleus) are irreversibly bleached. In particular, in the 3D cell configuration, it is not possible to reach very low level (80% of bleaching) of fluorescence in the nucleus without destroying the cellular fluorescence. The bleaching time needs to be reduced from a few seconds (for the spread cells) to 100 ms and the total recovery is calculated considering only 30% of the initial fluorescence.

**Figure 9** shows the fit of the recovery curves of the spread and roundish cells. The bleached area in the two populations is on average Aspread = 123 ± 34 µm<sup>2</sup> , ARoundish = 40 ± 13 µm<sup>2</sup> . The curves are well fitted with a monoexponential function, as demonstrated by the statistical analysis (reduced-χ 2 spread cells = 0.979 reduced-χ 2 roundish cells = 0.946). The parameters extracted from the fits are reported in **Table 2**, which highlights the characteristic diffusion time of the GFP translocation between the cytosol and cell nucleus of the spread and roundish cells (tspread = 56 ± 2.6 s, and troundish = 26 ± 2.5 s).

### Numerical Simulations of Stretch Dependent Diffusion of GFP

**Figure 10** shows the finite element simulation results of the recovery of GFP by the stretch-dependent diffusion model previously described for both spread (blue) and roundish (red) configurations of the nucleus. The faster recovery of the spread compared to the roundish nucleus is clear, since the level of deformation in the NE of the spread nucleus is larger than in the roundish nucleus, and therefore more permeable (see **Figure 6** bottom panel). According to the results in **Figure 10**, the corresponding characteristic time for both spread and roundish configurations were found to be t1spread = 17.2 s and t1roundish = 25.4 s which were very similar to those obtained experimentally. These results were obtained using the aforementioned structural/dimensional values of the pores and the corresponding permeability. The difference in recovery times is only due to the degree of modulation that the deformation of the NE exerts on the NE permeability.

#### DISCUSSION

different conformations.

To the best of our knowledge, there are no papers in the literature that specifically use computational mechanics and numerical analyses to demonstrate strain-dependent passive diffusion through the NE. Instead the focus has been on the mechanisms that lead with the active transport of cargoes through the NPCs, see for example (Moussavi-Baygi et al., 2011; Azimi and Mofrad, 2013; Mahboobi et al., 2015). The work by Nava et al. (2015b) treated the passive diffusion of solutes between the nucleus and the cytoplasm as strain-dependent. In their analysis, the whole nucleus is deformed and assumed as a permeable material. In our literature search we found no other studies on this topic.

The multiscale numerical model presented in this paper, is thus the first attempt to directly analyse the passive diffusion of small molecules through the deformed NPCs (nano-level) at the nuclear envelope scale (micro-level) by including a straindependent variable permeability barrier in the NE. Our results highlight the potential of our numerical model to describe the passive transport through the nuclear membrane, that is, the passive diffusive flux of small molecular weight particles. Note that, since the DR diameter is a variable parameter of the numerical model, it may also account for calcium effects on distal ring opening. However, in our numerical simulations, we set the DR diameter to 0 µm in order to only analyze the passive diffusion mechanical dependency. This is because the DR is chemically opened/activated by a calcium flux through the NPC (Stoffler et al., 1999a; Wang and Clapham, 1999).

An important limitation regarding our model is the use of small deformation theory for the NPC and isotropic linear elastic behavior for the NE-lamina network. We assumed such mechanical properties due to the lack of available data for the NElamina-NPC assembly. We also consider this numerical model as a first attempt to demonstrate our hypothesis, and we believe that more complex material properties should not greatly qualitatively modify our final results. However, these assumptions require further research in order to obtaining more accurate results that would reinforce our final hypothesis.

Fluorescence recovery after photobleaching belongs to a class of measures based on photoperturbation. This means that only the optical properties of the protein of interest are changed and, after the perturbation, the protein redistribution in space is monitored in time-lapse. This class of measures is also known as ensemble-averaging, in fact it is possible to obtain results over a relatively long time (from hundreds of milliseconds to minutes) and they are the result on average of the behavior of many molecules. This means that the measure masks the fast diffusion process or hides the properties of sub-populations. In general, their results need to be coupled with a mathematical model to help in the data interpretation.

Usually, in FRAP experiments, a high concentration of the protein of interest is expressed in a live cell fused with a fluorescent protein (GFP protein for example). A small area, in a single cell compartment, i.e., the region of interest, is permanently bleached by a strong laser illumination, and the redistribution of the fluorescence in the entire cell is monitored by low intense excitation (as in **Figures 5A,B**). If the protein of interest is immobile, the bleached area will remain dark. On the other hand, if it is mobile, then a redistribution between the fluorescent and bleached protein happens between the region of interest and the rest of the cell.

In order to study the mobility of a nuclear protein, due to the confocal\ wide field microscopy set up, the bleaching takes place in a cylindrical volume of a few microns along the z axis, which include the cell nucleus and also the cytoplasm. This involves the destruction of the fluorescence of a small portion of the protein in the cell cytoplasm. However, this does not affect the measure because in cells grown on a 2D flat surface (like our spread cells), these cytosolic bleached volumes were very small, because the nucleus generally fills the space between the upper and lower plasma membranes.

This technique has been used to evaluate the protein redistribution between two different cell compartments i.e., between the cell cytoplasm and nucleus. In this experimental configuration, as in our experiments, a wide as possible photobleached area within the nucleus was used, to ensure that the entire nucleus was photobleached, and the nuclear intensity recovery, as a consequence of the protein transport between the cytoplasm and the nucleus, was monitored. In this case, the prolonged GFP fluorescence recovery of the nuclear compartment (tens of seconds), compared to the GFP free diffusion in the nucleus or in the cell cytoplasm (2 s) (Lippincott-Schwartz et al., 2001; Wei et al., 2003; Sprague and McNally, 2005; Bizzarri et al., 2012), is due to the restricted diffusion across the nuclear envelope, which is in line with the diffusion through the open NPCs (∼0.01 of total NE surface area, Wei et al., 2003). This is also shown in our work, from the graphs in **Figure 5C**.

Our results also show that our experimental conditions the long bleaching time performed on spread cells, and the large volume of cytoplasm above and below the nucleus in the roundish cells—induced a high ratio of fluorescence protein disruption. This is also supported by the fact that we are working with a single GFP which is a non-toxic inert protein that does not interact with nuclear and cytosolic components, and therefore which does not show an immobile fraction during the FRAP measures. As is evident from our results on the GFP translocation, the characteristic time between the cytoplasm and the nucleus of the spread cells is comparable with those of the literature performed on cells grown on a standard flat substrate such as a glass coverslip (Wei et al., 2003; Sunn et al., 2005; Bizzarri et al., 2012). At the same time, we were unable to compare the results obtained on roundish cells because in the literature there are no similar experiments performed on cells grown on 3D scaffolds.

A comparison of the characteristic recovery time of the nuclear fluorescence of these two cell populations, led to the unexpected result: the fluorescence recovery was faster for the round cells than the spread cells. A careful evaluation revealed that we were evaluating the fluorescence recovery on a single (3 µm in thickness) plane of the cell, in which the area of the nucleus differed greatly between spread cells and roundish cells ( <sup>A</sup>spread Aroundish = 3). This means that a larger number of particles have to translocate and therefore it takes a longer time for the GFP to diffuse over the area of the spread cell nucleus. However, an experimental analysis performed on MSCs cells grown on a glass flat substrate and in the nichoid did not show a significant difference in the nuclear volume (Nava et al., 2015a), which suggests that the number of proteins that translocate from the cytosol to the cell nucleus in FRAP experiments does not strongly influence the measure. Other factors therefore need to be considered that may affect the recovery time, such as a strong modulation of the number of pores, or a reduction/increment in the effective nuclear surfaces.

None of the results presented in this manuscript i.e., the NPC spoke ring area via STEM analysis, the numerical parametric finite element diffusion model and FRAP experiments with the confocal microscopy, contradict the hypothesis that the deformation/strain of the nuclear envelope induces structural modifications in the NPCs and thus directly affects the passive diffusion of molecules. These results can be directly linked to the existence of small diameter secondary channels through the NPC that may allow small molecules such as ions to pass from the cytosol to the nucleus. In the case of a full blockage of the NPC due to high trafficking and deformation, it therefore allows the ions to open the distal ring and thus, to increase the flux through the NPC. We already mentioned this in a previous paper (Garcia et al., 2016) and referenced the works of (Maimon et al., 2012; Eibauer et al., 2015) which showed such secondary channels.

A major limitation of the work discussed in the present paper is that we were forced to use two different techniques to maintain roundish cells in the experiments. The cells were fixed in suspension to keep them roundish for the STEM reconstructions used to estimate the NPC dimensions, and they were cultured in the nichoid substrate for the FRAP measurement of the GFP nuclear import. In fact, the nichoid substrate is made of a fragile polymer that cannot be sectioned for STEM preparation without it being destroyed. Moreover, cells cannot be measured by FRAP for nuclear import flows while in suspension.

Reducing the cell adhesion sites by limiting the area of the adhesion substrate available for integrin binding, which is a similar approach to suspension culture, is a widely-accepted method used to induce a roundish cell morphology in mechanobiological studies (Badique et al., 2013). However, reducing the adhesion sites to maintain cells in a roundish morphology is likely to down-regulate the activation of mechanobiological transcription factors and other signaling molecules linked to the pathways activated by focal adhesions. Thus, inducing cell adhesion to a 3D scaffold is preferable to limiting the cell adhesion sites, for mechanobiology investigations. However, here we did not measure the activation or nuclear imports of transcription factors or signaling molecules, we only measured the nuclear imports of the GFP protein, expressed in the cell following transfection regardless of the cell morphology. In designing the experiments, we basically assumed that nuclear pore activation was primarily affected by NE local strains induced by nuclear deformation, regardless of the means used by the cell to adhere to its environment.

Another important limitation of our study is that in the nichoid culture model, the mechanical properties of the adhesion substrate were different for the spread and roundish cells. Spread cells adhered to glass, with a Young's modulus of around 80 GPa, while the photo-polymerized nichoid micro-lattice has a Young's modulus in the order of 0.138 GPa, i.e., three orders of magnitude less stiff than glass. The stiffness of a substrate to which the cell adheres is known to correlate significantly with the fate of several stem cell types, including MSCs, thanks to pioneering demonstrations by the research groups of Discher and Engler. It could thus be argued that the differences between the roundish and spread cells that we measured by FRAP in terms of nuclear flows are related to differences in the adhesion substrate stiffness. However, our previous findings using the nichoid cell culture model (Nava et al., 2015b) suggest, in addition to the stiffness theory, that the substrate stiffness combines with the substrate architecture in generating an adhesion configuration for the cell, which can be either isotropic (roundish) or very far from isotropic (spread), which correlates with the shape of the cell's nucleus.

We deduced that the level of nuclear isotropy induced by the combination of stiffness and architecture of the adhesion substrate, and not the substrate stiffness itself, was indeed the primary parameter correlating with the cell fate. In order to move from correlation to causation, in this work we introduced the hypothesis that, when the cell spreads, a primary mechanism activating the master switch between cell programs is the NPC stretch activation leading to a sudden increase in the permeability of the NE to purely diffusive signaling molecules. Our modeling approach, far from negating the primary role of substrate stiffness on cell fate, integrates substrate stiffness with its 3D architecture, thus providing a mechanistic interpretation of this effect, which is well corroborated by in vivo measurements of changes in diffusive nuclear flows due to nuclear morphology.

In future works, we will test our hypothesis on the key transcription factors involved in MSC differentiation. Many of these are molecules are in the range 40–70 kDa, which can diffuse freely (without consuming chemical energy) through NPCs. For example, the molecular weight of MyoD, a key myogenic transcription factor, is in the range 34–45 kDa. The molecular weight of Cbfa1 (also called Runx2), a transcriptional activator of osteoblast differentiation, is 55 kDa. Thus, these key transcription factors may diffuse freely through the NPCs. Thanks to the mechanobiology model developed here, we will be able to computationally predict their nuclear import flows on the basis of their molecular weight, and we will be able to interpret and validate these predictions with FRAP measurements on cells cultured in the nichoid model. To perform FRAP measurements, we will fuse the transcription factors with an inert fluorescent protein, such as GFP which is only 27 kDa in size, enabling us to still fall within the 70 kDa limit in the overall molecular weight of the fused protein, for NPC translocation based on passive diffusion. In fact, we selected GFP in this work because of its very limited size, as it falls well below the 40 kDa lower limit known for the passive diffusion of molecules through NPCs.

We also plan to characterize, in cells of a different morphology, the activation of gene expression induced by the nuclear translocation of the transcription factors of interest. However, a quantitative correlation between NE permeability and the up-regulation of gene expression is not expected, because up-regulation very much depends on the degree of chromatin packing influencing DNA accessibility to the chemical binding of the transcription factors.

In conclusion, here we have proposed a fundamental mechanism which uses nuclear mechanics to orchestrate the response of progenitor cells to the architectural properties of the extracellular environment. Our data show that cell stretching modulates the characteristic time needed for the nuclear import of a small inert molecule, GFP. What still needs to be proven is whether this modulation effect is due to an opening of the distal ring. We also still need to prove that a transcription factor with a comparable size to GFP would be subjected to the modulation effect that we found for GFP.

If further verified on specific transcription factors involved in MSC differentiation, this idea could thereby contribute directly to the definition of better differentiation protocols for MSCs, primarily based on guiding the spontaneous tendency of stem cells to differentiate in culture, by the mechanical cues provided by "physically" biomimetic culture niches. A new research field that could be impacted by our hypothesis is the fate control of induced pluripotency stem (iPS) cells. The iPS technology consists in converting adult somatic cells, usually fibroblasts or epithelial cells, to a pluripotent phenotype using genetic engineering. Despite the high potential of iPS to revolutionize medicine, to date there are very few successful re-differentiation protocols regarding mature phenotypes for these cells. Neurobiology is the only field where there are stable differentiation protocols. Our hypothesis could produce the knowledge and technology, the nichoid culture substrate, to direct the differentiation of iPS to lineages other than neural and potentially enable iPS to be applied in the clinical field.

### REFERENCES


The mechanobiological model of this work could also be used to compare the different nuclear mechanosensing responses in physiological vs. pathological states. For example, in cancer, it is believed that the expression of the malignant phenotype is due, at least in part, to a malfunctioning of the cell mechanoregulatory circuit. If our central hypothesis is verified, unconventional cell properties correlating the nuclear membrane structure to its permeability (including structural proteins of the cytoskeleton, the nucleus, the nuclear membrane, and the nuclear pore complexes) could become crucial new targets in cancer research.

### AUTHOR CONTRIBUTIONS

AG-G and JR: hypothesis development, numerical modeling, pre and post-processing of the numerical analysis, post-processing of the experimental STEM images, discussion and conclusions, drafting manuscript; EJ and MR: hypothesis development, experimental analysis design, pre and post-processing of the experiments, discussion and conclusions, drafting manuscript; RM: STEM imaging, writing the protocols; MT: cell culture, writing the protocols.

### FUNDING

This project received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement No. 646990-NICHOID). These results reflect only the author's views and the ERC is not responsible for any use that may be made of the information contained.

### ACKNOWLEDGMENTS

We are very grateful to our colleagues Prof. G. Cerullo, Dr. R. Osellame, and Dr. T. Zandrini for their contribution to the development of the bioengineered niches by 2PP fabrication. Part of this work was carried out in ALEMBIC, an advanced microscopy laboratory established by IRCCS Ospedale San Raffaele and Università Vita-Salute San Raffaele. We are also grateful to Barbara Bonandrini, Marina Figliuzzi, and Andrea Remuzzi at IRCCS M. Negri Institute for Pharmacological Research for providing to us the MSC cells used in the study.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 García-González, Jacchetti, Marotta, Tunesi, Rodríguez Matas and Raimondi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Enabling Detailed, Biophysics-Based Skeletal Muscle Models on HPC Systems

Chris P. Bradley <sup>1</sup> , Nehzat Emamy 2,3, Thomas Ertl 3,4, Dominik Göddeke3,5 , Andreas Hessenthaler 3,6, Thomas Klotz 3,6, Aaron Krämer 3,5, Michael Krone3,4 , Benjamin Maier 2,3, Miriam Mehl 2,3, Tobias Rau3,4 and Oliver Röhrle3,6 \*

<sup>1</sup> Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand, <sup>2</sup> Institute for Parallel and Distributed Systems, University of Stuttgart, Stuttgart, Germany, <sup>3</sup> Stuttgart Centre for Simulation Sciences, University of Stuttgart, Stuttgart, Germany, <sup>4</sup> Visualization Research Center of the University of Stuttgart, University of Stuttgart, Stuttgart, Germany, 5 Institute for Applied Analysis and Numerical Simulation, University of Stuttgart, Stuttgart, Germany, <sup>6</sup> SimTech Research Group on Continuum Biomechanics and Mechanobiology, Institute of Applied Mechanics (CE), University of Stuttgart, Stuttgart, Germany

#### Edited by:

Alfons Hoekstra, University of Amsterdam, Netherlands

#### Reviewed by:

Pras Pathmanathan, United States Food and Drug Administration, United States Mark Potse, Inria Bordeaux - Sud-Ouest Research Centre, France

> \*Correspondence: Oliver Röhrle roehrle@simtech.uni-stuttgart.de

#### Specialty section:

This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology

Received: 08 November 2017 Accepted: 11 June 2018 Published: 12 July 2018

#### Citation:

Bradley CP, Emamy N, Ertl T, Göddeke D, Hessenthaler A, Klotz T, Krämer A, Krone M, Maier B, Mehl M, Rau T and Röhrle O (2018) Enabling Detailed, Biophysics-Based Skeletal Muscle Models on HPC Systems. Front. Physiol. 9:816. doi: 10.3389/fphys.2018.00816 Realistic simulations of detailed, biophysics-based, multi-scale models often require very high resolution and, thus, large-scale compute facilities. Existing simulation environments, especially for biomedical applications, are typically designed to allow for high flexibility and generality in model development. Flexibility and model development, however, are often a limiting factor for large-scale simulations. Therefore, new models are typically tested and run on small-scale compute facilities. By using a detailed biophysics-based, chemo-electromechanical skeletal muscle model and the international open-source software library OpenCMISS as an example, we present an approach to upgrade an existing muscle simulation framework from a moderately parallel version toward a massively parallel one that scales both in terms of problem size and in terms of the number of parallel processes. For this purpose, we investigate different modeling, algorithmic and implementational aspects. We present improvements addressing both numerical and parallel scalability. In addition, our approach includes a novel visualization environment which is based on the MegaMol framework and is capable of handling large amounts of simulated data. We present the results of a number of scaling studies at the Tier-1 supercomputer HazelHen at the High Performance Computing Center Stuttgart (HLRS). We improve the overall runtime by a factor of up to 2.6 and achieve good scalability on up to 768 cores.

Keywords: skeletal muscle mechanics, biophysical modeling, multi-scale modeling, scalability, high-performance computing, numerical efficiency, visualization

## 1. INTRODUCTION

Even "simple" tasks like grabbing an object involve highly coordinated actions of our musculoskeletal system. At the core of such coordinated movements are voluntary contractions of skeletal muscles. Understanding the underlying mechanism of recruitment and muscle force generation is a challenging task and subject to much research (e.g., Kandel et al., 2000; MacIntosh et al., 2006). One of the few non-invasive and clinically available diagnostic tools to obtain insights

**117**

into the functioning (or disfunctioning) of the neuromuscular system are electromyographic (EMG) recordings, i. e., measuring the activation-induced, resulting potentials on the skin surface (e.g., Merletti and Parker, 2004). Conclusions on the neuromuscular system are often drawn from results obtained through signal processing, although such signal processing techniques typically ignore the underlying muscular structure. Further limitations of (surface) EMG measurements are, for example, that they only capture activity from muscle parts close to the surface. This leads to difficulties in identifying, for example, cross-talk (e.g., Farina et al., 2005). Moreover, an EMG often only records weak signals due to layers of adipose tissue, and, in some cases, is restricted to isometric contractions. Hence, to obtain more holistic insights into the neuromuscular system, computational models can be employed (for a review see e.g., Mesin, 2013). Such models need to capture much of the electro-mechanical properties of skeletal muscle tissue and the interaction between neural recruitment and muscular contraction.

The contractile behavior of skeletal muscle tissue has been extensively modeled using lumped-parameter models such as Hill-type skeletal muscle models (e.g., Zajac, 1989), continuummechanical skeletal muscle models (e.g., Johansson et al., 2000; Blemker et al., 2005; Röhrle and Pullan, 2007; Böl and Reese, 2008), or multi-scale, chemo-electromechanical skeletal muscle models (e.g., Röhrle et al., 2008, 2012; Hernández-Gascón et al., 2013; Heidlauf and Röhrle, 2013). To predict the resulting EMG of a particular stimulation, there exist analytical models (e.g., Dimitrov and Dimitrova, 1998; Farina and Merletti, 2001; Mesin and Farina, 2006) with short compute times, or numerical approaches (e.g., Lowery et al., 2002; Mesin and Farina, 2006; Mordhorst et al., 2015, 2017). For realistic muscle geometries, however, numerical methods are almost unavoidable. The chemo-electromechanical models as proposed by Röhrle et al. (2012), Heidlauf and Röhrle (2013, 2014), or Heidlauf et al. (2016) are particularly well-suited to incorporate many structural and functional features of skeletal muscles. They embed one-dimensional computational muscle fibers within a three-dimensional skeletal muscle model and associate them with a particular motor unit. Moreover, those models can be directly linked to motor neuron models either phenomenologically (e.g., Heckman and Binder, 1991; Fuglevand et al., 1993) or biophysically (e.g., Cisi and Kohn, 2008; Negro and Farina, 2011) to further investigate the relationship between neural and mechanical behavior. The desired degree of detail and complexity within these models requires the coupling of different physical phenomena on different temporal and spatial scales, e.g., models describing the mechanical or electrical state of the muscle tissue on the organ scale and the bio-chemical processes on the cellular scale (cf. section 2.1).

Being able to take into account all these different processes on different scales requires a flexible multi-scale, multi-physics computational framework and significant compute power. The availability of computational resources restricts the number of individual muscle fibers that can be considered within a skeletal muscle. The chemo-electromechanical models as implemented within the international open-source libraries OpenCMISS (e.g., Bradley et al., 2011; Heidlauf and Röhrle, 2013; Mordhorst et al., 2015) allow general muscle geometries with about 1,000 embedded computational muscle fibers. As most skeletal muscles, however, have significantly more fibers (ranging from several thousands to more than a million McCallum, 1898; Feinstein et al., 1955), the embedded muscle fibers geometrically represent only a selection from the actual muscle fibers located in its geometrical vicinity. While simulations with 1,000 fibers and less can potentially provide some insights into the neuromuscular system, some effects, such as the motor unit recruitment over the full range of motor units and muscle fibers and their implication on the resulting EMG, can not be estimated unless a detailed and realistic model with a realistic number of muscle fibers is simulated. This full model allows us to estimate the accuracy of "reduced" models by comparing them to the output of the detailed full "benchmark" model. Unless such comparisons are carried out it is hard to make predictions on how additional details such as, for example, more fibers or functional units (motor units) affect the overall outcome—both in terms of muscle force generation and in terms of computed EMG signals.

Highly optimized and highly parallel software exist in the community for biomechanical applications, e.g., for chemoelectromechanical heart models (Xia et al., 2012; Lafortune et al., 2012; Gurev et al., 2015; Colli Franzone et al., 2015). Skeletal muscle tissue and cardiac muscle tissue share many similarities with respect to the underlying microstructure. Therefore similar simulation techniques can be utilized both for heart models and skeletal muscle models. However, significant differences exist with respect to recruitment and action potential propagation between cardiac and skeletal muscle tissue. Whilst there is a homogeneous and continuous spreading of the action potential across a three-dimensional myocardium, the behavior of skeletal muscle exhibits highly heterogeneous recruitment and action potential propagation—essentially each muscle fiber can be recruited independently leading to complex potential fields. Moreover, there exist feedback mechanisms, e.g., afferent feedback, that directly modulate recruitment. To simulate such complex physiological behavior, one requires flexible computing frameworks and a careful analysis of different parallelization strategies for specific applications like skeletal muscle recruitment.

Most multi-purpose computational frameworks for biomedical applications such as OpenCMISS, for example, are developed to provide flexibility using parallel simulation environments, but are typically not designed for highly parallel simulations on Tier-1 supercomputers. This flexibility is achieved through standards like CellML (e.g., Lloyd et al., 2004) and FieldML (e.g., Christie et al., 2009). The respective frameworks are utilized to enhance existing multi-physics models for a wide range of (bioengineering) applications. Most computational frameworks are designed to be run by biomedical researchers on small-sized compute clusters. While they typically can be compiled on large-scale HPC compute clusters such as HazelHen at the HLRS in Stuttgart, they often are not capable of exploiting the full potential of the hardware for a number of reasons. Moreover, simulation run time is typically considered less important than model complexity and output. Hence, typical simulations of biomedical applications are not necessarily optimized for numerical efficiency, parallel scalability, the exploitation of novel algorithms, or file I/O. In this paper, we demonstrate how one can exploit analysis tools, suitable numerical techniques, and coupling strategies to obtain an efficient chemo-electro-mechanical skeletal muscle model that is suitable to be run on a large-scale HPC infrastructure. The model is thus capable of running with a sufficient resolution and number of muscle fibers to provide the required highresolution details. Once large-scale simulations of biomedical applications have been solved with a high degree of detail, most specialized visualization tools such as OpenCMISS-Zinc can no longer handle the large amount of simulation data. Dedicated visualization tools for large-scale visualizations are required. In this work, the MegaMol framework (Grottel et al., 2015) has been adapted to visualize the different biophysical simulation parameters and the resulting EMG.

### 2. MODEL AND METHODS

#### 2.1. The Multi-Scale Skeletal Muscle Model

Before outlining our the model in its full detail, we first provide a brief overview on some anatomical and physiological characteristics of skeletal muscles that are relevant. From an anatomical point of view, skeletal muscles are a hierarchical system. Starting from its basic unit, the so-called sarcomere, several sarcomeres arranged in-series and in-parallel constitute a cylindrically shaped myofibril. Several myofibrils arranged inparallel make up a skeletal muscle fiber and multiple muscle fibers form a fascicle. All the fascicles together constitute an entire muscle and these fascicles are connected together through the extracellular matrix (ECM). From a physiological point of view, several fibers are controlled by a single lower motor neuron through nervous axons. The entire unit consisting of the lower motor neuron, the axons and the respective fibers that are innervated by the axons, is referred to as a motor unit. The motor unit is the smallest unit within a skeletal muscle that can voluntarily contract. The lower motor neuron sends rate-coded impulses called action potentials (AP) to all fibers belonging to the same motor unit (neural stimulation). Moreover, motor units are activated in an orderly fashion, starting from the smallest, up to the largest (recruitment size principle). After a motor neuron stimulates a muscle fiber at the neuromuscular junction, an AP is triggered and propagates along the muscle fiber, resulting in a local activity (contraction). For more comprehensive insights into muscle physiology and anatomy, we refer to the book of MacIntosh et al. (2006).

As the focus of this research is on enabling the simulation of biophysically detailed skeletal models on HPC architectures, this section provides an overview of the multi-scale modeling framework of our chemo-electromechanical skeletal muscle model that is based on the work by Röhrle et al. (2012), Heidlauf and Röhrle (2013, 2014), and Heidlauf et al. (2016). These models can account for the main mechanical and electrophysiological properties of skeletal muscle tissue, including a realistic activation process and resulting force generation. These results are realized by linking multiple sub-models, describing different physical phenomena on different length and time scales. To reduce the computational costs, the different sub-models are simulated using different discretizations, i. e., spatial resolution and time-step size. Data are exchanged between the sub-models using homogenization and interpolation techniques. The link to neuromuscular recruitment, i.e., an entire neuromuscular model, is modeled using predefined stimulation trains for the fibers associated with individual motor units. This recruitment assumption can be replaced without any modifications with a biophysical motor neuron model (e.g., Cisi and Kohn, 2008; Negro and Farina, 2011).

#### 2.1.1. The 3D Continuum-Mechanical Muscle Model

The physiological working range of skeletal muscles includes large deformations. Therefore, we use a continuum mechanical modeling approach that is based on the theory of finite elasticity to simulate the macroscopic deformations and stresses in the muscle tissue. In continuum mechanics, the placement function χ describes the motion of a material point, i. e., it assigns every material point with position **X** in the reference (non-deformed) domain <sup>0</sup> ⊂ R 3 at a time t<sup>0</sup> to a position **x** = χ(**X**, t) in the actual (deformed) domain <sup>t</sup> ⊂ R 3 at time t. The deformation of the body at a material point can be described by the deformation gradient tensor **F** : = ∂χ <sup>∂</sup>**<sup>X</sup>** = ∂**x** ∂**X** , which is defined as the partial derivative of the placement function χ with respect to the reference configuration. The local displacement is defined by the vector **u** = **x** − **X**.

The governing equation of the continuum mechanical model is the balance of linear momentum. Under the assumption of no acceleration (i.e., inertia forces vanish) and neglecting body forces, the balance of linear momentum in its local form can be written as

$$\operatorname{div} \mathbf{P} = \mathbf{0} \text{ in } \Omega\_l \text{ for all } t, \tag{1}$$

where div(·) denotes the divergence operator and **P** is the first Piola-Kirchhoff stress-tensor. To solve the balance of linear momentum, one needs to define a constitutive equation that relates **P** to deformation. The constitutive equation describes the overall mechanical behavior of the muscle and can be divided into a passive and an active component. The latter represents the muscle's ability to contract and produce forces. In this work, we assume a superposition of the active and passive behavior, i. e., an additive split of **P**.

Passive skeletal muscle tissue is assumed to be hyperelastic and transversely isotropic. Consequently, the passive part to the first Piola-Kirchhoff stress tensor **P**passive(**F**,M) depends on the deformation gradient tensor **F** and a structure tensor M = **a**<sup>0</sup> ⊗ **a**0, which is defined by the muscle fiber direction **a**0. The isotropic part of the passive stress-tensor assumes a Mooney-Rivlin material. It is enhanced by an additive anisotropic contribution accounting for the specific material properties in the muscle fiber direction **a**0.

The active force is generated on a microscopic scale, i.e., within a half-sarcomere (the smallest functional unit of a muscle) consisting of thin actin and thick myosin filaments. Based on geometrical considerations of the half-sarcomere structure, it is known that the active muscle force depends on the actual half-sarcomere length lhs (force-length relation) (Gordon et al., 1966). When a half-sarcomere is activated by calcium as a secondary messenger, actin and myosin filaments can form crossbridges and produce forces (cross-bridge cycling). The active force state of the microscopic half-sarcomere is summarized in an activation parameter γ that enters the macroscopic constitutive equation. Furthermore, we assume that the active stress contribution acts only along the fiber direction **a**0. When considering only isometric or slow contractions, the active stress tensor **P**active(**F**,M, γ ) can be defined as a function of the lumped activation parameter γ , the deformation gradient tensor **F**, and the structure tensor M. An additional force-length relationship needs to be included within **P**active.

Finally, we assume skeletal muscle tissue to be incompressible, which implies the incompressibility constraint det **F** = 1. The resulting first Piola-Kirchhoff stress tensor reads

$$P(\mathcal{F}, \mathcal{M}, \boldsymbol{\chi}) = P\_{\text{passive}}(\mathcal{F}, \mathcal{M}) + P\_{\text{active}}(\mathcal{F}, \mathcal{M}, \boldsymbol{\chi}) - p\mathcal{F}^{-T}, \tag{2}$$

where p is the hydrostatic pressure, which enters the equation as a Lagrange multiplier enforcing the incompressibility constraint. The material parameters of the continuum-mechanical skeletal muscles are fitted to experimental data (Hawkins and Bey, 1994), and can be found in Heidlauf and Röhrle (2014).

#### 2.1.2. The 1D Model for Action Potential Propagation

The electrical activity of skeletal muscles resulting from the local activity of all muscle fibers can be analyzed by measuring the extracellular potential. The bidomain-model is a framework widely used in continuum mechanics to simulate the electrical activity of living tissues (Pullan et al., 2005). It is based on the assumption that the intracellular and extracellular spaces homogeneously occupy the same domain. The intracellular and extracellular spaces are electrically coupled by an electrical current I<sup>m</sup> flowing across the cell membrane, i. e.,

$$-\operatorname{div}\,\boldsymbol{q}\_{\mathrm{i}} = \operatorname{div}\,\boldsymbol{q}\_{\mathrm{e}} = A\_{\mathrm{m}}I\_{\mathrm{m}}\nu$$

where **q**<sup>i</sup> and **q**<sup>e</sup> denote the current density in the intracellular and extracellular space, respectively, and A<sup>m</sup> is the fiber's surface to volume ratio. The muscle fiber membrane is nearly impermeable for ions and serves as a capacitor. However, ions can be transported through the membrane by ion channels and active ion pumps. This process can be mathematically described by the biophysically motivated modeling approach proposed by Hodgkin and Huxley (1952) which leads to the constitutive equation

$$I\_{\rm m} = \,^\circ C\_{\rm m} \frac{\partial \, V\_{\rm m}}{\partial t} + I\_{\rm ion} (\mathcal{y}, V\_{\rm m}, I\_{\rm stim}) \,, \tag{3}$$

where V<sup>m</sup> is the transmembrane potential, C<sup>m</sup> is the capacitance of the muscle fiber membrane (sarcolemma) and Iion is the transmembrane-potential-dependent ionic current flowing through the ion-channels and -pumps. Further state variables are summarized in **y**, e. g., the states of different ion channels. Istim is an externally applied stimulation current, e. g., due to a stimulus from the nervous system. Assuming that the intracellular space and extracellular space show the same anisotropy, which is the case for 1D problems, the bidomain equations can be reduced to the monodomain equation. We thus use the one-dimensional monodomain equation in the domain Ŵ<sup>t</sup> ⊂ R:

$$\frac{\partial V\_{\rm m}}{\partial t} = \frac{1}{A\_{\rm m}C\_{\rm m}} \left( \frac{\partial}{\partial \boldsymbol{\chi}} \left( \sigma\_{\rm eff} \frac{\partial V\_{\rm m}}{\partial \boldsymbol{\chi}} \right) - A\_{\rm m}I\_{\rm ion} \left( \boldsymbol{\mathcal{y}}, V\_{\rm m}, I\_{\rm stim} \right) \right) \tag{4}$$

Here, x denotes the spatial coordinate along a one-dimensional line, i.e., the fiber, and σeff is the effective conductivity.

#### 2.1.3. The 0D Sub-cellular Muscle Model

The model proposed by Shorten et al. (2007) provides a basis to compute the lumped activation parameter γ , which is the link to the 3D continuum-mechanical muscle model. Its evolution model is steered by the ionic current Iion of the 1D model. In more detail, the 0D sub-cellular muscle model contains a detailed biophysical description of the sub-cellular excitation-contraction coupling pathway. Specifically, it models the depolarization of the membrane potential in response to stimulation, the release of calcium from the sarcoplasmic reticulum (SR) which serves as a second messenger, and cross-bridge (XB) cycling. To do so, the Shorten model couples three sub-cellular models: A Hodgkin-Huxley-type model is utilized to simulate the electrical potentials and ion currents through the muscle-fiber membrane and the membrane of the T-tubule system. For calcium dynamics, a model of the SR membrane ryanodine receptor (RyR) channels (Ríos et al., 1993) is coupled to the electrical potential across the T-tubule membrane and models the release of calcium from the SR. Additionally, the calciumdynamics model describes diffusion of calcium in the muscle cell, active calcium transport through the SR membrane via the SERCA pump (sarco/endoplasmic reticulum Ca2+-ATPase), binding of calcium to buffer molecules (e. g. , parvalbumin or ATP), and binding of calcium to troponin enabling the formation of cross-bridges. The active force generation is simulated by solving a simplified Huxley-type model (Razumova et al., 1999), which is the basis for calculating the activation parameter γ .

All incorporated sub-cellular processes are modeled with a set of coupled ordinary differential equations (ODEs)

$$\frac{\partial \mathbf{y}}{\partial t} = G\_{\mathcal{Y}} \left( \mathbf{y}, V\_{\mathbf{m}}, I\_{\text{stim}} \right), \tag{5}$$

where G**y** summarizes the right-hand-side of all the ODEs associated with the state variables **y** which number, in the case of the Shorten et al. model, more than 50.

The final activation parameter γ is computed from the state variable vector **y** and the length and contraction velocity of the half-sarcomere, <sup>l</sup>hs and ˙ lhs. For isometric or very slow contractions, the contraction velocity can be neglected. Hence, following Razumova et al. (1999) and Heidlauf and Röhrle (2014), the activation parameter is calculated as

$$\text{yr (y, l\_{hs}) = f\_{\text{l-l}} \left(l\_{\text{hs}}\right) \frac{A\_2 - A\_2^{\text{min}}}{A\_2^{\text{max}} - A\_2^{\text{min}}} \,. \tag{6}$$

Here, the function ff-l lhs is the force-length relation for a cat skeletal muscle by Rassier et al. (1999), A<sup>2</sup> ∈ **y** is the

concentration of post power-stroke cross-bridges, A max 2 is the concentration of post power-stroke cross-bridges for a tetanic contraction (100 Hz stimulation after 500 ms stimulation) and A min 2 is an offset parameter denoting the concentration of post power-stroke cross-bridges in the resting state.

#### 2.1.4. Summary of the Full Model

In summary, the chemo-electromechanical behavior of a skeletal muscle is described by the following coupled equations:

$$\begin{aligned} \mathbf{0} &= \operatorname{div} \mathbf{P} \{ \mathbf{F}, \mathcal{M}, \boldsymbol{\nu} \left( \mathbf{y}, I\_{\text{hs}} \right) \} & \text{in } \Omega\_{t} \text{for all } t, \\ \frac{\partial V\_{\text{m}}}{\partial t} &= \frac{1}{A\_{\text{m}} \mathbf{C}\_{\text{m}}} \left( \frac{\partial}{\partial \boldsymbol{\chi}} \left( \sigma\_{\text{eff}} \frac{\partial V\_{\text{m}}}{\partial \boldsymbol{\chi}} \right) \right. \\ & \qquad - A\_{\text{m}} I\_{\text{ion}} \left( \boldsymbol{\mathcal{y}}, V\_{\text{m}}, I\_{\text{stim}} \right) \Big) & \text{on all fibers } \Gamma\_{t}, \\ \boldsymbol{\Omega} &= \end{aligned}$$

$$\frac{\partial \mathbf{y}}{\partial t} = \mathcal{G}\_{\mathbf{y}} \left( \mathbf{y}, V\_{\mathbf{m}}, I\_{\text{stim}} \right) \qquad \text{at all sucroseere positions.} \tag{7c}$$

Realistic material parameters and muscle fiber directions, appropriate boundary and initial condition (i.e., Dirichlet boundary conditions for the three-dimensional, continuummechanical model to describe the displacement of a tendon and, hence, of the skeletal muscle tissue, as a result of motion, or the stimulus train, Istim(t)) for all fibers, need to be chosen (cf. section 3.1 for a particular example).

#### 2.2. Numerical Methods

To enable multi-scale skeletal muscle models, e.g., such as the ones described in section 2.1, to run efficiently and scalably on (large-scale) clusters, we first present the numerical methods as implemented in Heidlauf and Röhrle (2013) (section 2.2.1) followed by algorithmic optimizations aiming to achieve efficient and scalable code (section 2.2.2). To distinguish between the implementation of Heidlauf and Röhrle (2013) and the new optimized implementation, we denote the former as the baseline implementation.

#### 2.2.1. Discretization and Solvers

#### **2.2.1.1. Spatial discretization**

The sub-models of the multi-scale skeletal muscle model have significantly different characteristic time and length scales. To solve the overall model efficiently, different discretization techniques and resolutions are required for the sub-models. In Heidlauf and Röhrle (2013), as in this work, the continuummechanics model is solved via the finite element method using Taylor-Hood elements (i. e., a mixed formulation of tri-quadratic and tri-linear Lagrange basis functions to approximate the displacements and the hydrostatic pressure respectively). The one-dimensional muscle fibers are represented by embedded, one-dimensional finite element meshes with linear Lagrange basis functions. **Figure 1** (left) shows the embedding of n<sup>y</sup> × n<sup>z</sup> discretised 1D fibers within the 3D muscle domain <sup>0</sup> discretised with e<sup>x</sup> × e<sup>y</sup> × e<sup>z</sup> tri-quadratic finite elements, where ex, ey, and e<sup>z</sup> are the number of elements in the x, y, and z direction respectively. Each node of the 1D fiber mesh serves as sarcomere position where one instance of the sub-cellular model is calculated.

The different discretizations of the coupled multi-physics problem require data to be transfered between the different spatial discretizations. Within our model, the transfer of information from the microscopic scale to the macroscopic scale is realized via the activation parameter γ . The microscopic sarcomere forces γ provided by the monodomain model are projected to the macroscopic three-dimensional continuummechanics model (γ → ¯γ ). This homogenization is performed for all Gauss points in the 3D model by averaging the γ values of all monodomain model nodes nearest to the respective Gauss point. Similarly, the node positions of the onedimensional computational muscle fibers are updated from the actual displacements **u** of the three-dimensional, continuummechanics model by interpolating the node positions via the basis functions of the three-dimensional model. Based on this step, the microscopic half-sarcomere lengths lhs(**x**) can be calculated.

#### **2.2.1.2. Time discretization**

To compute an approximate solution for Equation (7), the different characteristic time scales of the 3D, 1D and 0D problems can be exploited. The action potential propagates faster than the muscle deformation, and the sub-cellular processes evolve considerably faster than the diffusive action potential propagation. From a computational point of view, it is desirable to have common global time steps. To achieve this, we choose dt3D/N = dt1D = K · dt0D with N, K ∈ N. Then, each discrete time is uniquely defined as tm,n,<sup>k</sup> : = m · dt3D + n · dt1D + k · dt0D, with M ∈ N, n = 0, .., N − 1 and k = 0, .., K − 1. Moreover, state values associated with time tm,n,<sup>k</sup> are denoted with the superscript (·) m,n,k . Employing different time steps requires a time splitting scheme. The baseline implementation in Heidlauf and Röhrle (2013) uses a first-order accurate Godunov splitting scheme, for which one time-step of the three-dimensional equation including all sub-steps for the one-dimensional monodomain equation is given by:

	- a. For k = 0, . . . , K − 1 perform explicit Euler steps for Equation (7c) and the 0D portion of Equation (7b).
	- b. Set V m,n,0 m : = V m,n,K <sup>m</sup> and **y** <sup>m</sup>,n+1,0 : = **y** <sup>m</sup>,n,K.
	- c. Perform one implicit Euler step for the 1D portion of Equation (7b) to compute V m,n+1,0 <sup>m</sup> .

**Figure 1** (right) schematically depicts this algorithm.

#### **2.2.1.3. Linear solvers**

The coupled time stepping algorithm described above contains two large systems of equations that need to be solved. The first one results from the 3D elasticity problem (7a) and the second one stems from an implicit time integration of the linear 1D diffusion problem of the fiber (7b). In Heidlauf and Röhrle (2013)

FIGURE 1 | (Left) Schematic view of a 3D muscle domain that contains a given number of nx × ny muscle fibers per 3D partition, ex × ey × ez finite elements for the 3D model (7a), and sx nodes per fiber for (7b) and (7c). (Right) Schematic view of the multi-scale time stepping scheme based on a Godunov splitting of the monodomain equation.

the linear systems are obtained by applying Newton's method to the 3D and 1D problems and are solved using GMRES (Saad and Schultz, 1986) as implemented within the PETSc library (Balay et al., 1997, 2015).

#### 2.2.2. Algorithmic Optimizations

While section 2.2.1 describes the implementation as in Heidlauf and Röhrle (2013), in the following paragraphs we propose some algorithmic optimizations to improve numerical efficiency.

#### **2.2.2.1. Spatial discretization**

We optimize the interpolation and homogenization routines, and leave the spatial discretization as described in section 2.2.1 unchanged in this work: interpolation and homogenization steps involve the transfer of information between values at Gauss points of the 3D elements to nodes of the 1D fibers. To allow for a general domain decomposition later on, a mapping between the respective 3D and 1D finite elements is necessary. In Heidlauf and Röhrle (2013), the homogenization was achieved using a naive search over all locally stored fibers. This search was performed for each 3D element. We replace this approach, which exhibits quadratic complexity (in terms of the number of involved elements), with a calculation of linear complexity. This is achieved by calculating – in constant time – the indices of the 1D elements that are located inside a 3D element.

#### **2.2.2.2. Second-order time stepping**

To reduce computational cost, we replace the first-order Godunov splitting with a second-order Strang splitting as proposed by, e.g., Qu and Garfinkel (1999). A higher order means that we advance from an O(dt) approach to an O(dt<sup>2</sup> ) for a given steplength dt in time. Second-order time-stepping schemes reduce the discretization error much faster with a decreasing time step size dt and thus, the required accuracy might be achieved using larger time steps. Along with the change of the splitting approach, we replace the explicit Euler method for Equation (7c) and the 0D portion of Equation (7b) with the method of Heun and employ an implicit Crank-Nicolson method for the diffusion part of Equation (7b). In contrast to the simpler Godunov splitting, Strang splitting uses three sub-steps per time step: a first step with length dt1D/2 for the 0D part, a second step with length dt1D for the diffusion, and a third step with length dt1D/2 again for the 0D part. The modified algorithm at time tm,0,0 is given by:

1. For n = 0, . . . , N − 1 do


The explicit Heun step in 1.a. and 1.d. (see above) is given by:

$$\begin{bmatrix} \mathbf{y} \\ \begin{bmatrix} \mathbf{y} \\ \end{bmatrix} \end{bmatrix} = \begin{bmatrix} \mathbf{y} \\ \begin{bmatrix} \mathbf{y} \\ \end{bmatrix} \end{bmatrix}^{m,n,k} + dt\_{\mathrm{0D}} \begin{bmatrix} G\_{\mathcal{Y}} (\mathbf{y}^{m,n,k}, \mathbf{V}\_{\mathbf{m}}^{m,n,k}, I\_{\mathrm{stm}}) \\ -\frac{1}{\mathbf{C}\_{\mathrm{m}}} I\_{\mathrm{ion}} (\mathbf{y}^{m,n,k}, \mathbf{V}\_{\mathbf{m}}^{m,n,k}, I\_{\mathrm{stm}}) \end{bmatrix}, \\ \text{(8a)}$$

$$\begin{aligned} \left[\begin{array}{c} \mathcal{Y} \\ V\_{\rm{m}} \end{array}\right]^{m,n,k+1} &= \left[\begin{array}{c} \mathcal{Y} \\ V\_{\rm{m}} \end{array}\right]^{m,n,k} \\ &+ \frac{dt\_{\rm{0D}}}{2} \left[\begin{array}{c} G\_{\mathcal{Y}} \left(\mathcal{Y}^{m,n,k}, V\_{\rm{m}}^{m,n,k}, I\_{\rm{stim}}\right) \\ + G\_{\mathcal{Y}} \left(\mathcal{Y}^{\rm{rec}}, V\_{\rm{m}}^{\rm{rec}}, I\_{\rm{stim}}\right) \\ - \frac{1}{C\_{\rm{m}}} \left(I\_{\rm{ion}} \left(\mathcal{Y}^{m,n,k}, V\_{\rm{m}}^{m,n,k}, I\_{\rm{stim}}\right) \\ + I\_{\rm{ion}} \left(\mathcal{Y}^{\rm{rec}}, V\_{\rm{m}}^{\rm{rec}}, I\_{\rm{stim}}\right) \right) \end{array} \right] \end{aligned}$$

In 1.b., we solve the system resulting from the Crank-Nicolson time discretization of the diffusion part in Equation (7b):

$$\begin{split} V\_{\mathrm{m}}^{m,n+1,0} &= \, V\_{\mathrm{m}}^{m,n,0} + \frac{dt\_{\mathrm{lD}}}{2\,A\_{\mathrm{m}}C\_{\mathrm{m}}} \left( \frac{\partial}{\partial \boldsymbol{\omega}} \left( \sigma\_{\mathrm{eff}} \frac{\partial V\_{\mathrm{m}}^{m,n,0}}{\partial \boldsymbol{\omega}} \right) \right) \\ &+ \frac{\partial}{\partial \boldsymbol{\omega}} \left( \sigma\_{\mathrm{eff}} \frac{\partial V\_{\mathrm{m}}^{m,n+1,0}}{\partial \boldsymbol{\omega}} \right) \bigg), \end{split} \tag{9}$$

#### **2.2.2.3. Optimal complexity linear solver**

The GMRES solver is a robust choice for general sparse systems of linear equations but it does not exploit the symmetry, positive definiteness and tri-diagonal structure of the 1D diffusion system. For symmetric matrices the conjugate gradient (CG) solver (Hestenes and Stiefel, 1952) is an appropriate iterative solver. For tri-diagonal matrices one could even employ the most simple Thomas algorithm (Thomas, 1949). To maintain flexibility, we currently replace the GMRES solver by a direct solver from the MUMPS library (Amestoy et al., 2001, 2006) that exploits the structure and exhibits optimal complexity for tridiagonal systems.

#### 2.3. Domain Partitioning and Parallelization

For parallelization, the computational domains must be partitioned appropriately. This is particularly challenging for multi-scale problems, as considered in this work, as the parallelization induces communication due to dependencies of local data on data in neighboring partitions. To motivate the discussion below, we briefly outline the main challenges in the scope of this work:


process is completely local as all 0D points are contained within the respective 3D element and reside on the same process.

#### 2.3.1. Pillar-Like Domain Decomposition

In the baseline implementation by Heidlauf and Röhrle (2013), the domain decomposition for parallel execution was hard-coded for only four processes, following a partitioning ensuring that entire fibers remain within the same partition at all times, which is anatomically motivated. Since all skeletal muscle fibers are, from an electrical point of view, independent of each other, this is also computationally attractive as no quantities in the 0D and 1D sub-models need to be exchanged between fibers. We extend the approach to an arbitrary number of processes, and keep the structure of partitioning the 3D and 1D meshes in the same way, such that quantities in the 3D, 1D and 0D models corresponding to the same spatial location are stored on the same process. This avoids unnecessary inter-process volume-communication between the sub-models.

#### 2.3.2. New Spatial Domain Decomposition

In addition to the extension of the pillar-like domain partitioning, we investigate a second approach with nearly cube-shaped partitions, cf. **Figure 2**. In contrast to partitioning strategies based on space-filling curves such as Schamberger and Wierum (2005), graph partitioning such as Miller et al. (1993) and Zhou et al. (2010), or problem-specific approaches such as the pillarshaped partitioning, a cuboid partition has the advantage that the interaction of one cuboid partition with others is guaranteed to be planar and bounded by the maximum number of neighboring partitions, i. e., 3<sup>3</sup> − 1 = 26. This allows communication with reduced complexity and cost.

However, we cannot completely avoid obtaining sub-domains at the boundary of the computational domain that have less elements than other domains. Given a fixed number of available cores, we thus maximize the number of employed processes by adapting the number of sub-divisions in each axis direction corresponding to a factorization of the total number of processes. By carefully choosing the factorization, we reduce the impact on sub-optimal load-balancing in these 'nearly cuboid' partitioning cases. By introducing the additional constraint that each generated partition has to be larger than a specified "atomic" cuboid of elements, we can easily ensure that each process contains only entire fibers (pillar-like partition), a fixed number of fiber subdivisions (cube-like partition), or anything in between.

In summary, based on the communication dependencies 1 and 2 as described at the beginning of this section, we enhance the original pillar-like domain partitioning in two ways: (i) we allow for an arbitrary number of processes instead of a fixed number of four processes and (ii) we introduce a new partitioning concept with nearly cuboid partitions that minimize the partitioning's surface area.

Note, when considering the simulation of realistic muscle geometries that cannot be discretized using rectangular elements, e.g., using unstructured meshes, a domain decomposition into pillar-like or nearly cuboid partitions is generally no longer feasible. The same is true for a skeletal muscle with complex

muscle fiber distributions. In such a case, one cannot ensure that fibers are always contained within a single partition when using a pillar-like domain decomposition. However, the strategy to aim for minimal surface domains is always possible as it inherently involves cutting fibers at process boundaries.

Within this work, we assume that it is possible to create nearly optimal cube-shaped partitions.

### 2.4. Visualization of Muscle Simulations

Performing large scale simulations is only the first step to gain an improved insight into the musculoskeletal system. Visual analysis and interactive exploration of the simulation data gives the opportunity to investigate every facet of large and complex systems. General-purpose visualization tools like ParaView (Ahrens et al., 2005) or VisIt (Childs et al., 2012a) can only provide a first glimpse of such data sets. However, for the above-mentioned in-depth analysis, a tailored visualization tool is necessary. The standard visualization framework within the OpenCMISS software project is OpenCMISS-Zinc. This framework already offers a range of visualization techniques for muscle fiber data, for example, a convex hull calculation to construct a mesh geometry from point cloud data. However, OpenCMISS-Zinc lacks important features that are required to develop efficient visualizations intended to run on HPC systems. These missing features are, for example, a suitable platform for fast visualization prototyping, distributed rendering, or CPUbased visualization. The open-source visualization framework MegaMol (Grottel et al., 2015) fulfills these criteria and offers additional functionality and features that are valuable for this project. Therefore, we use MegaMol as the basis for improved musculoskeletal visualizations. For example, one additional feature is the infrastructure for brushing and linking that allows for developing interactive visual analytics applications. MegaMol also offers a built-in headless mode and a remote control interface, which is crucial for HPC-based in-situ rendering.

In-situ visualization is an alternative approach to traditional post-hoc data processing. The key idea is to process and visualize data on the HPC system while the simulation is running. Consequently, writing raw data to disk can be avoided completely. Since our new visualization tool is intended to cope with the visual analysis of large-scale muscle simulations, we require an architecture that allows us to employ this approach in the future. There are three different approaches that are considered as in-situ visualization, identified by Childs et al. (2012b). The first one is known as co-processing, where the visualization tool runs simultaneously with the simulation and accesses the simulations memory for further processing and visualization. In the second approach, the visualization runs on separate nodes and communicates data via a network. This method is known as concurrent-processing. The last possibility, the hybrid technique, directly accesses the simulation's memory and reduces the data for less network load while sending the data to visualization nodes. We are planning to add the first two methods—co-processing and concurrent processing—into our implementation. However, we cannot completely disregard the hybrid technique as we might need to identify the workload of each node and the network traffic of a running large-scale simulation with in-situ visualization first.

Interactive visualization typically uses graphics APIs like OpenGL to employ the GPU for rendering. GPU-accelerated rendering uses polygon rasterization, i. e., large numbers of triangles can be processed and rendered in parallel. All geometric objects that are rendered thus have to be represented by triangle meshes. This visualization approach is, for example, also used by OpenCMISS-Zinc. An alternative rendering approach to GPU-accelerated rasterization is ray tracing. Here, one or more view rays are computed for each pixel. Each ray is tested for intersection with the objects in the scene in order to find out which objects are visible at this pixel. Note that this approach can not only render triangles but also all objects that have a mathematical representation that can be used for computing the ray-object intersection (e. g., spheres or cylinders). Ray tracing is usually computed on the CPU and was traditionally only used for high-quality offline rendering due to its higher computational complexity. The combination of modern hardware and improved algorithms, however, enables interactive ray tracing, even on single desktop workstations.

MegaMol offers GPU rendering (rasterization) and CPU ray tracing via a thin abstraction layer. The GPU rendering uses the OpenGL API, whereas the CPU rendering is based on the ray tracing engine OSPRay (Wald et al., 2017). In particular the CPU-based ray tracing enables image synthesis on any computer, regardless of the availability of dedicated GPUs. This is especially important for HPC clusters, which are typically not equipped with GPUs: Currently, only two of the top-ten HPC systems in the Top500 list GPU systems. Since ray tracing simulates the transport of light, it offers advanced rendering and shading methods (e. g., global illumination and ambient occlusion) that enhance the perception of depth. MegaMol is currently not optimized for HPC usage. However, it provides the necessary basic infrastructure for enabling distributed rendering on an HPC system. Furthermore, MegaMol is already capable of rendering discretized muscle fibers as continuous geometry. The visual quality and scalability obtained by MegaMol using integrated OSPRay ray tracing are discussed in section 3.4.

### 3. RESULTS

Before simulating realistic and complex models on HPC systems, it is essential to first analyse numerical complexity, i. e., scalability in terms of the size of the problem both for the baseline methods described in section 2.2.1 and our optimized methods presented in section 2.2.2. To avoid any geometrical effects stemming from realistic geometries, we perform the analysis on a test example introduced in section 3.1. As the old parallel code used 4 cores, only, in section 3.3 we restrict our analysis of the parallel scalability to the proposed new parallelization strategies.

#### 3.1. Test Scenario

As a test scenario, we use a generic cubic muscle geometry (1 × 1 × 1 cm). The muscle fibers are aligned in parallel to one cube-edge (the x-direction). The discretization in space and time is as carried out as described in sections 2.2.1 and 2.2.2. The discretization parameters will be specified for the respective experiments. For the material parameters for the continuummechanics model, the effective conductivity σeff, the surface-tovolume ratio Am, and the membrane capacity Cm, we use exactly the same values as reported in Heidlauf and Röhrle (2014).

To constrain the muscle, Dirichlet boundary conditions (zero displacement) are used to fixate the following faces of the muscle cube: the left and the right faces (faces normal to the x-direction), the front face (face normal to the y-direction) and the bottom face (face normal to the z-direction). Further, no current flows over the boundary of the computational muscle fibers, i. e., zero Neumann boundary conditions are assumed at both muscle fiber ends. As far as the skeletal muscle recruitment is concerned, we consider an isometric single-twitch experiment by stimulating all fibers at their mid-points for t ∈ [0, 0.1 ms] with Istim(t) = 1200µA/cm<sup>2</sup> . For all other t, Istim(t) is assumed to be 0.

### 3.2. Numerical Investigations

In the following, we present numerical experiments demonstrating, in particular, the increase in efficiency with the new second-order time discretization method. All runtimes are measured in serial, on an Intel <sup>R</sup> CoreTM i5-4590 CPU (3.3 GHz, 32 GB RAM) for Secs. 3.2.1 and 3.2.2, and an Intel <sup>R</sup> XeonTM E7-8880 v3 CPU (2.3 GHz, 504 GB RAM) for Secs. 3.2.3, 3.2.4, using the OpenCMISS implementation.

#### 3.2.1. Time Discretization for the Sub-cellular Model

In a first step, we verify the convergence order of Heun's method experimentally. Therefore, we restrict ourselves to the reaction term, i. e., step 1.a of the Godunov algorithm, but use Heun's method for Equation (7c) and the 0D portion of Equation (7b). The diffusion term is thus completely neglected. We use the test setup as presented in Sect. 3.1. To compare the accuracy of Heun's method with an explicit Euler method, we compare the values of V<sup>m</sup> and Iion at a stimulated material point on a muscle fiber while varying the time step size dt0D. As a reference solution, we use the solution calculated with Heun's method for a very high resolution (K : = dt1D/dt0D = 4096). We restrict ourselves to the time interval [0, dt1D], with dt1D = 0.5µs. To compare the methods in terms of efficiency, we measure the related compute times. **Figure 3A** depicts the relative error depending on the number K of 0D time steps while on the right the necessary CPU-times to reach a certain accuracy for the different solvers are compared.

**Figure 3A** shows the expected first-order convergence for the explicit Euler method and second-order convergence for Heun's method. From an application point of view, however, efficient computation ("Which accuracy can be achieved in which runtime?") is more important than the order of convergence. Therefore, in order to reveal the potential of Heun's method in decreasing the runtime for a given required accuracy, we take into account the different computation time per step of the methods. **Figure 3B** shows that two Heun steps with dt0D = 2.5µs replace 50 forward Euler steps yielding a theoretical speedup of 12.5 for the 0D-solver. At the same time, the error decreases by a factor of approximately 3. All times are normalized with respect to the CPU-time of a single step of the Euler method (K = 1).

#### 3.2.2. Time Discretization for the Muscle Fibers

In a second experiment, we verify the convergence order of the Strang splitting scheme, i. e., we couple 0D reaction and 1D diffusion. Again, the same test setup as above is considered except that we use a larger time interval [0, 0.1 ms] and vary the number, N, of 1D time steps. Based on the previous results for the isolated 0D problem, we choose K = 2 for the Strang-splitting scheme and K = 5 for the Godunov-splitting scheme. This ensures a comparable relative error for the 0D sub-problem while saving computational time. The reference solution is computed using a Strang-splitting scheme with dt1D = 0.25µs, yielding Vm(0.1 ms) ≈ −23.5219 mV.

**Figure 4A** shows the relative errors of V<sup>m</sup> (0.1 ms) at a stimulated sub-cell for the Godunov- and Strang-splitting schemes. Comparable relative errors as for the Godunov scheme with dt1D = 0.5µs are achieved for the Strang splitting scheme with dt1D = 2 or 4µs. Qu and Garfinkel (1999) applied the Strang splitting scheme on the monodomain equation in cardiac conduction, using a different reaction term than in this work. However, it is not entirely clear whether second order convergence is exhibited by their numerical experiments. For an electrocardiogram simulation Sundnes et al. (2005) used the same scheme on the more general bidomain equation, achieving a nearly second order scheme. In contrast to these works our results show a true second-order error dependency. The resulting speedups are depicted in **Figure 4B** by arrows pointing from Godunov to Strang data points. There, the compute times are normalized with respect to the compute time of the Godunov scheme for dt1D = 0.5µs.

Based on a relative error in V<sup>m</sup> of about 2 · 10−<sup>3</sup> , the improved time stepping scheme achieves a speedup of 7.54, if the accuracy requirement is weakened slightly. If the error constraint is not weakened, we still obtain a speedup of 3.89. Note that, for more restrictive error limits, the speedup achieved with a second-order scheme will be even higher due to the higher convergence order.

#### 3.2.3. Solving the Linear Systems of Equations in the 1D Model

In a further experiment, which solves a 1D diffusion problem, we consider a single fiber inside one 3D element for the time interval t ∈ [0, 3 ms]. The Godunov splitting scheme is employed with time step sizes dt1D = 5 · 10−<sup>3</sup> ms and dt0D = 10−<sup>4</sup> ms, as the experiment is largely independent of the splitting scheme. We compare the GMRES solver with 30 restarts against the CG solver and a direct solver from the MUMPS library. **Figure 5** shows the expected reduction in the runtime for the CG and direct solvers. Although the direct solver has a higher runtime for a small number of 1D elements, it requires the lowest runtime for finer discretizations and shows a linear complexity with the number of 1D elements.

#### 3.2.4. Runtime Analysis During Serial Execution of the Full Model

In previous sections we considered subproblems of the computational model. In this section we measure the overall effect of the combined improvements. A complete single-twitch scenario as described in section 3.1 is simulated for a time span of [0, 1 ms]. We compare all numerical and algorithmic improvements of this paper against the baseline setting of Heidlauf and Röhrle (2013).

The 3D spatial discretization comprises 8 Taylor-Hood finite elements containing 36 muscle fibers (n<sup>x</sup> = n<sup>y</sup> = 6) in total. For the baseline setting using the Godunov splitting scheme the time steps are set to dt3D = 1 ms, dt1D = 5 · 10−<sup>4</sup> ms and dt0D = 10−<sup>4</sup> ms, i. e., N = 2000 and K = 5. For the Strang

splitting scheme the values are dt3D = 1 ms, dt1D = 4 · 10−<sup>3</sup> ms and dt0D = 2 · 10−<sup>3</sup> ms, i. e., N = 250 and K = 2. For the baseline setting the linear system of equations arising from the 1D problem is solved using a restarted GMRES solver with a restart after 30 iterations and relative residual tolerance of 10−<sup>5</sup> . The improved simulation uses the direct solver as described in section 2.2.2.3. To solve the 3D problem, Newton's method from the PETSc library is used with a relative and absolute tolerance of 10−<sup>8</sup> and a backtracking line search approach with a maximum number of 40 iterations.

To assess problem size scalability, we vary the number of 1D elements along each muscle fiber and measure the runtimes of the simulation components. Note that the number of sub-cellular model instances is changed accordingly.

The results depicted in **Figure 6** provide the following insights: (i) The majority of the runtime is spent solving the 0D problem. (ii) The portion of runtime spent solving the 3D problem is negligible. This is due to the low number of 3D finite elements for the mechanics problem. Realistic models would, however, require a finer resolution of the 3D problem. (iii) The runtime for the other computational components increases approximately linearly with the number of fiber elements. This indicates a good scaling behavior with respect to problem size. (iv) The computations of the macroscopic variable lhs from the fiber nodes, the homogenized activation parameter γ¯ (homogenization), as well as lhs (interpolation) have almost no impact on the overall computational time. However, interpolation is more time consuming as it involves simultaneously traversing the fiber and the 3D meshes, whereas homogenization requires only a single averaging operation for each Gauss point of the 3D elements.

#### 3.3. Parallel Scaling Experiments

In the following we conduct parallel scalability experiments to investigate the behavior of the simulation on highly parallel compute clusters. All experiments are conducted on HazelHen, the Tier-1 supercomputer at the High Performance Computing Center Stuttgart (HLRS). A dual-socket node of this Cray XC40 contains two Intel <sup>R</sup> Haswell E5-2680v3 processors with base frequency of 2.5 GHz, maximum turbo frequency of 3.3 GHz, 12 cores each and 2 hyperthreads per core, leading to a

total number of 48 possible threads per node. We present the results of a strong scaling (Experiment #1) and weak scaling experiments (Experiments #2 and #3) as well as an investigation of partitioning strategies (Experiment #4).

#### 3.3.1. Strong Scaling Measurements—Experiment #1

Strong scaling investigates the runtime for a fixed problem size with respect to different process counts. **Figure 7** depicts strong scaling results for the specified problem with 13,824 1D elements. Taking the first runtime measurement with 12 processes (T12) as reference, the parallel efficiency for a process count p is computed from the runtime T<sup>p</sup> as E<sup>p</sup> = (T12/Tp) · (p/12) and visualized in the bottom plot of **Figure 7**. It can be seen that the 0D model solver shows a good parallel efficiency of more than 80% whereas the parallel efficiencies for the 3D solver and the 1D solver drop below 50 and 30%, respectively. This matches the fact that the half-sarcomere sub-models (0D) are completely independent of each other whereas the solutions of 3D and 1D problems require communication.

#### 3.3.2. Weak Scaling Measurements—Experiment #2

For weak scaling, the problem size is increased proportional to the number of processes. Thus, invariants are the number of elements per process and the overall shape of the computational domain. Here, we show weak scaling for both partitioning strategies: partitioning only in y- and z-direction, i.e., pillar-like partitioning, and cuboid partitioning. We start with 24 processes on a single node of HazelHen with an initial partition consisting of p<sup>x</sup> × p<sup>y</sup> × p<sup>z</sup> = 1 × 6 × 4 = 24 subdivisions for both pillar-like and cuboid partitioning. Each partition contains e<sup>x</sup> × e<sup>y</sup> × e<sup>z</sup> = 2 × 2 × 2 = 8 3D elements per MPI rank. Further, we ensure that each 3D element contains 2 × 2 fibers in x-direction with three 1D elements per fiber, i.e., 12 1D elements per 3D element. Hence, the initial problem is made up of 24 × 8 = 192 elements and 12 × 8 × 4 = 384 fibers.

In the series of measurements for the two partitioning strategies, further subdivisions are defined such that the pillar-like or cuboid partitioning structure is maintained. The refinements are obtained by first refining by a factor of 2 in the x-direction, in the z-direction and then in the y-direction before repeating the process. For the cuboid partitioning, we fix the number of 3D elements that each MPI rank contains to be 2 × 2 × 2. For the pillar-like partitioning, the constraint is that each sub-domain spans over all three-dimensional elements in the x-direction, whose number varies with increasing problem size. Therefore, the number of elements per MPI rank in y- and

(solid lines); projected runtimes for optimal scaling, Tp,opt = T12 · p/12 (dashed thin lines). (Bottom) Parallel efficiency Ep = Tp,opt/Tp.

z-direction is halved for each refinement in an alternating way. This way, we double the number of partitions while maintaining the constant number of eight 3D elements per MPI rank. By allocating 24 processes on the 24 cores of each node (no hyperthreading), we scale from 1 to 32 nodes, i. e., from 24 to 768 cores. **Table 1** provides the details on the partitioning and the number of three-dimensional and one-dimensional elements.

Results are shown in **Figure 8**, and show that the solver for the 3D model has a slightly higher computational time for the pillar-like partitioning compared to the cuboid partitioning. This is expected as the partition boundaries are larger and induce more communication. For the 1D problem solver, pillars are better as fibers are not subdivided between multiple cores and no communication is needed. The reduced benefit from a cuboid partitioning is due to the fact that the time spent on communication is rather dominant compared to the time needed to solve the rather small problem, e. g., only 3 e<sup>x</sup> = 6 1D elements of a fiber are locally stored in each partition. This should improve as one chooses larger sub-problem sizes, i. e., increases the number of nodes per fiber.

Theoretically, the time needed to solve the 0D problem should not be affected by the domain decomposition. However, due to cache effects, the runtime for a cuboid partitioning is slightly higher. Overall, this leads to a higher total computational time for cuboid partitioning compared to the pillar-like partitioning. This conclusion is, however, only valid for the chosen scenario and for the relatively low number of cores. Note that extending this scaling experiments to a larger numbers of cores is currently limited due to memory duplications in the current code. This needs to be first eliminated before conducting further scaling studies.

#### 3.3.3. Weak Scaling Measurements – Experiment #3

While the somewhat artificial setting in experiment #2 yields perfect pillar-like or cuboid partitions, experiment #3 addresses a more realistic setup, where we increase the number of processes more smoothly, i.e., by less than a factor of two in each step. With this, it is not possible anymore to choose perfect cuboid or pillar-like partitions. Thus, we identify reasonable parameters by solving an optimization problem that trades the targeted aspect ratio of sub-domain shape against process counts.


Note that the combination of the number of processes and the number of elements leads to partitions at the boundary of the computational domain that potentially have less elements than interior partitions. Compared to the previous example, the number of 3D elements per process is here only approximately constant, with the pillar-like partitions getting closer to constant size than the cuboid ones. The numbers of processes and the dimensions of the computational domain are listed in **Table 2**. **Figure 9** presents the runtime results.

As already discussed above, the ODE solver for the 0Dproblem (yellow line) requires the majority of the runtime. This is followed by the solution times for the 1D (red line) and 3D (green line) sub-problems. The blue lines depict the duration of the interpolation and homogenization between the node positions of the 1D fibers and the 3D mesh. It can be seen that the computational times stay nearly constant for increasing problem size. As in the previous experiment, the 3D solver performs better for cuboid partitioning whereas the 1D solver is faster for pillarlike partitions. In this scenario, the cuboid partitioning slightly outperforms the pillar-like partitioning, as expected.

As before, the memory consumption appears to be a weakness. Therefore, additional tests investigating the memory consumption per process at the end of the runtime were carried out. The memory consumption for the presented scenario is plotted in **Figure 10** with respect to the overall number of 1D elements. Also the average number of ghost layer elements per process is depicted. Ghost layer elements are copies of elements adjacent to the partition of a process, i.e., they belong to the subdomain of a neighboring process. They are used as data buffers for communication. We observe that the average number of ghost elements per process for the 3D problem is higher for pillar-like partitions (dashed black line) than for the cuboid partitions (solid black line). A sharp increase of memory consumption (magenta lines) is observed independent of the partitioning scheme. This is due to duplications of global data on each process, which will be eliminated in future work. Compared to this effect, the difference between the number of ghost elements needed for the two partitioning strategies is negligible.

#### 3.3.4. Dependency Between Runtime and Partition Shape – Experiment #4

In our fourth scaling test, the dependency of the solver of the 3D continuum-mechanical problem on the partitioning strategy


FIGURE 9 | Weak scaling measurements—experiment #3: Runtimes for different model components. The results for cuboid and pillar-like partitions are depicted by solid and dashed lines, respectively. Different runtime components are encoded in colors, i. e., the total runtime in black, 0D solver in yellow, the 1D solver in red, the 3D solver in green, the interpolation in light blue and the homogenization in dark blue.

FIGURE 10 | Weak scaling measurements—experiment #3: Total memory consumption per process at the end of the runtime. The total memory consumption is depicted in magenta and the average number of 3D ghost layer elements per process in black. Again, the solid lines represent cuboid partitioning and the dashed line piller-like partitioning.

is investigated. We analyse how different domain decomposition approaches, in particular approaches other than the previously discussed pillar-like and cuboid partitioning schemes, affect the runtime. A test case with 144 × 12 × 12 three-dimensional elements is considered. The setup, otherwise, is as described in section 3.1. To reduce the contributions of the 0D/1D subproblem and focus on the performance of the 3D components, we include in each 3D element only two 1D fiber elements. The domain is decomposed into a constant number of 144 partitions by axis-aligned cutplanes in all possible ways. To distinguish between the different partitioning variants, we compute the average boundary surface area between the partitions for each variant and relate this to runtime. The results are presented in **Figure 11**. The smallest average surface area between the partitions, which corresponds to the first data point in **Figure 11**, is obtained for a partitioning with 144 partitions with 4 × 6 × 6 elements each. The highest average surface area between the partitions, which is the last data point within **Figure 11**, is obtained for 144 partitions with 1 × 12 × 12 elements each. All experiments are run on 12 nodes of Hazel Hen with 12 processes per node. It can be seen that only the time needed to solve the 3D continuum-mechanical problem increases monotonically with respect to the average surface area between the partitions, i. e., depends on the partitions' shape. This is expected. Further, the runtime ratio of the 3D solver between the partitioning with the smallest and largest average surface area is 1 : 4.3.

#### 3.4. Visualization Results

In this section, we describe the results obtained using our new ray-tracing-based visualization within the MegaMol system. Our goal is to demonstrate the capabilities of our rendering approach for the interactive visualization of complex, real-world simulation data sets. Therefore, we used data from previous simulations to showcase these visualization capabilities, in particular, data from the Tibialis Anterior simulation performed by Heidlauf and Röhrle (2013). Analyzing and optimizing existing code for HPC infrastructures is best performed with test cases for which the geometry has a minimal influence. Under this consideration, the cuboid muscle test case introduced in section 2.1.4 would have been an obvious choice. However, in contrast to the Tibialis Anterior data, the cuboid muscle test case is too small and simple to demonstrate the full capabilities of our new visualization approach for complex geometries.

Our test data set consists of 3,600 fibers, which are discretized into a total of 144,000 1D elements. The consecutive elements along each fiber are connected via tubes to visualize the fibers. **Figure 12** shows a rendering created by MegaMol (Grottel et al., 2015) using our integration of the CPU ray tracing engine OSPRay (Wald et al., 2017). Color is used to illustrate values of the elements, in this case the local membrane potential. The interactive ray tracing offers very high image quality, including global illumination effects that increase the perception of spatial details. This is especially visible with the shadows between fibers, which help to perceive the distance between them as well as deformations of the individual fibers with respect to their neighbors. That is, our visualization approach not only delivers publication-quality images, which is often not possible for interactive visualization of large data using classical rendering approaches, but it is also beneficial for the visual analysis of local details as well as the overall spatial impression of the data.

To test the scaling behavior of our OSPRay integration into MegaMol, we measured the rendering performance of four different-sized systems. We used synthetic data sets ranging from 10<sup>6</sup> to 1.4 · 10<sup>9</sup> elements rendered as sphere geometries. Spheres are the most basic visualization primitive and can be rendered very fast, therefore, they are typically used as a baseline case for performance tests using large data sets. We also compare the CPU ray tracing performance with a GPU-based ray casting, which is a fast and efficient way to render large numbers of particles ( e.g., Reina and Ertl, 2005). The CPU ray tracing uses a P-k-d tree by Wald et al. (2015) for fast ray traversal. This

FIGURE 13 | Average rendering performance (frames per second, FPS) for four different data sets measured on four different CPU architectures (blue, green, red, cyan; triangle markers). For reference, the rendering performance of a GPU-based ray casting measured on a high-end GPU is provided (violet; circle marker).

tree is a memory-efficient hierarchical data structure used for space partitioning. All measurements were executed on a single desktop PC at a resolution of 1280 × 720 pixels. **Figure 13** shows the results obtained by different Intel CPUs for the OSPRay rendering compared to the GPU rendering on a high-end Nvidia consumer graphics card (Nvidia Titan XP). As observable, the GPU-based rendering outperforms the CPU-based ray tracing only for the smallest test case. For more than 10<sup>7</sup> spheres, the OSPRay ray tracing clearly outperforms the GPU rendering. This result agrees with our earlier findings presented in (Rau et al., 2017).

In summary, the CPU-based ray tracing approach that we chose is superior to classical GPU-based rendering not only in terms of image quality but also in terms of scalability for very large data sets. This is important for the visual analysis of HPC simulation data, which constantly increases in size as well as complexity due to improvements in simulation codes as well as the availability of faster HPC hardware. Our results demonstrate that real-time ray tracing is a viable solution nowadays for rendering large muscle fiber simulation data sets compared to classical rasterization-based approaches. It delivers not only superior image quality, which is beneficial for visual analysis, but also higher rendering performance even on single desktop PCs.

### 4. DISCUSSION

Using models to gain new insights into the complex physiological or anatomical mechanisms of biological tissue, or to better interpret and understand experimentally measured data, requires accurate and detailed models of the underlying mechanisms. This can lead very quickly to highly complex and computationally extremely demanding models. Software packages such as OpenCMISS are designed to build up computational models for a variety of complex biomechanical systems, e.g., for the chemoelectromechanical behavior of skeletal muscles after recruitment, the mechanics of the heart, the functioning of the lung, etc. Such software packages might already run within a parallel computing environment, but are not necessarily optimized to run largescale simulations on large-scale systems such as HazelHen, the Tier-1 system in Stuttgart. Thus, before being able to exploit the full capabilities of supercomputers, they have to be analyzed and optimized to achieve good scaling properties—ideally perfect scaling meaning that the simulation of a twice as large problem on twice as many nodes/cores requires the same runtime as the original setup.

Within this paper, we have demonstrated that the chemoelectromechanical multi-scale skeletal muscle model as introduced in section 2.1 and implemented in OpenCMISS is capable of running significant large-scale model setups in a parallel compute environment. We have simulated the deformation of a skeletal muscle in which 34, 560 randomly activated fibers are discretized with 103, 680 1D elements. Due to the algorithmic optimizations a meaningful compute time reduction was achieved. Further, by utilizing a standard test case, we have been able to show good strong and weak scaling properties for a small number of compute nodes. For the partitioning of the domain, two different approaches have been considered: a pillar-like partition along fiber directions and a minimal-surface partitioning. The solution times of the 3D and the 1D solver mainly depend on the domain partitioning. The 1D solver profits from pillar-like partitions, while the 3D solver exhibits lower runtimes for cube-like minimal-surface partitions. In addition to its advantage in terms of communication complexity for large numbers of parallel processes, the minimal-surface domain decomposition strategy investigated here is generalizable to arbitrary geometry settings even for unstructured meshes based, for example, on graph-partitioning methods.

However, for more realistic large-scale simulations, further aspects concerning the model, algorithms, implementation, and visualization need to be considered: a more complicated chemo-electromechanical model that includes, for example, the mechanical behavior of titin (Heidlauf et al., 2016, 2017) and further important biophysical details such as metabolism, a biophysical recruitment model (Heidlauf et al., 2013), and a feedback mechanism from the spindles and the golgi-tendon organs to the neuromuscular system; simulation and visualizing of the surface EMG to further test motor unit decomposition algorithms; novel or custom-tailored efficient numerical schemes for new model components and coupling with the existing ones; or integrating chemo-electro-mechanical modeling approaches to extend forward simulations using continuum-mechanics musculoskeletal system models Röhrle et al. (2017) in order to drive them not only through optimization Valentin et al. (2018) but also by means of neural recruitment, and hence obtain a deeper insight into neuromuscular recruitment principles.

Our goal is to set up large-scale simulations for a single chemo-electromechanical skeletal muscle model with a realistic number of fibers (e. g., about 300,000) of realistic length. The results of these simulations need to be visualized and analyzed for which we extend MegaMol to offer novel, comprehensive visualizations that allow users to interactively explore the complex behavior of muscle fiber simulation data. We will validate our simulation by comparisons of the simulated surface EMG of a muscle with experimental data obtained via non-invasive and clinically available diagnostic tools. Finally, our simulations can serve as a new tool to investigate the interplay of the underlying complex and coupled mechanisms leading from neural stimulation to force generation.

### AUTHOR CONTRIBUTIONS

All authors have equally contributed to the conception and design of the work, data analysis and interpretation, drafting of the article, and critical revision of the article. Hence, the author list appears in alphabetical order. In addition NE, TK, AK, BM, and TR have conducted the simulations and summarized their results. All authors fully approve the content of this work.

#### FUNDING

This research was funded by the Baden-Württemberg Stiftung as part of the DiHu project of the High Performance Computing II program, the Intel <sup>R</sup> Parallel Computing Center program,

### REFERENCES


and the Deutsche Forschungsgemeinschaft (DFG) as part of the International Graduate Research Group on Soft Tissue Robotics—Simulation-Driven Concepts and Design for Control and Automation for Robotic Devices Interacting with Soft Tissues (GRK 2198/1).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Bradley, Emamy, Ertl, Göddeke, Hessenthaler, Klotz, Krämer, Krone, Maier, Mehl, Rau and Röhrle. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Reduced Numerical Approximation of Reduced Fluid-Structure Interaction Problems With Applications in Hemodynamics

Claudia M. Colciago and Simone Deparis\*

MATH, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland

This paper deals with fast simulations of the hemodynamics in large arteries by considering a reduced model of the associated fluid-structure interaction problem, which in turn allows an additional reduction in terms of the numerical discretisation. The resulting method is both accurate and computationally cheap. This goal is achieved by means of two levels of reduction: first, we describe the model equations with a reduced mathematical formulation which allows to write the fluid-structure interaction problem as a Navier-Stokes system with non-standard boundary conditions; second, we employ numerical reduction techniques to further and drastically lower the computational costs. The non standard boundary condition is of a generalized Robin type, with a boundary mass and boundary stiffness terms accounting for the arterial wall compliance. The numerical reduction is obtained coupling two well-known techniques: the proper orthogonal decomposition and the reduced basis method, in particular the greedy algorithm. We start by reducing the numerical dimension of the problem at hand with a proper orthogonal decomposition and we measure the system energy with specific norms; this allows to take into account the different orders of magnitude of the state variables, the velocity and the pressure. Then, we introduce a strategy based on a greedy procedure which aims at enriching the reduced discretization space with low offline computational costs. As application, we consider a realistic hemodynamics problem with a perturbation in the boundary conditions and we show the good performances of the reduction techniques presented in the paper. The results obtained with the numerical reduction algorithm are compared with the one obtained by a standard finite element method.The gains obtained in term of CPU time are of three orders of magnitude.

Keywords: fluid-structure interaction, Navier-Stokes equations, reduced order modeling, proper orthogonal decomposition, reduced basis method, hemodynamics

### 1. INTRODUCTION

When modeling hemodynamics phenomena in big arteries, the resulting model is a complex unsteady fluid-dynamics system, usually coupled with a structural model for the vessel wall. In specific cases, suitable assumptions can be made to reduce the complexity of the model equations. In particular, when the displacement is small, the moving domain can be linearized around a reference steady configuration and the dynamics of the vessel motion can be embedded in the

#### Edited by:

Mariano Vázquez, Barcelona Supercomputing Center, Spain

#### Reviewed by:

Alexey Goltsov, Abertay University, United Kingdom Juan Carlos Cajas García, Barcelona Supercomputing Center, Spain

> \*Correspondence: Simone Deparis simone.deparis@epfl.ch

#### Specialty section:

This article was submitted to Systems Biology, a section of the journal Frontiers in Applied Mathematics and Statistics

> Received: 15 January 2018 Accepted: 16 May 2018 Published: 29 June 2018

#### Citation:

Colciago CM and Deparis S (2018) Reduced Numerical Approximation of Reduced Fluid-Structure Interaction Problems With Applications in Hemodynamics. Front. Appl. Math. Stat. 4:18. doi: 10.3389/fams.2018.00018 equations for the blood flow. In such way we obtain a Reduced Fluid-Structure Interaction (RFSI) formulation where a Navier-Stokes system in a fixed fluid domain is supplemented by a Robin boundary condition that represents a surrogate of the structure model.

Although the RFSI model is faster with respect to fully threedimensional (3D) models where the structure is solved separately, the numerical computation of one heartbeat is still expensive: the resolution of an entire heartbeat, that typically lasts one physical second, takes orders of hours of computational time on a supercomputer. A big challenge in realistic applications is to achieve a real time resolution of fluid-structure interaction problems. In particular, in hemodynamics applications, this would grant the possibility to perform real time diagnosis. Nevertheless, the great variability of patient-specific data requires the parametrization of the model with respect to many physical and geometrical quantities. Moreover, as we have recalled above, the complexity of the hemodynamics phenomena requires a mathematical description with complex unsteady models that are difficult to be solved in real time. The RFSI model is already a simpler version of the fully 3D FSI system; a further reduction of the physical model would result in an inaccurate estimation of specific outputs, like e.g., the wall shear stress when using a rigid wall model [1, 2]. Thus, to further reduce numerical costs, in this work, we focus on the reduction of the discretization space. In realistic applications, the finite element space has order of 10<sup>6</sup> degrees of freedom. The aim is to construct a discretization space such that the number of degrees of freedom is reduced to less than 100 and then to be able to solve one heartbeat in 1 s.

In the past few years, due to their relevance in realistic applications, a lot of interest has been devoted to discretization reduction techniques for parametrized Partial Differential Equation (PDE) problems (e.g., [3–6]). These techniques aim to define a suitable reduced order model which can be solved with marginal computational costs for different values of the model parameters. Reduced order models are then important in the many query context, when a parametrized model has to be solved for different values of the parameter, and in the real time problems, when the solution has to be computed with marginal computational costs. To obtain a suitable reduced order model, we typically start from a problem written in a high-fidelity approximation framework, e.g., using the finite element method. The dimension of the discretized system is then drastically reduced through suitable projection operators. The construction of these projection operators is the core of the reduced order technique. Another key concept in the reduction framework is the subdivision of the computational costs into two stages: an offline stage, expensive but performed once, and an online stage, real time and performed each time new values of the model parameters are considered. During the offline stage the projection space is generated by a reduced basis of functions of the high-fidelity approximation space.

Reduced order models applied to the Burgers equation parametrized with respect to the Péclet number is considered in Yano et al. [7] and Nguyen et al. [8]. Other applications of reduced basis techniques applied to fluid problems can be found (e.g., in [9–18] and in the recent volume [6]).

The aim of this work is indeed to propose a suitable discretization reduction algorithm that can be applied to a RFSI problem. The work is organized as follows. In section 2 we present the partial differential equations that we are interested in solving. We propose a possible parametrization of the unsteady equations with respect to temporal varying data and with respect to a perturbation of the boundary data. In section 3 we then present how the standard proper orthogonal decomposition algorithm can be applied to the problem at hand in order to generate a suitable reduced space. Moreover, we propose a way to improve the quality of the reduced approximation based on a greedy procedure. Finally, in section 4 we apply the reduction algorithms presented to a realistic hemodynamic problem. Conclusions follow.

#### 2. MODEL EQUATION

Blood is in large vessels can be modeled as an incompressible viscous fluid the well-known Navier-Stokes equations (e.g., [19, 20]). Being [0, T] the temporal interval of interest the Navier-Stokes system reads as follows:

$$\begin{cases} \rho\_f \frac{\partial \mathbf{u}}{\partial t} + \rho\_f (\mathbf{u} \cdot \nabla) \mathbf{u} - \nabla \cdot \boldsymbol{\sigma}^{\text{res}} = \mathbf{0} & \text{in } \Omega \times [0, T], \\\\ \nabla \cdot \mathbf{u} = 0 & \text{in } \Omega \times [0, T] \end{cases}$$

where **u** and p are the velocity and pressure of the blood, respectively, and ρ<sup>f</sup> is its density. We denote by σ nS the Cauchy stress tensor

$$
\sigma^{\mathfrak{res}} = \mu\_f(\nabla \mathbf{u} + (\nabla \mathbf{u})^T) - p\mathbf{I},
$$

with **I** being the identity tensor and µ<sup>f</sup> the blood dynamic viscosity. denote the domain of interest, in our case, the lumen of the vessels where we are interested in computing the dynamics of blood. Due to the compliant vessel wall, should be time dependent. Considering that the wall displacement is relatively small with respect to the arterial diameter, we assume as fixed which allows us to reduce the computational complexity otherwise generated from a moving domain. Nevertheless, to retrieve the physical effect of the wall compliance we introduce a non-rigid boundary condition on the lateral surface of the lumen. The condition is derived by a three dimensional linear isotropic elastic model condensed as a two dimensional membrane [1, 21]. Denoting Ŵ the lateral surface (i.e., the fluid-structure interface), 5Ŵ(**d**s), the stress-strain relation of this membrane, can be written as:

$$\begin{split} \Pi\_{\Gamma}(\mathbf{d}\_{\mathsf{s}}) &= h\_{\mathsf{s}} \frac{E\_{\mathsf{s}} \boldsymbol{\upsilon}\_{\mathsf{s}}}{(1 - 2\boldsymbol{\upsilon}\_{\mathsf{s}})(1 + \boldsymbol{\upsilon}\_{\mathsf{s}})} \text{tr} \left( \frac{\nabla\_{\Gamma} \mathbf{d}\_{\mathsf{s}} + \left( \nabla\_{\Gamma} \mathbf{d}\_{\mathsf{s}} \right)^{T}}{2} \right) \\ &+ h\_{\mathsf{s}} \frac{E\_{\mathsf{s}}}{2(1 + \boldsymbol{\upsilon}\_{\mathsf{s}})} \left( \nabla\_{\Gamma} \mathbf{d}\_{\mathsf{s}} + \left( \nabla\_{\Gamma} \mathbf{d}\_{\mathsf{s}} \right)^{T} \right) . \end{split}$$

where ∇Ŵ**d**<sup>s</sup> is the tangential gradient of **d**<sup>s</sup> , E<sup>s</sup> is the structural Young modulus, ν<sup>s</sup> is the Poisson's ratio and h<sup>s</sup> is the material thickness. All the physical parameters of the structure are assumed homogeneous in space.

Let us now suppose that the boundary ∂ is divided into three non intersecting parts such that ∂ = Ŵ ∪ Ŵ<sup>D</sup> ∪ ŴN. Ŵ<sup>D</sup> is the Dirichlet boundary, typically the inflow of a vessel, Ŵ<sup>N</sup> is the Neumann boundary, typically the outflows. We introduce the Hilbert space V = H<sup>1</sup> (; Ŵ) = {v ∈ H<sup>1</sup> () v|<sup>Ŵ</sup> ∈ H<sup>1</sup> (Ŵ)} and the correspondent vectorial spaces **V** = [V] 3 and **V** = [V] 3 . Moreover, we introduce a suitable couple standard finite element spaces **V**<sup>h</sup> and Q<sup>h</sup> such that **V**<sup>h</sup> ⊂ **V** and Q<sup>h</sup> ⊂ L 2 () and they represent a stable coupled of finite element spaces for the Navier-Stokes equations. We set **X**<sup>h</sup> := **V**<sup>h</sup> × Qh. We define [t<sup>0</sup> T] a time interval of interest and we divide it into subintervals [t<sup>n</sup> tn+1] for n = 0, .., N<sup>T</sup> − 1 such that t<sup>0</sup> < t<sup>1</sup> < t<sup>2</sup> < . . . < tN<sup>T</sup> = T and tn+<sup>1</sup> − t<sup>n</sup> = 1t; let us define N<sup>T</sup> = {0, 1, . . . , NT, NT} the collections of all the temporal indexes n. For a generic function φ(t) we use φ n := φ(tn). Finally, we define the operators D(·) and DŴ(·) as follows:

$$D(\mathbf{v}) = \frac{\nabla \mathbf{v} + (\nabla \mathbf{v})^T}{2} \quad \text{and} \quad D\mathbf{r}(\mathbf{v}) = \frac{\nabla\_{\Gamma} \mathbf{v} + (\nabla\_{\Gamma} \mathbf{v})^T}{2} \quad \forall \mathbf{v} \in \mathbf{V},$$

where ∇(·) is the standard gradient operator and ∇Ŵ(·) is the tangential component of the gradient with respect to the surface Ŵ.

The RFSI model as presented in Colciago et al. [1] is an unsteady Navier-Stokes model set on a fixed domain with generalized Robin boundary conditions (For similar models see e.g., [21–24]). Let us introduce the velocity and pressure unknowns [**u**h, ph] and the corresponding test functions [**v**h, qh]. Although the RFSI model lives in a fixed domain, it is necessary to define an auxiliary variable which stands for the displacement of the arterial wall **d**s,h. Using a backward Euler finite difference method for the time derivatives, the fully discrete weak formulation of the RFSI problem is written as follows:

for each n = 0, .., N<sup>T</sup> − 1, find [**u** n+1 h , p n+1 h ] ∈ **X**<sup>h</sup> such that **u** n+1 <sup>h</sup> = **g** n+1 D on Ŵ<sup>D</sup> and

$$a\_0([\mathbf{u}\_h^{n+1}, p\_h^{n+1}], [\mathbf{v}\_h, q\_h]) + a\_1(\mathbf{u}\_h^n, \mathbf{u}\_h^{n+1}, \mathbf{v}\_h) = F\_0(\mathbf{v}\_h; \mathbf{h}^{n+1})$$

$$F\_\mathbf{u}(\mathbf{v}\_h; \mathbf{u}\_h^n) + F\_{\mathbf{d}\_\delta}(\mathbf{v}\_h; \mathbf{d}\_{s,h}^n) \quad \forall [\mathbf{v}\_h, q\_h] \in \mathbf{X}\_h, \tag{1}$$

where

$$\begin{split} a\_{0}([\mathbf{u}\_{h}^{n+1},p\_{h}^{n+1}],[\mathbf{v}\_{h},q\_{h}]) &= \int\_{\Omega} \left(\rho\_{f}^{\begin{subarray}{c}\mathbf{u}\_{h}^{n+1} \\ \Delta t\end{subarray}} \cdot \mathbf{v}\_{h} + (2\mu\mathbf{D}(\mathbf{u}\_{h}^{n+1}) \\ &\quad - p\_{h}^{n+1}I) : \nabla \mathbf{v}\_{h} + q\_{h}^{n+1} \nabla \cdot \mathbf{u}\_{h} \Big) d\Omega \\ &\quad + \int\_{\Gamma} \left(\frac{h\_{\delta}\rho\_{s}}{\Delta t} \mathbf{u}\_{h}^{n+1} \cdot \mathbf{v}\_{h} \right. \\ &\quad + h\_{\delta} \Delta t \Pi\_{\Gamma}(\mathbf{u}\_{h}^{n+1}) : \nabla\_{\Gamma} \mathbf{v}\_{h} \Big) d\Gamma, \\ a\_{1}(\mathbf{u}\_{h}^{n},\mathbf{u}\_{h}^{n+1},\mathbf{v}\_{h}) &= \int\_{\Omega} \rho\_{f}(\mathbf{u}\_{h}^{n} \cdot \nabla) \mathbf{u}\_{h}^{n+1} \cdot \mathbf{v}\_{h} d\Omega, \\ F\_{0}(\mathbf{v}\_{h},\mathbf{h}^{n+1}) &= \int\_{\Gamma\_{N}} \mathbf{g}\_{N}^{n+1} \cdot \mathbf{v}\_{h} d\Gamma\_{N}, \\ F\_{\mathbf{u}}(\mathbf{v}\_{h},p\_{h}^{n}) &= \int\_{\Omega} \frac{\rho\_{f}^{\ast}}{\Delta t} \mathbf{u}\_{h}^{\ast \ast} \cdot \mathbf{v}\_{h} d\Omega + \int\_{\Gamma} \frac{h\_{\delta}\rho\_{s}}{\Delta t} \mathbf{u}\_{h}^{n} \cdot \mathbf{v}\_{h} d\Gamma, \end{split}$$

$$F\_{\mathbf{d}\_s}(\mathbf{v}\_h; \mathbf{d}\_{s,n}^n) = -\int\_{\Gamma} h\_s \Pi\_{\Gamma}(\mathbf{d}\_{s,h}^n) : \nabla\_{\Gamma} \mathbf{v}\_h d\Gamma,\tag{2}$$

with **d** n+1 <sup>s</sup>,<sup>h</sup> = **d** n <sup>s</sup>,<sup>h</sup> + 1t**u** n+1 h and ρ<sup>s</sup> represents the density of the solid. The functions gN(**x**, t) and gD(**x**, t) are sufficiently regular functions that stand for the Dirichlet and Neumann boundary data, respectively. Finally the problem should be equipped with suitable initial condition that, without any loss of generality, we suppose to be equal to zero.

As said before, the RFSI problem (1) is indeed a linearized Navier-Stokes on a fixed domain with a non standard boundary condition on the interface Ŵ. In particular is a generalized Robin boundary condition that contains both a mass and a stiffness boundary term to mimic the presence of a compliant arterial wall surrounding the fluid domain (see [25] for more detailed on the analysis of partial differential equations with generalized Robin boundary condition). We remark that **d** n s,h does not represents a problem unknown since it is indeed reconstructed as a weighted sum of the velocities at different time instants

#### 2.1. Boundary Condition

Problem (1) is endowed with Dirichlet velocity boundary condition on the inlet surface Ŵ<sup>D</sup> . Given the inlet velocity data **g**D(**x**, t), at the time instant tn+<sup>1</sup> we impose:

$$\mathbf{u}\_h^{n+1} = \mathbf{g}\_D^{n+1} \qquad \text{on } \Gamma\_D. \tag{3}$$

The Neumann boundary condition **D**(**u** n+1 h )**n** = **g**<sup>N</sup> is imposed weakly on ŴN. The solution of problems (1)–(3) depends on the time variable t through the inlet and outlet data: **g**N(**x**, t) and **g**D(**x**, t). We suppose that

$$\mathbf{g}\_{\mathcal{D}}(\mathbf{x},t) = \sigma\_1(t)\widetilde{\mathbf{g}}\_{\mathcal{D}}(\mathbf{x}) \quad \text{and} \quad \mathbf{g}\_{\mathcal{N}}(\mathbf{x},t) = \sigma\_2(t)\widetilde{\mathbf{g}}\_{\mathcal{N}}(\mathbf{x}), \tag{4}$$

that is we separate the contribution of the space and temporal variables in the inlet and outlet data. In realistic applications, the separation of variables (4) often derives directly from modeling choices. If at the outlet we prescribe an average normal stress, no spatial variability is involved in the boundary condition data **g**N. At the inlet, the Dirichlet data is imposed by means of a velocity profile; typically Poiseuille or Womersley profiles are chosen in hemodynamics applications [20]. The separation of variables in **g**D(**x**, t) in this case is straightforward.

Assumption (4) allows to write an affine decomposition of the operators in problem (2) with respect to σ1(t) and σ2(t). With respect to the latter we have:

$$F\_0([\mathbf{v}\_h, q\_h]; \sigma\_2^{n+1}, \mathbf{g}\_N) = \sigma\_2^{n+1} \int\_{\Gamma\_N} \widetilde{\mathbf{g}}\_N \cdot \mathbf{v}\_h d\Gamma\_{out}.$$

The non homogeneous Dirichlet boundary condition (3) is not directly included in the variational form (1). In order to write the affine decomposition with respect to the parameter σ1(t), a suitable choice to embed condition (3) into the variational formulation has to be made. In the literature two possible approaches are proposed: a strong imposition, using a lifting function or suitable Lagrange multipliers [17], and a weak imposition adding suitable penalty variational terms [2, 26]. Due to the fact that the Dirichlet data can be written in the form (4), a single time independent lifting function can be constructed and properly weighted by a scalar in order to represent the lifting at each temporal instant.

We explain how problem (1) is modified when a lifting function for the Dirichlet condition (3) is introduced. Let us directly consider the fully discretized formulation (1). We define the time independent lifting function **<sup>R</sup>**e**<sup>g</sup>** : <sup>R</sup> 3 7→ R 3 such that **<sup>R</sup>**e**<sup>g</sup>** <sup>∈</sup> **<sup>V</sup>**<sup>h</sup> and

$$\mathbf{R}\widetilde{\mathbf{g}}(\mathbf{x}) = \widetilde{\mathbf{g}}\_D(\mathbf{x}) \quad \text{on } \Gamma\_D \quad \text{and} \quad \mathbf{R}\widetilde{\mathbf{g}}\_D(\mathbf{x}) = 0 \quad \text{on } \partial\Omega\langle\Gamma\_D\rangle$$

At the time level tn+1, the lifting function of the data **g** n+1 <sup>D</sup> = σ n+1 <sup>1</sup> <sup>e</sup>**g**<sup>D</sup> reads **Rg**n+<sup>1</sup> <sup>D</sup> = σ n+1 <sup>1</sup> **<sup>R</sup>**e**g**D. Then, for each <sup>t</sup>n+1, we introduce the following change of variable:

$$
\widetilde{\mathbf{u}}\_h^{n+1} = \mathbf{u}\_h^{n+1} - \mathbf{R} \mathbf{g}\_D^{n+1}.\tag{5}
$$

We define the space **X**h,Ŵ<sup>D</sup> as **X**h,Ŵ<sup>D</sup> := **V**<sup>h</sup> ∩ [H<sup>1</sup> ŴD ()]<sup>d</sup> × Q<sup>h</sup> and we observe that **d** n+1 <sup>s</sup>,<sup>h</sup> = Pn+<sup>1</sup> <sup>s</sup>=<sup>0</sup> 1t**u** s <sup>h</sup> = Pn+<sup>1</sup> <sup>s</sup>=<sup>0</sup> <sup>1</sup>te**<sup>u</sup>** s h on Ŵ.

#### 2.1.1. Affine Decomposition

Using the definitions of the functionals as in (2), we are now ready to write the affine decomposition of problem (1) with respect to the temporal parameters σ1(t) and σ2(t). We remark that the lifting function **<sup>R</sup>**e**g**<sup>D</sup> does not depend on the time variable, thus the problem parameter at a fixed time level can be gathered in the following vector:

$$(\mu^{n+1})^T := [\mu\_0, \mu\_1, \mu\_2] := [\sigma\_1^{n+1}, \sigma\_2^{n+1}, \sigma\_1^n]. \tag{6}$$

One single time step of finite element approximation of the RFSI problem can be written under the form:

for each <sup>n</sup> <sup>=</sup> 0, .., <sup>N</sup><sup>T</sup> <sup>−</sup> 1, find <sup>e</sup>**<sup>U</sup>** n+1 h ∈ **X**h,Ŵ<sup>D</sup> such that

$$a(\widetilde{\mathbf{U}}\_{h}^{n+1}, \mathbf{W}\_{h}; \widetilde{\mathbf{U}}\_{h}^{n}, \boldsymbol{\mu}^{n+1}) = F(\mathbf{W}\_{h}; \widetilde{\mathbf{U}}\_{h}^{n}, \mathbf{D}\_{h}^{n}, \boldsymbol{\mu}^{n+1}) \quad \forall \mathbf{W}\_{h} \in \mathbf{X}\_{h, \Gamma\_{D}}, \tag{7}$$

where

$$\begin{aligned} a(\widetilde{\mathbf{U}}\_{h}^{n+1}, \mathbf{W}\_{h}; \widetilde{\mathbf{U}}\_{h}^{n}, \boldsymbol{\mu}^{n+1}) &:= a\_{0}(\widetilde{\mathbf{U}}\_{h}^{n+1}, \mathbf{W}\_{h}) \\ &+ \mu\_{2} a\_{1}(\mathbf{R} \widetilde{\mathbf{g}}\_{D}, \widetilde{\mathbf{U}}\_{h}^{n+1}, \mathbf{W}\_{h}) \\ &+ a\_{1}(\widetilde{\mathbf{U}}\_{h}^{n}, \widetilde{\mathbf{U}}\_{h}^{n+1}, \mathbf{W}\_{h}), \\ F(\mathbf{W}\_{h}; \widetilde{\mathbf{U}}\_{h}^{n}, \mathbf{d}\_{s,h}^{n}, \boldsymbol{\mu}^{n+1})) &:= \mu\_{1} F\_{0}(\mathbf{W}\_{h}; \widetilde{\mathbf{h}}) + F\_{\mathbf{u}}(\mathbf{W}\_{h}; \widetilde{\mathbf{U}}\_{h}^{n}) \\ &+ \mu\_{2} F\_{\mathbf{u}}(\mathbf{W}\_{h}; \mathbf{R} \widetilde{\mathbf{g}}\_{D}) + F\_{\mathbf{d}}(\mathbf{W}\_{h}; \mathbf{d}\_{s,h}^{n}) \\ &- \mu\_{0} a\_{0}(\mathbf{R} \widetilde{\mathbf{g}}\_{D}, \mathbf{W}\_{h}) \\ &- \mu\_{0} a\_{1}(\widetilde{\mathbf{U}}\_{h}^{n}, \mathbf{R} \widetilde{\mathbf{g}}\_{D}, \mathbf{W}\_{h}) \\ &- \mu\_{2} \mu\_{0} a\_{1}(\mathbf{R} \widetilde{\mathbf{g}}\_{D}, \mathbf{R} \widetilde{\mathbf{g}}\_{D}, \mathbf{W}\_{h}). \end{aligned}$$

Due to the fact that we use a semi-implicit treatment of the convective term the formulation of the RFSI problem at one single time instant tn+<sup>1</sup> can be interpreted as a linear steady problem parametrized with respect to µ n+1 , e**U** n h and **d** n s,h .

Furthermore, we can introduce a parameter in the inlet flow rate function representing a small perturbation with respect to a reference value: the inlet flow rate function (4) is then modified as

$$\mathbf{g}(\mathbf{x},t;\alpha) = \theta(\alpha,t)\sigma\_1(t)\widetilde{\mathbf{g}}\_D(\mathbf{x}),\tag{9}$$

where α ∈ D, being D the set of the admissible value of α. The same affine decomposition 8, with the following modification: the parameter becomes (µ n+1 ) <sup>T</sup> : = [µ0,µ1,µ2,µ3]: = [σ n+1 1 , σ n+1 2 , σ n 1 , θ n (α)] and in (8) we substitute µ<sup>0</sup> with µ3µ<sup>0</sup> and µ<sup>2</sup> with µ2µ3.

#### 3. NUMERICAL REDUCTION

In this section we briefly introduce some of the basic concepts of the reduced basis method that are useful to our purpose. For more details on the reduced basis theory we address the interested reader to e.g., Rozza et al. [5], Hesthaven et al. [27], and Quarteroni et al. [28]. We already introduced <sup>e</sup>**<sup>U</sup>** n+1 h that, at each time instant is the a high-fidelity approximation of the exact solution and is computed as a finite element solution with a sufficiently fine mesh. The solutions <sup>e</sup>**<sup>U</sup>** n+1 h of problem (7) are, in general, expensive to obtain from the computational point of view, since in realistic applications the finite element spaces has order of 10<sup>6</sup> degrees of freedom and the complexity of the geometrical domain does not always allow for the generation of structured meshes. We conclude that due to the magnitude of the finite element problem a real time computation would be impossible to achieve.

As in the standard reduced basis theory, we state the following assumption: the family of solutions <sup>e</sup>**<sup>U</sup>** n+1 <sup>h</sup> <sup>=</sup> <sup>e</sup>**<sup>U</sup>** n+1 h (µ n+1 ) obtained for different realizations of the parameters belongs to a low dimensional manifold <sup>M</sup><sup>µ</sup> h . The aim of the reduction techniques is to find a suitable approximation of the manifold M<sup>µ</sup> h through the construction of a low dimensional space **X**<sup>N</sup> ⊂ **X**h,Ŵ<sup>D</sup> . The dimension of the reduced space N needs to be orders of magnitude lower that the dimension of the finite element space. The reduced approximation of RFSI problem reads:

given <sup>e</sup>**<sup>U</sup>** 0 <sup>N</sup> <sup>=</sup> <sup>e</sup>**<sup>U</sup>** 0 h , for each <sup>n</sup> <sup>=</sup> 0, .., <sup>N</sup><sup>T</sup> <sup>−</sup> 1, find <sup>e</sup>**<sup>U</sup>** n+1 <sup>N</sup> ∈ **X**<sup>N</sup> such that

$$a(\widetilde{\mathbf{U}}\_N^{n+1}, \mathbf{W}\_N; \,\widetilde{\mathbf{U}}\_N^n, \mu^{n+1}) = F(\mathbf{W}\_N; \,\widetilde{\mathbf{U}}\_{N^\*}^n, \mathbf{D}\_{N^\*}^n, \mu^{n+1}) \quad \forall \mathbf{W}\_N \in \mathbf{X}\_N,\tag{10}$$

where a(·, ·) and F(·) are defined as in (8).

#### 3.1. Proper Orthogonal Decomposition

We apply a discretization reduction to the RFSI problem (7) and the Proper Orthogonal Decomposition (POD) method. In the context of this work we only detail the specific choices performed in relation to the problem at hand, for more details about POD applied to fluid problems we address the reader to e.g., Rowley [29] and Willcox and Peraire [30].

We define a subset of temporal indexes N<sup>S</sup> ⊂ N<sup>T</sup> with cardinality N<sup>S</sup> and consider the solutions of problem (1) at the time instants t nS for <sup>n</sup><sup>S</sup> <sup>∈</sup> <sup>N</sup>S. The solutions <sup>e</sup>**<sup>U</sup>** nS h , called snapshots, represent our starting point for the POD analysis. Since the RFSI problem (7) is a saddle point problem in two variables (velocity and pressure) with different characteristic order of magnitude, we split the POD into two eigenvalue decompositions: one for the velocity variable and another for the pressure one [31]. We measure the energy associated to the snapshots using the following scalar products: for the velocity, we set

$$(\mathbf{u}\_{\boldsymbol{h}}, \mathbf{v}\_{\boldsymbol{h}})\_{\mathbf{V}} := (\mathbf{u}\_{\boldsymbol{h}}, \mathbf{v}\_{\boldsymbol{h}})\_{\mathbf{H}^{1}(\boldsymbol{\Omega})} + (\mathbf{u}\_{\boldsymbol{h}}, \mathbf{v}\_{\boldsymbol{h}})\_{\mathbf{H}^{1}(\boldsymbol{\Gamma})},$$

$$\forall \mathbf{u}\_{\boldsymbol{h}}, \mathbf{v}\_{\boldsymbol{h}} \in \mathbf{V}\_{\boldsymbol{h}} \subset \mathbf{V} (= \mathbf{H}^{1}\_{\Gamma\_{D}}(\boldsymbol{\Omega}; \boldsymbol{\Gamma})), \tag{11}$$

and for the pressure,

$$(\mathfrak{p}\_h, q\_h)\_{\mathbb{Q}} := (\mathfrak{p}\_h, q\_h)\_{L^2(\Omega)}, \quad \forall \mathfrak{p}\_h, q\_h \in Q\_h \subset Q := L^2(\Omega). \tag{12}$$

Then, we compute the two Gramian matrices

$$G\_{\vec{ij}}^{\mathbf{u}} = (\mathbf{u}\_h^i, \mathbf{u}\_h^j)\_{\mathcal{V}} \quad \text{and} \quad G\_{\vec{ij}}^{\rho} = (p\_h^i, p\_h^j)\_{\mathcal{Q}} \qquad \forall j, i \in \mathcal{N}\_{\mathcal{S}} \tag{13}$$

and we perform the eigenvalue decomposition of G **u** and the one of G p , obtaining the pairs (λ **u** k , ζ **u** k ) and (λ p k , ζ p k ) where λ **u** k , λ p k ∈ R and ζ **u** k , ζ p k ∈ R <sup>N</sup><sup>S</sup> are the k − th eigenvalues and eigenvectors of the velocity and pressure Gramian matrices, respectively, for k ∈ NS. Fixing the same tolerance for both the velocity and pressure decompositions, we select the first N **u** and N p eigenpairs such that:

$$\frac{\sum\_{j=1}^{N^{\mathbf{u}}} \lambda\_j^{\mathbf{u}}}{\sum\_{k=1}^{N\_{\mathcal{S}}} \lambda\_k^{\mathbf{u}}} \ge 1 - tol \quad \text{and} \quad \frac{\sum\_{j=1}^{N^{\mathcal{P}}} \lambda\_j^{\mathcal{P}}}{\sum\_{k=1}^{N\_{\mathcal{S}}} \lambda\_k^{\mathcal{P}}} \ge 1 - tol,\tag{14}$$

respectively. The j−th velocity eigenfunction φ **u** <sup>j</sup> ∈ **V**<sup>h</sup> is reconstructed using the linear combination:

$$\boldsymbol{\phi}\_{j}^{\mathbf{u}} = \frac{1}{\lambda\_{j}} \sum\_{n\_{S} \in \mathcal{N}\_{S}} [\boldsymbol{\xi}\_{j}^{\mathbf{u}}]\_{n\_{S}} \mathbf{u}\_{h}^{n\_{S}}, \quad \text{for } j = 1, \ldots, N^{\mathbf{u}}.$$

Similarly for φ p <sup>j</sup> ∈ Q<sup>h</sup> for j = 1, .., N p . We remark that, since the velocity basis are linear combinations of solutions of problem (7), they all verify R qh∇ · φ **u** <sup>j</sup> = 0, ∀q<sup>h</sup> ∈ Q<sup>h</sup> for j = 1, .., N **u** . Thus, the linear system induced by the bilinear form a(·, ·) as in (2) would be singular if we consider the functional spaces generated from the velocity functions φ **u** j and the pressure modes φ p j . One of the possibilities often employed in the context of Navier-Stokes equations is to restrict the system and to solve the problem only for the velocity unknown (see e.g., [32]). Unfortunately, this is not possible when considering problem (2). The generalized boundary condition applied on Ŵ derives from a structural model which solution is driven by the pressure condition set on the external boundary in the structural model (see [1, 22]). If we solve the reduced system not taking into account the pressure variable, we cannot recover the velocity on the boundary Ŵ and the output functionals that depends on these values (e.g., wall shear stress). For these reasons, following Rozza and Veroy [33], for each selected pressure mode φ p j , we define the corresponding supremizer function σ<sup>j</sup> ∈ **V**<sup>h</sup> as the solution of the following problem:

$$\mathbf{u}(\sigma\_j, \mathbf{v}\_h) = \int\_{\Omega} \phi\_j^{\rho} \nabla \cdot \mathbf{v}\_h d\Omega \quad \forall \mathbf{v}\_h \in \mathbf{V}\_h, \quad \text{for } j = 1, \ldots, N^{\rho}. \tag{15}$$

We then add them to the POD basis. The POD reduced space **X** POD N associated to the RFSI model is composed by the basis functions {ψ<sup>j</sup> } N **<sup>u</sup>**+2×N p j=1 , ξ <sup>j</sup> ∈ **X**<sup>h</sup> defined as follows:

$$\begin{aligned} \boldsymbol{\Psi}\_{j} &= [\boldsymbol{\Phi}\_{j}^{\mathbf{u}}, \mathbf{0}]^{T} & \text{ for } j = 1, \ldots, N^{\mathbf{u}} \\ \boldsymbol{\Psi}\_{N^{\mathbf{u}} + j} &= [\mathbf{0}, \boldsymbol{\phi}\_{j}^{\boldsymbol{\rho}}]^{T} & \text{ for } j = 1, \ldots, N^{\boldsymbol{\rho}} & \text{ and } \\ \boldsymbol{\Psi}\_{N^{\mathbf{p}} + N^{\mathbf{u}} + j} &= [\boldsymbol{\Phi}\_{j}^{\boldsymbol{\sigma}}, \mathbf{0}]^{T} & \text{ for } j = 1, \ldots, N^{\mathbf{p}}, \end{aligned} \tag{16}$$

where φ σ j for j = 1, .., N p represent the orthonormalization of the supremizer functions σ<sup>j</sup> , obtained with a Gram-Schmidt algorithm with respect to the scalar product (·, ·)**V**.

#### 3.2. Greedy Enrichment

The bottleneck of the POD procedure is the computation of the high-fidelity solutions <sup>e</sup>**<sup>U</sup>** n h necessary to build the correlation matrix: we have to solve a finite element problem N<sup>T</sup> times. Moreover if we choose N<sup>S</sup> = NT, the Gramian matrix becomes too large and its eigenvalue decomposition gets too much expensive. We can envision two situations where we would like to improve the quality of the approximation obtained with the POD reduced space without changing the snapshots sample. For example, if N<sup>S</sup> is five times smaller than NT, the information carried by the snapshots sample refers to only the 25% of the entire set of the truth solutions. Is it possible to improve the quality of the reduced approximation, without increasing the number of snapshots selected? In another scenario, suppose that a perturbation parameter α is introduced in the unsteady problem (7), as proposed in (9), and that the snapshots are computed for a specific value of α = α1. We would like to generate a reduced space that suitably approximates also the truth solutions for other values of α without recomputing all the high-fidelity snapshots.

With these two scenarios in mind, we propose a strategy to improve the quality of the reduced approximation based on a greedy algorithm. For references to standard greedy algorithms applied to parametrized PDEs see e.g., Hesthaven et al. [27] and Quarteroni et al. [28].

We introduce another solution **U** n N,h that belongs to an intermediate problem between (7) and (10): find **U** n N,h ∈ **X**<sup>h</sup> such that

$$a(\mathbf{U}\_{N,h}^{n+1}, \mathbf{W}\_h; \widetilde{\mathbf{U}}\_N^n, \boldsymbol{\mu}^{n+1}) = F(\mathbf{W}\_h; \widetilde{\mathbf{U}}\_N^n, \mathbf{D}\_N^n, \boldsymbol{\mu}^{n+1}) \quad \forall \mathbf{W}\_h \in \mathbf{X}\_{h, \Gamma\_D}. \tag{17}$$

We notice that problem (17) is set in the high-fidelity approximation framework but the right hand side and the advection field are defined by (10). In fact, in (7), these terms are evaluated using the truth solution <sup>e</sup>**<sup>U</sup>** n h , while in (17) it is evaluated using the reduced solution <sup>e</sup>**<sup>U</sup>** n N , as in problem (10). Considering the error between <sup>e</sup>**<sup>U</sup>** n N and <sup>e</sup>**<sup>U</sup>** n h in a generic norm k · k∗, the following triangular inequality holds:

$$\begin{aligned} \|\widetilde{\mathbf{U}}\_N^{n+1} - \widetilde{\mathbf{U}}\_h^{n+1}\|\_\* &= \|\widetilde{\mathbf{U}}\_N^{n+1} - \mathbf{U}\_{N,h}^{n+1} + \mathbf{U}\_{N,h}^{n+1} - \widetilde{\mathbf{U}}\_h^{n+1}\|\_\* \\ &\le \|\widetilde{\mathbf{U}}\_N^{n+1} - \mathbf{U}\_{N,h}^{n+1}\|\_\* + \|\mathbf{U}\_{N,h}^{n+1} - \widetilde{\mathbf{U}}\_h^{n+1}\|\_\*. \end{aligned}$$

The greedy procedure that we propose focuses on the first contribution <sup>k</sup>e**<sup>U</sup>** n+1 <sup>N</sup> − **U** n+1 N,h k∗. Subtracting problem (10) from (17) allows to state a result of Galerkin orthogonality:

$$a(\mathbf{U}\_{N,h}^{n+1} - \widetilde{\mathbf{U}}\_N^{n+1}, \mathbf{W}\_h; \,\widetilde{\mathbf{U}}\_N^n, \mu^{n+1}) = 0.$$

We assume that the dual norm of the residual can be used as an indicator of the error k**U** n+1 <sup>N</sup> −**U** n+1 h k**X**. In particular, at each time level tn+1, we consider

$$r\_N^{n+1}(\mathbf{W}\_h) := F(\mathbf{W}\_h; \widetilde{\mathbf{U}}\_N^n, \mathbf{D}\_N^n, \mu^{n+1}) - a(\widetilde{\mathbf{U}}\_N^{n+1}, \mathbf{W}\_h; \widetilde{\mathbf{U}}\_N^n, \mu^{n+1}) \tag{18}$$

and its associated dual norm kr n+1 N (**W**h)k**X**′ .

We now have defined all the necessary quantities, we can proceed presenting the steps to be performed when we want to enrich the POD basis with a greedy algorithm. First, perform a POD on the snapshots <sup>e</sup>**<sup>U</sup>** nS h , for n<sup>S</sup> ∈ N<sup>S</sup> and we construct the reduced space **X** POD N . Then, we start the greedy enrichment setting **X**<sup>N</sup> = **X** POD N :


$$n^\* = \arg\max\_{n \in \mathcal{N}\_T} \|r\_N^n(\mathbf{W}\_h)\|\_{\mathbf{X}'}.$$


**Remark.** We remark that the functions that are added to the space X<sup>N</sup> in step 5 are derived from **U** n ∗ N,h and not the truth solution <sup>e</sup>**<sup>U</sup>** n ∗ h . We have no guarantee that **U** n ∗ N,h is close to <sup>e</sup>**<sup>U</sup>** n ∗ h or that it belongs to the low dimensional manifold <sup>M</sup><sup>µ</sup> h of the truth solutions. We would like also to remark that even if we are trying to reduce the error <sup>e</sup>**<sup>U</sup>** n+1 <sup>N</sup> − **U** n+1 N,h , to date we have no proof that the algorithm converges. In fact, we cannot theoretically prove that

$$\|\|\widetilde{\mathbf{U}}\_N^{n+1} - \widetilde{\mathbf{U}}\_h^{n+1}\|\|\_\* \le \|\|\widetilde{\mathbf{U}}\_{N-1}^{n+1} - \widetilde{\mathbf{U}}\_h^{n+1}\|\|\_\*. \tag{19}$$

For this lack of theoretical convergence results, to stop the greedy enrichment procedure, we rather opt for a fixed number of solutions Nmax chosen a priori, instead of using a certain tolerance on the a posteriori error estimator. Nevertheless, in the next section we will show some numerical evidence that the greedy enrichment is able to improve the quality of the approximation space.

### 4. APPLICATION TO A FEMOROPOPLITEAL BYPASS

#### 4.1. Application and Motivation

Atherosclerotic plaques often occur in the femoral arteries. The obstruction of the blood flow results in a lower perfusion of the lower limbs and the most common symptom of this disease is an intermittent claudication, which affects the 4% of people over the age of 55 years [34]. In order to restore the physiological blood circulation, different medical treatments are possible. In critical cases, the stenosis is treated with surgical intervention that bypasses the obstruction using a graft and providing an alternative way where blood can flow. The bypass creates a sideto-end anastomosis between the graft and the upstream artery (before the occlusion) and an end-to-side anastomosis with the distal downstream part. In particular, the design of end-toside anastomosis affects the flow downstream the bypass and provokes remodeling phenomena inside the arterial wall. The arteries adapt their size in order to maintain a certain level of shear stress, which results in a thickening of the intima layer and in an increasing risk of thrombi formation. The arterial wall remodeling is in fact linked with hemodynamic factors such as the wall shear stress magnitude and direction. Moreover, velocity profiles and separation of flows have been investigated when studying the bypass end-to-side anastomosis [35, 36]. Studies with idealized geometrical models have been proposed in order to define an optimal design for the anastomosis [37]. Nevertheless, the geometry of the vessel is one of the most important factors that affect the pattern of the wall shear stress. Further, patient specific data would be required in order to analyse each particular case.

We focus our attention on the patient-specific femoropopliteal bypass performed with a venous graft bridging the circulation from the femoral artery to the popliteal one. As a domain of interest we select the end-to-side anastomosis (see **Figure 1**). The geometry was reconstructed by CT-scan images as it is detailed in Marchandise et al. [38] and inlet and outlet flow rates are provided from the experimental data in Marchandise et al. [38]. We compute the Reynolds number as Re = 4ρ<sup>f</sup> Qin πDµ , being ρ<sup>f</sup> the blood density, Qin the inlet flow rate, D the vessel diameter and µ the blood viscosity. The average Reynolds number ranges from 144 and 380 (at the systolic peak), in agreement with the values provided in Loth et al. [36].

#### 4.2. Test Case

#### 4.2.1. Application of the POD Algorithm

In this section we investigate the behavior of the POD and the greedy enriched POD algorithms on a case representing

the femoropopliteal bypass application where the finite element resolution is performed on a coarse mesh. The usage of a coarse grid allows us to lower the offline computational costs and, thus, to test and compare several reduced basis approximations. Since we are interested in the realistic application of the femoropopliteal bypass, the physical parameters and boundary data are patient-specific. The coarse mesh is composed by 5,823 tetrahedra and 1.309 vertices. To obtain the high fidelity solutions of the RFSI model we use standard P <sup>1</sup>+Bubble-P <sup>1</sup> finite elements for a total of 22,702 degrees of freedom. The boundary conditions are periodic with period of 0.8 s (one heartbeat). We set the solutions at time t = 0 equal to zero. To get rid of the dependence of these initial condition we perform the simulation of an entire heartbeat and we focus on the solutions obtained for the subsequent heartbeat. Thus, to test the POD reduction algorithm, we compute the high fidelity numerical solutions for a time lapse corresponding to the second heartbeat, from t<sup>0</sup> = 0.8 s to tN<sup>T</sup> = 1.6 s with a time step 1t = 0.001 for a number of time intervals N<sup>T</sup> = 800. We denote with the superscript n ∈ N<sup>T</sup> varying from 0 to N<sup>T</sup> the sequence of computed solutions:

$$\mathbf{U}\_h^n \approx \mathbf{U}\_h(t\_n) \quad \text{where } t\_0 = 0.8, t\_1 = t\_0 + \Delta t, t\_2 = t\_0 + 2\Delta t,$$

$$t\_3 = t\_0 + 3\Delta t, \dots, t\_{N\_T} = 1.6\text{s}.$$

We save the finite element solutions every five time steps and we use the apex n<sup>S</sup> ∈ NS, n<sup>S</sup> = 5k, with k = 0, .., N<sup>S</sup> (N<sup>S</sup> = 160) to address the stored functions, that represent the snapshot sample:

$$\mathbf{U}\_{h}^{\eta\_{S}} \approx \mathbf{U}\_{h}(t\_{\eta\_{S}}) \quad \text{where } t\_{5} = 0.805, t\_{10} = 0.810, t\_{15} = 0.815,$$

$$t\_{20} = 0.820, \dots, t\_{\eta\_{N\_{S}}} = 1.6 \text{s}.$$

Indeed, we compute the POD starting from the 160 snapshots **U** nS h , n<sup>S</sup> ∈ NS, which represent the 25% of the finite element solutions computed for the second heartbeat. To check the quality of the reduced space approximations, we monitor the following errors:

• relative error of the velocity at time tn<sup>S</sup> and correspondent space-time error:

$$\varepsilon\_{N}(\mathbf{u}^{n\_{S}}) := \frac{\|\mathbf{u}\_{N}^{n\_{S}} - \mathbf{u}\_{h}^{n\_{S}}\|\mathbf{v}}{\|\mathbf{u}\_{h}^{n\_{S}}\|\mathbf{v}} \quad \text{and}$$

$$E\_{N}(\mathbf{u}) := \frac{\left(\sum\_{n\_{S} \in \mathcal{N}\_{S}} \left(\|\mathbf{u}\_{N}^{n\_{S}} - \mathbf{u}\_{h}^{n\_{S}}\|\mathbf{v}\right)^{2}\right)^{1/2}}{\left(\sum\_{n\_{S} \in \mathcal{N}\_{S}} \left(\|\mathbf{u}\_{h}^{n\_{S}}\|\mathbf{v}\right)^{2}\right)^{1/2}};\tag{20}$$

• relative error of the pressure at time tn<sup>S</sup> and correspondent space-time error:

$$\begin{split} \varepsilon\_{N}(\boldsymbol{\rho}^{n\_{\mathcal{S}}}) &:= \frac{\|\boldsymbol{p}\_{N}^{n\_{\mathcal{S}}} - \boldsymbol{p}\_{h}^{n\_{\mathcal{S}}}\|\_{\boldsymbol{L}^{2}(\Omega)}}{\|\boldsymbol{p}\_{h}^{n\_{\mathcal{S}}}\|\_{\boldsymbol{L}^{2}(\Omega)}} \quad \text{and} \\\ E\_{N}(\boldsymbol{\rho}) &:= \frac{\left(\sum\_{n\_{\mathcal{S}} \in \mathcal{N}\_{\mathcal{S}}} \left(\|\boldsymbol{p}\_{N}^{n\_{\mathcal{S}}} - \boldsymbol{p}\_{h}^{n\_{\mathcal{S}}}\|\_{\boldsymbol{L}^{2}(\Omega)}\right)^{2}\right)^{1/2}}{\left(\sum\_{n\_{\mathcal{S}} \in \mathcal{N}\_{\mathcal{S}}} \left(\|\boldsymbol{p}\_{h}^{n\_{\mathcal{S}}}\|\_{\boldsymbol{L}^{2}(\Omega)}\right)^{2}\right)^{1/2}};\end{split} \tag{21}$$

• space-time dual norm of the residual scaled with respect to the space-time norm of the global solution

$$R\_N(\mathbf{U}) := \left(\frac{N\_S}{N\_T}\right)^{1/2} \frac{\left(\sum\_{n \in \mathcal{N}\_T} \|r\_N^n(\mathbf{W}\_h)\|\_{\mathbf{X}'}^2\right)^{1/2}}{\left(\sum\_{n\_S \in \mathcal{N}\_S} \|\mathbf{U}\_h^{n\_S}\|\mathbf{x}\right)^2};\tag{22}$$

We build a sequence of POD reduced spaces with decreasing values of the tolerance tol and we compute the aforementioned indicators for each one of the reduced spaces generated. The space-time errors are reported in **Table 1**. In particular, we show: the number of selected velocity modes (#**u** basis); the number of selected pressure modes (#p basis); the total number of basis functions composing the reduced space ( # basis = #**u** basis + 2 × #p basis ); the space-time errors and residuals as defined above. Since the problem at hand is unsteady and the solution at a time instant t<sup>n</sup> depends on the solutions at the previous instants, the POD model errors EN(**u**) and EN(p) are bounded from above by the fixed tolerance but they are however of the same order of magnitude (see **Table 1**). We notice that, even if kr n N (**W**h)k**X**′ does not represent an upper bound for the error, nevertheless, from experimental results, we can use it as an indicator of k**U** nS <sup>N</sup> − **U** nS h k**<sup>X</sup>** (see **Figure 2**). The apparent strong correlation between the dual norm of the residual and the global error norm is probably due to the strong contribution of the mass term in the unsteady formulation. Indeed, if we choose a time step of 0.001, the mass matrix is multiplied for a factor of 10<sup>3</sup> . We remark that the magnitudes of the absolute errors for the velocity span from 10−<sup>1</sup> to 10<sup>2</sup> and the associated velocity solutions norms are of order of 10<sup>2</sup> − 10<sup>3</sup> . For the pressure, we have absolute errors of order 10<sup>0</sup> − 10<sup>3</sup> , while their solutions norms are of order 103−<sup>5</sup> . Indeed, in absolute terms the global error is mostly related with the pressure one.

#### 4.2.2. Application of the Greedy Enriched POD Algorithm

The POD algorithm provided satisfactory results and we were able to reduce the approximation space dimension from 10<sup>5</sup> to 10 − 100. In this section we aim at comparing the greedy enrichment algorithm with the POD one, in order to understand if using different basis functions than POD modes provides the same quality of reduced approximations. Thus, we compare the magnitude of the reduced approximation errors obtained using a reduced space generated through a standard POD algorithm with the ones obtained using the POD coupled with the greedy enrichment as introduced in section 3.2. We recall that the snapshots sample represents a subset of the time instants we solve in the unsteady simulation: indeed we store only the 25% of the time instants solutions computed. As there is no error bound available, we use the dual norm of the residual as surrogate. This is a rough approximation, also because the real error includes time integration, while the dual norm of the residual can only represent a space error. Of course we do not expect the greedy enriched POD to perform better, on the contrary it can have (and actually has) limitations.

**Remark** We are interested to simulate a fluid-dynamics phenomena with cyclic inputs. Typically in hemodynamics applications, we are interested in several heartbeats. Thus, instead of performing the greedy research only on one single heartbeat, we exploit as much as we can the information on the truth solutions coming from the snapshots. For each single snapshot e**U** ns h , we perform a simulation that starts from the initial time tns and ends at tns+N<sup>T</sup> = tn<sup>s</sup> + 0.8. We define a vector index **<sup>n</sup>** <sup>=</sup> (nT, <sup>n</sup>S) with <sup>n</sup><sup>T</sup> <sup>=</sup> <sup>n</sup><sup>S</sup> <sup>+</sup> <sup>n</sup> such that <sup>e</sup>**<sup>U</sup> n** <sup>h</sup> <sup>=</sup> <sup>e</sup>**<sup>U</sup>** nS,nT h being the approximate solution at time t <sup>n</sup><sup>T</sup> obtained starting from the initial condition <sup>e</sup>**<sup>U</sup>** ns h . We define the set of indexes N = {(nT, nS): n<sup>T</sup> = n<sup>S</sup> + n, n ∈ N and n<sup>S</sup> ∈ NS}. The generalization of the greedy enrichment presented in Section 3.2 is straightforwards substituting n with **n**. In particular, the selection of the worst approximated index n ∗ in the greedy enrichment can be generalized as follows:

$$\mathbf{n}^\* = \arg\max\_{\mathbf{n}\in\mathcal{N}} \|r\_{\mathcal{N}}^\mathbf{n}(\mathbf{W}\_h)\|\_{\mathbf{X'}}.$$


FEM solutions obtained on a coarse mesh, bypass application. Second column: number of selected velocity modes (#u basis). Third column: number of selected pressure modes (#p basis). Forth column: total number of basis functions composing the reduced space (# basis = #u basis + 2 × #p basis).

In order to initialize the greedy enrichment algorithm we compute a POD basis fixing the tolerance tol = 1e−5 (50 velocity modes, 3 pressure modes, 3 supremizers). To compare the POD approximation with the greedy enriched one, we augment the initial reduced space with two strategies. On one side, we apply the greedy enrichment and, at each iteration, we add the triplet of functions selected by the largest dual norm of the residual in space. On the other side, we augment the basis by adding, at each algorithm iteration, one POD mode for the velocity and one POD mode for the pressure with its associated supremizer. In both cases, at each iteration, we increase the reduced space dimensions of three units. The results obtained using only POD modes are displayed in black and addressed with the label POD, while the results obtained with the greedy enrichment are shown in red and addressed with the label Greedy enriched POD (see **Figure 3**).

From **Figure 3**, we note that the decrements of the errors in the greedy enrichment algorithm are slower than when adding POD modes. Nevertheless, we notice that both the space-time pressure error and residuals are comparable when adding POD modes or greedy basis functions (see **Figures 3B,C**). On the contrary the decrements of the velocity is much slower when we use the greedy enrichment with respect to adding POD modes. We recall, however, than the residual is mostly related to the pressure error component.

#### 4.3. Realistic Case

#### 4.3.1. Application of the POD Algorithm

In this section we perform a discretization reduction of the RFSI model applied to the femoropopliteal bypass case, where the high fidelity approximations are computed using a fine mesh. As before, a parabolic velocity profile is imposed at the inlet section and a mean pressure condition at the outlet. The P <sup>1</sup>+Bubble-P <sup>1</sup> discretization yields 1,410,475 degrees of freedom on the fine mesh. We first test the discretization reduction using a standard POD procedure: we compute the high fidelity numerical solutions for two heartbeats with a time step 1t = 0.001 and we store the ones related to the second heartbeat every five time steps. Thus, N<sup>T</sup> = 1, 2, 3, . . . , 800 and N<sup>S</sup> = 5, 10, 15, .., 800. We compute the Gramian matrices associated to the 160 snapshots **U** nk h , separating the velocity and pressure components. We denote λ **u** k and λ p k for k = 1, ..N<sup>S</sup> the eigenvalues associated to the decomposition of the correlation matrices of the velocity and pressure, respectively (see section 3.1). In both cases they decrease exponentially fast. The eigenvalues λ p k associated to the pressure snapshots (**Figure 4B**) decrease faster than the ones associated to the velocity (**Figure 4A**). Thus, by fixing the same tolerance, we expect that a fewer number of pressure modes will be selected with respect to the velocity ones.

We compute the POD reduced spaces using three different values for the tolerance: tol ∈ {1e − 4, 1e − 5, 1e − 6}. As

it was done in section 4.2, in **Table 2**, we record the selected number of modes and we compute the space-time errors EN(**u**) and EN(p) of the velocity and pressure, respectively. By taking advantage of the generated reduced space, at each time iteration we solve the reduced system and we compute a linear functional of the approximate solution that evaluates the outlet flow rate. We record the computational time associated with the offline and online computations in **Table 3**. We can appreciate that the resolution of the reduced problem combined with the evaluation of a linear output functional is performed in almost real time: using 35 basis functions we solve 1.6 physical seconds in 0.8 computational seconds, while a ten heartbeats simulation (8 physical seconds) selecting the POD space with 54 basis functions takes 12.6 s on a notebook. Performing the same simulation with the high fidelity model would have taken around 40 h on 256 processors of a supercomputer. The offline costs of the POD reduction (without the snapshots generation) are reported in the third column of **Table 3**. We remark that most of this

TABLE 2 | Number of basis functions and space-time errors for the velocity and pressure.


FEM solutions obtained on a fine mesh, bypass application.

TABLE 3 | CPU time X POD N : offline computations costs for the generation of the POD reduced spaces (without the finite element computations) on 512 processors; CPU time 2HB - RB: online computational time corresponding to the simulation of 2 heartbeats (2HB) on a personal laptop; CPU time 2HB - FE: finite element computational time corresponding to the simulation of 2 heartbeats (2HB) on 256 processes on a supercomputer.


time is spent in the generation of the structures for the residual evaluation.

Note that the POD model errors EN(**u**) and EN(p) decrease significantly when increasing the number of basis functions, as it is reported from both the values of **Table 2**. Once again, we notice that the dual norm of the residual kr n+1 N (**W**h)k**X**′ is a good indicator of the approximation error k**U** nk <sup>N</sup> − **U** nk h k**<sup>X</sup>** (see **Figure 5**).

In the femoropopliteal bypass application, we are interested in measuring also the errors on the output of interests. Being σ nS the stress tensor and **n** the normal vector to the surface Ŵ, we compute the wall shear stress as τ <sup>n</sup><sup>S</sup> := σ <sup>n</sup>S**n** − (σ <sup>n</sup>S**n** · **n**)**n** and we also consider the averaged wall shear stresses on a generic area A: τ nS <sup>A</sup> = 1/A R A |τ nS |dA. We remark that to properly estimate the selected output of interest we need accurate high fidelity solutions with a mesh refined at the wall, as shown by Marchandise et al. [38]; the fine mesh used in this work is similar to the fine one used in that paper. Reducing the dimension of the finite element space does not lead to the same results that we obtain reducing the degrees of freedom using the POD decomposition. In fact, the wall shear stress values computed using a coarse finite element space underestimate considerably the values obtained with the fine grid (see **Figure 6**), while the results obtained with the POD reduced approximation mostly overlapped with the ones computed with the finite element discretization.

#### 4.3.2. Percentage of Flow Coming From the Occluded Artery

In many cases, the artery is not completely occluded but a residual flow is still provided by the original vessel. Studying the distribution of fluid-dynamic quantities could be important to identify, for example, whether it would be better or not to surgically close the original artery. Our approach allows to study the variation of the flow and wall shear stress for different values of flow percentage coming from the occluded artery with low computational costs.

Here the percentage of flow coming from the occluded artery is an additional parameter. We solve the high fidelity model for two extreme cases: µ<sup>1</sup> = 0% (full occlusion) and µ<sup>2</sup> = 50% of flow. Using a POD algorithm with tol = 10−<sup>5</sup> , we construct

1.1 s. (D) Diastole, 1.2 s.

the reduced space **X** POD,µ1 N1 of dimension N1, associated with the snapshots computed with µ1. We obtain N<sup>1</sup> = 54, where 48 basis functions are velocity modes, 3 pressure modes and 3 supremizers. Then, we consider the snapshots associated with µ2, we build a second POD reduced space **X** POD,µ2 N2 of dimension N2, also fixing tol = 10−<sup>5</sup> . We obtain N<sup>2</sup> = 55 where 47 basis functions are velocity modes, 4 pressure ones and 4 supremizers. Finally, we check that the basis functions are linear independent and orthonormalize the basis of **X** POD,µ2 N2 with respect to **X** POD,µ1 N1 . Indeed, none of the basis functions obtained for µ<sup>2</sup> = 0.5 is as a linear combination of the basis related to µ<sup>1</sup> = 0 and the final reduced space **X**<sup>N</sup> has dimension N = N<sup>1</sup> + N<sup>2</sup> = 109.

We then choose a parameter µ<sup>3</sup> = 25% of flow coming from the occluded artery. We focus on the velocity profiles near the systolic peak and in the early diastole (see **Figure 7**) and the averaged wall shear stress (see **Figure 8**). We compare the target outputs obtained using the finite element approximation (label: FEM 25%) and the reduced ones (label: POD 25%). Moreover, we display the selected quantities also for the high fidelity solutions obtained with µ<sup>1</sup> = 0 (label: FEM 0%), µ<sup>2</sup> = 0.5 (label: FEM 50%) in order to clarify how the system dynamics changes when different values of the percentage are considered.

During the systolic phase the reduced solutions well reproduce the high fidelity ones, while during the diastole, the differences are more visible, in particular in some locations as, for example, near the anastomosis between the arterial vessel and the bypass graft (see **Figure 7**). We remark that the velocity profiles for different values of the parameter are significantly different between themselves. Our reduced solution approximates well the value of the wall shear stress associated to µ<sup>3</sup> (see **Figure 8**). We can appreciate the good agreement between the reduced and the high fidelity results, even when the values of wall shear stress associated to µ<sup>3</sup> are not berween those associated to µ<sup>1</sup> and µ<sup>2</sup> (see **Figure 8B**). Moreover, the approximation of the wall shear stress using a finite element approximation with a coarse

grid leads to a consistent underestimation of their values (see **Figure 8C**).

#### 4.3.3. Application of the Greedy Enriched Algorithm With Perturbed Data

In this section we apply the greedy enrichment in the case of perturbed boundary data. In particular, as in (9), we introduce a parameter in the inlet flow rate function representing a small perturbation with respect to a reference value. The perturbation function θ(α, t) is define as follows:

$$\theta(\alpha, t) = 1 + \alpha \sin\left(\frac{2\pi t}{0.8}\right)$$

where α is supposed to vary between 0 and 0.2. Thus, the maximum relative difference with the original flow rate is equal to the 20%. We denote with <sup>e</sup>**U<sup>n</sup>** ∗ (α) with ∗ = {h},{N} or {N, h} the numerical solutions at the time instant t **n** that depend on the parameter α and with r **n** N (**W**h; α) the residuals. In the perturbed case, the algorithm steps in section 3.2 are modified as follows. First we perform a POD algorithm fixing α = α<sup>1</sup> = 0.0; the resulting reduced space is addressed with **X** α1,POD N , where the apex α<sup>1</sup> denotes the choice of the α parameter. Then, we set α<sup>2</sup> = 0.2:


The real modification is indeed related to the fact that the initial POD is computed for α = α<sup>1</sup> = 0, while the greedy enrichment is performed fixing α = α<sup>2</sup> = 0.2. The resulting reduced space **X**<sup>N</sup> aims to represent a suitable space of approximation for both values of α. In the parametrized case, by using greedy enrichment we aim at saving a part of the offline computational costs: indeed, in a standard POD-Greedy procedure (see [39]), each new evaluation of the parameter α requires the computation of the associated finite element solutions for each time instant n ∈ NT. In our application, this would require about 8 h on 256 processors. Instead, during the greedy enrichment, we perform only one finite element resolution for a single time step, while the remaining computations are dedicated to reduced basis structures.

We test the greedy enrichment algorithm by initializing it with two different starting POD reduced spaces: in one case we consider the modes selected with tol = 1e − 4 (35 POD basis functions) and in the other one we consider the POD modes corresponding to tol = 1e − 5 (54 POD basis functions). In the first case, we enrich the space **X** α1,POD <sup>35</sup> by adding 8 triplets selected by the greedy algorithm; we obtain the reduced space **X**59. In the second case, starting from **X** α1,POD <sup>54</sup> , we enrich the space adding 12 triplets, obtaining **X**90. All the errors and residuals computed and shown below are referred to the solutions obtained with α<sup>2</sup> = 0.2. In particular, in **Table 4** we report the velocity and pressure errors generated by the greedy enriched reduced spaces as well as the ones obtained with the standard POD ones. Moreover, we compute the space-time dual norm of the residual, scaled by the solution norm (sixth column of **Table 4**).

We note that the space-time velocity error does not decrease significantly neither when adding greedy basis functions nor

TABLE 4 | Number of basis functions and space-time errors for the velocity and pressure.


Femoropopliteal bypass application in which the high fidelity solutions are obtained using a finite element approximation on a fine mesh.

when augmenting the number of selected POD modes. If we look at the pressure, using the greedy enrichment we manage to decrease its error more than if we use POD modes. Also the space-time dual norm of the residual is smaller when considering the greedy enriched space than the POD ones.

Regarding the offline costs, to generate the space **X**<sup>59</sup> starting form the **X** 0,POD <sup>35</sup> , we perform 8 iterations of the greedy enrichment algorithm: this takes 82 min on 512 processors where the most of the time is devoted to the generation of the reduced structures for the residual evaluation. We remark that computing a standard POD reduced space for the parameter evaluation corresponding to α = 0.2 would require about 8 h on 256 processors for the finite element computation of two periods, plus about 1 h on 512 processors for the generation of the reduced space itself.

To explain why we obtain better results for the pressure than for the velocity, we investigate the absolute values of velocity, pressure and global solutions errors and we compare them to the dual norms of the residuals (see **Figure 9**). Since the velocity and pressure norms have two different magnitudes (10 − 10<sup>2</sup> for the velocity and 10<sup>3</sup> − 10<sup>5</sup> for the pressure), the corresponding absolute values of the pressure errors are bigger than the velocity ones, even if the relative errors are lower. The greedy procedure selects the worst approximated time instant based on the dual norms of the residuals and these quantities are indicators of the global absolute errors. Since the latter is mostly due to the pressure error, this can explain why the greedy enrichment provides better results for the pressure than for the velocity.

#### 5. CONCLUSIONS

In this work we presented an application of reduced order modeling to a RFSI problem that is indeed an unsteady Navier-Stokes problem with generalized Robin boundary conditions. We detailed how an affine decomposition with respect to boundary data varying in time can be obtained under suitable hypothesis. Moreover, we presented and detailed how the POD can be applied to the RFSI problem in order to take into

FIGURE 9 | Dual norms of the residuals and norms of the global errors with respect to time for different choices of the POD tolerance tol. Femoropopliteal bypass application in which the high fidelity solutions are obtained using a finite element approximation on a fine mesh. (A) 54 Basis - POD. (B) 78 Basis - POD. (C) 90 Basis - Greedy enriched POD.

account the different order of magnitudes of the variables. We discussed the introduction of the supremizer functions inside the reduced basis, necessary to include the pressure in the reduced system. Afterwards, we proposed an enrichment of the POD reduced basis based on a greedy algorithm. All the algorithms presented were then numerically tested on a realistic hemodynamics problem. We tested the POD and greedy enrichment algorithm on two cases: a test case, where the finite element solution is obtained with a coarse grid, and a fine case, where the finite element space has order of 10<sup>6</sup> degrees of freedom. The results showed the good performances of the POD reduction algorithm on the RFSI problem, also with respect to the evaluation of specific hemodynamics target output (wall shear stress). Moreover we provided numerical evidence of how the reduced approximation can be improved using the greedy enrichment algorithm, in particular regarding the pressure error. The different behavior of the velocity and pressure errors is due to the use of the dual norm of the residual as an indicator of the global solution error. Indeed, since we do not have suitable a-posteriori error estimators, one for the velocity and one for the pressure variables, we measure the dual norm of the residual as a surrogate estimator. Being the pressure variable and the correspondent error two order of magnitudes grater that the velocity ones, the residual is indeed an indicator of the pressure errors. Nevertheless, even in lack of theoretical results, numerical experiments showed

#### REFERENCES


that the greedy enrichment is able to improve the quality of the reduced approximation allowing us to save computational time. The development of suitable a-posteriori error estimators for the pressure and velocity in the case of RFSI problem would be required to improve the performances of the greedy enrichment.

#### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

#### ACKNOWLEDGMENTS

The authors acknowledge support of the European Research Council under the Advanced Grant ERC-2008-AdG 227058, Mathcard, Mathematical Modeling and Simulation of the Cardiovascular System, and of the FNS project 200021E-168311, Domain-Decomposition-Based Fluid Structure Interaction Algorithms for Highly Nonlinear and Anisotropic Elastic Arterial Wall Models in 3D. Moreover, we thank Prof. Bernard Haasdonk and Prof. Alfio Quarteroni for the insightful discussion. We gratefully acknowledge the Swiss National Supercomputing Center (CSCS) for providing the CPU resources for the numerical simulations under the projects IDs s475 and s796.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer, JC, and handling Editor declared their shared affiliation.

Copyright © 2018 Colciago and Deparis. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Validation of Patient-Specific Cerebral Blood Flow Simulation Using Transcranial Doppler Measurements

Derek Groen<sup>1</sup> \*, Robin A. Richardson<sup>2</sup> , Rachel Coy <sup>3</sup> , Ulf D. Schiller 4,5 , Hoskote Chandrashekar <sup>6</sup> , Fergus Robertson<sup>6</sup> and Peter V. Coveney 2,3

*<sup>1</sup> Department of Computer Science, Brunel University London, London, United Kingdom, <sup>2</sup> Centre for Computational Science, University College London, London, United Kingdom, <sup>3</sup> Centre for Mathematics and Physics in the Life Sciences and Experimental Biology, University College London, London, United Kingdom, <sup>4</sup> Department of Materials Science and Engineering, Clemson University, Clemson, SC, United States, <sup>5</sup> School of Health Research, Clemson University, Clemson, SC, United States, <sup>6</sup> Lysholm Department of Neuroradiology, National Hospital for Neurology and Neurosurgery, University College London, London, United Kingdom*

#### Edited by:

*Alfons Hoekstra, University of Amsterdam, Netherlands*

#### Reviewed by:

*Tim David, University of Canterbury, New Zealand Jacopo Biasetti, CorWave SA, France*

> \*Correspondence: *Derek Groen derek.groen@brunel.ac.uk*

#### Specialty section:

*This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology*

Received: *11 January 2018* Accepted: *24 May 2018* Published: *19 June 2018*

#### Citation:

*Groen D, Richardson RA, Coy R, Schiller UD, Chandrashekar H, Robertson F and Coveney PV (2018) Validation of Patient-Specific Cerebral Blood Flow Simulation Using Transcranial Doppler Measurements. Front. Physiol. 9:721. doi: 10.3389/fphys.2018.00721* We present a validation study comparing results from a patient-specific lattice-Boltzmann simulation to transcranial Doppler (TCD) velocity measurements in four different planes of the middle cerebral artery (MCA). As part of the study, we compared simulations using a Newtonian and a Carreau-Yasuda rheology model. We also investigated the viability of using downscaled velocities to reduce the required resolution. Simulations with unscaled velocities predict the maximum flow velocity with an error of less than 9%, independent of the rheology model chosen. The accuracy of the simulation predictions worsens considerably when simulations are run at reduced velocity, as is for example the case when inflow velocities from healthy individuals are used on a vascular model of a stroke patient. Our results demonstrate the importance of using directly measured and patient-specific inflow velocities when simulating blood flow in MCAs. We conclude that localized TCD measurements together with predictive simulations can be used to obtain flow estimates with high fidelity over a larger region, and reduce the need for more invasive flow measurement procedures.

Keywords: lattice-Boltzmann, middle cerebral artery, computational fluid dynamics, transcranial Doppler, high performance computing, blood flow, validation study

### 1. INTRODUCTION

Computational fluid dynamics (CFD) has been widely applied by researchers to model blood flow in cerebral arteries and specifically within aneurysms (Cebral et al., 2011; Miura et al., 2013; Mountrakis et al., 2013; Byrne et al., 2014; Ouared et al., 2016). There is considerable interest in exploring the correlation between the dynamical properties of blood flow and clinical outcomes, with the long-term aim to provide a personalized, predictive simulation approach for aneurysm formation, growth, and/or rupture (Jou et al., 2008; Bernabeu et al., 2013; Xiang et al., 2014). When performing such simulations it is essential that computational models are able to deliver a realistic prediction of patient-specific flow velocities.

**151**

A range of simulation studies have been performed using patient-specific flow measurements derived from phase contrast magnetic resonance angiography (pc-MRA, see e.g., Boussel et al., 2008). However, Marzo et al. (2011) found that using this type of measurement provides limited accuracy benefits in comparison with modeled boundary conditions. The use of CFD in combination with transcranial Doppler (TCD) velocity measurements has been less extensively researched (see e.g., Hassan et al., 2004), primarily because reliable TCD measurements can only be made in a limited subset of the cerebral arteries. In addition, TCD measurements with handheld devices may contain errors if held at an incorrect angle (e.g., an underprediction of approximately 1.6% if the angle is off by 10 degrees). However, the excellent time resolution of TCD allows for a more reliable detection of peak velocities, and helps to establish more precise pulsatile flow profiles. Indeed, the maximum velocity values found by TCD are frequently around 30% higher than those found through pc-MRA (Chang et al., 2011; Meckel et al., 2013). In addition, TCD is non-invasive, unlike pc-MRA, and both are widely applied in clinical settings today.

Blood consists of blood cells which reside within a liquid medium known as blood plasma. Blood has a viscosity which decreases under shear flow (shear-thinning), unlike water which exhibits a constant Newtonian viscosity regardless of shear strain rate. Many well-known CFD studies of cerebrovascular blood flow are performed using a Newtonian fluid model (e.g., Cebral et al., 2011; Miura et al., 2013; Byrne et al., 2014), though recent research has found that such an assumption could lead to over-estimation of wall shear stresses (WSS) in cerebral arteries and aneurysms (Bernsdorf and Wang, 2009; Xiang et al., 2011; Khan et al., 2016). As a result, it can also alter the outcome of related diagnostic techniques such as three-band diagram analysis (Bernabeu et al., 2013), a technique proposed by Chatzizisis et al. (2008) to compare WSS at a specific location, over a period of time, to a set of pathological threshold values.

Existing CFD studies of cerebrovascular flow frequently derive inflow velocities not from the specific patient of interest, but from healthy subjects (e.g., Miura et al., 2013; Byrne et al., 2014) or idealized pulsatile profiles (Womersley flow, e.g., Castro et al., 2006; Alnæs et al., 2007; Cebral et al., 2011). However, blood flow velocities in middle cerebral arteries (MCA) from healthy subjects are typically much lower than those from stroke patients or patients suffering from hypertension. In this context Venugopal et al. (2007) found that mean WSS properties of simulations at Reynolds numbers (Re) below 200 do not correspond in any linear way to WSS properties of simulations at Re = 340–675. Itani et al. (2015) investigated how the mean, maximum, and minimum wall shear stress changes when a patient is subject to different levels of exercise intensity. They also found a non-linear relation between maximum inflow velocity and extracted WSS.

In this work, we simulate blood flow in a patient-specific MCA model using patient-specific TCD measurements as inflow boundary conditions, and compare our predictions against clinical measurements at four locations. Our simulations employ the lattice-Boltzmann method at high resolution, a technique which has been shown by Jain et al. among others, to be particularly well-suited for simulating cerebrovascular and aneurysmal blood flow Jain et al. (2016). We perform simulations imposing the measured velocity from the individual patients at the inlet, and investigate how the choice of rheology model affects the predicted flow velocities throughout the MCA. In addition, we report on the accuracy of velocity predictions when running simulations with downscaled inlet velocities, and rescaling the velocities obtained from the measurement planes.

### 2. MATERIALS AND METHODS

To perform our simulations, we use the HemeLB software (Groen et al., 2013; Nash et al., 2014) for lattice Boltzmann simulations of blood flow in cerebral arteries. The lattice Boltzmann method (LBM) is a highly scalable simulation approach which uses a discretized kinetic model on a regular lattice to reproduce the dynamics of incompressible fluid flow. The LBM can be interpreted as a numerical solver for the Navier-Stokes equation with the advantage that it algorithmically separates the nonlinearity from the non-locality. Specific boundary conditions are applied to create accurate representations of fluid flow near vessel walls, as well as inflow and outflow boundaries. In our case, we adopt a 3-dimensional LBM which propagates fluid flow in 19 directions per grid point (D3Q19) using a BGK collision operator (see e.g., Succi, 2001 for details). For the boundary conditions, we used the Bouzidi (Bouzidi et al., 2001) model to represent flow interactions with the vessel walls. Patient-specific inflow conditions were obtained from TCD measurements performed at the National Hospital for Neurology and Neurosurgery (NHNN) using the Doppler BoxX (with a handheld device) from the DWL company, and used rotational angiography data from NHNN to obtain imaging data from the same patient. TCD measurements were recorded for at least six cardiac cycles each in the right MCA, consecutively at depths of 49, 54, 57, 59, and 63 mm away from the temple area (see **Figure 1** for the location of the TCD validation planes in the 3D model, **Table 1** for the velocity measurements, and **Figure 2** for the TCD image measurement at the inflow boundary). The Doppler BoxX provides a flow direction indication at all depths whenever a measurement is made. In our case, this feature enabled us to hold the TCD device such that the flow was observable in the right MCA, as well as the right Anterior Communicating Artery (ACA). This is important, because retaining such a tight orientation minimizes TCD measurement errors caused by holding the device at a wrong angle. In addition, to align the TCD measurements precisely with the corresponding planes of flow direction in the simulation domain, we performed a triangulation and an angle correction with respect to the perpendicular flow direction (see **Table 2** for our triangulation results). The maximum velocity at the inflow boundary, extracted from the TCD data, was 1.50 m/s.

Extracted cardiac cycle lengths vary for each cardiac cycle and each measurement. The patient is known to have an existing aneurysm in the MCA on the opposite (left) side, within which the velocity magnitudes could not be clearly resolved using TCD

TABLE 1 | Overview of measured and simulated flow characteristics in the MCA, as well as relative differences between measurement and simulation.


*In rows 1, 2, and 3 we report the mean, maximum and minimum cardiac cycle length extracted from the TCD velocity measurements, respectively. In row 4 we provide the maximum velocity (vmax) as measured in the TCD data, and in rows 5 and 7 we present vmax for our (full velocity) HemeLB simulations with the Newtonian and the CY rheology models, respectively. Relative differences (dr) between the TCD measurements and each of the respective two HemeLB simulations are in rows 6 and 8. We use the velocity obtained from TCD as the inflow condition for our simulation.* \**Indicates simulation values which are preset (boundary conditions).*

due to its unfavourable orientation. We segmented the images using VMTKlab (vmtklab.orobix.com), and voxelized the 3D model using the HemeLB setup tool. The resulting geometry has one inflow region and five outflow regions—two small ones at the top near the inflow boundary, two larger ones at the bottom, and the largest one left of the 49 mm plane (see **Figure 1B**).

The 2D inflow profiles were reconstructed from the 1D TCD data by mapping a parabolic profile to the non-circular inlets. This parabolic inlet profile has the original velocity from the 1D TCD data mapped to the centre of the inlet (the lattice site which is furthest from any wall), and 0 velocity values mapped to wall boundary sites. The velocity magnitude of a given lattice site is then calculated using a parabolic equation, which depends on the distance of the lattice site to the nearest vessel wall site in the inlet plane (0 for wall sites, 1 for the site in the centre, and values in between for other sites).

The boundary conditions in the lattice Boltzmann method were implemented as follows. To set the reconstructed velocity profile uETCD(xEin, t) at the inlet, we use a method introduced by Ladd (1994). A simple bounce-back boundary condition is augmented with a momentum term that results in a timedependent Dirichlet condition for the velocity

$$
\vec{u}(\vec{\chi}\_{\rm in}, t) = \vec{u}\_{\rm TCD}(\vec{\chi}\_{\rm in}, t). \tag{1}
$$

At the outlet, we employed an open boundary condition in terms of a mixed Dirichlet-Neumann boundary condition (Nash et al., 2014)

$$
\vec{u}\_p(\vec{\chi}\_{\text{out}}, t) = 0,\tag{2}
$$

$$
\hat{n} \cdot \nabla \vec{u}\_n(\vec{x}\_{\text{out}}, t) = 0,\tag{3}
$$

where nˆ is the normal vector of the outlet plane, and uE<sup>p</sup> and uE<sup>n</sup> are the in-plane and normal components of the outlet velocity, respectively. The gradient in Equation (3) is taken as the firstorder finite difference approximation on the lattice Boltzmann grid. In the implementation by Nash et al. (2014), the density

FIGURE 2 | Raw TCD input image of the measured velocity at a depth of 63 mm (inflow boundary plane). The measured velocity at the selected depth (63 mm) is given at the top, while the general flow direction at all depths is given at the bottom, either toward the device (red) or away from it (blue). We observe a change in flow direction around a depth of 67 mm, which is at the junction between the right MCA and the right ACA.

TABLE 2 | Triangulation points, input, output, and measurement plane locations (and orientations where applicable) in the simulation domain, used to calculate the angle correction.


ρ(xEout, t)=ρ<sup>0</sup> at the outlet is prescribed in order to determine the unknown fluid variables. It is worth noting that prescribing the density at the outlet fixes the static pressure through the ideal gas equation of state. However, this does not constrain the dynamic pressure which varies over a cardiac cycle as shown in **Figure 3**.

The shear-thinning behavior of blood is modeled using the Carreau-Yasuda (CY) model which employs the expression (Boyd et al., 2007; Bernabeu et al., 2013)

$$\frac{\eta(\dot{\nu}) - \eta\_{\infty}}{\eta\_0 - \eta\_{\infty}} = \left( 1 + (\lambda \dot{\nu})^a \right)^{\frac{n-1}{a}} \tag{4}$$

to account for the dependence of the dynamic viscosity η on the shear rate γ˙. Here, η<sup>0</sup> and η<sup>∞</sup> are the asymptotic values at zero and infinite shear rate, and a, n, λ are empirical materials parameters that describe the shear-thinning curve. The CY model represents a smooth transition between Newtonian behavior at η<sup>0</sup> and η∞.

The HemeLB simulations were performed on the ARCHER supercomputer at EPCC in Edinburgh, United Kingdom, and the SuperMUC supercomputer at LRZ in Garching, Germany. We used between 1,536 and 24,768 cores, depending on the chosen resolution.

### 2.1. Choice of Lattice Boltzmann Parameters

plane is given by the black line.

Our lattice Boltzmann model uses a D3Q19 lattice with the Lattice Bhatnagar-Gross-Krook (LBGK) collision model (Bhatnagar et al., 1954). The relaxation parameters are set to yield the dynamic viscosity of blood of η = 0.004 Pa·s. The parameters used in the CY model are η<sup>0</sup> = 0.16 Pa·s, η<sup>∞</sup> = 0.0035 Pa·s, λ = 8.2 s, a = 0.64 and n = 0.2128 as given by Boyd et al. (2007) and previously adopted by Bernabeu et al. (2013). In our full-resolution, full-velocity simulations, we used a voxel size of 10 µm, a time step size of 0.28 µs, and a geometry consisting of 174,738,326 fluid sites. The simulations ran for 21.43 million time steps, which corresponds to 5 s of simulated time following a one-second "warmup" period (during which the inlet flow speed is increased gradually from rest in order to avoid flow instability or shockwaves resulting from a step change). The Reynolds number of our full-velocity simulation is approximately 966, based on a characteristic diameter of 24 mm with the highest measured peak velocity of 1.61 m/s.

We also performed simulations at reduced velocity and resolution, multiplying the velocities by 50 or 25%, as well as with increased voxel sizes of 20 and 40 µm. We discuss the implications of using this type of velocity scaling in detail in the next subsection.

#### 2.2. Velocity Scaling

The LBM is valid in the incompressible regime and introduces compressibility errors that scale quadratically in the Mach number Ma = U/c<sup>s</sup> , where U is the flow velocity and c<sup>s</sup> the speed of sound. The cardiac flow is characterized by the Reynolds number Re = UD/ν and the Womersley number α = (ωD 2 /ν) 1/2 , where D is the vessel diameter, ν = η/ρ is the kinematic viscosity, and ω is the angular frequency of the oscillating flow due to the cardiac cycle. In terms of the simulation parameters, the kinematic viscosity of the lattice BGK model and the speed of sound are given by

$$\nu = \frac{1}{3} \left( \hat{\mathbf{r}} - \frac{1}{2} \right) \frac{\left( \Delta \mathbf{x} \right)^2}{\Delta t},\tag{5}$$

$$
\omega\_s = \frac{1}{\sqrt{3}} \frac{\Delta x}{\Delta t},\tag{6}
$$

where τˆ is the dimensionless relaxation parameter of the BGK model, and 1x and 1t are the discrete lattice spacing and time step, respectively. Based on the Reynolds and Mach numbers, we have the following relation for the dimensionless relaxation parameter

$$
\hat{\mathbf{r}} - \frac{1}{2} = \sqrt{3} \frac{D}{\Delta x} \frac{Ma}{Re} . \tag{7}
$$

Linear stability requires τ > ˆ 0.5 which guarantees a positive viscosity. However, it is mandatory to keep the Mach number small in order to reduce compressibility errors and make the system less prone to instabilities due to density fluctuations. In the standard diffusive scaling, convergence is achieved by reducing the Mach number while keeping the Reynolds number constant. This implies (1x) <sup>2</sup> ∼ 1t. Thus, reducing the Mach number by means of increasing resolution results in an increase in computational costs due the cubic scaling of volume.

Therefore, some authors have been tempted to use lower flow velocities, e.g., from healthy subjects (Miura et al., 2013; Byrne et al., 2014), in order to maintain stable simulations at a larger voxel size 1x. The ratio of the reduced velocity U ′ and the original velocity U is denoted by a scaling factor s. This leads to a scaling relation

$$s = \frac{U'}{U} = \frac{\nu' \text{Re}' D}{\nu \text{Re} D'} = \frac{\alpha^2 \omega' D' \text{Re}'}{\alpha'^2 \omega D \text{Re}'},\tag{8}$$

where the prime denotes the quantities associated with the scaled velocity U ′ . If one insists on a fixed system size D ′ = D and cardiac cycle length ω ′ = ω, it is not possible to fix both the Womersley number and the Reynolds number at the same time such that the simulation is performed in a flow regime different to that of the full velocity simulation. In section 3.2, we demonstrate that this can significantly impact the simulated flow patterns.

#### 3. RESULTS AND DISCUSSION

We present results from three types of simulation. First, we compare our full velocity and full resolution (10 micron voxel size) simulations against the TCD measurements on the same patient. Second, we present the results from simulations at reduced velocity and reduced resolution, and compare them both with results from our full-scale simulations and with the TCD measurements. Third, we compare the results of simulations using a Newtonian rheology model to simulations using a non-Newtonian (Carreau-Yasuda) rheology model (Abraham et al., 2005).

### 3.1. Validating Full Velocity Haemodynamics Predictions Against Measurements

In **Table 1** we present the maximum velocity vmax as measured with TCD and the simulation results for all four measurement planes. Our simulations predict the flow velocity with a relative error of less than 9% in all cases. In **Figure 4** we present a direct comparison of our TCD measurements in the four planes over time, and our velocity predictions derived using HemeLB at the same locations. We observe good agreement between the simulation results and the measured TCD profile. The differences can be ascribed to the limitations of our approach (see section 3.4) and uncertainties in the measurements, including phase misalignments due to the sequential nature of the TCD measurement.

In **Figures 5A–D**, we present the two-dimensional velocity profiles extracted from the simulation at the four measurement planes. These profiles were extracted at the peak systole of the second cycle, corresponding to a velocity at the inlet of approximately 1.42 m/s. The figures show how the profile changes along the flow through the MCA. Compared to the inflow profile, the velocity profile at 59 mm is already substantially different, as a high velocity region is visible on the left side of the artery. The profiles at 57 and 54 mm show a strong concentration of (high) velocity near the top, which is presumably due to the bend present in that region of the artery, while a bend in the opposite direction just before the 49 mm plane is the likely cause of the more evenly distributed velocities there at peak systole (**Figure 5E**). In **Figures 5E,F** we show the calculated wall shear stress (WSS) across the MCA at peak systole and diastole (at 2.18 s). The front in **Figures 5E,F** corresponds to the left side in **Figures 5A–D**. We observe a WSS of >40 Pa during the systole in at least three locations. The WSS at the subsequent diastole (**Figure 5F**) remains relatively high at the location near the second outlet at the top, which indicates that this location could be susceptible to the formation of a new aneurysm.

### 3.2. Full vs. Reduced Velocity Simulations

In this section we compare the velocity profiles at peak systole from simulations at 10 µm voxel size and full velocity with those at reduced velocity and/or increased voxel size. Reduced velocity and resolution runs are attractive because they are cheaper, faster to run, and more likely to become computationally tractable in a clinical setting. For example, at time of writing, a full velocity run across five cardiac cycles costs approximately £4200 on the ARCHER supercomputer (EPCC, 2017), whereas a run at 50% velocity and the same resolution costs £2100 and a run at 50% velocity and 20 µm voxel size costs £500 to perform. However, reduced velocity simulations have a lower Reynolds number which affects a wide range of flow properties. In this study we have performed runs at 50% velocity (Re ∼ 483) and 25% velocity (Re ∼ 242).

We compare our simulation results at full velocity and resolution with those at reduced velocity and resolution in

**Figure 6** and **Table 3**. When we reduce the inflow velocity by 50%, the maximum inflow velocity at the inlet is 0.75 m/s (not an uncommon value for healthy volunteers) (Bishop et al., 1986) instead of 1.50 m/s (not an uncommon value for stroke patients) (Manno et al., 1998). We multiply the extracted velocities from our reduced velocity runs by two for simulations at 50% inflow velocity, and by four for simulations at 25% velocity. When comparing the runs with full inflow velocity runs with those at 50%, we already observe major differences in the extracted velocities. Here the comparisons at all four locations feature velocity differences of more than 0.4 m/s, and more than 30% of the maximum absolute flow velocity extracted in the corresponding plane. For the planes at 49 and 57 mm we see very large velocity differences near the vessel wall. This is likely due to the much higher Reynolds number of the flow in the full velocity run. When we compare the rescaled 50% velocity runs to the TCD measurements, the velocities differ by up to 15.5%, which is almost twice as large as the 8.8% maximum difference between TCD measurements and full velocity runs.

The results of the 50% velocity run with 20 µm voxel size are almost identical to the one with 10 µm voxel size, with only very small differences in all the velocity planes. However, the run with 25% velocity is considerably less accurate, with absolute velocity differences up to 0.75 m/s, in particular close to the vessel walls. These errors are still smaller close to the inflow boundary at 59 mm, but dominate the overall result in the validation planes that are beyond the bifurcation with lenticulostriate arteries.

We conclude that simulations with reduced velocities affect the accuracy of the results significantly. This is particularly important because realistic velocities close to the wall are essential to obtain accurate wall shear stress estimates. We find that no such estimates can be reliably made for half velocity runs.

### 3.3. Comparing Rheology Models

To compare different rheology models, we performed simulations on our MCA geometry using a Carreau-Yasuda (CY) rheology model (Abraham et al., 2005). When the CY model was adopted, Bernabeu et al. found important differences in the WSS and Three-Band-Diagram analysis outcome for the MCA under "healthy human" flow conditions. Here we focus on differences in velocities obtained from the two rheology models, as we are interested in comparing simulation predictions to TCD measurements.

FIGURE 6 | Absolute difference in flow velocity, between the run with Newtonian rheology at 10 µm resolution and 100% velocity and other runs for each of the four validation planes. Comparisons are made with runs at 10 µm and 50% velocity (Left column), 20 µm and 50% velocity (Middle), and 20 µm, and 25% velocity (Right) respectively. The velocities in reduced velocity runs are multiplied by 2 (for the 50% velocity runs) or 4 (for the 25% velocity runs). The snapshots were made at the second peak systole (at 1.44 s), with differences in m/s.

The difference in flow velocity between the Newtonian rheology model and the CY rheology model at peak systole is shown in **Figure 7**. We observe differences in velocity of up to 0.12 m/s in three of the four validation planes, and a difference of up to 0.21 m/s in a highly concentrated central region in the 54 mm measurement plane. In all cases the velocity differences are largest in regions where the absolute velocity is relatively small in the Newtonian rheology results, cf. **Figure 5**, while only smaller differences exist in regions where the velocity is relatively large. These results suggest that the choice of using either a CY or Newtonian rheology model has little effect on v sim max in all our comparisons (see **Table 3**).

The difference between the Newtonian and the CY rheology model for 50% reduced velocity is shown in **Figure 8** at peak systole. As noted above, velocity extractions from runs at 50% velocity are multiplied by 2 to enable a direct comparison with full velocity runs. The difference in velocities between the 50% runs is considerably smaller than for 100% velocity runs, reaching at most 0.05 m/s in any of the measurement planes. The velocity difference is largest close to the arterial wall, but is in all cases much smaller than the velocity mismatch introduced by the



*We present the velocity scaling used in each run in column 1 (100% for a full velocity run), the rheology model used in column 2, the voxel size used in column 3 (10* µ*m for a full resolution run), the extracted peak velocity in each of the measurement planes in columns 4 to 7, and the relative difference in peak velocity compared to TCD measurements for each plane in columns 8 to 11. As a reference, we provide vTCD max for each of the planes in the bottom row.*

velocity reduction (see **Figure 6**, left row). This is in line with the finding that the choice of the rheology model has a small effect, and in the reduced velocity runs the impact of scaling down the velocity on accuracy is the dominating factor.

### 3.4. Limitations of our Study

The main limitations of our validation study are related to data acquisition, model construction and simulation constraints.

Regarding TCD measurements, phase misalignments are common when directly comparing simulation results to these measurements, due to differences in apparent cardiac cycle length between the consecutively measured TCD planes (see **Figure 4**). Furthermore, due to the proprietary nature of the TCD numerical data, numerical velocity values were extracted semi-automatically from JPEG images obtained with the Doppler BoxX software, which may introduce small transcription errors of up to 0.0064 m/s due to the resolution of the images. The measurement quality and level of background noise can vary with different measurements, as different depths are subject to varying levels of occlusion and wave propagation interference.

In the area of segmentation it is particularly challenging to accurately reproduce the small lenticulostriate arteries originating near the origin of the MCA (Kang et al., 2012). These arteries are not always clearly captured in the medical imaging data, and many existing haemodynamics models of MCAs do not include them, while our geometry contains two of these arteries. However, omitting them altogether can lead to velocity overestimations in the remainder of the MCA. In our model we were able to resolve the lenticulostriate arteries to a limited extent after extensive segmentation efforts.

Due to the one-dimensional nature of the TCD measurement, we used a parabolic inflow velocity profile and fitted it to the non-circular shapes of the inflow boundaries (see section 2). Real inflow velocity profiles can vary depending on the morphology of the arterial network, as shown for example by Takeuchi and Karino (2010). Regarding the outlets, a more physiologically accurate choice of boundary condition would take into account the downstream peripheral resistance. However, such an approach introduces additional patientspecific parameters. For the purposes of the validation conducted in this study we intentionally limit the complexity of the model and thus use a simple mixed Neumann-Dirichlet boundary condition.

Furthermore, our simulation model is based on a rigid geometry and does not include elastic deformations of the vessel. In the case of blood flow in cerebral aneurysms, Dempere-Marco et al. (2006) found that incorporating wall motion has relatively little effect on the WSS. Understanding the dynamical response of arterial walls in the MCA, on a patient-specific level, is a particularly challenging area of research. However, recent studies show promising results that should soon allow us to examine these processes (Oubel et al., 2010; Vanrossomme et al., 2015).

### 3.5. Future Work

There are a range of factors that we seek to incorporate in our model as part of our future research. Firstly, we aim to develop techniques to create more realistic inflow profiles by using simulation data of arteries upstream from the patients MCA. Secondly, we seek to enhance our model by incorporating mechanisms for arterial wall deformations and damage. Such mechanisms are highly complex and very difficult to measure experimentally, and therefore modelling them is a particularly challenging area of research. Thirdly, we seek to provide more realistic outflow properties by extending our geometry to arteries further downstream. This could be accomplished for example by investigating how existing (1D) resistance models could be accurately applied within the context of complex 3D simulation models, or by attempting to simulate the full human brain in 3D over realistic time scales, and using patient-specific flow conditions.

### 4. CONCLUSIONS

We have conducted a validation study comparing flow velocities from patient-specific lattice-Boltzmann simulations to clinical TCD measurements in the MCA. As part of the study, we analyzed simulation results obtained at reduced velocities and

variable resolution. Moreover, we investigated the impact of using the Carrueau-Yasuda rheology model compared to a Newtonian rheology model.

We achieved very good agreement of the maximum velocity between our full patient-specific velocity simulation results and TCD measurements, with an error of less than 9% independent of the choice of rheology model. Simulating blood flow at reduced velocities, for example by scaling down the velocity or using velocity measurements from healthy subjects, is attractive because the simulation runs are computationally cheaper and deliver results faster. However, we found that scaling down the flow velocities leads to substantially larger errors, and an accurate comparison between simulations and TCD measurements is no longer achieved. Adopting a CY rheology model instead of a Newtonian one results in small changes in maximum velocities in the planes and in our validation, whereas substantial flow velocity differences are observed near the arterial wall and in the resulting WSS. However, the CY rheology model does not enable a significant improvement when the velocity is already scaled down (e.g., by using inflow profiles of healthy volunteers or reduced velocity Womersley profiles), as errors caused by this velocity scaling then dominate the overall accuracy. **Figures 7**, **8** suggest that a Newtonian rheology model may be a justifiable approximation for MCA simulations at lower (i.e., < 0.75 m/s) peak flow, but that this could quickly become problematic for the higher flows typically recorded in unhealthy patients (in which 1.5 m/s is not unusual).

Computational haemodynamics predictions that accurately match patient-specific TCD measurements are likely to become an important asset in clinical settings and pave the way to using computer models in the process of clinical decision making (Fenner et al., 2008; Sadiq et al., 2008). Compared to clinical measurements alone, patient-specific simulations allow us to obtain information about a much wider range of flow properties, such as detailed flow velocity characteristics and wall shear stress estimates. In addition, simulations can help predict flow velocity in areas that have not been directly measured, and thereby help reduce the number and intensity of invasive measurements that need to be performed. Here we have shown that a combination of non-invasive TCD measurements with haemodynamics simulations can lead to accurate predictions of blood flow velocity throughout the MCA. The ability to make these accurate predictions constitutes an important step in making computational haemodynamics a viable approach for assessing intracranial blood flow.

#### ETHICS STATEMENT

The patient-specific data (3D angiography, TCD measurements) used in this study was available on the shelf and did not contain identifiable information. The segmented geometry is already published Itani et al. (2015). The present study involved only secondary analysis of de-identified data that

#### REFERENCES

Abraham, F., Behr, M., and Heinkenschloss, M. (2005). Shape optimization in steady blood flow: a numerical study of non-newtonian effects. Comput. Methods Biomech. Biomed. Eng. 8, 127–137. doi: 10.1080/10255840512331 388588

is not linked to the subjects from whom it was originally collected.

### AUTHOR CONTRIBUTIONS

DG conceived the study, while DG and RR carried out the simulations, performed the validation comparison, and wrote the manuscript. US advised on the choice of simulation parameters and contributed to writing the manuscript. RC segmented the medical images and extracted the TCD velocity profiles from the measurement images, with help from DG, US, and PC. FR obtained the original angiography images, while HC performed the TCD measurements. Both FR and HC advised on the medical aspects of the manuscript. PC coordinated the study and helped draft the manuscript. All authors gave final approval for publication.

### FUNDING

This work received funding from the EU FP7 CRESTA project (grant number, 287703), the EU H2020 projects ComPat (grant no. 671564) and CompBioMed (grant no. 675451), the EPSRC-funded 2020 Science Programme (EP/I017909/1), and the Qatar National Research Fund (NPRP), project No. 5-792-2-238. RC is supported through doctoral training grant SP/08/004 from the British Heart Foundation (BHF), through the UCL CoMPLEX doctoral training programme.

#### ACKNOWLEDGMENTS

We thank Aditya Jitta for his efforts in analyzing a preliminary version of our MCA simulation model. We are grateful to Rupert Nash, Miguel Bernabeu, and Timm Krüger for useful discussions pertaining the simulation parameters. We thank Ann Warner for her help in obtaining angiographic data, and Sebastian Schmieschek in assisting with the supervision of RC. Access to the ARCHER supercomputer at EPCC in the UK was provided by the UK Consortium on Mesoscopic Engineering Sciences (EP/L00030X/1). In addition, we are grateful to the Leibniz Rechenzentrum (LRZ) in Germany for providing access to the SuperMUC supercomputer.

#### SUPPLEMENTAL DATA

We provide the source data for our publication via Figshare under doi: 10.17633/rd.brunel.5001962.


a model of the middle cerebral artery. J. R. Soc. Interface Focus 3:20120094. doi: 10.1098/rsfs.2012.0094


simulation approach for vascular blood flow. J. Comput. Sci. 9, 150–155. doi: 10.1016/j.jocs.2015.04.008


cerebal aneurysm hemodynamics to inflow boundary conditions. J. Neurosurg. 106, 1051–1060. doi: 10.3171/jns.2007.106.6.1051


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Groen, Richardson, Coy, Schiller, Chandrashekar, Robertson and Coveney. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Highly Automated Computational Method for Modeling of Intracranial Aneurysm Hemodynamics

Jung-Hee Seo<sup>1</sup> , Parastou Eslami <sup>1</sup> , Justin Caplan<sup>2</sup> , Rafael J. Tamargo<sup>2</sup> and Rajat Mittal <sup>1</sup> \*

<sup>1</sup> Department of Mechanical Engineering, Johns Hopkins University, Baltimore, MD, United States, <sup>2</sup> Department of Neurosurgery, Johns Hopkins Medicine, Baltimore, MD, United States

Intracranial aneurysms manifest in a vast variety of morphologies and their growth and rupture risk are subject to patient-specific conditions that are coupled with complex, non-linear effects of hemodynamics. Thus, studies that attempt to understand and correlate rupture risk to aneurysm morphology have to incorporate hemodynamics, and at the same time, address a large enough sample size so as to produce reliable statistical correlations. In order to perform accurate hemodynamic simulations for a large number of aneurysm cases, automated methods to convert medical imaging data to simulation-ready configuration with minimal (or no) human intervention are required. In the present study, we develop a highly-automated method based on the immersed boundary method framework to construct computational models from medical imaging data which is the key idea is the direct use of voxelized contrast information from the 3D angiograms to construct a level-set based computational "mask" for the hemodynamic simulation. Appropriate boundary conditions are provided to the mask and the dynamics of blood flow inside the vessels and aneurysm is simulated by solving the Navier-Stokes equations on the Cartesian grid using the sharp-interface immersed boundary method. The present method does not require body conformal surface/volume mesh generation or other intervention for model clean-up. The viability of the proposed method is demonstrated for a number of distinct aneurysms derived from actual, patient-specific data.

Keywords: cerebral aneurysm, hemodynamics, computational fluid dynamics, immersed boundary method, automatic segmentation

### INTRODUCTION

An aneurysm is a pathological, localized, balloon-like bulge in the wall of a blood vessel. Although aneurysms can occur in any vessel, intracranial aneurysm (ICA) (or cerebral aneurysm) and abdominal aortic aneurysm (AAA) are most common and clinically significant. Intracranial aneurysms can present incidentally (i.e., unruptured) or may present in the form of aneurysmal subarachnoid hemorrhage (aSAH) following intradural rupture. The overall incidence of aSAH in the Western world is 6–8 per 100,000 people per year (Zacharia et al., 2010) and mortality rates from aSAH are nearly 50%. Of the patients who do survive, less than 60% will return to a neurologic baseline allowing them to function independently (Zacharia et al., 2010). Given the complexity of treatment, and the care required for survivors with devastating neurologic injury, the cost of aSAH is staggering.

Aneurysms considered to be at high risk of rupture are usually treated by surgical intervention such as clipping of the aneurysm itself or implementation of a prosthetic graft to prevent rupture.

#### Edited by:

Mariano Vázquez, Barcelona Supercomputing Center, Spain

#### Reviewed by:

Paolo Di Achille, IBM Research (United States), United States Lucy T. Zhang, Rensselaer Polytechnic Institute, United States

> \*Correspondence: Rajat Mittal mittal@jhu.edu

#### Specialty section:

This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology

Received: 11 December 2017 Accepted: 15 May 2018 Published: 12 June 2018

#### Citation:

Seo J-H, Eslami P, Caplan J, Tamargo RJ and Mittal R (2018) A Highly Automated Computational Method for Modeling of Intracranial Aneurysm Hemodynamics. Front. Physiol. 9:681. doi: 10.3389/fphys.2018.00681

**164**

Such surgical interventions, however, bring their own risks such as bleeding, stroke, and vessel spasms (Raaymakers et al., 1998; Tomasello et al., 1998). Introduction of rupture-prevention devices (e.g., stent graft) can cause thrombosis and increase thrombo-embolic risk (International Study of Unruptured Intracranial Aneurysms Investigators, 1998; Song et al., 2004). Thus, prompt and accurate stratification of risk is the key to making sound clinical decisions about surgical intervention. Significant hurdles however exist in developing accurate risk stratification metrics that are grounded in the biomechanics of aneurysm growth and these have stymied not only our ability to understand aneurysm growth, but also, effective clinical interventions for this devastating condition.

Aneurysms manifest in a vast variety of morphologies (shapes, sizes, orientations and locations) and their growth is also subject to patient-specific conditions of age, gender, flow rate, blood pressure, heart rate, etc. Thus, any statistical correlation that is used for risk stratification should be based on large, likely, O(10<sup>4</sup> ) sample size that can cover the vast parameter space associated with aneurysms and generate reliable statistical correlations. While patient specific morphology and conditions are the primary determinants of growth and rupture risk, the connection between these factors and risk is highly complex due to the intervening non-linear effects of hemodynamics and vessel wall structural dynamics. Thus, current clinical guidelines for aneurysm treatment, which are based primarily on morphology (e.g., aneurysm diameter, Desai et al., 2010), have low sensitivity and specificity (Juvela et al., 1993; International Study of Unruptured Intracranial Aneurysms Investigators, 1998; Wiebers, 2003; Desai et al., 2010). Risk stratification approaches that go beyond morphology, and incorporate biomechanics could transform the treatment of aneurysms.

Physics-based computational models of aneurysm biomechanics (Cebral et al., 2005a; Valencia et al., 2008; Castro et al., 2009; i.e., hemodynamics and/or structural mechanics) hold great promise in this context. In particular, hemodynamics is essential to the estimation of aneurysm rupture risk not only because hemodynamics is the key intermediary between morphology and vessel wall mechanics but also because hemodynamic metrics are very sensitive to the geometrical and flow conditions (Cebral et al., 2005b; Valencia et al., 2013). Fortunately, modern imaging modalities [Computational Tomography Angiography (CTA) and 3D Rotating Angiogram (3DRA)] provide inputs that are suitable and generally sufficient for computational fluid dynamics modeling. The primary limitation of the current approaches however is that they are not designed to scale to large sample sizes that are necessary for developing insights and reliable statistical correlations/metrics.

In order to perform hemodynamic simulations for large number of aneurysm cases, pipe-lined (Cebral et al., 2005a) and automated methods to convert medical imaging data to simulation-ready configuration with minimal (or no) human intervention are required. Currently, most simulations of aneurysm hemodynamic are performed with the finite-volume or finite-element methods (Shojima et al., 2004; Cebral et al., 2005b, 2015; Valencia et al., 2008, 2013; Castro et al., 2009; McGah et al., 2014; Valen-Sendstad and Steinman, 2014) that require surface and volume meshes. Commercial CFD software based on the finite volume/element method are also often employed (Meng et al., 2006; McGah et al., 2014; Valen-Sendstad and Steinman, 2014), for which the segmented vessel/aneurysm geometry and surface/volume meshes need to be provided. Most of the current segmentation and simulation methods that involve surface and volume mesh generations necessitate substantial human intervention. The traditional approach consists of the following steps; (i) segmentation of lumen from the angiogram data, (ii) 3D model generation, (iii) cleaning-up and truncation of the model (e.g., cutting out the vessels outside the region of interest), (iv) surface mesh generation, and (v) volume mesh generation. An open source or commercial software can be employed for each step, but it still requires substantial human intervention to interface each step and determine the parameters. Thus, the traditional approach may not be adequate to deal with large number of individual cases envisioned here. Furthermore, conventional computational fluid dynamics simulation methodologies can be quite sensitive to the quality of the segmentation and grid generation, and this may necessitate attention to the quality of the segmented geometry and computational grid (Valen-Sendstad and Steinman, 2014). In the present study, a highly-automated method based on the immersed boundary method (Mittal and Iaccarino, 2005) framework is proposed to construct computational models directly and rapidly from medical imaging data. The key idea is the direct use of voxelized contrast information from the 3D angiograms to construct a level-set based computational "mask" for the simulation. In this way, 3D, simulation-ready models of the vessel of interest can be constructed automatically and rapidly, and no body-conformal grids (surface and volume) need to be generated for the flow simulation.

An immersed boundary method based on the "masking function" on the Cartesian grid has previously been applied to the aneurysm hemodynamics (Mikhal and Geurts, 2014), but the method employed a simple volume penalization, and the geometry was represented by set of Cartesian voxels. Better representation of the aneurysm/vessel geometry on the Cartesian grid can be achieved by using a level-set function. The level-set function based methods have been used for the simulation of aneurysm hemodynamics on the Cartesian grid using a lattice Boltzmann method (LBM) (He et al., 2009; Závodszky and Paál, 2013) and a boundary data immersion method (Otani et al., 2018). The latter method, however, still employed a surface mesh to construct the level-set function. In the present study, we employ the masking function for the automatic segmentation of vessel/aneurysm from the medical imaging data. The vessel/aneurysm boundaries are then represented by the level-set function constructed directly by using the contrast information. A previously developed and validated hemodynamic flow solver based on the sharp-interface immersed boundary method (Mittal et al., 2008) is adopted for modeling aneurysm hemodynamics on the Cartesian grid. In this paper, we report the key components of the highly automated simulation procedure using the immersed boundary method such as a 3D, region-growing technique for automatic vessel segmentation and cleaning, a level-set based, immersed boundary flow simulation module, and a suitable method for post-processing the data. The present method has been tested with small sample set of patient-data.

### MATERIALS AND METHOD

### Procedure of Highly-Automated Hemodynamic Modeling

The overall procedure of the highly-automated hemodynamic modeling using the 3D angiogram data is the following:


Each of these steps are described in detail in the following sections.

### Region of Interest and Boundary Conditions

The first step required for the hemodynamic modeling is to specify the region of interest (ROI) around the target aneurysm. By visualizing the angiogram, the user should identify the target aneurysm of interest and set the Cartesian ROI around it (see **Figure 1**). This can be done by specifying the range of spatial indices (i, j, k) for the Cartesian ROI domain, for example, imin≤i≤imax, jmin≤j≤jmax, and kmin≤k≤kmax. Once the ROI is set, there could be number of vessels that intersect with the ROI boundaries. Boundaries of flow domain can easily be identified during the automatic segmentation phase using the spatial index of Cartesian ROI domain. For example, the voxels masked for the flow domain at the Cartesian ROI boundaries (imin, imax, jmin, jmax, kmin, kmax) generate the boundaries of the flow domain. The flow direction (inflow or outflow) in each of these intersecting vessels needs to be identified. For inflow vessels, the user can specify the flow rate and/or flowrate wave form if available. The user also sets a seed point in the lumen region which is connected to the target aneurysm and a threshold contrast intensity (I0) for the region growing operation.

### Automatic Segmentation and Cleaning

Once the seed point and the threshold intensity (I0) are set, the 3D region growing operation is performed to automatically segment the lumen of interest and perform clean-up. The region growing runs on the Cartesian voxel space of the 3D angiogram data and each voxel of the imaging data serves as a Cartesian fluid cell. Starting from the seed point, the edge cells grow to neighboring Cartesian cells if the intensity of the cell (I) satisfies the criteria, I>I0. Additional criteria based on the gradient of intensity, e.g., 1I< 1Imax can also be employed. The choice of threshold can affect the size of the segmented vessel/aneurysm, and subsequently, the simulation results. The user may reset the threshold by checking the morphology of the segmented vessel/aneurysm. For this reason, the threshold value may need to be chosen by a trained expert. The mask function (M) is set to 1 for the growing region and 0 otherwise (see **Figure 2**). The process continues until no further growth is possible, and the connected lumen region is segmented based on the masking function (M = 1). The flow simulation is performed only for the volume where M = 1, and thus the other region where M = 0 including lumen volumes that are not connected to the target aneurysm is automatically cleaned-up. A key element in the present method is that, unlike other conventional bodyconformal numerical methods (finite-difference, finite-volume, or finite-element) that are commonly used in hemodynamic simulations (Soto et al., 2004; Updegrove et al., 2017), no surface mesh is generated for the segmented lumen. This alleviates complexities associated with mesh quality, and enhances the automation and robustness of the simulation tool.

### Level-Set Function and Wall Boundary Condition

The Cartesian grid based on the voxel space of the 3D angiogram also serve as the grid for the flow simulation. In order to define the lumen wall boundary which is not conformal to the Cartesian grid, a level-set function, φ is defined by using the intensity (I) information:

$$\phi\_0(\mathbf{i}, \mathbf{j}, \mathbf{k}) = \mathbf{I}(\mathbf{i}, \mathbf{j}, \mathbf{k}) - \mathbf{I}\_0,\tag{1}$$

where (i, j, k) are grid indices. If the angiogram is noisy, a lowpass, spatial filtering can be employed as a preprocessing step to smooth out the level set function, φ. A filtering scheme for 3 × 3 ×3 stencils is given by

$$\phi(\mathbf{i}, \mathbf{j}, \mathbf{k}) = \sum\_{\mathbf{p} = -1}^{1} \sum\_{\mathbf{q} = -1}^{1} \sum\_{\mathbf{r} = -1}^{1} \left(\frac{1}{2}\right)^{3} \left(\frac{1}{2}\right)^{\left(|\mathbf{p}| + |\mathbf{q}| + |\mathbf{r}|\right)} \tag{2}$$

$$\phi\_0(\mathbf{i} + \mathbf{p}, \mathbf{j} + \mathbf{q}, \mathbf{k} + \mathbf{r}). \tag{2}$$

The lumen wall surface is defined by φ = 0, and φ > 0 for the hemodynamic flow region (**Figure 3A**). To apply the boundary

condition on the lumen wall, the distance from the Cartesian grid point to the wall location is required and this is computed automatically as shown in **Figure 3B** by

$$\mathbf{d}\_{\mathbf{x}} = \frac{\phi}{\partial \phi / \partial \mathbf{x}}, \quad \mathbf{d}\_{\mathbf{y}} = \frac{\phi}{\partial \phi / \partial \mathbf{y}}, \ \mathbf{d}\_{\mathbf{z}} = \frac{\phi}{\partial \phi / \partial \mathbf{z}}, \tag{3}$$

where dx, dy, and d<sup>z</sup> are the distances to the wall in x, y, and z directions, respectively. Once these distances are calculated, the wall boundary condition for the flow simulation is imposed by the following way. Since the flow equations are solved on the Cartesian grid, the boundary conditions for the flow velocities are applied by imposing the cell face velocity, UBC as shown in **Figure 4**. The value of UBC is obtained by interpolation/extrapolation with the velocity on the wall uw, and on the flow region, u<sup>i</sup> . In the x-direction, for example, for the noslip, stationary wall (u<sup>w</sup> = 0), if the distance from the Cartesian grid point to the wall, dx, is smaller than the half of grid spacing, 1x/2, UBC is given by the linear interpolation:

$$\mathbf{U}\_{\rm BC} = \mathbf{u}\_{\rm i} \left( 1 - \frac{\Delta \mathbf{x}}{2 \mathbf{d}\_{\rm x}} \right). \tag{4}$$

If dx>1x/2, UBC is calculated by a ghost fluid method (Fedkiw et al., 1999). First, the adjacent grid point outside the flow region is identified and marked as a ghost point. To find the velocity on the ghost point, uGC, the image point in the flow region is found by mirroring the ghost point with respect to the wall position. Note that the distance from the ghost point to the wall is the same with the distance from the wall to the image point. For the no-slip, stationary wall (u<sup>w</sup> = 0), therefore, uGC is given by

$$\mathbf{u}\_{\rm GC} = -\mathbf{u}\_{\rm IM},\tag{5}$$

where uIM is the velocity on the image point, which can be found by interpolation using flow velocities on the Cartesian grid points;

$$\mathbf{u}\_{\rm IM} = \mathbf{u}\_{\rm i} + \frac{\mathbf{u}\_{\rm i+1} - \mathbf{u}\_{\rm i}}{\Delta \mathbf{x}} (\Delta \mathbf{x} - 2\mathbf{d}\_{\rm x}).\tag{6}$$

Finally, UBC is given by

$$\mathbf{U}\_{\rm BC} = \frac{1}{2} (\mathbf{u}\_{\rm GC} + \mathbf{u}\_{\rm i}) = \frac{\mathbf{u}\_{\rm i} - \mathbf{u}\_{\rm i+1}}{\Delta \mathbf{x}} \left( \frac{\Delta \mathbf{x}}{2} - \mathbf{d}\_{\rm x} \right). \tag{7}$$

### Inflow Velocity Profile

The flow simulations in the proposed method are performed by imposing the flow velocity in the inflow vessels. It is assumed that the inflow velocity is aligned with the vessel centerline. Also, the radial distribution of the velocity profile is in general specified as a combination of a steady parabolic and an oscillatory Womersley profile as follows

$$\mathbf{u}(\mathbf{r}, \mathbf{t}) = \mathbf{u}\_{\text{parabolic}}(\mathbf{r}) + \mathbf{u}\_{\text{oscillation}}(\mathbf{r}, \mathbf{t}; \text{Wo}). \tag{8}$$

where the steady flow profile is prescribed as uparabolic(r) = U0(1 − r 2 /R 2 ) and the oscillatory profile uoscillatory is determined in terms of the Womersley number Wo = R p ρ2πf0/µ, (Loudon and Tordesillas, 1998) where f<sup>0</sup> is the heart rate (Hz), R is the radius of the vessel and µ is the Newtonian viscosity of blood. The oscillatory profile can also be constructed by superposing a number of Fourier modes at harmonics of f<sup>0</sup> to model more realistic inflow profiles (Cebral et al., 2005a; Valencia et al., 2008). The radius of the artery is available directly from the segmentation. The other parameters needed are U<sup>0</sup> and the heart rate f0. Both of these may either be provided by the user based on patient-data, or in the absence of this information, simulations may be carried out for a range of these parameters.

Because the inflow vessels are not always normal to the boundary of the ROI, the following prescription is applied. First, the local, unit vessel centerline vector, Es is determined by calculating the vessel center point, xE<sup>c</sup> using the masking function, M(i, j, k). For example, if the inflow boundary is at the ROI boundary, k = kmin, the vessel center points are calculated at each k index near the boundary by;

$$\vec{\mathbf{x}}\_{\vec{\mathbf{c}}}(k) = \frac{\sum\_{\substack{\mathbf{i} = \mathbf{i}\_{\text{min}} \ \mathbf{j} = \mathbf{j}\_{\text{min}}}}^{\mathbf{i}\_{\text{max}}} \mathbf{M}(\mathbf{i}, \mathbf{j}, \mathbf{k}) \cdot \vec{\mathbf{x}}(\mathbf{i}, \mathbf{j}, \mathbf{k})}{\sum\_{\substack{\mathbf{i}\_{\text{max}} \ \mathbf{j} = \mathbf{j}\_{\text{min}}}}^{\mathbf{i}\_{\text{max}}} \mathbf{M}(\mathbf{i}, \mathbf{j}, \mathbf{k})},\tag{9}$$

where x is the grid center coordinates. The local centerline vector, E Esis then obtained by Es = Exc(kmin+1k)−Exc(kmin). Once the vessel center point is found at the inflow boundary, in-plane radius vector, RE can be defined on any grid points inside the inflow vessel lumen (see **Figure 5**) by RE = Ex− Exc. To prescribe a radial velocity profile, the radius vector normal to the vessel centerline vector is computed by a vector rejection:

$$\vec{\mathcal{R}}'(\mathbf{i}, \mathbf{j}, \mathbf{k}) = \vec{\mathcal{R}}(\mathbf{i}, \mathbf{j}, \mathbf{k}) - (\vec{\mathcal{R}}(\mathbf{i}, \mathbf{j}, \mathbf{k}) \cdot \vec{\mathbf{s}}) \vec{\mathbf{s}}.\tag{10}$$

The radial inflow velocity profile can be specified using this radius vector. For example, a fully developed, parabolic profile for steady flow is given by

$$\vec{\mathbf{U}}(\mathbf{i}, \mathbf{j}, \mathbf{k}) = \mathbf{U}\_0 \left[ 1 - \left( \frac{|\vec{\mathbf{R}}'|}{\mathbf{R}'\_{\text{max}}} \right)^2 \right] \vec{\mathbf{s}},\tag{11}$$

where R′ max is the maximum value of |RE′ | over the inflow boundary. More realistic, time dependent velocity profile can also be employed by using Equation (8).

#### Immersed Boundary Flow Solver

The hemodynamic simulation is performed by solving the incompressible Navier-Stokes equations on the Cartesian grid using the immersed boundary method (Mittal and Iaccarino, 2005). In the present study, a Newtonian fluid assumption is employed and the governing equations for the hemodynamic flow are given by

$$\nabla \cdot \vec{\mathbf{U}} = 0, \quad \rho \frac{\partial \vec{\mathbf{U}}}{\partial t} + \rho (\vec{\mathbf{U}} \cdot \nabla) \vec{\mathbf{U}} + \nabla \mathbf{P} = \mu \nabla^2 \vec{\mathbf{U}}, \tag{12}$$

where U is the flow velocity vector, P is the pressure, E ρ and µ are the density and dynamic viscosity of the blood. The equations are discretized by using the second-order finite-difference methods in time and space. The flow solver used in this study is a modified version of the immersed boundary, incompressible flow solver, ViCar3D (Mittal et al., 2008). The solver has been extensively validated for a variety of laminar/turbulent flows (Mittal et al., 2008; Vedula et al., 2014), and employed for a wide range of studies of cardiac hemodynamics, including modeling of left ventricular (LV) hemodynamics with natural (Zheng et al., 2012; Seo et al., 2014) and prosthetic mitral valves (Choi et al., 2014), role of ventricular trabeculae on LV hemodynamics (Vedula et al., 2016), and LV thrombus formation (Seo et al., 2016). The solver is also fully parallelized by using a message passing interface (MPI) library, and the performance scales well up to O(1000) processors. As mentioned above, Cartesian voxel space of the 3D angiogram can directly be used as a Cartesian grid for the flow simulation. For 3D angiograms, the voxel size is about 0.2∼0.3 mm, and this is adequate as the grid spacing for the flow simulation. The flow equations are solved only for the lumen region identified by the masking function, M, and the lumen wall boundary condition is prescribed by the level-set function method shown in section Level-Set Function and Wall Boundary Condition. The procedure of solving Equation (12) is as follows: the second equation of Equation (12) (momentum equation) is discretized on the Cartesian grid using the second-order finite difference method without the pressure gradient term, and integrated in time using the second-order Crank-Nicolson method to obtain the intermediate velocity fields. Applying the continuity equation (the first equation of Equation 12), one can obtain the Poisson equation for the pressure, and this is solved by using a parallelized bi-conjugate gradient method. Finally, the intermediate velocity field is corrected by adding the pressure gradient term to advance the solution over one time-step. More detailed solution procedure can be found in Mittal et al. (2008).

#### Post-processing

Flow-induced forces on the lumen wall (pressure and viscous shear stress) are considered important factor for characterizing the aneurysm rupture risk. Since the pressure gradient in the wall normal direction is usually set to 0 (∂P/∂n = 0), the pressure on the lumen wall can easily be calculated by simple interpolation using the values on Cartesian fluid cells near the wall. The viscous shear stress involves velocity gradients, and thus is calculated by the following way. On the Cartesian fluid cell adjacent to the wall, the velocity gradients are calculated by using the boundary velocities at the cell faces as shown in **Figure 6**.

$$\begin{split} \frac{\partial \mathbf{u}}{\partial \mathbf{x}} &\approx \frac{\mathbf{U}\_{\mathbf{i}+1/2} (\mathbf{1} - \mathbf{M}\_{\mathbf{i}+1,\mathbf{j},\mathbf{k}}) + \mathbf{U}\_{\mathbf{i}-1/2} (\mathbf{1} - \mathbf{M}\_{\mathbf{i}-1,\mathbf{j},\mathbf{k}}) + (\mathbf{M}\_{\mathbf{i}+1,\mathbf{j},\mathbf{k}} - \mathbf{M}\_{\mathbf{i}-1,\mathbf{j},\mathbf{k}}) \mathbf{u}\_{\mathbf{i},\mathbf{k}}}{\Delta \mathbf{x}/2}, \\ \frac{\partial \mathbf{u}}{\partial \mathbf{y}} &\approx \frac{\mathbf{U}\_{\mathbf{j}+1/2} (\mathbf{1} - \mathbf{M}\_{\mathbf{i}\mathbf{j}+1,\mathbf{k}}) + \mathbf{U}\_{\mathbf{j}-1/2} (\mathbf{1} - \mathbf{M}\_{\mathbf{i}\mathbf{j}-1,\mathbf{k}}) + (\mathbf{M}\_{\mathbf{i}\mathbf{j}+1,\mathbf{k}} - \mathbf{M}\_{\mathbf{i}\mathbf{j}-1,\mathbf{k}}) \mathbf{u}\_{\mathbf{i},\mathbf{k}}}{\Delta \mathbf{y}/2}, \\ \frac{\partial \mathbf{u}}{\partial \mathbf{z}} &\approx \frac{\mathbf{U}\_{\mathbf{k}+1/2} (\mathbf{1} - \mathbf{M}\_{\mathbf{i},\mathbf{k}} \mathbf{1}) + \mathbf{U}\_{\mathbf{k}-1/2} (\mathbf{1} - \mathbf{M}\_{\mathbf{i},\mathbf{k}} \mathbf{1}) + (\mathbf{M}\_{\mathbf{i},\mathbf{k}} \mathbf{1} - \mathbf{M}\_{\mathbf{i},\mathbf{k}} \mathbf{1}) \mathbf{u}\_{\mathbf{i},\mathbf{k}}}{\Delta \mathbf{x}/2}, \end{split} \tag{13}$$

where 1x, 1y, 1z are the grid spacing, Mi,j,k is the masking function value, and subscripts denote grid indices. The wall

normal vector is given by the gradient of the level-set function as

$$
\vec{\Pi} = \frac{\nabla \phi}{|\nabla \phi|}. \tag{14}
$$

For the incompressible flow, the normal gradient of the wall normal velocity component on the stationary wall is supposed to be zero. It is found however that this is not guaranteed numerically for the present method, because the normal gradient is not directly calculated on the wall. This numerical error scales with the grid spacing (1x). The viscous wall shear stress is then calculated by using the tangential velocity gradient in the wall normal direction as

$$\vec{\pi}\_{\text{w}} = \mu \frac{\partial \vec{\mathbf{u}}\_{\text{l}}}{\partial \mathbf{n}} = \mu \left( \frac{\partial \vec{\mathbf{u}}}{\partial \mathbf{n}} - \frac{\partial \vec{\mathbf{u}}\_{\text{n}}}{\partial \mathbf{n}} \right) = \mu \left\{ \nabla \vec{\mathbf{u}} \cdot \vec{\mathbf{n}} - \left( (\nabla \vec{\mathbf{u}} \cdot \vec{\mathbf{n}}) \cdot \vec{\mathbf{n}} \right) \vec{\mathbf{n}} \right\}, \tag{15}$$

where ∇Eu is the velocity gradient tensor, and t and n are the direction tangent and normal to the wall, respectively. The wall shear stress value is stored on the nearest Cartesian fluid cell to the wall, and in the post-processing, the value is projected onto the wall.

#### Patient-Specific Cases

The developed simulation method is tested with patient-specific angiogram data. A total of seven anonymized patient-specific cases are selected from the Johns Hopkins University Intracranial Aneurysm Database (JHUIAD) so as to provide a range of aneurysm morphologies. The 3D angiograms for these cases are shown in **Figure 7**. The aneurysms are categorized into 3 types (fusiform, saccular, sidewall), and the size parameter, SR (size ratio: the ratio of aneurysm maximal length to the parent vessel diameter; Rahman et al., 2010) is listed in **Table 1** for these cases.

#### RESULTS

The developed method has been applied to the set of seven patient-specific cases shown in **Figure 7**. The cases include 1 fusiform (case A) at a branching, 2 saccular (cases C and D) type aneurysms located at bifurcation, and 4 sidewall saccular aneurysms (cases B,E–G). For the given angiogram data, an experienced neurosurgeon has performed the manual procedures described in section Region of Interest and Boundary Conditions. The neurosurgeon set the appropriate threshold intensity value and ROI, and provided the region growing seed point and the flow direction. The threshold intensity values are different for each case based on the overall contrast of the images, and chosen for the best representation of the morphology. The Cartesian ROI is determined to include sufficient length of the vessels both upstream and downstream from the target aneurysm. For cases with strong curvature of the vessel upstream of the aneurysm, the ROI is extended to include the upstream curved vessels. This is done so as to incorporate the effects of complex upstream flow on the aneurysm, and to minimize the artifacts due to the truncation of the domain. The lumen regions are then automatically identified by using the region-growing algorithm described in section Automatic Segmentation and Cleaning and the results are presented in **Figure 8** for the sample cases. This shows that the present algorithm is capable of identifying the aneurysm and connected vessels for various types of the cerebral aneurysms.

The proposed simulation procedure and level-set based, immersed boundary flow solver have been applied to the prepared patient-specific aneurysm data shown in **Figure 8**. The computational domain covers the ROI and the 3D voxel space is directly used as the Cartesian grid for the flow simulations. The computation employed up to about 2 million Cartesian grid points with isotropic resolution of 0.2∼0.27 mm depending on the ROI size and the voxel resolution. For the present flow simulations, a steady inflow velocity of 0.5 m/s, which is in the range of patient-specific blood flow speed reported in the previous study (Valencia et al., 2008), is applied to all cases. Flow simulations are performed for 5 s of real time which takes about 3 h. with 48 CPU cores on the MARCC (Maryland Advanced Research Computing Center) cluster for each case. The flow simulation results are presented in **Figure 9**, where the flow patterns are visualized via streamlines. Overall, the streamlines are tangent to the axial direction of the vessels, but as one can see in the figure, the curved vessels generate swirling flow patterns in the streamwise direction, i.e. streamwise vorticity (see **Figures 9B,D–G**). The wall shear stress (WSS) is then computed by the method described in section Post-Processing as a post-processing. The computed magnitude of WSS on the lumen wall surface is shown in **Figure 10**. Note that the boundary is not represented by the surface mesh but by the iso-contour of the level set function, φ = 0. Since

TABLE 1 | Types and size parameters for the patient-specific aneurysm cases.


SR, the ratio of aneurysm maximal length to the parent vessel diameter.

FIGURE 8 | Automatically segmented aneurysm and vessel geometries using the present region-growing algorithm for the sample cases (A–G). Target aneurysm and inflow direction are marked.

the simulations are not performed with patient-specific inflow conditions, the overall magnitude of the WSS results may be outside the physiological range. Thus, only a comparative analysis of the WSS for the various cases is appropriate here. For most cases, the WSS magnitude is low on the aneurysm wall, and high on the aneurysm neck and walls of the parent distributions.

vessel. For some cases however, (cases E–G), locally high WSS values are observed on the aneurysm wall. The results show that the present level-set based immersed boundary flow solver can resolve the hemodynamics for cerebral aneurysms with a wide range of shapes.

The hemodynamic metrics normalized suitably with the inflow velocity are listed in **Table 2** for all cases. The WSS on the aneurysm wall is normalized by the inflow velocity, and its maximum (max), average over the aneurysm wall (avg), and variation (var) are calculated. In order to assess the overall flow strength inside the aneurysm, the normalized velocity magnitude is averaged over the volume inside the aneurysm and presented in **Table 2** as well. The results are discussed in the following section.

the best visualization of local distributions.

### DISCUSSION

In the present study, a highly-automated method to perform hemodynamic modeling of cerebral aneurysms using the patientspecific angiogram data has been proposed. The key idea is the direct use of voxelized contrast information from the 3D angiograms to construct a level-set function for the flow simulation with the immersed boundary method on a Cartesian grid. In this approach, the target aneurysm and vessels of interest can be segmented automatically, and no body-conformal surface/volume meshes need to be generated for the flow simulation. The Cartesian grid methods for the simulation of aneurysm hemodynamics were reported in the previous studies for the simple volume penalization method using the masking function (Mikhal and Geurts, 2014), the lattice Boltzmann method (He et al., 2009; Závodszky and Paál, 2013), and the boundary data immersion method (Otani et al., 2018). In the present study, we employ the masking function approach for the automatic segmentation of vessel/aneurysm from the medical imaging data, and the wall boundaries are represented by the level-set function constructed directly by using the image intensity information. For the simulation of aneurysm hemodynamics on the Cartesian grid, a well validated, "sharpinterface" immersed boundary method (Mittal et al., 2008) is adopted, and this solver can provide high resolution, high fidelity flow simulation results because the boundary conditions are imposed on the identified wall location precisely. By employing the present method, manual operations by a user to conduct hemodynamic simulations with the patient-specific angiogram data can be minimized, and this should in principle, enable us to scale up the hemodynamic modeling to very large number of sample cases.

The method developed in this study has been tested for a set of seven patient-specific cases picked from the Johns Hopkins Intracranial Aneurysm Database (JHUIAD). Although the sample size is in the current study is small, the cases involve a variety of aneurysm morphologies, sizes, and locations (see **Figure 7** and **Table 1**). The developed algorithm successfully segmented various types of aneurysm and connected vessels within the ROI automatically as shown in **Figure 8**. The hemodynamic simulations for each case are then also performed automatically by the level-set based immersed boundary flow solver, and the results are presented in **Figures 9**, **10** and **Table 2**.

The present simulation results show that the values and the distribution of the wall shear stress (WSS) are very different for each patient-specific case. It should be noted that, although the peak WSS values in **Figure 10** are in the range of reported values (Shojima et al., 2004), the current simulations are not performed with the patient-specific inflow conditions, and thus the WSS values could be over-predicted (McGah et al., 2014). Thus, comparative analysis of the WSS is warranted here. The present simulation results show that the aneurysms formed around the vessel branching/bifurcation (A,C,D) are exposed to low WSS in general. On the other hand, for the aneurysms on the sidewall of high curvature vessels (B,E–G), a higher WSS is observed, especially in the local region of the aneurysm wall. These observations are in-line with the previous computational



τw <sup>∗</sup> = |τEw| /(ρU<sup>0</sup> 2 ): wall shear stress normalized by inflow into the artery. Subscripts, max, avg, and var denote the maximum, average, and variation over the aneurysm wall, respectively.  uE  /U<sup>0</sup> avg: normalized magnitude of velocity averaged over the aneurysm volume.

studies (Castro et al., 2009; Valen-Sendstad and Steinman, 2014; Cebral et al., 2015). For bifurcation aneurysms, the flow inside the aneurysm is relatively weak (normalized average velocity magnitude: 0.003∼0.04), and high values of WSS are observed only around the aneurysm neck and on the walls of the parent vessels (see **Figures 10A,C,D**; Shojima et al., 2004; Valencia et al., 2008; Castro et al., 2009). The WSS values on the aneurysm wall are consistently higher for the sidewall aneurysms as compared to ones at a bifurcation. This is because the high curvature of the vessel upstream the aneurysm results in more complex flow pattern and allows the stronger flow inside the aneurysm as one can see in **Figures 9B,E–G**. The locally high WSSs are observed at the location where the flow is impinging on or attaching to the aneurysm wall (Shojima et al., 2004; Castro et al., 2009; Cebral et al., 2015). The average velocity magnitude inside the aneurysm listed in **Table 2** clearly shows this trend. The normalized average velocity magnitude for the sidewall aneurysms (0.13∼0.2) are about an order-of-magnitude higher than the ones for the bifurcation aneurysms. The increase of flow strength and WSS for the saccular aneurysm on the sidewall of curved vessel was reported in the previous study (Meng et al., 2006). The present simulations show that the strong curvature of the upstream vessel can also affect the aneurysm hemodynamics (see **Figures 9B,E,G**). This implies that the ROI for the aneurysm hemodynamics simulation needs to be carefully chosen to include the effects of upstream vessel.

For all the cases (A–G), WSSs vary significantly over the range of an order-of-magnitude due to the different flow characteristics (normalized average WSS: 0.00062∼0.025). For the present cases, the WSS is not correlated with the size parameter (SR). However, once the aneurysms are categorized by type or location, the WSSs in the same category show a similar order-ofmagnitude. For example, the saccular aneurysms on the sidewall (cases B,E–G) present higher average WSS values (0.012∼0.025), while the bifurcation aneurysms (A,C,D) show lower values (0.00062∼0.0047). This suggests that a proper categorization of aneurysm morphology is essential for the reliable statistical analysis, and it also emphasizes the need for a large number of samples. Automation of the processes from patient imaging to hemodynamic modeling and post processing, such as is presented here would enable the scaling up of these models to very large sample sizes. Quantitative information regarding the hemodynamics of aneurysms obtained and analyzed for tens of thousands of cases could lead to fresh insights and new metrics regarding the factors that are responsible for aneurysm growth and rupture.

While the present study demonstrates that the method described here is capable of conducting simulations of aneurysm hemodynamics with very limited human intervention, the method has some limitations. First, there are still a significant number of user-defined features and actions such as the determination of segmentation criteria and the ROI size and the identification of inflow/outflow vessels, and these should be reduced to further automate the process. This could be accomplished by employing advanced image processing algorithms and methods such as machine learning. Second, the current method employs a fully developed inflow velocity profile, but if the upstream vessel has high curvature, a fully developed profile may not be valid. This issue could be addressed in a number of ways including by setting the ROI to avoid high curvature at the inflow boundary. For the outflow boundary condition, a traction-free condition is used in the present simulations. For more realistic hemodynamic modeling, the downstream boundary could employ a lumped-element model, which are quite well established in cardiovascular modeling (Esmaily-Moghadam et al., 2013; Min et al., 2015). Finally, in

### REFERENCES


the present method, the voxel spacing of the angiogram data is directly used as a Cartesian grid spacing for the flow simulation. However, this grid resolution may not be enough especially for the smaller vessels. A simple resampling method based on the subdivision of the voxel can be employed to increase the Cartesian grid resolution for the flow simulation.

### AUTHOR CONTRIBUTIONS

J-HS and RM conception and design of research. JC and RT prepared and provided the data. J-HS and PE performed computations. J-HS and PE analyzed data. J-HS, PE, JC, and RM interpreted results of computations. J-HS prepared figures. J-HS and RM drafted manuscript. J-HS, RM, JC, and RT edited and revised manuscript. JHS, PE, RM, JC, and RT approved final version of manuscript.

### FUNDING

This work has been supported by the Johns Hopkins University IDIES Seed Funding Program. Support from NSF grants CBET-1511200 and IIS-1344772 is also acknowledged.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Seo, Eslami, Caplan, Tamargo and Mittal. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Towards a Computational Framework for Modeling the Impact of Aortic Coarctations Upon Left Ventricular Load

Elias Karabelas <sup>1</sup> \*, Matthias A. F. Gsell <sup>1</sup> , Christoph M. Augustin1,2, Laura Marx <sup>1</sup> , Aurel Neic<sup>1</sup> , Anton J. Prassl <sup>1</sup> , Leonid Goubergrits 3,4, Titus Kuehne3,4 and Gernot Plank <sup>1</sup> \*

<sup>1</sup> Computational Cardiology Laboratory, Institute of Biophysics, Medical University of Graz, Graz, Austria, <sup>2</sup> Shadden Research Group, Department of Mechanical Engineering, University of California, Berkeley, Berkeley, CA, United States, <sup>3</sup> Department of Congenital Heart Disease/Pediatric Cardiology, German Heart Institute Berlin, Berlin, Germany, <sup>4</sup> Institute for Imaging Science and Computational Modeling in Cardiovascular Medicine, Charité - University Medicine Berlin, Berlin, Germany

#### Edited by:

Mariano Vázquez, Barcelona Supercomputing Center, Spain

#### Reviewed by:

Joakim Sundnes, Simula Research Laboratory, Norway Chris Patrick Bradley, University of Auckland, New Zealand

#### \*Correspondence:

Elias Karabelas elias.karabelas@medunigraz.at Gernot Plank gernot.plank@medunigraz.at

#### Specialty section:

This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology

Received: 15 December 2017 Accepted: 26 April 2018 Published: 28 May 2018

#### Citation:

Karabelas E, Gsell MAF, Augustin CM, Marx L, Neic A, Prassl AJ, Goubergrits L, Kuehne T and Plank G (2018) Towards a Computational Framework for Modeling the Impact of Aortic Coarctations Upon Left Ventricular Load. Front. Physiol. 9:538. doi: 10.3389/fphys.2018.00538 Computational fluid dynamics (CFD) models of blood flow in the left ventricle (LV) and aorta are important tools for analyzing the mechanistic links between myocardial deformation and flow patterns. Typically, the use of image-based kinematic CFD models prevails in applications such as predicting the acute response to interventions which alter LV afterload conditions. However, such models are limited in their ability to analyze any impacts upon LV load or key biomarkers known to be implicated in driving remodeling processes as LV function is not accounted for in a mechanistic sense. This study addresses these limitations by reporting on progress made toward a novel electro-mechano-fluidic (EMF) model that represents the entire physics of LV electromechanics (EM) based on first principles. A biophysically detailed finite element (FE) model of LV EM was coupled with a FE-based CFD solver for moving domains using an arbitrary Eulerian-Lagrangian (ALE) formulation. Two clinical cases of patients suffering from aortic coarctations (CoA) were built and parameterized based on clinical data under pre-treatment conditions. For one patient case simulations under post-treatment conditions after geometric repair of CoA by a virtual stenting procedure were compared against pre-treatment results. Numerical stability of the approach was demonstrated by analyzing mesh quality and solver performance under the significantly large deformations of the LV blood pool. Further, computational tractability and compatibility with clinical time scales were investigated by performing strong scaling benchmarks up to 1536 compute cores. The overall cost of the entire workflow for building, fitting and executing EMF simulations was comparable to those reported for image-based kinematic models, suggesting that EMF models show potential of evolving into a viable clinical research tool.

Keywords: cardiac mechanics, computational fluid dynamics, finite element model, arbitrary Lagrangian-Eulerian formulation, patient-specific modeling, translational cardiac modeling, total heart function

## 1. INTRODUCTION

CFD models of blood flow in the LV and aorta are important tools for analyzing the mechanistic links between myocardial deformation and flow patterns. Typically, such models are either driven by prescribed flow profiles measured in the LV outflow tract or the aortic root (Goubergrits et al., 2013; Ralovich et al., 2015; Andersson et al., 2017), or by image-based kinematic models (Doenst et al., 2009; Schenkel et al., 2009; Mihalef et al., 2011; Seo et al., 2013; Chnafa et al., 2014; Su et al., 2016) built from segmentation of 4D medical imaging datasets. While such models have proven to be valuable for analyzing the hemodynamic status quo of a patient or for predicting changes in hemodynamics in the aorta secondary to intervention such as aortic valve repair (Kelm et al., 2017) or stenting of a coarctation (Goubergrits et al., 2015), they are inherently limited in their ability to assess cardiac function as the biophysics driving myocardial activation and deformation is not taken into consideration in the model formulation. EMF models that capture the entire physics of a heartbeat based on first principles show promise to overcome this limitation (Crozier et al., 2016a) by rendering feasible the assessment of all essential myocardial parameters, which are known to be key factors driving ventricular remodeling and disease progression. Thus EMF models may offer, in principal, the potential of predicting longer term outcomes beyond changes in the acute response to therapies.

However, due to a number of factors such as the inherent complexity of multiphysics models, the large-scale motion and complex deformation of the myocardial walls as well as the significant computational burden, these models pose substantial methodological challenges. For LV EMF models and similar applications, methods to overcome the problem of large-scale deformations can be roughly classified into two categories: ALE formulations using a moving fluid mesh (Tang et al., 2008, 2010; Nordsletten et al., 2011; Vázquez et al., 2015; de Vecchi et al., 2016) and immersed boundary (IB) methods (Vigmond et al., 2008a; Seo and Mittal, 2013; Choi et al., 2015). While ALE formulations often rely on severe simplifications or automatic remeshing strategies (Long et al., 2013), IB methods are more versatile as the moving wall of the ventricle is not explicitly tracked. However, IBs and all related non-boundary-fitting methods have a reduced accuracy for the solution near the fluidsolid structure interface due to interpolation errors, pose severe challenges on the implementation, and additional degrees of freedom have to be introduced on interface cut elements, which all contributes to significantly higher computational costs (van Loon et al., 2007).

In this study, we report on the progress made toward a novel EMF model of the human LV that is entirely based on first principles and that copes with significantly large defomations, i.e., ejection fractions (EFs) beyond 60%, without requiring remeshing or IB principles. Validated in silico models taken from a recent clinical modeling study where a cohort of in silico EM LV and aorta models of patients suffering from aortic valve disease (AVD) and/or CoA (Augustin et al., 2016a) were built, served as kinematic driver to a computational model of hemodynamics in the LV cavity and aorta. A hybrid two stage modeling approach was adopted with regard to hemodynamics. First, the afterload imposed by the circulatory system onto the LV was represented by a lumped model of afterload and coupled to an EM model of LV and aorta to compute LV kinematics. Subsequently a full-blown CFD model with moving domain boundaries based on an ALE formulation was unidirectionally or weakly coupled to the EM model using the kinematics of its endocardial surface as input. We show validation results for two selected clinical CoA cases under pre-treatment conditions and compare pre-treatment and post-treatment simulation results for one patient case in which the CoA was geometrically repaired by a virtual stenting procedure. Further, we demonstrate numerical feasibility of the implemented approach by analyzing changes in mesh quality and its impact upon solver performance under the significantly large deformations of the LV blood pool mesh and also provide strong scaling benchmarking results for a range of 96–1,536 compute cores. The overall cost of the entire workflow for building, fitting and execution of EMF simulations is ≈ 48 h which is comparable to plain image-based kinematic driver models (Mittal et al., 2016).

## 2. METHODS

The methodology to develop a coupled model of cardiac and cardiovascular hemodynamics based on an ALE formulation is structured as follows.


### 2.1. Clinical Data Acquisition and Model Generation

Hemodynamic data of two patients with clinical indication for catheterization due to CoA—all preceding a cardiac magnetic resonance study—were acquired before and after CoA treatment by stent implant, see **Table 1**. CoA treatment indicators included an echocardiographic measured, peak systolic pressure gradient across the stenotic region of > 20 mmHg and/or arterial hypertension. The study was approved by the institutional

TABLE 1 | CoA patient characteristics from MRI and invasive catheter pressure recordings including end-diastolic volume (EDV), end-systolic volume (ESV), stroke volume (SV), ejection fraction (EF), heart rate (HR), cardiac output (CO), diastolic and systolic pressures recorded in the aorta or estimated from cuff measurements (Pao/cuff,dia and Pao/cuff,sys), mean arterial pressure (MAP) computed from pressure recorded invasively in the aorta or estimated from Pcuff,dia and Pcuff,sys, and aortic valve open pressure Popen determined from invasive pressure recordings.


research ethics committee following the ethical guidelines of the 1975 Declaration of Helsinki. Written informed consent was obtained from the participants' guardians. Acquired data are summarized in **Table 1**.

#### 2.1.1. MRI Acquisition and Post Processing

MR imaging was done with a whole body 1.5 Tesla MR scanner Achieva R 3.2.2.0 using a five-element cardiac phased-array coil (Philips Medical System, Best, Netherlands). Three MRI sequences were used further in our study: (i) flow-sensitive fourdimensional (4D) velocity-encoded magnetic resonance imaging (4D VEC-MRI), (ii) three-dimensional (3D) anatomical imaging of the whole heart (3DWH) during diastasis, and (iii) 4D gapless short axis Cine MRI.

4D VEC-MRI of the thorax was performed using an anisotropic 4D segmented k-space phase contrast gradient echo sequence. Retrospective electrocardiographic gating without navigator gating of respiratory motion in order to minimize acquisition time was used. Sequence parameters were: acquired voxel 2.5 × 2.5 × 2.5 mm; reconstructed voxel 1.7 × 1.7 × 2.5 mm; repetition time 3.5 ms; echo time 2.2 ms; flip angle 5°; 25 reconstructed cardiac phases; number of signal averages 1; High velocity encoding (3–6 m/s) in all three directions was used in order to avoid phase wraps in the presence of coarctation and associated secondary flow. Flow measurements were completed with automatic correction of concomitant phase errors. Postprocessing for analysis of flow rates across the aortic valve was carried out with GTFlow 1.6.8 software<sup>1</sup> (Gyrotools, Zurich, Switzerland).

The 3DWH exemplary sequence parameters were: acquired voxel 0.66 × 0.66 × 3.2 mm; reconstructed voxel 0.66 × 0.66 × 1.6 mm; repetition time 4.0 ms; echo time 2.0 ms; flip angle 90°; and number of signal averages 3.

Short axes Cine imaging data were acquired with sequence parameters: 16 slices, with an acquisition resolution of 0.86 × 0.86 × 6.0 mm, repetition time 4.24 ms, echo time 2.12 ms, flip angle 60◦ and 25 automatically reconstructed cardiac phases which were used to determine LV volume traces. The noncompact myocardium as well as papillary muscles were counted toward blood pool volume.

MRI based pressure mapping allowing to assess noninvasively the relative pressures in a vessel by solving Pressure


Relative pressure maps are represented with zero pressure located at the center of the CoA (narrowest location). 3D mask based on 3DWH data was used due to its better spatial resolution compared to 4D VEC-MRI data. Correction of velocity data (step ii) was done in order to minimize noise and aliasing artifacts originating from multiple sources.

#### 2.1.2. Invasive Catheter Recordings

During catheterization, pressure was recorded over the cardiac cycle in the ascending aorta and the LV before treatment and repeated in the ascending aorta after an interventional treatment procedure was performed. Pressures were recorded simultaneously at three predefined locations (LV, ascending aorta, and descending aorta) and the femoral artery during catheterization. Patients were sedated by intravenous administration of a bolus of midazolam (0.1–0.2 mg/kg, max. 5 mg), followed by a bolus of propofol (1–2 mg/kg, as needed) and continuous infusion of propofol (1–2 mg/kg, as needed). Pressure measurements were taken with senior cardiologists present. Pigtail catheters (Cordis, Warren, NJ, USA) of 5-6F were connected to pressure transducers (Becton-Dickinson, Franklin Lakes, NJ, USA). Routinely, patients received balloon angioplasty with or without additional

Poisson equation (PPE) was done with MevisFlow<sup>2</sup> . Briefly, the PPE can be derived from the Navier–Stokes equations by taking the divergence of the momentum Equation (26), see Gresho and Sani (1987) and Krittian et al. (2012) for more details. For more details we refer to Krittian et al. (2012). The processing and analysis pipeline of the pressure mapping consists of the following four steps.

<sup>1</sup>http://www.gyrotools.com/products/gt-flow.html

<sup>2</sup>https://www.mevis.fraunhofer.de/en/solutionpages/mevisflow-non-invasiveinteractive-exploration-of-in-vivo-hemodynamics.html

placement of a stent in order to treat a given stenosis by removing the narrowing of the vessel and thus the pressure gradient. To reduce duration of catheterization, pressures were measured post-treatment only in the ascending aorta. The Schwarzer hemodynamic analysis system (Schwarzer, Heilsbronn, Germany) was used to amplify, acquire, and analyze pressure signals.

#### 2.1.3. Anatomical FE Model Generation

Multi-label segmentation of the LV myocardium, LV blood pool, left atrium (LA) and aortic cavities was done at the DHZB using 3DWH data and the ZIB Amira software<sup>3</sup> (Stalling et al., 2005). The segmentations were smoothed and upsampled to a 0.1 mm isotropic resolution using a variational smoothing method (Crozier et al., 2016a). The resulting high resolution multi-label segmentation was meshed using CGAL<sup>4</sup> (The CGAL Project, 2017), giving a global mesh <sup>0</sup> s,total consisting of tetrahedral elements. Here, (•) <sup>0</sup> denotes the mechanical reference configuration at end-diastolic pressure. The mesh was subdivided into various subdomains corresponding to predefined labels which are summarized in Equation (3). We write

$$\mathfrak{Q}^{0}\_{\text{s,total}} = \bigcup\_{i \in I} \mathfrak{Q}^{0}\_{\text{s},i} \tag{1}$$

with the index set

$$I := \{ \text{lv, ao, cushion, av, mv, lvbp, aobp} \}, \tag{2}$$

see **Figures 1E–G** for illustration. The elements in the index set are abbreviations for the following labels

$$\begin{aligned} \text{lv} & \leftrightarrow \text{Myocardium}, \\ \text{ao} & \leftrightarrow \text{Aortic wall}, \\ \text{cushion} & \leftrightarrow \text{Elastic cushion}, \\ \text{av} & \leftrightarrow \text{Aortic value}, \\ \text{mv} & \leftrightarrow \text{Mital value}, \\ \text{lvbp} & \leftrightarrow \text{Left ventricular blood pool}, \\ \text{aobp} & \leftrightarrow \text{Aortic blood}. \end{aligned}$$

With this, we define the following submeshes

$$
\mathfrak{Q}^0\_{\mathrm{s}} := \mathfrak{Q}^0\_{\mathrm{s,total}} / \left( \mathfrak{Q}^0\_{\mathrm{s,lbvp}} \cup \mathfrak{Q}^0\_{\mathrm{s,aobp}} \right), \tag{4}
$$

$$
\Omega^0\_{\rm s,bp} = \widetilde{\Omega}^0\_f := \Omega^0\_{\rm s,av} \cup \Omega^0\_{\rm s,lvbp} \cup \Omega^0\_{\rm s,aobp}, \tag{5}
$$

where <sup>0</sup> s is the solid domain and <sup>0</sup> s,bp is the unsmoothed blood pool domain used for extracting a smoothed CFD mesh, see **Figures 1E,F**. For later use, we define the following surfaces

$$
\Gamma^{0}\_{\mathrm{s,N}} := \partial \left( \left( \Omega^{0}\_{\mathrm{s,lv}} \cup \Omega^{0}\_{\mathrm{s,av}} \cup \Omega^{0}\_{\mathrm{s,mv}} \right) \cap \Omega^{0}\_{\mathrm{s,lvbp}} \right), \tag{6}
$$

$$
\Gamma^{0}\_{\text{s,H}} := \partial \Omega^{0}\_{\text{s}} / \left(\Gamma^{0}\_{\text{s,N}} \cup \Gamma^{0}\_{\text{s,D}}\right),
\tag{7}
$$

$$
\Gamma^0\_{\rm s,bp} := \partial \Omega^0\_{\rm s,bp} \backslash \Gamma^0\_{\rm s,D}, \tag{8}
$$

<sup>3</sup>https://amira.zib.de

<sup>4</sup>http://www.cgal.org

where Ŵ 0 s,D denote the cutoff faces as indicated by blue lines in **Figure 1**; Ŵ 0 s,N are surfaces subject to pressure; and Ŵ 0 s,H are surfaces with homogeneous Neumann boundary conditions. In order to avoid numerical difficulties with non-smooth, jagged boundaries, the surface of the mechanical blood pool domain Ŵ 0 s,bp was extracted and smoothed using the VMTK toolbox<sup>5</sup> (Antiga et al., 2008). The smoothed surface, Ŵ 0 f,wall, was used to define the boundary of the fluid domain reference configuration, 0 f , for volumetric FE meshing using ANSYS ICEM CFD<sup>6</sup> . Refined boundary layers were included in this process to better resolve sharp gradients in the vicinity of Ŵ 0 f,wall occurring during simulation of hemodynamics. The various processing stages for building EM and CFD models are illustrated in **Figures 1, 4**, respectively.

### 2.2. Electromechanical Model

#### 2.2.1. Electrophysiology of the LV

A recently developed reaction-eikonal (R-E) model (Neic et al., 2017) was employed to generate electrical activation sequences which serve as a trigger for active stress generation in cardiac tissue. The hybrid R-E model combines a standard reactiondiffusion (R-D) model based on the monodomain equation with an eikonal model. Briefly, the eikonal equation is given as

$$\begin{cases} \sqrt{\nabla \mathbf{x} t\_{\mathbf{a}}^{\top} \mathbf{V} \nabla \mathbf{x} t\_{\mathbf{a}}} = 1 & \text{in } \Omega^{0}\_{s\_{\mathbf{s}} \mathbb{I} \mathbf{v}^{\ast}}\\ \mathbf{t}\_{\mathbf{a}} = \mathbf{t}\_{0} \text{ on } \Gamma^{0}\_{s\_{\mathbf{s}} \ast} \end{cases} \tag{9}$$

where (∇**X**) is the gradient with respect to the end-diastolic reference configuration <sup>0</sup> s,lv; t<sup>a</sup> is a positive function describing the wavefront arrival time at location **X** ∈ <sup>0</sup> s,lv; and t<sup>0</sup> are initial activations at locations Ŵ 0 s,<sup>∗</sup> ⊆ Ŵ 0 s,N. The symmetric positive definite 3 × 3 tensor **V**(**X**) holds the squared velocities vf (**X**), vs(**X**), vn(**X**) associated to the tissue's eigenaxes, referred to as fiber, **f**0, sheet, **s**0, and sheet normal, **n**0, orientations. The arrival time function ta(**X**) was subsequently used in a modified monodomain R-D model given as

$$
\beta \text{C}\_{\text{m}} \frac{\partial V\_{\text{m}}}{\partial t} = \nabla\_{\text{X}} \cdot \sigma\_{\text{i}} \nabla\_{\text{X}} V\_{\text{m}} + I\_{\text{foot}} - \beta I\_{\text{ion}}, \tag{10}
$$

where an arrival time dependent foot current, Ifoot(ta), was added which is designed to mimic subthreshold electrotonic currents to produce a physiological foot of the action potential. The key advantage of the R-E model is its ability to compute activation sequences at much coarser spatial resolutions that are not afflicted by the spatial undersampling artifacts leading to conduction slowing or even numerical conduction block as it is observed in standard R-D models. Ventricular EP was represented by the tenTusscher–Noble–Noble–Panfilov model of the human ventricular myocyte (ten Tusscher et al., 2004). As indicated in Equations (9, 10), activation sequences and electrical source distribution in the LV were computed in its end-diastolic configuration <sup>0</sup> s,lv, that is, any effects of

<sup>5</sup>http://www.vmtk.org

<sup>6</sup>http://www.ansys.com/Services/training-center/platform/introduction-toansys-icem-cfd-Hexa

deformation upon electrotonic currents remained unaccounted for.

homogeneous Neumann boundary conditions

$$F\vec{\rm{S}}(\vec{d}\_{\rm{s}},t) = \vec{0} \qquad\qquad\qquad\text{on }\Gamma^{0}\_{\rm{s,H}}.\tag{13}$$

and inhomogeneous Neumann boundary conditions

$$\mathbf{FS(d\_s, t)} \; \mathbf{n}\_{s, 0} = p(t) f \; \mathbf{F}^{-\top}(\mathbf{d\_s, t}) \; \mathbf{n}\_{s, 0} \quad \text{on} \quad \Gamma^0\_{s, \mathcal{N}} \tag{14}$$

were imposed, where **n**s,0 is the outward unit normal vector; p(t) is the pressure; and J = det **F**. For sake of clarity, boundary conditions are illustrated in **Figure 1C**.

The total stress **S** was additively decomposed according to

$$\mathbf{S} = \mathbf{S}\_{\rm pas} + \mathbf{S}\_{\rm act},\tag{15}$$

where **S**pas and **S**act refer to the passive and active stresses, respectively. Passive stresses were modeled based on the constitutive equation

$$\mathbf{S}\_{\rm pas} = 2 \frac{\partial \Psi(\mathbf{C})}{\partial \mathbf{C}} \tag{16}$$

given a hyper-elastic strain-energy function 9 and the right Cauchy–Green strain tensor **C** = **F** <sup>⊤</sup>**F**. Two different strainenergy functions were used for characterizing passive mechanical

#### 2.2.2. Active and Passive Mechanics in the LV and Aorta

The deformation of the heart is governed by imposed external loads such as pressure in the cavities or from surrounding tissue and active stresses intrinsically generated during contraction. Tissue properties of the LV myocardium and the aorta are characterized as a hyperelastic, nearly incompressible, anisotropic material with a non-linear stress-strain relationship. Mechanical deformation was described by Cauchy's equation of motion under stationary equilibrium assumptions leading to a quasi-static boundary value problem

$$-\nabla\_{\mathbf{X}} \cdot \mathbf{F} \mathbf{S}(\mathbf{d}\_{\mathbf{s}}, t) = \mathbf{0} \quad \text{in } \Omega^0\_{\mathbf{s}}, \tag{11}$$

for t ∈ [0, T], where **d**<sup>s</sup> is the unknown displacement; **F** is the deformation gradient; **S** is the second Piola–Kirchhoff stress tensor; and (∇**<sup>X</sup>** ·) denotes the divergence operator in the Lagrange reference configuration. Homogeneous Dirichlet boundary conditions

$$\mathbf{d}\_{\mathbf{s}} = \mathbf{0} \quad \text{on} \quad \Gamma^{0}\_{\mathbf{s,D}},\tag{12}$$

behavior in the LV and the aorta. In the LV, where the underlying mesh <sup>0</sup> s,lv and fiber orientations (**f**0,**s**0, **n**0) are the same as for the EP model, section 2.2.1, the transversely isotropic constitutive relation

$$
\Psi\_{\rm Guc}(\mathbf{C}) = \frac{\kappa}{2} \left( \log J \right)^2 + \frac{C\_{\rm Guc}}{2} \left[ \exp(\mathcal{Q}) - 1 \right]. \tag{17}
$$

by Guccione et al. (1995) was employed. Here, the term in the exponent is

$$\mathcal{Q} = b\_{\text{f}} (\mathbf{f}\_{0} \cdot \overline{\mathbf{E}} \mathbf{f}\_{0})^{2} + b\_{\text{l}} \left[ (\mathbf{s}\_{0} \cdot \overline{\mathbf{E}} \mathbf{s}\_{0})^{2} + (\mathbf{n}\_{0} \cdot \overline{\mathbf{E}} \mathbf{n}\_{0})^{2} + 2(\mathbf{s}\_{0} \cdot \overline{\mathbf{E}} \mathbf{n}\_{0})^{2} \right]$$

$$+ 2b\_{\text{fs}} \left[ (\mathbf{f}\_{0} \cdot \overline{\mathbf{E}} \mathbf{s}\_{0})^{2} + (\mathbf{f}\_{0} \cdot \overline{\mathbf{E}} \mathbf{n}\_{0})^{2} \right] \tag{18}$$

and **E** = 1 2 (**C** − **I**) is the modified isochoric Green–Lagrange strain tensor, where **C** := J <sup>−</sup>2/3**C**. Default values of b<sup>f</sup> = 18.48, b<sup>t</sup> = 3.58, and bfs = 1.627 were used. The parameter CGuc was varied for the different cases, see **Table 2**. In the aorta <sup>0</sup> s,ao, unlike in previous studies (Augustin et al., 2014), we refrained from assigning fiber structures, since our efforts were primarily focused on modeling the biomechanics of the LV and, to a lesser degree, the aorta. Thus, in absence of information on structural anisotropy, an isotropic model due to Demiray (1972) was used

$$\Psi\_{\rm Dem}(\mathbf{C}) := \frac{\kappa}{2} \left( \log J \right)^2 + \frac{a}{2 \cdot b} \left\{ \exp \left[ b \left( \text{tr}(\overline{\mathbf{C}}) - \mathbf{3} \right) \right] - 1 \right\}. \tag{19}$$

The parameter <sup>e</sup><sup>C</sup> <sup>=</sup> a 2b was chosen such that <sup>e</sup><sup>C</sup> <sup>=</sup> 3,000 kPa in the aorta, <sup>e</sup><sup>C</sup> <sup>=</sup> 30,000 kPa for valves, and <sup>e</sup><sup>C</sup> <sup>=</sup> 300 kPa for the elastic cushion. The bulk modulus κ, which serves as a penalty parameter to enforce nearly incompressible material behavior, was chosen as κ = 650 kPa in both Equations (17, 19). For the elastic cushion a value of κ = 100 kPa was used.

A simplified phenomenological contractile model was used to represent active stress generation (Niederer S. A. et al., 2011). Owing to its small number of parameters and its direct relation to clinically measurable quantities such as peak pressure, plv, and the maximum rate of rise of pressure, dplv/ dtmax, this model is fairly easy to fit and thus very suitable for being used in clinical EM modeling studies. Briefly, the active stress transient is given by

$$S\_{\mathbf{a}}\{t,\lambda\} = S\_{\text{peak}}\phi(\lambda)\tanh^2\left(\frac{t\_{\mathbf{s}}}{\tau\_{\mathbf{c}}}\right)\tanh^2\left(\frac{t\_{\text{dur}}-t\_{\mathbf{s}}}{\tau\_{\mathbf{r}}}\right),$$
 
$$\text{for } 0 < t\_{\mathbf{s}} < t\_{\text{dur}},\tag{20}$$

with

$$\phi = \tanh(\text{ld}(\lambda - \lambda\_0)), \quad \mathbf{r}\_{\mathbb{C}} = \mathbf{r}\_{\mathbb{C}\_0} + \text{ld}\_{\text{up}}(1 - \phi), \quad t\_{\mathbb{S}} = t - t\_{\mathbf{a}} - t\_{\text{emd}} \tag{21}$$

and t<sup>s</sup> is the onset of contraction; φ(λ) is a non-linear lengthdependent function in which λ is the fiber stretch and λ<sup>0</sup> is the lower limit of fiber stretch below which no further active tension is generated; t<sup>a</sup> is the local activation time from Equation (9); temd is the EM delay between the onsets of electrical depolarization and active stress generation; Speak is the peak isometric tension; tdur is the duration of active stress transient; τ<sup>c</sup> is time constant of contraction; τc<sup>0</sup> is the baseline time constant of contraction; ldup is the length-dependence of τc; τ<sup>r</sup> is the time constant of relaxation; and ld is the degree of length dependence. Thus, active stresses in this simplified model are only length-dependent, but dependence on fiber velocity, λ˙, is ignored. Unlinke in previous studies (Niederer S. A. et al., 2011) we set the nonlinear lengthdependent function φ(λ) = 1 for the whole simulation. The active stress tensor in the reference configuration <sup>0</sup> s,lv induced in fiber direction **f**<sup>0</sup> is defined as

$$\mathbf{S\_{3}} = \mathbf{S\_{3}} \left(\mathbf{f\_{0}} \cdot \mathbf{C} \mathbf{f\_{0}}\right)^{-1} \mathbf{f\_{0}} \otimes \mathbf{f\_{0}},\tag{22}$$

with S<sup>a</sup> defined in Equation (20). This active stress involves a scaling by λ <sup>2</sup> = **f**<sup>0</sup> · **Cf**0, see Pathmanathan and Whiteley (2009) for details.

#### 2.2.3. Mechanical and Hemodynamic Afterload Models

Hydrostatic pressures in the LV, plv, and the proximal aorta, pao, were modeled using a 3-element Windkessel model (Westerhof et al., 1971), and the system of PDEs (11) was linked to this lumped model of the arterial system, see **Figure 2**. The models were coupled by a diode (aortic valve) which opens at the end of the isovolumetric contraction (IVC) phase when the pressure in the LV cavity, plv, exceeds the pressure in the proximal aorta, pao, and closes at the end of ejection when plv drops below pao and the flow qlv starts to reverse. In its open state the aortic valve was modeled as a linear resistor, Rav, in series with the characteristic impedance of the aorta, Zc. During ejection, the pressure in the LV was then computed by the Windkessel equation

$$\frac{\mathrm{d}p\_{\mathrm{lv}}}{\mathrm{d}t} = \frac{1}{\mathrm{C}} \left( 1 + \frac{\mathrm{Z}\_{\mathrm{c}} + \mathrm{R}\_{\mathrm{av}}}{\mathrm{R}} \right) q\_{\mathrm{lv}} + (\mathrm{Z}\_{\mathrm{c}} + \mathrm{R}\_{\mathrm{av}}) \frac{\mathrm{d}q\_{\mathrm{lv}}}{\mathrm{d}t} - \frac{1}{\mathrm{RC}} p\_{\mathrm{lv}},\tag{23}$$

TABLE 2 | Fitted parameters for EM Model.


which predicts the rate of change of pressure in the LV as a function of flow qlv out of the LV into the aorta. The resistor R represents peripheral arterial resistance placed in parallel with a capacitor C, representing vascular compliance.

A similar form of Equation (23) was also used to estimate the pressure in the aorta, pao. In this case, there is no additional resistance due to an outlet valve and hence Rav is omitted. Balancing of the PDE (11) and the ODE (23) was achieved by recasting Equation (11) as a saddle point problem, see Gurev et al. (2015) and Hirschvogel et al. (2017).

For CFD simulations, hydrostatic pressures at artificial aortic fluid outlets, were modeled using a similar 3-element Windkessel model as in Equation (23) that was rewritten in the form of the following differential algebraic equations for outlet i

$$C\_i \frac{\mathrm{d}p\_{\mathrm{d},i}}{\mathrm{d}t} + \frac{p\_{\mathrm{d},i}}{R\_i} = q\_i,\tag{24}$$

$$p\_{\text{wk},i} = Z\_i q\_i + p\_{\text{d},i},\tag{25}$$

see Fouchet-Incaux (2014) and Bertoglio et al. (2017) for more details. During ejection the Windkessel pressure pwk at an outlet was then applied as an outflow boundary condition for the fluid flow model, see section 2.5.5. In Equations (24, 25), C<sup>i</sup> represents compliance, Z<sup>i</sup> impedence, and R<sup>i</sup> resistance of the peripheral arteries for the respective aortic outlet and q<sup>i</sup> denotes the flux through this outlet. Fitting of the parameters involved will be discussed in section 2.5.5.

#### 2.3. Fluid Flow Model

Human blood in larger vessels such as the LV or the aorta complies with the assumptions of an incompressible, isothermal, Newtonian and single-phase liquid (Nichols et al., 2011). Let <sup>f</sup> ( R <sup>3</sup> denote the fluid domain, then the evolution of flow is governed by the incompressible Navier–Stokes equations

$$\rho\_{\mathbf{f}} \left( \frac{\partial}{\partial t} \mathbf{u}\_{\mathbf{f}} + \mathbf{u}\_{\mathbf{f}} \cdot \nabla\_{\mathbf{x}} \mathbf{u}\_{\mathbf{f}} \right) - \nabla\_{\mathbf{x}} \cdot \sigma\_{\mathbf{f}}(\mathbf{u}\_{\mathbf{f}}, \rho\_{\mathbf{f}}) = \mathbf{0} \qquad \text{in } \Omega\_{\mathbf{f}}, \tag{26}$$

$$\begin{aligned} \nabla\_{\mathbf{x}} \cdot \mathbf{u}\_{\mathbf{f}} &= \mathbf{0} \qquad \text{in } \Omega\_{\mathbf{f}}, \qquad \text{(27)}\\ \mathbf{u}\_{\mathbf{f}} &= \mathbf{0} \qquad \text{on } \Gamma\_{\text{noslip}} \end{aligned}$$

$$\stackrel{\text{\\_}}{\text{ (28)}}$$

$$\mathbf{u}\_{\mathbf{f}} = \mathbf{g}\_{\mathbf{f}} \qquad \text{on } \Gamma\_{\text{inflow}}.\tag{29}$$

$$
\sigma\_{\text{f}}\mathbf{n}\_{\text{f}} - \rho\_{\text{i}}\beta \left(\mathbf{u}\_{\text{f}} \cdot \mathbf{n}\_{\text{f}}\right)\_{-}\mathbf{u}\_{\text{f}} = \mathbf{h}\_{\text{f}} \qquad \text{on } \Gamma\_{\text{outflow}}.
$$

**u**f 

$$\mathbf{u}\_{t=0} = \mathbf{u}\_0,\tag{31}$$

where **u**<sup>f</sup> denotes fluid velocity in m/s; p<sup>f</sup> is fluid pressure in Pa; ρ<sup>f</sup> is the density of blood, given as 1.060 kg/m<sup>3</sup> ; σ<sup>f</sup> is the fluid stress tensor in, Pa, defined as −p<sup>f</sup> **I** + µ<sup>f</sup> ∇**xu**<sup>f</sup> + ∇**xu** ⊤ f , with dynamic viscosity of blood µ<sup>f</sup> given as 0.004 Pa s; **g**<sup>f</sup> , in m/s is a velocity inlet; pwk, in Pa, is the Windkessel pressure solution to Equations (24, 25); **u**0, in m/s, refers to the initial condition; **n**<sup>f</sup> is the outward unit normal of the fluid domain; and (∇**x**) is the gradient and (∇**x**·) is the divergence operator in the fluid domain <sup>f</sup> . The sets Ŵnoslip, Ŵinflow, and Ŵoutflow denote the complementary subsets of Ŵ<sup>f</sup> : = ∂<sup>f</sup> and we assume that |Ŵoutflow| > 0. Note that Equation (29) is given only for the sake of completeness but was not used in this study, as the inflow of blood into the aorta is driven by the motion of the LV thus avoiding the need for prescribing an inflow profile as it is necessary in models which consider the aorta in isolation. For pwk ≡ 0, boundary condition Equation (30) is referred to as directional do-nothing boundary condition, see Esmaily Moghadam et al. (2011) and Braack et al. (2014), and the term

$$(\mathbf{u}\_{\mathbf{f}} \cdot \mathbf{n}\_{\mathbf{f}})\_{-} := \frac{1}{2} \left( \mathbf{u}\_{\mathbf{f}} \cdot \mathbf{n}\_{\mathbf{f}} - |\mathbf{u}\_{\mathbf{f}} \cdot \mathbf{n}\_{\mathbf{f}}| \right) \tag{32}$$

is added for backflow stabilization. A value of β > <sup>1</sup> 2 was assumed to guarantee stability of the system. However, in practical applications values of β ≤ 1 2 were also used without causing numerical issues, see Esmaily Moghadam et al. (2011). In presence of multiple outlets outflow boundary conditions as given in Equation (30) were prescribed at each of the outlets.

#### 2.3.1. Extension to Moving Geometries

For time-dependent fluid domains, i.e., <sup>f</sup> = <sup>t</sup> f , Equations (26–31) need to be modified to account for the domain movement. This requires the linking of the equations governing fluid dynamics—posed in an Eulerian coordinate frame—with the structural mechanics equations—posed in a Lagrangian reference frame. This is achieved by using the ALE formulation which combines both Lagrangian and Eulerian formulation in a generalized description, see Bazilevs et al. (2013, section 1.3) and Hirt et al. (1974). Similar to structural mechanics, a reference fluid configuration <sup>0</sup> <sup>f</sup> ( R<sup>3</sup> is used which we identify with the mesh been generated at enddiastolic state, see section 2.1.3. The coordinate system of the Eulerian frame is denoted by **x** and the reference coordinate system is denoted by **X**. Their relation is given by the ALE mapping **x** = **X** + **d**<sup>f</sup> (t,**X**). Here, **d**<sup>f</sup> (t,**X**) refers to an arbitrary, not necessarily physical, displacement of points to track the deformation of the fluid domain. Using this ALE mapping the time-dependent moving fluid domain is represented as

$$\boldsymbol{\Omega}\_{\rm f}^{t} \coloneqq \left\{ \mathbf{x} : \mathbf{x} = \mathbf{X} + \mathbf{d}\_{\rm f}(t, \mathbf{X}), \,\forall \mathbf{X} \in \Omega\_{\rm f}^{0} \right\}. \tag{33}$$

Further, we define the fluid domain velocity **w**<sup>f</sup> as

$$\mathbf{w}\_{\mathbf{f}} := \frac{\partial}{\partial t} \mathbf{d}\_{\mathbf{f}}|\_{\mathbf{X}^\*} \tag{34}$$

where <sup>∂</sup> ∂t (·) **X** isthe derivative with respect to t with **X** being fixed, and the moving interface between fluid and solid domain as

$$
\Gamma^t\_{\text{f,mov}} := \partial \Omega^t\_{\text{f}} \backslash \bigcup\_{i=1}^{n\_{\text{outlets}}} \Gamma^t\_{\text{f,outflow},i} \tag{35}
$$

where Ŵ t f,outflow,i are the individual aortic outlets. The fluid displacement at this point remains unknown and will be specified in section 2.3.3. Combining these concepts, an ALE description of the Navier–Stokes equations can be derived, see e.g., Bazilevs et al. (2013) and Förster et al. (2006),

$$
\rho\_\mathbf{i} \left( \frac{\partial}{\partial t} \mathbf{u}\_\mathbf{i} \big|\_\mathbf{X} + (\mathbf{u}\_\mathbf{i} - \mathbf{w}\_\mathbf{i}) \cdot \nabla\_\mathbf{x} \mathbf{u}\_\mathbf{i} \right) - \nabla\_\mathbf{x} \cdot \sigma\_\mathbf{i} (\mathbf{u}\_\mathbf{i}, \rho\_\mathbf{i}) = \mathbf{0} \tag{36}
$$

$$\nabla\_{\mathbf{x}} \cdot \mathbf{u}\_{\mathbf{f}} = \mathbf{0} \tag{37}$$

$$\mathbf{u}\_{\rm f} = \mathbf{g}\_{\rm mov} \tag{38}$$

$$\sigma\_l(\mathbf{u}\_l, p\_l)\mathbf{n}\_l - \rho\_l \beta (\langle \mathbf{u}\_l - \mathbf{w}\_l \rangle \cdot \mathbf{n}\_l)\\_{\mathbf{u}\_l} = -\rho\_{\text{wk},i}\mathbf{n}\_l \tag{39} \\ \tag{30} \\ \text{and } \Gamma\_{\text{f}, \text{outflow},i}^t \tag{31}$$

$$\left.\mathbf{u}\_{\mathbf{f}}\right|\_{t=0} = \mathbf{u}\_0 \tag{40}$$

Along Ŵ t f,mov we imposed equality between fluid velocity and the velocity of the moving surfaces. Boundary condition (Equation 39) is the ALE equivalent of the outflow stabilization in Equation (30), see Bazilevs et al. (2013, section 8.4.2.3). Details on how domain movement and velocity were chosen in our application will be discussed later in sections 2.3.3 and 2.5.5.

#### 2.3.2. Variational Formulation of the Navier–Stokes Equations

Following Bazilevs et al. (2007), Bazilevs et al. (2013), and Pauli and Behr (2017), the discrete variational formulation of the ALE Equations (36)–(40) can be stated in the following abstract form: The FE function space S 1 h,∗ (TN) is the conformal trial space of piecewise linear, globally continuous basis functions over a decomposition T<sup>N</sup> of <sup>t</sup> f into N simplicial elements constrained by **v** <sup>h</sup> = ∗ on essential boundaries. The FE function space S 1 h (TN) denotes the same space without constraints. For further details we refer to Brenner and Scott (2007) and Steinbach (2007).

From a mathematical point of view, the Navier–Stokes equation can be seen as a multidimensional convection–diffusion equation with pressure acting as a Lagrangian multiplier of the incompressibility constraint. In the common case where velocity and pressure are retained as unknowns, as above, the Ladyzhenskaya–Babuška–Brezzi (LBB) condition has to

find **u** h f ∈ [S 1 h,**g** (TN)]<sup>3</sup> , p h f ∈ S 1 h (TN) such that for all **v** <sup>h</sup> ∈ [S 1 h,**0** (TN)]<sup>3</sup> and for all q <sup>h</sup> ∈ S 1 h (TN)

$$A\_{\rm NS}(\mathbf{v}^{\mathbf{h}}, q^{\mathbf{h}}; \mathbf{u}\_{\mathbf{f}}^{\mathbf{h}}, p\_{\mathbf{f}}^{\mathbf{h}}) + \mathbb{S}\_{\rm VMS}(\mathbf{v}^{\mathbf{h}}, q^{\mathbf{h}}; \mathbf{u}\_{\mathbf{f}}^{\mathbf{h}}, p\_{\mathbf{f}}^{\mathbf{h}}) = F\_{\rm NS}(\mathbf{v}^{\mathbf{h}}), \tag{41}$$

with the classical bilinear form of the Navier–Stokes equations

$$\begin{split} A\_{\rm NS}(\mathbf{v}^{\mathbf{h}},q^{\mathbf{h}};\mathbf{u}\_{\mathbf{f}}^{\mathbf{h}},p\_{\mathbf{f}}^{\mathbf{h}}) &:= \rho\_{\mathbf{f}}\int\_{\Omega\_{\rm I}^{\mathbf{f}}} \mathbf{v}^{\mathbf{h}} \cdot \left(\frac{\partial}{\partial t}\mathbf{u}\_{\mathbf{f}}^{\mathbf{h}} + \left(\mathbf{u}\_{\mathbf{f}}^{\mathbf{h}} - \mathbf{w}\_{\mathbf{f}}^{\mathbf{h}}\right) \cdot \nabla\_{\mathbf{X}}\mathbf{u}\_{\mathbf{f}}^{\mathbf{h}}\right) \,\mathrm{d}\mathbf{x} \\ &+ \int\limits\_{\Omega\_{\rm I}^{\mathbf{f}}} \mathbf{e}(\mathbf{v}^{\mathbf{h}}) : \sigma\_{\mathbf{f}}(\mathbf{u}\_{\mathbf{f}}^{\mathbf{h}},p\_{\mathbf{f}}^{\mathbf{h}}) \,\mathrm{d}\mathbf{x} \\ &+ \int\limits\_{\Omega\_{\rm I}^{\mathbf{f}}} q^{\mathbf{h}} \nabla\_{\mathbf{x}} \cdot \mathbf{u}\_{\mathbf{f}}^{\mathbf{h}} \,\mathrm{d}\mathbf{x} - \rho\_{\mathbf{f}}\beta \sum\_{i=1}^{n\_{\rm outless}} \\ &\int\limits\_{\Omega\_{\rm I}^{\mathbf{f}}} ((\mathbf{u}\_{\mathbf{f}}^{\mathbf{h}} - \mathbf{w}\_{\mathbf{f}}^{\mathbf{h}}) \cdot \mathbf{n}\_{\mathbf{f}}) \\_{\mathbf{v}}^{\mathbf{h}} \cdot \mathbf{u}\_{\mathbf{f}}^{\mathbf{h}} \,\mathrm{d}\mathbf{x}\_{\mathbf{X}}, \end{split} \tag{42}$$

the bilinear form SVMS, which is explained later in Equation (45), and the right-hand side contribution

$$F\_{\rm NS}(\mathbf{v}^{\rm h}) := -\sum\_{i=1}^{n\_{\rm outless}} \mathcal{p}\_{\rm wk,i} \int\_{\Gamma\_{\rm f,outflow,i}^{\rm r}} \int \mathbf{v}^{\rm h} \cdot \mathbf{n}\_{\rm f} \, \mathrm{d}s\_{\rm x} . \tag{43}$$

In Equation (42), ε is the strain-rate tensor and **w** h f is the discrete counterpart of the fluid domain velocity **w**<sup>f</sup> , i.e.,

$$\mathbf{w}\_{\mathbf{f}}^{\mathrm{h}}(t^{n+1}, \mathbf{X}) = \frac{\mathbf{d}\_{\mathrm{f}}(t^{n+1}, \mathbf{X}) - \mathbf{d}\_{\mathrm{f}}(t^{n}, \mathbf{X})}{\Delta t}. \tag{44}$$

be satisfied by the velocity and pressure spaces (Donea and Huerta, 2003). A violation of the LBB condition may lead to pressure oscillations. Stabilization techniques allowing the circumvention of the LBB condition exist and have been extensively studied (see for example Hughes et al., 1986; Franca and Hughes, 1988; Douglas and Wang, 1989; Bochev et al., 2006). However, with increasing Reynolds number the Navier– Stokes equations become convection dominated. This requires increasingly finer mesh resolutions to accurately resolve finer flow details which, eventually, renders numerical solution in this form computationally intractable. As a remedy, one can resort to using turbulence models. In particular, in this study the residual based variational multiscale turbulence model (RBVMS), see Hughes (1995), Bazilevs et al. (2007), Bazilevs et al. (2013), and Pauli and Behr (2017) was employed which acts as a stabilization and a turbulence model. The underlying main idea is to split the unknown solution into resolvable (coarse) and unresolvable (fine) scales by the FE approximation, where the finer scale details are taken into account based on element residuals. For details on the derivation we refer to elsewhere (Bazilevs et al., 2007). The term SVMS in Equation (41) denotes the bilinear form of the RBVMS formulation and reads as

SVMS(**v** h , q h ; **u** h f , p h f ):= 1 ρf Xnel l=1 Z τℓ <sup>τ</sup>MOM ρf **u** h <sup>f</sup> − **w** h f · ∇**xv** <sup>h</sup> + q h · **r**MOM(**u** h f , p h f ) d**x** + Xnel l=1 Z τℓ τCONT∇**<sup>x</sup>** · **v** <sup>h</sup>∇**<sup>x</sup>** · **u** h f d**x** − Xnel l=1 Z τℓ τMOM**v** h · ∇**xu** h f **r**MOM(**u** h f , p h f ) d**x** − 1 ρf Xnel l=1 Z τℓ τ 2 MOMε(**v** h ): (**r**MOM(**<sup>u</sup>** h f , p h f ) ⊗ **r**MOM(**u** h f , p h f )) d**x**, (45)

where the vector **r**MOM is defined as

$$\mathbf{r}\_{\rm MOM}(\mathbf{u}\_{\rm f}^{\rm h}, \boldsymbol{\rho}\_{\rm f}^{\rm h}) := \rho\_{\rm f} \left( \frac{\partial}{\partial t} \mathbf{u}\_{\rm f}^{\rm h} + \left( \mathbf{u}\_{\rm f}^{\rm h} - \mathbf{w}\_{\rm f}^{\rm h} \right) \cdot \nabla\_{\mathbf{x}} \mathbf{u}\_{\rm f}^{\rm h} \right) - \nabla\_{\mathbf{x}} \cdot \sigma\_{\rm f}(\mathbf{u}\_{\rm f}^{\rm h}, \boldsymbol{\rho}\_{\rm f}^{\rm h}). \tag{46}$$

The definition of the parameters τMOM, τCONT according to Pauli and Behr (2017) is given by

$$\tau\_{\rm MOM} := \min \left\{ \left( \frac{4}{\Delta t^2} + (\mathbf{u\_f^h} - \mathbf{w\_f^h}) \cdot \mathbf{G} (\mathbf{u\_f^h} - \mathbf{w\_f^h}) \right)^{-\frac{1}{2}}, \frac{\rho\_\mathrm{f} C\_\mathrm{M}}{\mu\_\mathrm{f} \sqrt{\mathbf{G} : \mathbf{G}}} \right\}, \tag{47}$$

with 1t being the time step size and **G** := ∂ξ ∂**x** ⊤ **K** ∂ξ ∂**x** , where <sup>∂</sup><sup>ξ</sup> ∂**x** denotes the Jacobian of the mapping from a physical FE to the reference FE, the tensor **K** is defined as

$$\mathbb{K} := \frac{1}{2\sqrt[3]{2}} \begin{pmatrix} 3 & -1 & -1 \\ -1 & 3 & -1 \\ -1 & -1 & 3 \end{pmatrix} \tag{48}$$

and the constant C<sup>M</sup> = 0.0285. Further, the stabilization parameter τCONT is defined as

$$\pi\_{\text{CONT}} := \frac{1}{\pi\_{\text{MOM}} \mathbf{g}\_{\text{f}} \cdot \mathbf{g}\_{\text{f}}},\tag{49}$$

$$\mathbf{g}\_{\mathbf{f},i} := \sum\_{j=1}^{3} \left(\frac{\partial \mathfrak{E}}{\partial \mathbf{x}}\right)\_{ji}. \tag{50}$$

#### 2.3.3. EM-Based Kinematic Driver Model

Displacements computed with the EM model were used to prescribe the kinematics of the blood pool mesh which in turn was used for simulating hemodynamics in the CFD model. This was achieved by imposing **g**mov = ∂ ∂t **d**s in Equation (38). Since the surface of the reference CFD blood pool mesh, ∂<sup>0</sup> f , is not conformal with the surface of the reference EM blood pool mesh, 0 s,bp, and the overlap of the two surfaces is imperfect due to smoothing of ∂<sup>0</sup> f and remeshing of <sup>0</sup> f , a direct transfer of displacements between the two surfaces is not readily feasible. As a remedy, we proceeded as follows. After solving the EM problem the subset of displacementse**d**<sup>s</sup> that form the endocardial interface with the blood pool, Ŵ 0 s,bp, were extracted from the solution **d**<sup>s</sup> defined at <sup>0</sup> s . Since the mesh interface between <sup>0</sup> s and <sup>0</sup> s,bp is conformal the extracted displacements can be applied as inhomogeneous time-varying Dirichlet boundary conditions to the blood pool mesh <sup>0</sup> s,bp to solve a linear elastic problem given as

$$-\nabla\_{\mathbf{X}} \cdot \sigma(\mathbf{d}\_s(t)) = \mathbf{0} \qquad\qquad\qquad\text{in } \Omega^0\_{s, \text{bp}},\tag{51}$$

$$\mathbf{d}\_{\mathbf{s}}(t) = \widetilde{\mathbf{d}}\_{\mathbf{s}}(t) \qquad \qquad \text{on } \partial \Omega^{0}\_{\mathbf{s,bp}}, \qquad \text{(52)}$$

where stress and strain tensor are

$$\sigma(\mathbf{d}\_{\mathbf{s}}) := \frac{E}{1+\nu} \left( \frac{\nu}{1-2\nu} \nabla\_{\mathbf{X}} \cdot \mathbf{d}\_{\mathbf{s}} \mathbb{I} + \mathbf{s}(\mathbf{d}\_{\mathbf{s}}) \right), \tag{53}$$

$$\mathbf{e}(\mathbf{d}\_{\mathbf{s}}) := \frac{1}{2} \left( \nabla\_{\mathbf{X}} \mathbf{d}\_{\mathbf{s}} + \left( \nabla\_{\mathbf{X}} \mathbf{d}\_{\mathbf{s}} \right)^{\top} \right), \tag{54}$$

the constant E is Young's modulus in kPa and the constant ν is Poisson's ratio which is dimensionless in the range of [−1, 0.5). Combining the solutions **d**<sup>s</sup> computed for <sup>0</sup> s and <sup>0</sup> s,bp yields displacements **d**<sup>s</sup> for <sup>0</sup> s,total. Since ∂<sup>0</sup> f is fully embedded in this domain, <sup>0</sup> s,total <sup>0</sup> s,total can be used as a hanging background mesh for interpolating displacements onto the blood pool mesh, 0 f , used for CFD simulations. However, for reasons of mesh quality, interpolation is solely applied on the boundary <sup>0</sup> f itself, and to find the interior displacement field the exact same linear elastic problem 51–54, is solved for **d**<sup>f</sup> instead of **d**<sup>s</sup> .

In both patient cases studied, ejection fractions were large leading to a substantial deformation of the blood pool mesh <sup>t</sup> f . To maintain mesh quality under such large deformations the parameters E and ν governing stiffness and incompressibility of the material were altered accordingly. Initially, a fixed E<sup>0</sup> and ν<sup>0</sup> was chosen while the subsequent modification of E and ν was guided by a combination of the two following strategies.


### 2.4. Numerical Solution

Spatio-temporal discretization of all PDEs and the solution of the arising systems of equations relied upon the Cardiac Arrhythmia Research Package (CARP), see Vigmond et al. (2003). Numerical details on FE discretization (Rocha et al., 2011) and solution of EP (Vigmond et al., 2008b; Neic et al., 2012, 2017) and EM (Augustin et al., 2016b) have been discussed in detail elsewhere. FE discretization and solution of the Navier– Stokes equations were implemented recently using the same numerical framework which was extended to account for nonlinear saddle-point problems arising from the discretized CFD equations.

Two time discretization schemes were implemented and compared for the applications in mind, and a computationally cheap semi-implicit scheme, modified from Forti (2016, section 1.4.2), showed similar results to the more expensive fully-implicit generalized-α method (Jansen et al., 2000). Hence, all results in section 3 were obtained using the semi-implicit scheme; to advance from time step t n to t n+1 , only a linear block system needs to be solved, where each block depends on data from the previous time step only. Solvers for the block system were taken from the PETSc library (Balay et al., 1997, 2016a,b). We used a right preconditoned flexible GMRES method with PETSc fieldsplit preconditioning (Silvester et al., 2001; Elman et al., 2008) which in turn uses BoomerAMG (Van Emden and Yang, 2002) to approximate sub-block inverses. While the time step size for mechanics and CFD was the same, 1tmech = 1tCFD = 0.5 ms, it was significantly smaller for EP, where 1tEP = 25 µs.

The implementation of the CFD solvers has been subjected to various validation procedures against standard CFD benchmarks (Schäfer et al., 1996). All simulations were executed at the national HPC computing facility ARCHER in the United Kingdom using 384 and 768 cores for EM and CFD simulations, respectively.

# 2.5. Model Parameterization

#### 2.5.1. Electrophysiology

Electrical activation sequences were indirectly parameterized using the QRS complex of a given patient's ECG as guidance. Unlike in previous studies (Augustin et al., 2016a), we refrained from a detailed parameterization which aimed at reproducing the QRS complex of the ECG for a given patient by finding appropriate locations and timings for the main fascicles of the cardiac conduction system in the LV. Rather, default locations and timings were used which yielded a total activation time within the physiological range.

#### 2.5.2. Passive Biomechanics

The LV myocardium was characterized as a hyperelastic, nearly incompressible, transversely isotropic material with a nonlinear stress–strain relationship (Guccione et al., 1995). Orthotropic material axes were aligned with the local fiber, sheet and sheet normal directions. To remove rigid body motion, homogeneous displacement boundary conditions were applied by fixing the terminal rims of the clipped brachiocephalic, left common carotid and left subclavian arteries as well as the clipped rim of the aorta descendens, see **Figure 1**. The model was stabilized by resting the LV apex on an elastic cushion of which the bottom face was rigidly anchored also by applying homogeneous displacement boundary conditions.

The constitutive model was fitted to recorded clinical data as previously reported with minor modifications (Augustin et al., 2016a). The passive biomechanical model governed by the strainenergy function given in Equation (17) was fitted to approximate the end-diastolic pressure-volume relation (EDPVR). Due to limitations in the recorded data we refrained from directly fitting the model to the recorded pressure and volume data. Rather, only one data pair—EDV and end-diastolic pressure (EDP) was used to fit the stress-free residual volume to the empiric Klotz relation (Klotz et al., 2007) by adjusting the isotropic scaling parameter CGuc in Equation (17). As the model anatomy was built from a segmented 3DWH MRI scan—acquired during diastasis—the FE model was inflated to increase the volume of the cavity by the difference between the volume at mid diastasis and the EDV. Using the end-diastolic geometry, default material parameters and the recorded EDP, an initial guess of the stressfree reference configuration was computed by unloading the model using a backward displacement method (Sellier, 2011; Bols et al., 2013; Krishnamurthy et al., 2013). The unloading procedure was repeated with varying trial material parameters, CGuc, until the difference between the unstressed LV volume of the model and the prediction of the Klotz relation was less than 5 %.

#### 2.5.3. Active Stresses

Parameters of the active stress model were fitted during IVC and ejection phase. During IVC the LV volume was held constant (Gurev et al., 2015) and the parameters of the active stress given in Equation (20) rate of contraction, τc, and peak active stress, Speak, were manually adjusted to fit the maximum rate of rise of pressure, (dP/dt)max, and peak pressure, plv.

#### 2.5.4. Afterload

When the LV pressure plv exceeded the aortic pressure, pao, ejection was initiated by connecting the LV model with the lumped 3-element Windkessel model (Westerhof et al., 1971). Volume traces recorded from a given patient during ejection were used as input to compute aortic pressure traces by solving Equation (23). Both types of data were not recorded simultaneously as volume traces were computed from Cine MRI scans and pressure traces were recorded later invasively by catheterization. Volume and pressure traces were synchronized in time by aligning the onset of ejection of the volume trace Vlv(t) with the instant of opening of the aortic valve in the pressure trace pao(t). In those cases where heart rates were markedly different between the two measurements, volume traces were scaled in time to adjust LV ejection time (LVET) to the duration of ejection in the pressure traces, that is, the time elapsed between opening and closing of the aortic valve as these two instants in time were clearly identifiable in all traces pao(t), see **Figure 3**. Moreover, volume traces were offset to ensure that the model volume based on the segmentation of the 3DWH scan acquired during diastasis matched up with the Cine-MRI based volume trace at mid diastasis. The parameter space of the Windkessel model comprising characteristic impedance of the aorta, Zc, as well as resistance, R, and compliance, C, of the arterial system was sampled using a recently developed stochastic sampling approach (Crozier et al., 2016b).

Numerous box constraints were used to constrain the search space of parameter sweeps. In particular, we used reported measurements in humans to define the mean values and restricted the search space for each parameter to fall within ±20 % around the mean. Due to high frequency errors introduced by the pressure transducer we refrained from computing norms ||pao,meas − pao,fit|| to quantify the deviations of fitted from measured pressure and opted for manual selection using three criteria, aortic peak pressure, pao, closing pressure of aortic valve and exponential decay of pao during diastole. For the sake of fitting Z<sup>c</sup> we assumed pao ≈ plv since transvalvular pressure gradients in all patients were very minor.

#### 2.5.5. CFD Boundary Conditions

The validated EM models yield the time-dependent displacement fields, **d**<sup>s</sup> , which were transferred onto the fluid domain to drive simulations of blood flow in LV and aorta as described in section 2.3.3 yielding **d**<sup>f</sup> (t, **x**) defined on the whole CFD mesh. **Figure 4G** shows a summary of the boundary conditions. On the boundary Ŵ t f,mov a Dirichlet boundary condition enforcing the mesh velocity **w** h f is applied. On each aortic outlet Ŵf,outflow,<sup>i</sup> (t) a 3-Element Windkessel model as described in section 2.2.3 is attached. Further, the stabilization parameter β in Equation (39) was set to 0.2. Estimation of the input parameters for the hemodynamical Windkessel equations relied on an extension of the simple hydraulic analog of Ohm's law. Given the patient specific MAP, CO, and a percentage α<sup>i</sup> of total CO running through the outlet the resistance R<sup>i</sup> was estimated as

$$R\_i \approx \frac{\text{MAP}}{\alpha\_i \text{CO}}.\tag{55}$$

The percentages α<sup>i</sup> were obtained either by measurement or by applying Murray's law (Murray, 1926). The impedances Z<sup>i</sup> were chosen as 5 % of R<sup>i</sup> , and the compliances C<sup>i</sup> were chosen such that RiC<sup>i</sup> ≈ 1, 000 ms. To keep the semi-implicit character of the

FIGURE 3 | (A) Invasive clinical recordings from cases 28-Pre and 44-Pre. Top: Recorded aortic pressure Pao (black curve) and recorded LV pressure PLV (blue curve). Marked with dashed lines are Systolic pressure Psys, mean arterial pressure MAP, and diastolic pressure Pdia; Center: Volume change in the LV, VLV, in red ranging from end-diastolic volume EDV to end-systolic volume ESV. Bottom: LV flow QLV in orange with marked peak flow Qpeak. (B) Comparison of EM simulations and clinical data. Upper part shows a comparison of the LV model in end-diastolic (colored opaquley blue) and end-systolic configuration (colored by displacement). Lower part shows comparison of clinical (colored blue) and simulated PV loops (colored red). The dashed orange curve shows the ideal Klotz curve, while the green curve shows the simulated Klotz curve, with volume of stress-free unloaded configuration marked as V0.

CFD system the Windkessel equations were solved with a semiimplicit backward Euler method using the flow q n i through the aortic outlet, from the previous time step as input.

f,mov colored in orange, outlet boundaries Ŵ

t f,outflow,i

## 3. RESULTS

Moving wall boundary Ŵ

t

### 3.1. Building Electromechanical Kinematic Driver Models

Using a previously developed automated workflow (Crozier et al., 2016a), anatomical FE models of LV and aorta were built for patient cases 28-Pre and 44-Pre based on segmented imaging data acquired under pre-treatment conditions. **Figure 1** illustrates the key processing steps and the resulting FE model for case 28- Pre. For the case 28-Pre the CoA was repaired by a virtual dilatation procedure applied to the segmented image data with the aim to restore normal cross sectional areas. Subsequently, a new FE mesh was generated referred to as 28-Post, which was essentially identical to 28-Pre, with the only difference being the anatomical adjustment of the CoA in the aortic arch to the target post-treatment anatomy after stenting, see **Figure 5**.

Passive biomechanical properties, afterload and active stress models of cases 28-Pre and 44-Pre were parameterized using clinically recorded pressure and volume data under pretreatment conditions, see **Figure 3A**. The fitted final parameters used are summarized in **Table 2**. The goodness of fit of both integrated EM models was verified by standard PV loop analysis as shown in **Figure 3B**. Results of a quantitative comparison with

colored in blue with attached illustration of the 3-element Windkessel models.

clinically derived metrics including EF, EDV and ESV, CO, and peak systolic pressure are summarized in **Table 3**.

### 3.2. Blood Pool FE Modeling for CFD

Conformal FE blood pool meshes were extracted from EM FE meshes, surfaces were smoothed and used for volumetric remeshing with increased spatial resolution including boundary layers. The corresponding workflow is illustrated in **Figure 4**.

Kinematics of the EM model were transferred to the CFD blood pool mesh and the result is illustrated in terms of


f .

displacements **d**<sup>s</sup> , **d**<sup>f</sup> in **Figure 6II**. Due to the large EF of about 65 % for both 28-Pre and 44-Pre, the blood pool underwent a significant deformation. However, using a combination of element quality and ν-Volume based stiffening with an initial Young's Modulus E<sup>0</sup> = 100 kPa and Poisson's ratio ν<sup>0</sup> = 0.3, sufficient element quality was preserved throughout the entire ejection phase and numerical instabilities could be avoided. **Figure 6I** shows the 80th-percentile of bad element quality against the number of linear iterations required for convergence for the 28-Pre case. The quality of elements was calculated with the same quality inidcator (Freitag and Knupp, 2002; Kanchi and Masud, 2007) as described in section 2.3.3 but was rescaled to the interval [0, 1], with the best element quality being 0 and the worst element quality being 1. The modest increase in iteration numbers of the iterative preconditioned GMRES solver provides indirect evidence of sufficiently preserved mesh quality (see **Figure 6**). Spatially, most lower quality elements were located in the CFD boundary layer.

superimposed with fluid mesh displacement <sup>d</sup><sup>f</sup> on <sup>0</sup>

#### 3.3. Numerical CFD Benchmarks

The implementation of the Navier–Stokes solver was verified by solving a set of standardized benchmark problems, see Schäfer et al. (1996). Computational performance was evaluated by performing strong scaling experiments by repeating the post-treatment hemodynamics simulation of case 28-Post with varying numbers of cores ranging from 96 to 1.536. Details on computational complexity and costs are summarized in **Table 4**. For temporal discretization a time step of 1t = 0.5 ms was used to simulate the ejection phase lasting for 208 ms. The overall discrete system comprised 5,177,056 degrees of freedom, which was solved over 416 time steps. Strong scaling results are summarized in **Figure 7**. Efficient strong scaling behavior was observed up to 768 cores with parallel efficiency slowly degrading from 100 % at 96 cores down to 55 % at 768 cores. Scalability stalled when doubling the core count to 1,536 which reduced the degrees of freedom per parallel partition down to 3,386. Parallel efficiency dropped to 27 % which is attributed due to the unfavorable ratio between local compute work and communication.

#### 3.4. Simulating Cardiac and Cardiovascular Hemodynamics

Hemodynamics in the LV and aorta was simulated using the EM simulations as a kinematic driver. Flow rates through various

Displacement d<sup>s</sup> on <sup>0</sup>

s



Shown are the number of elemens (NE), number of vertices (NV), average edge length h in µm, degrees of freedom for displacement (DOF), degrees of freedom for velocity (DOFU), degrees of freedom for pressure (DOFP).

aortic cross sections and outflow orifices were calculated as the integral over measured fluxes through the cross-sectional plane for both 4D VEC MRI and simulated flow data. At locations of interest which were εDSC, εBCA, εLCA, and εLSCA denoting cross sections in the aorta descendens and the orifices of brachocephalic, left carotid and left subclavian artery, respectively, relative flows were computed from 4D VEC MRI data as fractions α<sup>i</sup> expressed in percent of the total peak flow through the aorta ascendens as determined over the plane εASC. For those planes of interest where measurements were not feasible due to noise, flow percentages were estimated based on Murray's law. Flow curves during ejection at selected cross sections are shown in **Figures 8A,E**. MAP and computed mean flow through each outlet orifice were used to determine the parameters of the coupled Windkessel models of afterload in Equations (24, 25), see **Table 5**. In the 28-Pre case this resulted in flow splits of α<sup>i</sup> ≈ 23, 51.3, 12.83, and 12.83% whereas in the 44-Pre case the flow split ratios were α<sup>i</sup> ≈ 5.68, 57.45, and 34.01% for εDSC, εBCA, εLCA, and εLSCA, respectively.

For the CFD analysis a time step of 1t = 0.5 ms was used. The ejection phases of the EM simulations were chosen as time horizons for the CFD simulation which lasted from t = 90 ms to t = 302 ms in the 28-Pre case and from t = 70 ms to t = 329 ms in the 44-Pre case, yielding 424 and 518 time steps, respectively. The Windkessel parameters for each outlet, calculated as described in section 2.5.5, are summarized in **Table 5**. Pressure p<sup>f</sup> along the centerline s<sup>c</sup> and fluxes through the planes εDSC, εLSC, εBCA, and εASC were computed at the instant of peak flow in the aorta ascendens and compared against measured data, which were pressures derived from Pressure–Poisson mapping (see **Figure 8D**) and 4D VEC MRI fluxes. For case 28-Pre pressure drops were calculated from the pressure values on the intersection of the centerline and εDSC, εASC respectively. Further, we calculated the average pressure over the aforementioned planes as well. Both ways yielded a simulated pressure drop across the CoA of ≈ 29.2 mmHg which agreed well with the clinically estimated pressure drop of ≈ 30 mmHg. Furthermore, we calculated the flux through the various planes and compared them against the clinically estimated fluxes. A quantitative comparison of fluxes is given in **Table 6**. **Figures 8C,G,H** show velocity profiles at peak flow condtions. **Figures 8B,F** show the pressure along the centerlines, the velocity field Ev<sup>f</sup> through the plane εASC, and the position of all planes used for evaluating fluxes. Supplementary Materials 1, 2 contain videos of the time evolution of the velocity distribution for cases 28-Pre and 44-Pre.

#### 3.5. Post-treatment Simulations

Simulations of case 28-Pre were repeated on geometry of case 28-Post using almost the same set of parameters, see **Table 2**. Only Speak was slightly adjusted, which resulted in a better peak pressure value in the LV. The geometry of case 28-Post was almost identical to case 28-Pre with the only exception being the virtual repair of CoA anatomy. In this scenario only pre- and posttreatment simulations were compared to evaluate their relative differences in terms of pressure and flow velocities. **Figure 9** shows results. Pressure drops were calculated as in section 3.4 for both scenarios. For 28-Pre we calculated a pressure drop of ≈ 29.2 mmHg while for 28-Post a pressure drop of ≈ 14.15 mmHg was calculated.

### 4. DISCUSSION

In this study, we report on the progress made toward a novel EMF model of the human LV that is entirely based on first principles and as such, in principle, is able to represent all

FIGURE 8 | CFD results. (A,E) show the given clinical measurements for flow through different planes. The planes are depicted in (B,F). (B,F) also depict the pressure along the centerlines at peak flow conditions at t = 167ms and t = 142ms respectively. (C) shows velocity streamlines at peak flow. (D) shows the relative pressure map from the Pressure–Poisson mapping used for validating the pressure drop in our simulations. (G,H) show velocity streamlines at peak flow and t = 200ms for case 44-Pre.


TABLE 6 | Comparison of clincal estimated flow rates and simulated flow rates through the various planes for cases 28-Pre and 44-Pre.


cause-effect relationships with full biophysical detail. Unlike in the majority of cardiac CFD studies where the use of imagebased kinematic driver models prevails, EM LV and aorta models of CoA patients were employed to serve as a kinematic driver to a computational model of hemodynamics in the LV cavity and aorta. A hybrid two stage modeling approach was adopted with regard to hemodynamics where EM and CFD model are executed sequentially. First, in the EM simulations the afterload imposed by the circulatory system upon the LV was represented by a lumped model to compute LV kinematics. These EM models were carefully fitted to available clinical data to replicate important clinical metrics characterizing hemodynamic and biomechanical work performed by the LV (Gsell et al., under review). In a subsequent step, a full-blown ALE-based CFD model with moving domain boundaries was unidirectionally or weakly coupled to the EM model. The motion of the fluid domain was driven by the kinematics of the EM model. Kinematics was transferred from EM mesh onto the CFD blood pool mesh by generating a combined kinematic model comprising LV, valve, aortic structure and a conformal blood pool mesh which served as a hanging background mesh for interpolation. The higher resolution blood pool CFD mesh with refined boundary layers

was fully immersed in the EM background mesh. Kinematics was transferred by interpolation only onto the surface of the CFD blood pool mesh and extended into the volume of the blood pool by solving a linear solid mechanics problem.

We show validation results for two selected clinical CoA cases under pre-treatment conditions and compare between pretreatment and post-treatment for one patient case in which the CoA was anatomically modified by a virtual stenting procedure. Further, we demonstrate numerical tractability of the implemented approach by providing strong scaling benchmark results. The overall cost of the entire work flow for building, fitting and execution of EMF simulations is comparable to plain image-based kinematic driver models (Mittal et al., 2016), suggesting that the proposed methodology may be, in principle, compatible with clinical time scales.

### 4.1. Biomechanical Modeling vs. Image-Based Kinematics

Modalities such as CMR and Cardiac CT on the other hand, provide excellent spatial resolution. CMR has an in-plane resolution of 1.5 × 1.5 mm, but more limited through-plane resolution (typically about 8 mm) while CT is capable of isotropic spatial resolution on the sub millimeter scale (≈ 0.5 mm) and clear delineation of trabeculae and lumen boundaries. CMR has the advantage of higher temporal resolution (30–50 ms) while temporal resolution in CT depends on the scanning system (50– 200 ms). This is orders-of-magnitude lower than the temporal resolution required for the flow simulation (≈ 1, 000 phases per cardiac cycle) and appropriate interpolation methods need to be employed to create CFD-ready models. This stage of model generation has been very difficult to automate, and remains the biggest bottleneck for patient-specific cardiac flow modeling. Compared to pure image-based kinematic approaches our model is able to compute, e.g., the spatio-temporal distribution of wall stresses, power density, the length of diastolic intervals available for myocardial perfusion, O<sup>2</sup> consumption, and metabolic supply/demand ratios. The variations of all these parameters in response to a changed afterload and many other biomarkers of physiological interest can be derived, which is not feasible with image-based models.

### 4.2. Kinematic Transfer to CFD Blood Pool Model

Both patients modeled in this study featured healthy EFs of > 60 %, that is, EF was ≈ 65 % in both cases. At a such high EFs the wall motion of the LV is significant, leading to substantial reductions in the LV blood pool volume. IB methods (Vigmond et al., 2008a; Seo and Mittal, 2013; Choi et al., 2015) are known to be more convenient to cope with the large deformation of the CFD blood pool (Quarteroni et al., 2017). IB methods and other non-boundary-fitting methods rely on a fixed fluid mesh and the moving wall of the ventricle is not explicitly tracked. The coupling between the CFD mesh and the structure is performed via Dirac Delta functions (IB) or Lagrange multipliers (fictitious domain methods) and is usually realized by introducing additional degrees of freedom on interface cut elements. While mesh generation is only necessary prior to computation fixed mesh methods typically require adaptive mesh refinement or modifications (Wang and Liu, 2004) to obtain reasonable accuracy for the solution near the fluid-solid interface.

In contrast, ALE algorithms capture the fluid-solid interface more accurately, are in general stable and easy to implement, no extra degrees of freedoms are introduced, and computational costs are low in comparison (Tallec and Mouro, 2001; van Loon et al., 2007). However, it is often assumed that unstructured FE approaches, as implemented in this paper, critically depend on automatic remeshing strategies (Long et al., 2013) to keep mesh quality within acceptable bounds (Mittal et al., 2016). Our study demonstrates that this may not necessarily be the case. While the mesh quality decreased with deformation over the course of ejection, the linear elastic deformation of the CFD blood pool mesh combined with the quality-based stiffening approach prevented the degeneration of any elements. The number of elements in which element quality degraded noticeably was very small. As illustrated in **Figure 6**, virtually all elements of reduced quality were located in the higher resolution boundary layer of the CFD blood pool mesh. According to the element quality metric used, an element quality of 1 refers to a fully degenerated element of zero volume. Despite the significant compression of the blood pool mesh, not a single element was deformed to this degree. Even when applying a stricter threshold where element quality is deemed poor if the quality indicator is >0.8, which is not critical from a numerical point of view, the number of elements in this range remained small with < 0.8 % (**Figure 6**). The worst element quality observed in the entire mesh was 0.9994. Using a threshold of >0.95 where element quality may be sufficiently poor to impact more notably on solver performance, only 24 out of 2,506,987 elements were found. Nonetheless, an increase in number of linear iterations required for convergence was observed which is likely to be linked to the gradual degradation of element quality. The number of iterations per solver step increased from around ≈17 iterations during early ejection up to ≈80 iterations during late ejection. While the more than fourfold increase in linear iterations negatively impacted overall solver performance and rendered simulations computationally more expensive, the complexity of automatic remeshing was avoided. We consider this a pivotal importance as automatic remeshing in combination with a MPI parallel FE solver is definitely feasible, but highly non-trivial to implement robustly and efficiently.

#### 4.3. Computational Feasibility

Computational feasibility of human scale cardiac simulations by using strongly scalable numerical implementations has been demonstrated previously for electrophysiology (Niederer S. et al., 2011) and mechanics (Augustin et al., 2016b). More recently, we reported on a novel reaction-eikonal model which reduces the cost of EM simulations significantly by alleviating constraints imposed by reaction-diffusion models upon mesh resolution (Neic et al., 2017). In this study, this recent reaction-eikonal approach was used for simulating EM using the same FE grid with an average resolution of ≈1 mm for both EP and mechanics. Such lower resolutions suffice for solving for mechanics with sufficient accuracy (Land et al., 2015). The overall reduction in terms of nodes and degrees of freedom reduces the compute cost substantially, rendering simulations in desktop environments feasible. Using 96 cores, EM simulations of a full cardiac cycle only lasted ≈180 min which facilitated sufficiently short simulation cycles for efficient model fitting. The entire workflow for building and parameterizing one patient-specific EM model is feasible within a day.

Owing to the higher resolution of the blood pool mesh and the presences of a refined boundary layer the number of nodes and degrees of freedom were higher than for EM simulations, around 350,000/1,500,000 nodes/degrees of freedom for case 28- Pre and 400,000/1,700,000 nodes/degrees of freedom for case 44- Pre, respectively. To assess strong scaling properties of our CFD solver implementation, the resolution was further increased to 1,300,000/5,000,000 nodes/degrees of freedom for case 28-Post to cover a wider range of core counts. Strong scaling efficiency leveled off when doubling from 768 to 1,536 cores. Local compute load with 1,536 was 900/2,600 nodes/degrees of freedom per core. The patient simulations were performed using 384 cores, resulting in a load per core of about 900/2,700 nodes/dofs, respectively. At these resolutions CFD simulations were executed in ≈ 40 min, suggesting that compatibility with clinical time frames will be achievable.

### 4.4. Limitations

In the presented modeling approach numerous simplifying assumptions were made which may affect the biophysical fidelity of the model. In particular, while the aorta was taken into account as a solid structure in the EM simulations, its biomechanical description was simplified by assuming isotropic behavior, that is, the fibrous organization of aortic walls remained unaccounted for (Augustin et al., 2014). Further, as our main focus was on the EM of the LV and, to a much lesser degree, on the aorta, the aortic lumen remained unpressurized and, in absence of distensibility measurements of the aortic wall, parameters of the passive biomechanics model used for the aortic wall were not fitted. Thus the model of the aorta does not respond to the rise in pressure during ejection with an adequate distension 1V of its lumen. In the CFD simulations 1V ≈ 0 translates into a stiff aorta of low compliance which may cause a bias toward overestimation of the computed pressure fields. Further, the influence of the aortic valve upon blood flow was not taken into account. Rather, it was assumed that with the start of ejection the aortic valve is in its full open configuration, which allows blood flow over the entire orifice area and in which the valve does not influence the blood flow out of the LV in a significant way. Since only CoA patients were modeled which showed no indications of AVD this simplifying assumption may be well justified.

A potential main strength of the presented modeling approach—the ability to predict the biomechanical response of the LV to changed flow patterns in the aorta—was not exploited. Due to the weak FSI coupling the immediate feedback of altered flow or changed pressure gradients in the aorta on LV biomechanics was ignored. In our current modeling approach any such feedback must be mediated through changes in the parameterization of the lumped afterload model. However, owing to regulatory mechanism of the circulatory system level this is not directly predictable with the modeling setup used in this study as flow distribution through the four outlets will be influenced by factors which cannot be accounted for in a model comprising only LV, aorta and lumped outflow impedances. In any case, one cannot assume that the computed changes in pressure gradients across a CoA translate directly into a reduction in LV peak pressure. Independently of the modeling approach taken—be it a strongly or weakly coupled FSI model—a lumped model of systemic regulation is likely to be necessary to predict altered LV loading under post-treatment conditions (Arts et al., 2005; Lumens et al., 2009). Compared to a fully coupled FSI model our approach is limited in the sense that CFD simulations do not influence the behavior of the EM model. However, in many clinical settings CFD simulations in the aortic arch and LV with image based kinematics prevail.

Image based kinematic models can only depict the status quo of a patient. With our personalized EM model, based on first principles, we can do simulations altering the motion, simply by changing input paramters. The altered motion is then reflected in the CFD simulation. Examples would include changes in heart beats, infarcts or LBBB conditions.

In this work, the effect of stenting was only accounted for by a geometric change in the computational geometry and an ad hoc adjustement of the lumped model parameters. In future studies, we intend to use a 1-D model of the arterial tree coupled to a 0-D lumped model at the aortic outlets, thus being able to account for the effect of stenting in a more detailed fashion, see for example Quarteroni et al. (2017). As a first step toward our ultimate goal of a fully coupled FSI model, that is based entirely on first principles, we will add the dynamic fluid pressure <sup>ρ</sup><sup>f</sup> 2 |**u**f | 2 to the pressure of the lumped model (0-D or 1-D). This results in a spatio-temporal pressure inside the LV and the aorta, and to incorporate the dynamic feedback of fluid upon structure we will iterate between a CFD solving step and a EM solving step within each timestep to guarantee a converged solution.

### 5. CONCLUSION

Biophysically detailed models of LV EM can be efficiently built and parameterized with clinical data to be considered a viable option for patient-specific simulation. Similar to image-based kinematic models such biophysics-based EM models can be used as a kinematic driver for simulating cardiac and vascular hemodynamics. The cost of model building and execution is comparable between the two approaches. Biophysical EM models offer the significant advantage of being based entirely

### REFERENCES


on first principles and as such, may allow to make predictions of interventions altering pressure and flow patterns onto LV performance. In contrast, image-based kinematics modeling may provide a more accurate representation of blood pool motion, at least under pre-treatment conditions or post-treatment conditions secondary to interventions which do not influence LV kinematics in a significant way.

### AUTHOR CONTRIBUTIONS

EK, GP contributed conception and design of the study; LG, TK acquired and processed clinical data; EK, MG, AN, and CA developed numerical methodology; AP contributed by conceiving modeling workflows and FE meshing; LM and MG developed parameterization of electromechanical model; EK, MG, CA, and GP analyzed and interpreted simulation data; EK, CA, and GP drafted the article; EK, CA, MG, LG, and GP critically revised the article; All authors contributed to manuscript revision, read, and approved the submitted version.

### FUNDING

This research was supported by the grants F3210-N18 and I2760- B30 from the Austrian Science Fund (FWF), the EU grant CardioProof agreement 611232 and a BioTechMed award to GP, and a Marie Skłodowska–Curie fellowship (GA 750835) to CA. We acknowledge PRACE for awarding us access to resource ARCHER based in the UK at EPCC (grant CAMEL) and the Vienna Scientific Cluster VSC-3.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphys. 2018.00538/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Karabelas, Gsell, Augustin, Marx, Neic, Prassl, Goubergrits, Kuehne and Plank. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Computational Evaluation of Cochlear Implant Surgery Outcomes Accounting for Uncertainty and Parameter Variability

Nerea Mangado<sup>1</sup> , Jordi Pons-Prats <sup>2</sup> , Martí Coma<sup>2</sup> , Pavel Mistrík <sup>3</sup> , Gemma Piella<sup>1</sup> , Mario Ceresa<sup>1</sup> and Miguel Á. González Ballester 1,4 \*

<sup>1</sup> BCNMedTech, Universitat Pompeu Fabra, Barcelona, Spain, <sup>2</sup> International Center for Numerical Methods in Engineering, Barcelona, Spain, <sup>3</sup> Med-EL, Innsbruck, Austria, <sup>4</sup> ICREA, Barcelona, Spain

#### Edited by:

Mariano Vázquez, Barcelona Supercomputing Center, Spain

#### Reviewed by:

Mark Potse, Inria Bordeaux-Sud-Ouest Research Centre, France Anuj Agarwal, Signal Solutions LLC, United States

> \*Correspondence: Miguel Á. González Ballester ma.gonzalez@upf.edu

#### Specialty section:

This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology

Received: 14 December 2017 Accepted: 18 April 2018 Published: 23 May 2018

#### Citation:

Mangado N, Pons-Prats J, Coma M, Mistrík P, Piella G, Ceresa M and González Ballester MÁ (2018) Computational Evaluation of Cochlear Implant Surgery Outcomes Accounting for Uncertainty and Parameter Variability. Front. Physiol. 9:498. doi: 10.3389/fphys.2018.00498 Cochlear implantation (CI) is a complex surgical procedure that restores hearing in patients with severe deafness. The successful outcome of the implanted device relies on a group of factors, some of them unpredictable or difficult to control. Uncertainties on the electrode array position and the electrical properties of the bone make it difficult to accurately compute the current propagation delivered by the implant and the resulting neural activation. In this context, we use uncertainty quantification methods to explore how these uncertainties propagate through all the stages of CI computational simulations. To this end, we employ an automatic framework, encompassing from the finite element generation of CI models to the assessment of the neural response induced by the implant stimulation. To estimate the confidence intervals of the simulated neural response, we propose two approaches. First, we encode the variability of the cochlear morphology among the population through a statistical shape model. This allows us to generate a population of virtual patients using Monte Carlo sampling and to assign to each of them a set of parameter values according to a statistical distribution. The framework is implemented and parallelized in a High Throughput Computing environment that enables to maximize the available computing resources. Secondly, we perform a patient-specific study to evaluate the computed neural response to seek the optimal post-implantation stimulus levels. Considering a single cochlear morphology, the uncertainty in tissue electrical resistivity and surgical insertion parameters is propagated using the Probabilistic Collocation method, which reduces the number of samples to evaluate. Results show that bone resistivity has the highest influence on CI outcomes. In conjunction with the variability of the cochlear length, worst outcomes are obtained for small cochleae with high resistivity values. However, the effect of the surgical insertion length on the CI outcomes could not be clearly observed, since its impact may be concealed by the other considered parameters. Whereas the Monte Carlo approach implies a high computational cost, Probabilistic Collocation presents a suitable trade-off between precision and computational time. Results suggest that the proposed framework has a great potential to help in both surgical planning decisions and in the audiological setting process.

Keywords: cochlear implant, surgical outcomes prediction, automatic framework, uncertainty analysis, finite element models, computational modeling, monte carlo, probabilistic collocation method

## 1. INTRODUCTION

Computational models have shown the potential to predict the performance of implantable devices, providing valuable information to guide pre-operative decisions, assisting surgical planning and supporting implant optimization processes. Although they are not yet used in the daily clinical practice, they have provided promising results for the prediction of cochlear implantation (CI) outcomes (Kalkman et al., 2014; Ceresa et al., 2015; Malherbe et al., 2015; Nogueira et al., 2016). CI is a surgical procedure that aims at restoring functional hearing via an implanted device that electrically stimulates the auditory nerves. Over the last decades, technological advances have helped to significantly improve speech perception in implanted patients. Yet, some cases show suboptimal results, and we contend that this is partly due to a lack of appropriate surgical planning tools.

Advanced computational modeling and simulations could help to guide and assist pre and post-operative decisions to optimize the surgical outcome. However, computational studies that consider a set of pre-defined parameters may lead to inaccurate results since they do not account for the inherent uncertainty of model parameters, or the large inter-patient variability. This uncertainty and parameter variability have been shown to affect CI outcomes (Finley et al., 2008; van der Marel et al., 2014). Patient-specific cochlear anatomy has been identified as one of the main factors that determine intracochlear electrode array (EA) position (van der Marel et al., 2014). However, it presents a large variability across patients, leading to a high variation in the EA intra-cochlear position (Finley et al., 2008; van der Marel et al., 2014; Venail et al., 2015) and a broad range of post-operative speech perception scores (Yukawa et al., 2004). Low scores may be the consequence of confused pitch perception or loss of some frequency range due to a mismatch of the alignment between the electrode location and the frequency distribution of the adjacent auditory nerve fibers (ANF) (Rebscher et al., 2008). This causes a harder CI adaptation of the patient, and consequently, a reduction of the possible implant benefits (Rebscher et al., 2008; van der Marel et al., 2014).

Geometrical aspects, such as surgical insertion depth, are not the only factors affecting the CI success. Both geometry and electrical properties of the tissues determine the voltage spread throughout the inner ear. A change in these parameters alters the potential distribution, which is critical to evoke the desired neural response. Tissue electrical resistivity values employed in computational CI models were originally obtained from animal data, and they are still used nowadays (Hanekom and Hanekom, 2016). Nonetheless, electrical properties of bone tissue exhibit the largest variability in humans (Hanekom and Hanekom, 2016). Specifically, bone electrical resistivity has shown to be easily modified by changes of density, which is affected by the chemical composition or some diseases, such as osteosclerosis (Mens et al., 1999). Although the electrical resistivity of the bone has been adapted to a more precise value according to recent studies (Mens et al., 1999; Rattay et al., 2001a; Malherbe et al., 2015), its value cannot be obtained accurately in patients. Hence, the effect of bone tissue on neural excitation profiles remains uncertain.

Despite the large number of techniques employed to study parameter variability and uncertainties in finite element (FE) models (Mangado et al., 2016b), Monte Carlo (MC) method is the most popular because it easily allows generating a set of models – computing for each of them a FE analysis. However, in some studies the associated computational cost is unfeasible when a large set of samples is evaluated, and thus, methods less expensive in terms of computational time are required. In this work, we propose to reduce the computational cost of our study using the Probabilistic Collocation method (PCM), which without modifying the numerical formulation of the FE model, allows evaluating the system outcomes with a reduced number of samples.

Our aim is to study the outcomes of CI computational models considering parameter uncertainty and variability for the prediction of neural response to support optimization processes for surgical planning and implant design. To this end, we make use of our framework for the complete functional assessment of CI (Mangado et al., 2016a), and we combine it with uncertainty quantification methods. First, we study the CI outcomes in a virtual population using the MC method. Due to the high amount of time required for such uncertainty quantification study, a High Throughput Computing (HTC) environment is used to considerably reduce the overall time of computational analysis. Second, we focus on the implant performance in a patientspecific case using PCM. This reduction of the time required for the study allows us to seek the optimal stimulus levels delivered by the implanted electrode – a highly time-consuming process–, providing thus the favorable set up for the implant programming in the given patient during the post-intervention procedure.

### 2. MATERIALS AND METHODS

In this section, first a brief description of the computational framework employed for the evaluation of CI models is introduced (section 2.1). The automatic framework consists of three main blocks: (1) the generation of the computational models, (2) their functional assessment and (3) the evaluation of their outcome. Then, the identification and characterization of the different sources of uncertainty and variability are presented (section 2.2). Finally, uncertainty quantification methods to propagate parameter variability and uncertainty through the CI simulations to the system output are described (section 2.3).

### 2.1. Computational Framework for CI Assessment

#### 2.1.1. CI Computational Model Generation

The first block of the framework is composed of a statistical shape model (SSM), a virtual insertion algorithm and a three dimensional full model of the head. The SSM is a compact representation learned from a training population of the shapes extracted from imaging data. It encodes the shape variability in the population by a small set of weights modulating the contribution of the main modes of variation around the mean shape (Cootes and Taylor, 1995) (**Figure 1** Step 1). By modulating these weights within a limited range, the mean

shape of the cochlea is deformed so that anatomically plausible cochlear morphologies are obtained (further implementation details shown by Mangado et al., 2016a; Gerber et al., 2017). Therefore, we can obtain a set of cochlear surfaces, each of them created from a different combination of the scalar weights (**Figure 2**). Here, this set of surfaces is referred to as population of virtual patients. The surgical trajectory of the EA insertion is computed via our surgical planning software based on the open source simulation framework SOFA (Allard et al., 2007). This surgical trajectory is matched to the centerline of the EA mesh by using a parallel transport frame algorithm (Mangado et al., 2016a). It allows adapting geometrically the EA mesh to the obtained insertion trajectory for a given virtual patient (**Figure 1** Step 2). The parametrization of the virtual EA insertion allows having control over the insertion depth (Mangado et al., 2016a). Cochlear anatomies of two virtual patients with two different insertion depths are shown in **Figures 3A,B**. The EA is based on Med-EL Flex28 design, with 12 electrodes numbered from 1 to 12 as E1 to E12. The virtual patient's cochlea and the array virtually inserted are coupled with a generalized model of the brain, scalp and skull. To further conduct the computational FE simulations, all the elements are transformed into a single volumetric mesh of approximately 2 · 10<sup>6</sup> tetrahedral elements free of intersections. (**Figure 1** Step 3) (Mangado et al., 2017a).

#### 2.1.2. CI Functional Assessment

The second block encompasses the simulations of the electrical field and the ANF model for the assessment of the evoked

neural response. The potential distribution is computed by the FE method (**Figure 1** Step 4) considering a monopolar configuration according to the stimulation strategy used by the implant design: one intra-cochlear electrode is set as active source, while the return is defined as the reference electrode located on the scalp (Mangado et al., 2017a). In the current work, the intra-cochlear electrode delivers a biphasic cathodic-first pulse of 100 µs, similar to previous reported studies (Rattay et al., 2001a,b), with an intensity of 350 µA.

The neural response provoked by the activation of the intracochlear electrodes is computed by the ANF model (**Figure 1** Step 5). This multi-compartment fiber model reproduces the active behavior of the neural cell membrane according to ionic channel kinetics (Hodgkin and Huxley, 1952), adjusted to the human temperature to fit the temporal behavior of the human ANF (Rattay et al., 2001a, 2013). The neural activity is considered as a single spike induced by the depolarization of the neuron, which generates an action potential that is propagated through the ANF. The external stimulation used to initiate this neural response corresponds to the potential value obtained by the FE simulation at the specific spatial location (Rattay et al., 2001a,b). These locations are equal to the ANF compartment coordinates, modeled according to the 3D model of the patient's cochlea and considering the human ANF morphology (Mangado et al., 2016a, 2017a). The model includes 334 nerve fiber bundles. As the human cochlea has approximately 30,000 nerve fibers, each fiber bundle represents 90 neural fibers, retaining enough frequency resolution. **Figures 3G–J** shows examples of four different neural responses for the presented examples.

#### 2.1.3. CI Outcome Evaluation

The third block of the framework assesses the implant performance. Here, the patient's neural response is evaluated by an activation map (Mangado et al., 2017a), where rows represent the frequency bandwidth of each ANF bundle and columns the electrode delivering the stimulus (see **Figure 4**). A target activation map (**Figure 4A**) describes the ideal excitation according to the tonotopic map of the cochlea, selectively stimulating the desired ANF. This tonotopic map provides a specific pitch perception according to the location of the evoked ANF–capturing high frequencies at the base and low frequencies at the apex of the cochlea (Greenwood, 1990; Stakhovskaya et al., 2007).

FIGURE 4 | Activation maps for (A) the desired and (B) the actual neural response, and (C) mismatch map computed in a randomly generated virtual patient. Each electrode on the array is numbered, from the tip (E1) to the base of the array (E12). The actual activation map is split and evaluated according to the stimulation found in the half turn of the cochlea where the mid target frequency is located at the middle of the cochlea section evaluated (D). The activation at the rest of the cochlea (E) is considered as cross-turn stimulation. Local performance score for E6 (F) and local cross-turn score for E2 (G). Activation profiles of both electrodes are highlighted in blue in their corresponding maps.

The actual activation map computed by the computational framework (**Figure 4B**) is then compared with this target map, which leads to a mismatch map (**Figure 4C**). We propose a set of measures using this mismatch map to quantify the neural response to assess the final CI outcome of the patient. We evaluate the global implant performance by the neural activation specificity –true negative rate. We also evaluate two local effects: the frequency selectivity and the cross-turn stimulation (**Figures 4D,E**). The frequency selectivity defines the mismatch between excited frequencies due to a non-focused current stimulation. We refer to this measure as the local performance score. Cross-turn stimulation corresponds to the excitation of the ANF that are located half turn further from the desired frequency bandwidth. Therefore, the second local measure, named cross-turn stimulation score, evaluates the nonselective ANF activation (**Figures 4F,G**).

To compute these two scores, the activation map is split into two–one analyzing the half turn of the cochlea where the center corresponds to the mid target frequency, and another representing the activation at the rest of the cochlea (i.e., cross-turn stimulation) (see **Figures 4D,E**, respectively). We consider that the target bandwidth of each electrode has a modified Gaussian distribution and, given an activation map, assigns positive and negative values to acceptable (up to 3 mm of bandwidth) and non-acceptable activation, respectively (see **Figure 4F**). A frequency bandwidth broader than 3 mm would imply a change in tone and a confusing pitch for the patient (Mistrík and Jolly, 2016). Therefore, cross-turn stimulation areas are penalized. This leads to a performance measure, one for each electrode, where the mid value corresponds to a zero stimulation, the maximum to the ideal activation profile and the minimum to the inverse profile, i.e., the activation of all nondesired ANF exclusively. The described performance measure is applied to both maps obtaining for each virtual patient a value of local performance and cross-turn stimulation score for each electrode (**Figures 4F,G**). For interpretation, both scores are mapped between (0, 100)% (for further details, see Mangado et al., 2017a).

Post-implantation stimulus comprises the stimulation threshold, T-level, and the maximum amplitude of stimulation, C-level. T-level defines the amplitude at which the first neural response within the desired target bandwidth is obtained. The desired target bandwidth is defined according to the EA design. C-level is here considered to be reached when the maximum recruitment of ANF within the desired target bandwidth is accomplished, while minimizing the cross-turn stimulation and avoiding frequency overlap. Therefore, C-level corresponds to the stimulation level of each electrode that provides the highest values of both specificity and sensitivity of the mismatch map.

### 2.2. Uncertainty and Variability Characterization

Uncertainty and variability sources considered in the current study were the insertion depth of the EA, the cochlear anatomy and the bone electrical resistivity. The EA insertion depth was characterized by a normal distribution with mean µ = 27 mm and standard deviation σ = 1 mm to cover the possible range found in the population. This mean value was reported previously in our computational model—with this cochlear anatomy—to be the most reliable to obtain the best CI outcome, and therefore, considered as the target depth (Mangado et al., 2017b). For the patient-specific study, we considered a standard deviation of 0.5 mm related to the inherent uncertainty due to the surgical insertion procedure.

Since the active stimulation range of the EA design is 23.1 mm, the minimum insertion depth was defined as 24.1 mm (active stimulation range plus 1 mm of the tip of the EA) to ensure a full insertion –all electrode contacts of the EA inside the cochlea. The insertion depth was measured from the round window. We took the deepest insertion allowed by the cochlear duct in cases of large values of insertion depth in cochlear anatomies with small dimensions. **Figure 3** shows an example of a small (Virtual patient A) and large cochlea (Virtual patient B)—with 5.5 mm of difference between their Organ of Corti length—with their shortest and longest possible insertions.

We characterized the variability of the cochlear anatomy by modifying the weights of the first three principal components of the SSM (see section 2.1.1). These weights were sampled from normal distributions with mean and standard deviation of 0 and 1, respectively, with maximum values of ±3. This avoids obtaining unrealistic shapes with high deformations, while ensuring plausibility of the shape anatomy. For higher standard deviation values, the generated cochlea presents a larger deformation (see **Figure 2**). The size of the cochlea was described by the length of the osseous spiral lamina, an inner structure located between the Organ of Corti (around 33 mm) and the modiolus wall (around 15 mm) (Stakhovskaya et al., 2007; Rask-Andersen et al., 2012; Venail et al., 2015), visible on our model and µCT images (Rask-Andersen et al., 2012; Martin et al., 2016). In the patient-specific study, the morphology was considered a known factor, defined as the mean shape of the SSM, with a length of the osseous spiral lamina of 25.3 mm.

Based on recent studies reporting the influence of bone resistivity in CI models (Malherbe et al., 2016) , we defined the bone resistivity parameter as normally distributed, with values µ = 65.0 · m and σ = 21.6 · m. These values were obtained matching electric field profiles to clinical data in a small number of computational models considering a broad range of bone resistivity values (Nelson et al., 2008; Tang et al., 2012; Malherbe et al., 2016).

### 2.3. Uncertainty and Variability Propagation and Quantification

We considered two different non-intrusive approaches, which did not modify the described CI framework. The first study used MC sampling to generate a population of virtual patients according to the variability of the cochlear anatomy and the uncertainty sources described in section 2.2. The second study used both MC sampling and PCM to evaluate the neural response in a patient-specific case.

The analysis via MC was performed by a set of individual evaluations that did not depend on each other, so it is easily parallelizable. This allowed us to use a HTC environment called HTCondor, which enables to easily create a grid of computers, maximizing the amount of available computing resources (Thain et al., 2005). MC sampling was implemented in a HTCondor (8 nodes and 40 cores), in both Windows and Linux platforms, to evaluate a large set of patients using our automatic framework (section 2.1). Nonetheless, the MC sampling technique still required to deal with a large number of simulations—leading to a high computational cost—to obtain a satisfactory accuracy. For this reason, to drastically reduce the number of samples, the second study explored the use of PCM to assess the neural response in a patient-specific case, while accounting for the uncertainty sources.

PCM (Loeven and Bijl, 2008) is a numerical technique to solve stochastic differential equations using (Lagrange) polynomial interpolation and Gaussian quadrature. We used PCM to approximate our model's response—treated as a random field as a weighted sum of N<sup>p</sup> Lagrange polynomial functions of the uncertain input parameters. Let f(**x**, ω) be a the random field, a function of (deterministic) **x** and the random variable ω, expanded as:

$$f(\mathbf{x}, \omega) \approx \sum\_{i=1}^{N\_p} f\_i(\mathbf{x}) \cdot L\_i(\xi(\omega)) \tag{1}$$

where fi(**x**) is the value of f(**x**, ω) evaluated at the interpolation point ωi—called collocation point—, ξ is the random basis (chosen so that the uncertain input parameter is a linear transformation of ξ ) and L<sup>i</sup> the Lagrange interpolating polynomial chaos of order n = N<sup>p</sup> − 1 corresponding to ω<sup>i</sup> (i.e., Li(ξ (ω)) passes through the N<sup>p</sup> collocation points, with Li(ξ (ωj) = δij)) (Loeven et al., 2007).

The statistics (mean and variance) are obtained by a Galerkin projection on the polynomial basis, with the collocation points calculated as the points of the Gaussian quadrature (i.e., for each uncertain parameter, the N<sup>p</sup> collocation points correspond to the N<sup>p</sup> roots of the polynomial basis) (Webster et al., 1996; Loeven and Bijl, 2008). When multiple uncertain parameters are considered, the collocation points are obtained from tensor products of one dimensional points and a total of (n + 1)<sup>p</sup> runs (rather than n + 1) are needed, where n is the order of the approximation and p the number of uncertain parameters. The mean and variance in the case of two stochastic variables are approximated as:

$$\mu = \sum\_{i=1}^{N\_p} \sum\_{j=1}^{N\_p} f\_{ij}(\mathbf{x}) \cdot k\_i \cdot k\_j \tag{2}$$

$$
\sigma^2 = \sum\_{i=1}^{N\_p} \sum\_{j=1}^{N\_p} (f\_{ij}(\mathbf{x}) - \mu)^2 \cdot k\_i \cdot k\_j,\tag{3}
$$

where k<sup>i</sup> and k<sup>j</sup> are the weights of the corresponding collocations points ω<sup>i</sup> and ω<sup>j</sup> that compound the random event ω, being fij(**x**) the solution of f(**x**, ω) evaluated at ω<sup>i</sup> and ω<sup>j</sup> . Here, we considered a second order polynomial for the Gaussian quadrature and, therefore, three collocation points (n + 1) for each random variable were required. Two sources of uncertainty were defined, and thus, N<sup>p</sup> <sup>2</sup> = 9 model runs were computed. The same uncertainty characterization was employed using MC sampling to create a set of 250 samples and evaluate the accuracy obtained with PCM.

#### 3. RESULTS

#### 3.1. Virtual Population Study

Preliminary results obtained from a population of 300 virtual patients showed a high impact of the bone resistivity variability, which hindered the impact of the variability and uncertainty of other parameters on the patient's neural response. Very low global performance values were related to the activation of (1) all ANF due to the vast spread of excitation or (2) very few ANF due to a highly focused potential distribution. No relevant effects were found regarding the rest of uncertainty and variability sources. These widespread CI outcomes are likely due to the wide range of variability in bone resistivity (Kalkman et al., 2015; Malherbe et al., 2016).

We created thus a second population of 1,000 virtual patients, divided in three groups. Each of them considered the bone resistivity as a fixed input parameter. The first group (Group 1) comprised 500 virtual patients with a bone resistivity equal to the mean value 65.0 · m (section 2.2). The two other groups, with 250 virtual patients each, had a resistivity of − σ (Group 2) and + σ (Group 3) from the mean, with σ = 4.5· m according to previous reported values (Mens et al., 1999; Rattay et al., 2001a; Frijns et al., 2009; Kalkman et al., 2014; Malherbe et al., 2015). We also used this mean and standard deviation to characterize bone resistivity uncertainty in the patient-specific study (section 3.2).

The population of virtual patients had an average length of 25.3 ± 1.1 mm and the final insertion depths were 26.7 ± 0.8, 26.9 ± 0.8, and 26.9 ± 0.9 mm for the Group 1, 2, and 3, respectively. **Figure 5** shows the CI outcomes for the three virtual populations of patients, with a global performance score (specificity) of 0.75 ± 0.06 (Group 1), 0.71 ± 0.05 (Group 2), and 0.67 ± 0.06 (Group 3).

**Figure 6** represents the global performance according to the shape variability of all virtual patients. The graphics show a clear effect of the bone resistivity on the outcome. In general, lower bone resistivity values led to better global performance measures. Group 3 presented no clear variation related to the morphology. Although the impact of each mode of variation individually was not evident, global performance slightly increased as the second mode took values above the mean. Better results were obtained when the value of the first mode was above 1 standard deviation from the mean, and the third mode, below the mean.

The relation between the global performance and the cochlear length was almost linear: the longer the cochlea, the higher the performance (see **Figure 7**). The effect of the bone resistivity can also be seen; results improved for longest cochleae with low resistivity values (**Figure 7A**). Although the insertion depth did not seem to have as large impact as the bone resistivity, some

value) and (D) Group 3 (+1 standard deviation).

groups with similar behavior were identified (see **Figure 7B**). Short cochleae with short insertion depth showed the worst results (**Figure 7C**). Although deepest insertions did not provide the best results in all anatomies, the best outcomes—with global performance score above 0.8—were obtained for insertions deeper than 26 mm in cochleae with a length of the spiral lamina larger than 26.5 mm.

**Figure 8** presents the neural response of the three sets of populations of virtual patients with regard to local effects. Apical electrodes performed worse than basal ones, in terms of higher non-focal and non-selective activation, with higher spread of excitation and cross-turn stimulation (**Figure 8A**). Medial electrodes showed similar cross-turn scores than apical ones, while they presented better local performance scores – more focused ANF recruitment. 34% of all electrodes presented a local performance score higher than 80%, # while less than 9% of all cases obtained a score below 50% and none less of 45%. Crossturn stimulation scores were 80% of the cases within [70, 95%]. Some outliers (2%) presented the lowest scores below 60 and 13% obtained scores above 95%.

On average, Group 3 obtained the worst performance values due to the higher non-desired ANF excitation and broader spread. Group 2 presented better results in terms of cross-turn stimulation and slightly better in local performance than Group 1. However, for the apical electrodes, Group 2 presented worse local performance score due to the high non-focused activation and missed target frequencies. Group 2 showed slightly narrowed bandwidth, but less non-focused activation, obtaining an overall better performance.

The impact of the insertion depth was also evaluated in terms of local effects. Insertions deeper than 27 mm obtained the best results for apical electrodes (highest values above 90% in E1–E4), although they did not provide such good outcomes in the basal part, missing some target frequencies due to the misaligned electrodes. Group 1 did not show a relevant relationship between the insertion and the local performance. Likewise, cross-turn stimulation was not clearly influenced by the insertion depth, although some of the better results corresponded to insertions between 27 and 28 mm. Some outliers – lowest scores – were identified to correspond to the smallest cochleae (below 24 mm), where the short distances between turns provided a large amount of evoked ANF at non-desired locations. Results of local effects according to the length of the spiral lamina provided similar information, as

shown in **Figure 7**; the smaller the cochlea, the worse the results.

Regarding the computational cost, each patient took 5.1 ± 1.2 h. However, using the HTC environment allowed parallelizing the simulations so that the whole population took <1,010 h (i.e., effective average of 1 h per patient).

#### 3.2. Patient-Specific Case Study

**Figures 9A,B** shows the global behavior of the patient's neural response using the MC approach. In line with the results presented above, as the bone resistivity decreases, the spread of excitation is narrowed. This causes more focused activation and avoids non-desired stimulation (high specificity values). However, if the spread is too narrow, it may not be able to activate the desired bandwidth (low sensitivity values– see **Figure 9C**). Bone electrical resistivity has a effect on the neural response, while the impact of the insertion depth is not observed.

CI global specificity and sensitivity measures were 0.72 ± 0.36 and 0.74 ± 0.35 for the PCM approach, and 0.72 ± 0.04 and 0.75 ± 0.08 for MC. Similarly to the population study, **Tables 1**, **2** show worst results on the basal and medial electrodes, in terms of local performance and cross-turn stimulation. Both scores showed similar patterns to the ones found in the population study (**Figures 8A,B**). Despite the higher standard deviation obtained when using PCM, mean values did not differ more than 3 %, providing an acceptable approximation of the mean behavior. Although the MC approach showed less variance, the computational time reached 1,100 h, while PCM took 96 % less (36 h). The use of higher order polynomials was also evaluated. Results from second to sixth order polynomials – from 9 to 49 samples, respectively – obtained specificity values that differed <1%. Mean values obtained were 0.723, 0.724, 0.724, 0.725, 0.719, 0.720, from 2 to 6 order polynomial, while the mean value using MC was 0.727. Local score values differed depending on their position on the array, however overall differences were <5.5%, being the minimum equal to 0.01%. The required computational time increased exponentially: from 15 to 218 h for first and sixth order, respectively.

Results showed that mean T-levels were approximated with values 240 ± 59 µA and 251 ± 32 µA computed by PCM and MC, respectively. Both approaches presented similar trends regarding each electrode's T-level: lower threshold at the apex (E1–E4) and higher at the first turn (E8–E11). Threshold mean values differed at most 55 µA, in the worst case (E4), while the best approximation was <5 µA (E1, E2, E3, E12). Likewise, Clevels presented lower values at the apex of the cochlea, while highest values were obtained at the medial part.

Mean C-level was 355 ± 71 µA for the PCM approach, in concordance with the behavior observed in **Figure 8B**, where in order to avoid cross-turn stimulation at the apex and medial part, lower amplitudes are required. This post-implantation level could not be computed for the MC approach, due to the unfeasible required computational time. Post-implantation stimulus levels—mean values—for a patient-specific case are shown in **Table 3**. Mean values for the C-level stimulus were evaluated in an average patient (mean cochlear shape, insertion and bone resistivity), obtaining global performance measures of 0.80 and 0.72 for sensitivity and specificity, respectively.

FIGURE 7 | Relation between the global performance and the length of the cochlea (A,B) in all the virtual population and (C) in each group of patients.

## 4. DISCUSSION AND CONCLUSIONS

This work aimed at the assessment of parameter variability and uncertainty using a computational framework for the modeling and the evaluation of CI. To this end, we employed uncertainty quantification methods and the developed automatic framework to functionally evaluate the implant in terms of neural excitation. We used a HTC environment to reduce the computational effort of the uncertainty study while evaluating the range of variability on the population.

TABLE 1 | Local performance score.


TABLE 2 | Cross-turn stimulation score.


Initial results showed that 53% of the virtual population obtained global performance measures in terms of specificity within the range [0.70, 0.80], and almost 10% above 0.80. This performance was related to a low rate of false positives, highly desirable in order to avoid confusing pitch for the patients.

Specificity values below 0.5 were related to wider spread of excitation and ANF recruitment due to an increase of bone resistivity, which combined with small cochlear dimensions, caused a considerable amount of non-selective stimulation. This is in line with the findings presented by Tang et al. (2012) and Malherbe et al. (2015). Indeed, results showed the large impact of the bone resistivity over the neural response: as it increases, CI outcomes worsen (i.e., lower performance measure, higher cross-turn stimulation and broader excited pitch). This behavior can be explained by the tendency of the currents to leak from the cochlear structure when the surrounding bone presents a low resistivity value. In those cases, a reduction of the current density and a narrower spread of excitation are observed (Malherbe et al., 2015). As the current leaks, higher post-implantation stimulus levels are required to reach the desired excited pitch (Frijns et al., 2009). In agreement with the findings reported by Tang et al. (2012) and Malherbe et al. (2015), our results showed that consequently, for high resistivity values (absence of bone conduction) lower stimulus intensity should be employed.

Morphology of the cochlea has also shown an impact over the neural response, as suggested by (van der Marel et al., 2014). The first modes of variation of the SSM can be roughly related to the morphology of the inner ear: the variation in general size, the dimension of the spiral radius and the rotation of the cochlea over the rest of the inner ear (the vestibular canals), for the first, second and third mode, respectively (see **Figure 2**). The second mode is the most influential to the CI outcomes. When it increases, the electrodes are further from the ANF (basal part distances from the modiolus), obtaining a more selective ANF recruitment and better performance measures (**Figure 6**).

The surgical length of insertion has always been a controversial aspect of the CI procedure. In the clinical practice a high variability of insertion depth has been reported (Gstoettner et al., 2004; Rebscher et al., 2008; Franke-Trieger et al., 2014; Kalkman et al., 2014; van der Marel et al., 2014), which varies according to the implant design, target intracochlear position (closer to the modioulus or the lateral wall) and target frequencies (shorter EAs focus on high frequencies, while longer ones cover the whole frequency range). Despite the wide range of reported results, some authors found no significant influence on the patient speech perception (Van Der Marel et al., 2015), while others remarked the insertion depth as a key factor, since it directly affects the alignment between frequency and cochlear location (Dorman et al., 1997; Finley et al., 2008; Mangado et al., 2017b). We found that the impact


TABLE 3 | Post-implantation stimulus levels for a patient-specific case using PCM.

of the insertion depth was subtle, and mainly observed at the base of the cochlea. This was caused by the narrow spread of excitation, which missed some target frequencies.

Although the computational quantification of the implant performance has not been attempted before, local effects have been previously reported. As suggested by Frijns et al. (2001) and Briaire and Frijns (2006), we observed that electrode contacts in the last cochlear turn presented cross-turn stimulation at the base of the cochlea – caused by the tightly coiled geometry of the cochlea at the apex. In addition, medial and basal electrodes showed cross-turn stimulation, identified to be related to the excitation of lower pitches. This could be explained by the use of a high impulse intensity, which combined with the low bone conduction, generates wider current fields that excite a high amount of non-selective ANF. Indeed, we observed that a wider excitation area tends to appear at the apex, as indicated by van der Beek et al. (2012) and Biesheuvel et al. (2016), which limits the spatial selectivity at the apex (Briaire and Frijns, 2006). Results agreed with reported excited pitches for similar computational conditions: lateral electrodes produced similar excitation pitch for bandwidths of 4 mm, i.e., E7 and E10 generated a pitch of 800–1,500 Hz and 2,100–4,400 Hz, respectively, in concordance with 900–1,700 Hz, and 2,000–4,000 Hz reported by Kalkman et al. (2014). These variations could be explained by a slight difference of the angular insertion depth. However, frequency bandwidth wider than 3 mm should be avoided since it implies a change of one octave in frequency, causing a high confusing pitch and therefore a large impact in CI outcomes (Mistrík and Jolly, 2016). To avoid this, in the clinical practice optimal stimulus amplitudes are sought to reach the desired pitch at each electrode location.

Results showed that lower amplitudes were required at apical electrodes, in line with Brill et al. (2009), Malherbe et al. (2013), Kalkman et al. (2014), and van der Beek et al. (2016). Predicted levels tended to decrease on the first electrodes, while increasing toward the base (Malherbe et al., 2013; van der Beek et al., 2016). Obtained T-levels can be compared with experimental measurements (eCAP thresholds): from 190 µA at the apex to 460 µA at the base for a Med-EL Flex28 array (Brill et al., 2009). These findings are also in agreement with previous computational studies, which found T-levels from 150 to 400 µA (Kalkman et al., 2014). However, they also reported relevant differences on these levels according to the geometrical description of the ANF, defined either as radial or oblique trajectories (Kalkman et al., 2014, 2015). The latter provided a better representation of the ANF by relating more accurately the peripheral process of each ANF with the position of its cell body in the spiral ganglion (Stakhovskaya et al., 2007; Kalkman et al., 2015). We believe that the improvement of such trajectories could explain some discrepancies of our results with the clinical data. In addition, previous studies defined the T-level and C-level as the stimulus required to evoke a bandwidth of 1 and 4 mm along the basilar membrane, respectively (Briaire and Frijns, 2006; Kalkman et al., 2014), based on experimental findings reported by Snel-Bongers et al. (2013). Although our proposed performance measures penalized the occurrence of cross-turn stimulation, including this information into our description could provide more reliable post-implantation levels.

The developed framework has a high cost, specifically when a large set of samples needs to be evaluated. The parallelization of the framework to conduct the population study using a HTC environment allowed processing all data more efficiently (4.9 times faster). Still, there is room for improvement. While providing a detailed description of the neural behavior in CI models, the implemented ANF model implied a high computational effort (Hanekom and Hanekom, 2016). Less-expensive neural models, such as analytical or singlecompartment models, could provide an alternative to reduce the required time of simulation. Although these models have been also used for the generation of the action potential (Brette, 2015), they are less realistic and they could imply some limitations on the CI assessment in patient-specific studies, such as in cases of ANF degeneration (Rattay et al., 2001a,b).

As for the uncertainty propagation approach, other sampling techniques could be used instead of MC to reduce the number of runs needed and, therefore, the overall required computational time (Berthaume et al., 2012). The appropriate number of samples to evaluate depends on each case study (Sarrazin et al., 2017), fact that makes it difficult to ensure the desired accuracy without conducting a prior dimensional analysis. The computational effort of the implemented framework hampers such analysis. However, our results are in line with previous findings, thus we consider the set of 250 samples evaluated an acceptable approximation.

Whereas PCM provided a trade-off between computational time and precision in the patient-specific case –compared to the mean obtained by the MC sampling approach–, the population study involved more uncertainty sources, which implied an exponential increment on the computational time. For this reason, PCM is recommended only for studies with few uncertain parameters, since otherwise the benefit of using a considerably lower number of runs than MC would be reduced. Results using PCM had a larger standard deviation. Polynomials of order higher than 6 should be used to gain in accuracy. However, the required computational time increases exponentially, and therefore, the advantage of using PCM to obtain the mean response of the system would be drastically reduced.

Additionally, other approaches for the uncertainty analysis can be employed, for instance, intrusive methods, which reformulate and solve the stochastic version of the deterministic FE model (Mangado et al., 2016b). They have been implemented successfully in electrical simulations considering sources of uncertainty the tissue electrical properties (Geneser et al., 2008) or the behavior of the ionic channels that control cardiac contractions (Du and Du, 2016). Despite their limitation when considering geometrical aspects, they may provide faster solutions to assess patient-specific cases.

Although implant performance in CI has been rarely quantified computationally due to the several involved physiological effects, results suggest that the proposed framework provides reliable information regarding the behavior of the implanted cochlea and in concordance with previous computational and experimental findings. Further improvements include the use of trains of pulses as electrical stimulus inducing then a temporal neural response, as well as the evaluation of different stimulation protocols in terms of current focusing and selective neural recruitment. This study has analyzed the influence of EA insertion and bone resistivity uncertainty according to the variation of the cochlear morphology among the population. This information can help surgeons to select the surgical parameters to achieve the optimal outcome of CI (Finley et al., 2008; van der Marel et al., 2014). Moreover, this work may provide a powerful computational tool for implant design optimization purposes, as well as for the

#### REFERENCES


implant programming to establish the most suitable stimulation setting. Overcoming the limitations mentioned above would lead to a more precise and highly accurate computational tool for its use in the clinical practice.

### AUTHOR CONTRIBUTIONS

NM, JP-P, MCe, and MG: contributed conception and design of the study; NM, JP-P, and MCo: developed the uncertainty quantification study and its implementation in the HTC environment; NM: performed the computational studies, processed the data and obtained results; NM, MCe, PM, and MG: interpreted the results and discussed the resulting conclusions; NM: wrote the first draft of the manuscript and all authors contributed to the manuscript revision, read and approved the submitted version.

#### ACKNOWLEDGMENTS

This work was partly supported by the Spanish Ministry of Economy and Competitiveness under the Maria de Maeztu Units of Excellence Program (MDM-2015-0502), by the AGAUR grant 2016-PROD-00047, the European Union Seventh Framework Program (FP7/2007-2013), Grant agreement 304857, HEAR-EU project and the QUAES Foundation Chair for Computational Technologies for Healthcare.

International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (Lake Buena Vista; Orlando, FL).


**Conflict of Interest Statement:** Author PM was employed by company MED-EL, Austria.

The other authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Mangado, Pons-Prats, Coma, Mistrík, Piella, Ceresa and González Ballester. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Using Non-linear Homogenization to Improve the Performance of Macroscopic Damage Models of Trabecular Bone

#### Francesc Levrero-Florencio1,2 and Pankaj Pankaj <sup>2</sup> \*

*<sup>1</sup> Computational Cardiovascular Science, Department of Computer Science, University of Oxford, Oxford, United Kingdom, 2 Institute for Bioengineering, School of Engineering, The University of Edinburgh, Edinburgh, United Kingdom*

Realistic macro-level finite element simulations of the mechanical behavior of trabecular bone, a cellular anisotropic material, require a suitable constitutive model; a model that incorporates the mechanical response of bone for complex loading scenarios and includes post-elastic phenomena, such as plasticity (permanent deformations) and damage (permanent stiffness reduction), which bone is likely to experience. Some such models have been developed by conducting homogenization-based multiscale finite element simulations on bone micro-structure. While homogenization has been fairly successful in the elastic regime and, to some extent, in modeling the macroscopic plastic response, it has remained a challenge with respect to modeling damage. This study uses a homogenization scheme to upscale the damage behavior from the tissue level (microscale) to the organ level (macroscale) and assesses the suitability of different damage constitutive laws. Ten cubic specimens were each subjected to 21 strain-controlled load cases for a small range of macroscopic post-elastic strains. Isotropic and anisotropic criteria were considered, density and fabric relationships were used in the formulation of the damage law, and a combined isotropic/anisotropic law with tension/compression asymmetry was formulated, based on the homogenized results, as a possible alternative to the currently used single scalar damage criterion. This computational study enhances the current knowledge on the macroscopic damage behavior of trabecular bone. By developing relationships of damage progression with bone's micro-architectural indices (density and fabric) the study also provides an aid for the creation of more precise macroscale continuum models, which are likely to improve clinical predictions.

Keywords: trabecular bone, multiscale modeling, parameter estimation, continuum damage, finite element method, homogenization, biomechanics, high performance computing

### 1. INTRODUCTION

The growth of older population around the world in the last few decades has caused an increase in problems which can be associated to deteriorated mechanical properties of bone; osteoporosis is the clearest example of one such condition.

Computer models have been extensively employed to evaluate the mechanical response of bone and bone-implant systems under a range of loading scenarios (Pankaj, 2013). Previous

#### Edited by:

*Alfons Hoekstra, University of Amsterdam, Netherlands*

#### Reviewed by:

*Bradley John Roth, Oakland University, United States Kumari Sonal Choudhary, University of California, San Diego, United States*

> \*Correspondence: *Pankaj Pankaj pankaj@ed.ac.uk*

#### Specialty section:

*This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology*

Received: *19 January 2018* Accepted: *27 April 2018* Published: *17 May 2018*

#### Citation:

*Levrero-Florencio F and Pankaj P (2018) Using Non-linear Homogenization to Improve the Performance of Macroscopic Damage Models of Trabecular Bone. Front. Physiol. 9:545. doi: 10.3389/fphys.2018.00545*

**210**

studies have assumed bone to be homogeneous (Completo et al., 2009; Conlisk et al., 2015), i.e., its properties do not vary from point to point in space or heterogeneous (Helgason et al., 2008; Schileo et al., 2008; Tassani et al., 2011), i.e., its properties vary with location (these are typically assigned on the basis of greyscale values observed in micro-computed tomography scans). However, in the large majority of studies, bone is assumed to be linear elastic and isotropic, i.e. its properties at a certain point in space are the same in all directions. It is well- recognized that the cellular microstructure of trabecular bone renders it anisotropic (Turner et al., 1990; Odgaard et al., 1997), i.e., properties at a point in space vary in different directions. Finite element (FE) analysis of the bone microstructure, in which the solid and pore phases are explicitly modeled, has been used to evaluate the homogenized anisotropic linear elastic properties of bone in the past two decades. Morphology-elasticity relationships that use bone density and fabric have also been established, with fabric typically measured through the mean intercept length (MIL) fabric tensor (Harrigan and Mann, 1984). These relationships establish links between density, fabric, and the components of the stiffness tensor (Zysset, 2003). More recently, some studies have attempted the evaluation of homogenized yield behaviour (Cowin, 1986; Wolfram et al., 2012; Levrero-Florencio et al., 2016).

Homogenized FE models of the whole bone can include microstructural information at the continuum (macroscopic) level and can thus improve the assessment of the behavior of bone and bone-implant systems in clinical scenarios. Homogenization relies on averaging the strains and stresses over a representative volume element (RVE) of the considered material; it is the most widely used multiscale approach to study the macroscopic behavior of trabecular bone. Homogenization of an RVE in the post-elastic regime requires examining its response to a wide range of loading scenarios (Bayraktar et al., 2004; Levrero-Florencio et al., 2016, 2017a). It is important to note that, in experiments, it is not possible to test multiple load cases after a certain load threshold has been surpassed because permanent deformations and/or damage caused during the first loading case will affect the behavior in subsequent loading cases. Therefore, computational means provide an attractive alternative. Nonetheless, the need for fine resolution to recreate a biofidelic geometry of the bone microstructure leads to micro-FE (µFE) systems of several tens of millions of degrees of freedom. The need to undertake multiple load cases each in non-linear regime requires the usage of high performance computing (HPC) platforms and software which can take advantage of them.

Although the damage behavior of bone has been considered in a few studies (Keaveny et al., 1994a; Garcia et al., 2009; Shi et al., 2010; Schwiedrzik and Zysset, 2013; Lambers et al., 2014), there are apparent limitations to most of the employed mathematical formulations. For example, most macroscopic damage models of trabecular bone employ an isotropic damage evolution, i.e., a "basic," or single scalar isotropic formulation, as mentioned in Carol et al. (2002), and do not take into account that the development of damage may be related to the load case being considered (Levrero-Florencio et al., 2017a). The authors have previously conducted a series of uniaxial simulations which show that damage develops differently in tension−compression, and in normal−shear (Levrero-Florencio et al., 2017a).

This study has a number of aims. Firstly, it extends the study performed in Levrero-Florencio et al. (2017a) by adding 12 biaxial macroscopic cases in the normal strain space. The second aim is to examine the suitability of certain damage mechanisms by fitting different damage laws to the damaged macroscopic stiffness tensors. The study then investigates the possible relationships between the macroscopic damage behavior of trabecular bone and its density and fabric description, by including these micro-architectural indices as additional data in the fitting procedure. The data for these formulations is obtained computationally through homogenization-based multiscale simulations run on a HPC platform with an in-house developed parallel implicit FE code.

### 2. NOTATION

The mathematical operators defined in this section largely follow the notation used in Wu and Li (2008), Schwiedrzik et al. (2013), and Levrero-Florencio et al. (2016). Compact tensor notation is used throughout this study, with indicial notation within brackets being used in this section to clarify certain tensorial operations, or in specific sections where further clarification might be required.

As a general rule, scalars are denoted with Greek or Latin italic characters (e.g., λ or a, respectively); vectors, or first-order tensors, are denoted by Latin bold lower-case characters (e.g., **a**); second-order tensors are denoted with Greek or Latin bold upper-case characters (e.g., σ or **A**, respectively); and fourthorder tensors are denoted by Latin double-barred upper-case characters (e.g., A).

Tensorial operations are denoted as follows. Single contraction of tensorial entities may appear as **a** · **b** (aibi), **a** · **B** (aiBij), **Ab** (Aijbj), or **AB** (AikBkj), note that the scalar product symbol (·) only appears when the first entity to be contracted is a first-order tensor; double contraction of tensorial entities may appear as **A** : **B** (AijBij), A : **B** (AijklBkl), **A** :B (AijBijkl), or A : B (AijmnBmnkl). Different tensor products have been defined, which include **a** ⊗ **b** (aibj), **A** ⊗ **B** (AijBkl), **A**⊗**B** (AikBjl ), **A**⊗**B** (AilBjk ), or **A**⊗**B** = 1 2 (**A**⊗**B** + **A**⊗**B**) ( 1 2 [AikBjl + AilBjk]).

Curly brackets {·} are used to represent vector projections of second-order tensors, such as

$$\begin{Bmatrix} \mathbf{A} \end{Bmatrix} = \begin{Bmatrix} A\_{11} & A\_{22} & A\_{33} & A\_{12} & A\_{13} & A\_{23} \end{Bmatrix}^{\mathrm{T}}.\tag{1}$$

Square brackets [·] are used, in conjunction with parentheses (·), to indicate priority in the order of mathematical operations; an important exception occurs when square brackets are used to represent the matrix projection of a fourth-order tensor, such as

$$[\mathbb{A}] = \begin{bmatrix} A\_{1111} & A\_{1122} & A\_{1133} & A\_{1112} & A\_{1113} & A\_{1123} \\ A\_{2211} & A\_{2222} & A\_{2233} & A\_{2212} & A\_{2213} & A\_{2223} \\ A\_{3311} & A\_{3322} & A\_{3333} & A\_{3312} & A\_{3313} & A\_{3323} \\ A\_{1211} & A\_{1222} & A\_{1233} & A\_{1212} & A\_{1213} & A\_{1223} \\ A\_{1311} & A\_{1322} & A\_{1333} & A\_{1312} & A\_{1313} & A\_{1323} \\ A\_{2311} & A\_{2322} & A\_{2333} & A\_{2312} & A\_{2313} & A\_{2323} \end{bmatrix} . \tag{2}$$

Double vertical bars k(·)k are used to represent the Frobenius norm of the matrix (·), such as the Frobenius norm of the following 3 × 3 symmetric matrix,

$$\|\mathbf{[A]}\| = \sqrt{A\_{11}^2 + A\_{22}^2 + A\_{33}^2 + 2A\_{11}^2 + 2A\_{13}^2 + 2A\_{23}^2}.\tag{3}$$

### 3. MATERIALS AND METHODS

#### 3.1. Computational Methods

This section follows the "Materials and Methods" section in Levrero-Florencio et al. (2017a). The authors used µCT images of trabecular bone samples to create detailed FE models, which ranged from 10 to 30 million elements, representing the solid phase of bone for a cubic trabecular bone samples (which includes both solid phase and pores) of size 5 mm. In the study conducted by Levrero-Florencio et al. (2017a), plasticity and damage were considered for the solid phase post-elastic properties and nine uniaxial strain cases were investigated (load cases 1 to 9 of **Table 1**) representing: three tensile cases (+ε11, +ε22, and +ε33), three compressive cases (−ε11, −ε22, and −ε33), and three shear cases (ε12, ε13, and ε23). The macroscopic damage behavior was studied by using an appropriate homogenizationbased multiscale technique, which is explained later.

Trabecular bone is an anisotropic material; its anisotropy may be quantified with a fabric tensor, which indicates how directionally distributed a material is. The Mean Intercept Length (MIL) fabric tensor is used in this study because it is widely used in trabecular bone studies, and it performs slightly better than other fabric measures (Kabel et al., 1999; Zysset, 2003). The magnitude of an eigenvalue of the MIL fabric tensor denotes the proportion of material which is aligned in the direction expressed in the correspondent eigenvector. The fabric tensors are normalized by a trace equal to 3 (Zysset, 2003).

In this study, 10 out of the 12 samples employed in Levrero-Florencio et al. (2017a) were subjected to 12 additional biaxial strain cases in the normal strain space (**Table 1**, cases 10–21). Kinematic uniform boundary conditions (i.e., conditions in which displacements, or macroscopic strains, are controlled) were used for all analyses; these are known for providing an upper bound for the macroscopic stiffness tensor and macroscopic yield surface of trabecular bone (Wang et al., 2009; Panyasantisuk et al., 2015). An example of how boundary conditions are implemented can be seen in **Figure 1**, which corresponds to load case 4 in **Table 1**. The morphological indices of these samples are shown in **Table 2**. BV/TV stands for bone volume over total volume and it is a surrogate for density, DOA stands for degree of anisotropy and it is the ratio of the highest to the lowest eigenvalues of the MIL fabric tensor, and SMI stands for structure model index and it ranges from rod- (SMI = 3) to plate-shaped (SMI = 0) microstructure.

The 10 samples were aligned with the MIL fabric tensor eigenvectors, with the eigenvalues sorted in descendent order (m<sup>1</sup> > m<sup>2</sup> > m3). The samples were then meshed with trilinear hexahedra and subjected to the aforementioned straincontrolled load cases; the largest mesh consisted of ∼27M degrees of freedom, leading to square sparse stiffness matrices of up to 27M×27M elements. The considered constitutive law at the


tissue level was isotropic with coupled plasticity and damage (the former captures irrecoverable deformations while the latter takes accounts for stiffness reduction), meaning that damage and plasticity interact with each other and evolve at the same time; the considered yield surface was Drucker-Prager (Tai et al., 2006; Carnelli et al., 2010; Panyasantisuk et al., 2015) with yield values corresponding to 0.41% strain in tension and 0.83% strain in

image of one of the used trabecular bone specimens; this particular sample led to a FE mesh of ∼21M degrees of freedom.

TABLE 2 | Morphological indices of the 10 used specimens.


compression (Bayraktar and Keaveny, 2004). Linear isotropic hardening corresponding to 5% of the undamaged elastic slope (Wolfram et al., 2012; Sanyal et al., 2015) was used. At the tissue level, damage evolution was assumed to be isotropic and it was obtained from Schwiedrzik and Zysset (2013, 2015). The maximum damage was capped at 0.9 (90% isotropic stiffness reduction) to avoid numerical difficulties related to the loss of positive-definiteness of the stiffness matrix; this was performed by using

$$D(\varepsilon^{\mathcal{P}}) = D\_{\max} \left( 1 - e^{-k\_{\mathcal{P}} \varepsilon^{\mathcal{P}}} \right) \tag{4}$$

where ε <sup>p</sup> = kε <sup>p</sup>k is the accumulated plastic strain, Dmax is the maximum damage, and k<sup>p</sup> is a parameter obtained from Schwiedrzik and Zysset (2015).

The µFE simulations were run on a Cray XC30 supercomputer hosted by ARCHER (UK National Supercomputing Service), with an in-house version of ParaFEM (Smith et al., 2013; Levrero-Florencio et al., 2017b) which solves implicit quasi-static finite strain elastoplasticity problems in a highly scalable message passing interface-based (MPI) parallel fashion. Each simulation took from 40 to 120 min when using 1,920 cores, depending on the considered load case, with biaxial compression-compression load cases taking the longest. In order to improve the convergence aspect of the local (constitutive level, i.e., at each integration point) Newton–closest-point projection method (Newton-CPPM), two additional schemes were implemented: (a) a line search as in the primal-CPPM scheme described in Pérez-Foguet and Armero (2002) and (b) an improved trial predictor (Bicani ´ c´ and Pearce, 1996; de Souza Neto et al., 2008). In the latter scheme, if the first Newton-CPPM fails to converge, it is restarted but this time with the initial guess for stress as σ proj, which is the stress returned to the frozen yield surface, i.e., no hardening or damage evolution. If these two mechanisms do not work, to ensure that a possible local lack of convergence does not influence the results of the µFE simulations, lack of convergence of the CPPM scheme is broadcasted to all MPI processes in order to cut down the time increment to half of its value. The initial, and maximum, step size corresponded to 0.1% macroscopic strain Frobenius norm and was allowed to decrease to a minimum of 0.001%, if global (structural level, i.e., the global stiffness matrix) or local convergence was not achieved. The global solution scheme employed was Newton-Raphson, and a Jacobi, or diagonally, preconditioned conjugate gradients method was used as the linear algebraic solver.

The macroscopic elastic stiffness tensor was calculated at each time increment by using the homogenization procedure described by van Rietbergen et al. (1995, 1996), in which the macroscopic elastic stiffness tensor E is

$$\mathbb{E} = \frac{1}{V} \int\_{\Omega} (1 - D\_{\mu}) \mathbb{E}\_{\mu} : \mathbb{M} \, \mathrm{d}V,\tag{5}$$

which, in a FE setting, is equivalent to

$$\mathcal{E} = \frac{1}{V} \sum\_{i=1}^{n\_{\text{cls}}} \sum\_{j=1}^{n\_{\text{ips}}} (1 - D\_{\mu \text{ ij}}) \mathbb{E}\_{\mu \text{ ij}} : \mathbb{M}\_{ij} \det(\mathbf{J}\_{ij}) \mathbf{w}\_j,\tag{6}$$

and where V is the volume of the cubic region (5×5×5 = 125 mm<sup>3</sup> ), D<sup>µ</sup> is the damage at the solid phase, E<sup>µ</sup> is the solid phase undamaged stiffness tensor, nels is the total number of elements in the considered mesh, nips is the number of integration points in a trilinear hexahedron, det(**J**ij) is the determinant of the Jacobian of the transformation from normal to natural coordinates, w<sup>j</sup> is the weight of the corresponding integration point, and M is the local structure tensor, which relates the solid phase strain ε<sup>µ</sup> to the average strain tensor ε, such that

$$\mathfrak{s}\_{\mu} = \mathbb{M} : \mathfrak{s}.\tag{7}$$

This tensor M was determined by solving six completely linear FE systems for six macroscopic uniaxial strain cases (three tensile or compressive and three shear). For each of these cases, the tissue strains calculated represent one of the six columns of the matrix projection of M (Hollister and Kikuchi, 1992). The assumption made was that the samples are aligned in their orthotropic axes as they were aligned with the MIL fabric tensor eigenvectors (Odgaard et al., 1997). Macroscopic strain points were defined by using the 0.2% strain criterion (Wolfram et al., 2012; Levrero-Florencio et al., 2017a), and it was extended to define further 0.3, 0.4, and 0.5% strain levels. The corresponding damaged slope to calculate these strain points is determined at each time step, depending on the load case. The following is an example for the biaxial tensile case ε<sup>11</sup> = ε<sup>22</sup> > 0 (load case 10 in **Table 1**). Since the macroscopic strains are small, the assumption of linear kinematics can be considered at the macroscale; thus, the homogenized infinitesimal stress can be obtained through the macroscopic infinitesimal strain and the macroscopic stiffness tensor, such as

$$
\begin{aligned}
\begin{Bmatrix}\sigma\_{\text{hom},11} \\ \sigma\_{\text{hom},21} \\ \sigma\_{\text{hom},3} \\ \sigma\_{\text{hom},12} \\ \sigma\_{\text{hom},13} \\ \sigma\_{\text{hom},23} \\ \sigma\_{\text{hom},23} \end{Bmatrix} &= \begin{Bmatrix}E\_{1111} & E\_{1122} & E\_{1133} & E\_{1112} & E\_{1113} & E\_{1123} \\ E\_{2211} & E\_{222} & E\_{2233} & E\_{2212} & E\_{2312} & E\_{2313} \\ E\_{3111} & E\_{3222} & E\_{1233} & E\_{1212} & E\_{1213} & E\_{1223} \\ E\_{1311} & E\_{3122} & E\_{1333} & E\_{1312} & E\_{1313} & E\_{1323} \\ E\_{2311} & E\_{2322} & E\_{2333} & E\_{2312} & E\_{2313} & E\_{2313} \end{Bmatrix} \begin{Bmatrix} \varepsilon\_{\text{l}} \\ \varepsilon\_{\text{l}} \\ 0 \\ 0 \\ 0 \\ 0 \end{Bmatrix}, \\ &= \begin{Bmatrix} E\_{1111}\varepsilon\_{\text{l}} + E\_{1122}\varepsilon\_{\text{l}} \\ 0 \\ 0 \\ 0 \\ 0 \end{Bmatrix}, \end{aligned} \tag{8}$$

where σ hom is the homogenized stress tensor, leading to

$$\|\sigma\_{\text{hom}}\| = \sqrt{E\_{1111}^2 \varepsilon\_{11}^2 + E\_{1122}^2 \varepsilon\_{22}^2 + E\_{2211}^2 \varepsilon\_{11}^2 + E\_{2222}^2 \varepsilon\_{22}^2}, \quad \text{(9)}$$

with the damaged slope being (note that in the considered biaxial cases |εii| = εjj )

$$K\_{\rm dam} = \sqrt{E\_{1111}^2 + E\_{1122}^2 + E\_{2211}^2 + E\_{2222}^2}.\tag{10}$$

#### 3.2. Theoretical Framework of Damage

The previously described µFE simulations, together with the homogenization-based multiscale procedure, were used to derive the damaged macroscopic stiffness tensors of the considered samples, for different load scenarios (**Table 1**) and load levels (0.2, 0.3, 0.4, and 0.5% strain norm). These stiffness tensors were used as data points for a minimization procedure (described in the following subsections), which was used to fit the macroscopic damage behavior to several theoretical damage models: single scalar isotropic formulation, three scalars anisotropic formulation, and isotropic/anisotropic combined formulation with tension/compression asymmetry.

Coupled damage and plasticity were considered for the µFE simulations. However, the focus of this study is on the macroscopic damage behavior of trabecular bone and therefore no plasticity is assumed at the macroscale. This is why, in the following, the total strain ε is used instead of the elastic strain ε e .

#### 3.2.1. Basic Concepts and Description of the Baseline Model

Let us consider the theoretical framework of elastic degradation by using state variables, from which the different damage constitutive models are derived (Carol et al., 1994, 2002; Murakami, 2012). The starting point of the theoretical framework is the assumption of a Helmholtz free energy potential per unit reference volume ψ of the considered material, from which the state equations are derived. The free energy potential may be expressed as

$$\begin{aligned} \psi(\mathfrak{s}, D\_k, R\_k) &= \psi^\mathfrak{e}(\mathfrak{s}, D\_k) + \psi^D(R\_k) \\ &= \frac{1}{2} \mathfrak{s} : \mathbb{E}(\mathbb{E}\_0, D\_k) : \mathfrak{s} + \frac{1}{2} \sum\_{k=1}^l K\_k R\_k^2, \end{aligned} \tag{11}$$

where ε is the infinitesimal strain tensor, E and E<sup>0</sup> are, respectively, the damaged and undamaged stiffness tensors, D<sup>k</sup> are a set of l scalar damage variables; R<sup>k</sup> and K<sup>k</sup> are, respectively, a set of l variables and l parameters controlling the size and hardening of the (damage) dissipation potential functions F<sup>k</sup> (Equation 16).

Time derivative of Equation (11) yields

$$\boldsymbol{\psi} = \frac{\partial \boldsymbol{\psi}}{\partial \mathbf{s}} : \dot{\mathbf{s}} + \sum\_{k=1}^{l} \frac{\partial \boldsymbol{\psi}}{\partial D\_{k}} \dot{\mathbf{D}}\_{k} + \sum\_{k=1}^{l} \frac{\partial \boldsymbol{\psi}}{\partial R\_{k}} \dot{\mathbf{R}}\_{k},\tag{12}$$

which, when used in the Clausius-Duhem inequality for isothermal processes

$$\boldsymbol{\sigma} : \dot{\mathbf{s}} - \boldsymbol{\rho} \,\, \dot{\boldsymbol{\psi}} \succeq \mathbf{0},\tag{13}$$

gives rise to the dissipation inequality

$$\phi = \left(\sigma - \rho \frac{\partial \psi}{\partial \mathbf{z}}\right) \cdot \dot{\mathbf{s}} - \sum\_{k=1}^{l} \rho \frac{\partial \psi}{\partial D\_k} \dot{D}\_k - \sum\_{k=1}^{l} \rho \frac{\partial \psi}{\partial R\_k} \dot{R}\_k$$

$$= \sum\_{k=1}^{l} Y\_k \dot{D}\_k + \sum\_{k=1}^{l} B\_k \dot{R}\_k \ge 0,\tag{14}$$

where ρ is the density of the considered material, σ = ρ ∂ψ ∂ε , Y<sup>k</sup> = −ρ ∂ψ ∂D<sup>k</sup> , and B<sup>k</sup> = −ρ ∂ψ ∂R<sup>k</sup> = KkR<sup>k</sup> .

The evolution equations of D<sup>k</sup> and R<sup>k</sup> are derived from the corresponding dissipation potential functions F<sup>k</sup> , leading to

$$
\dot{D}\_k = \dot{\eta}\_k \frac{\partial F\_k}{\partial Y\_k}; \quad \dot{R}\_k = \dot{\eta}\_k \frac{\partial F\_k}{\partial B\_k}, \tag{15}
$$

where γ˙<sup>k</sup> are indeterminate multipliers. Since F<sup>k</sup> also delimit the undamaged region of the considered material, the nonnegativeness of Equation (14) is assured (Murakami, 2012). Linear, a priori uncoupled, criteria for F<sup>k</sup> are considered in this study (each D<sup>k</sup> is related to a single F<sup>k</sup> ), such that

$$F\_k(Y\_k, B\_k) = Y\_k - (B\_k + B\_{k,0}) = Y\_k - (K\_k R\_k + B\_{k,0}) \le 0,\text{ (16)}$$

where Bk,0 are the initial sizes of F<sup>k</sup> , i.e., when R<sup>k</sup> = 0. These linear functions are considered for the sake of simplicity and also because data on additional strain points is needed so that more complex, non-linear, evolution expressions of the dissipation potentials may be taken into account.

Energy equivalence is adopted here since it automatically induces major symmetry in the stiffness and compliance tensors. This leads to

$$\begin{split} \psi^{\varepsilon}(\mathfrak{s}, D\_{k}) &= \frac{1}{2} \mathfrak{s} : \mathbb{E}(\mathbb{E}\_{0}, D\_{k}) : \mathfrak{s} = \frac{1}{2} \mathfrak{s} : \mathbb{M}^{\mathrm{T}}(D\_{k}) : \mathbb{E}\_{0} : \mathbb{M}(D\_{k}) : \mathfrak{s} \\ &= \frac{1}{2} \mathfrak{s}\_{\mathrm{eff}}(\mathfrak{s}, D\_{k}) : \mathbb{E}\_{0} : \mathfrak{s}\_{\mathrm{eff}}(\mathfrak{s}, D\_{k}), \end{split} \tag{17}$$

where M is the fourth-order damage effect tensor which depends on the considered damage formulation, and A T is defined so that A <sup>T</sup> ≡ A T ijkl = Aklij.

#### 3.2.2. Numerical Solution of the Damage Models

Equations (15, 16) are integrated with Backward Euler. Residual equations for each of the variables to be sought can be formulated, with a format similar to that of CPPM equations of computational plasticity (Armero and Pérez-Foguet, 2002; Pérez-Foguet and Armero, 2002), so that

$$\begin{Bmatrix} R\_{D,k} \\ R\_{R,k} \\ F\_k \end{Bmatrix} = \begin{Bmatrix} D\_{k,n+1} - D\_{k,n} - \Delta \gamma\_{k,n+1} \frac{\partial F\_k}{\partial Y\_k} \Big|\_{n+1} \\ R\_{k,n+1} - R\_{k,n} - \Delta \gamma\_{k,n+1} \frac{\partial F\_k}{\partial B\_k} \Big|\_{n+1} \\ Y\_{k,n+1} - (K\_k R\_{k,n+1} + B\_{k,0}) \end{Bmatrix} \tag{18}$$

where n stands for the nth time increment, and the vertical bar means "evaluated at".

The resulting set of non-linear equations (Equation 18) can be solved with a numerical scheme, for instance a Newton-Raphson approach. The first step is to calculate the Jacobian of the system, and therefore the residuals (Equation 18) are linearized, leading to (time subscripts are dropped for convenience from now onwards)

$$\begin{Bmatrix} \mathbf{0} \\ \mathbf{0} \\ \mathbf{0} \end{Bmatrix} = \begin{Bmatrix} \mathbf{d}D\_{j} \Big( \delta\_{jk} - \Delta \boldsymbol{\eta}\_{k} \frac{\partial}{\partial D\_{j}} \frac{\partial F\_{k}}{\partial Y\_{k}} \Big) - \mathbf{d}\Delta \boldsymbol{\eta}\_{k} \frac{\partial F\_{k}}{\partial Y\_{k}} \\ \mathbf{d}R\_{k} - \mathbf{d}\Delta \boldsymbol{\eta}\_{k} \frac{\partial F\_{k}}{\partial B\_{k}} \\ \mathbf{d}D\_{j} \frac{\partial F\_{k}}{\partial D\_{j}} + \mathbf{d}R\_{k} \frac{\partial F\_{k}}{\partial R\_{k}} \end{Bmatrix}.\tag{19}$$

where δij = ( 0 if i 6= j 1 if i = j is the Kronecker delta. The specific

expressions for the derivatives of the Jacobian are presented for each of the considered damage models in the following sections.

The resulting Newton-Raphson scheme to solve for D<sup>k</sup> , R<sup>k</sup> , and 1γ<sup>k</sup> is

$$\begin{Bmatrix} D\_k \\ R\_k \\ \Delta \boldsymbol{\chi}\_k \end{Bmatrix}\_{m+1} = \begin{Bmatrix} D\_k \\ R\_k \\ \Delta \boldsymbol{\chi}\_k \end{Bmatrix}\_m - \begin{bmatrix} \delta\_{jk} - \Delta \boldsymbol{\chi}\_k \frac{\partial}{\partial D\_j} \frac{\partial F\_k}{\partial \boldsymbol{Y}\_k} & 0 & -\frac{\partial F\_k}{\partial \boldsymbol{Y}\_k} \\ 0 & 1 & -\frac{\partial F\_k}{\partial B\_k} \\ \frac{\partial F\_k}{\partial D\_j} & \frac{\partial F\_k}{\partial R\_k} & 0 \end{Bmatrix}^{-1} \begin{Bmatrix} R\_{D,k} \\ R\_{R,k} \\ F\_k \end{Bmatrix}\_m \tag{20}$$

where m stands for the mth iteration of the Newton-Raphson scheme.

#### 3.2.3. Damage Models

This section describes the three main models, and their variants, used in this study. The first two models, single scalar isotropic model (section 3.2.3.1) and three scalars anisotropic model (section 3.2.3.2) are mainly used to assess the BV/TV and fabric eigenvalue dependencies of macroscopic damage models of trabecular bone. The proposed model (section 3.2.3.3), we believe, is a considerable improvement upon the usually employed single scalar isotropic formulation.

#### **3.2.3.1. Single scalar isotropic formulation**

In this simple damage formulation a single scalar damage variable D equally affects all the components of the stiffness tensor, i.e., all directions are equally affected by damage. The damage effect tensor is

$$\mathbb{M} = (1 - D)\mathbb{I}\_{\text{sym}},\tag{21}$$

where Isym = **I** ⊗ **I**.

The Helmholtz free energy potential for this model is

$$\psi(\mathfrak{s}, D, R) = \frac{1}{2}\mathfrak{s} : (1 - D)^2 \mathbb{E}\_0 : \mathfrak{s} + \frac{1}{2}K\mathbb{R}^2,\tag{22}$$

which leads to the following expressions for the conjugate thermodynamic associated variables

$$\begin{aligned} \mathfrak{\mathfrak{a}} &= (1 - D)^2 \mathbb{E}\_0 : \mathfrak{a} \\ Y &= -\frac{1}{2} \mathfrak{a} : \frac{\partial \mathbb{E}}{\partial D} : \mathfrak{a} = \mathfrak{s} : (1 - D) \mathbb{E}\_0 : \mathfrak{s} \\ B &= KR \end{aligned} \tag{23}$$

and to the following expressions for the derivatives in Equation (20)

$$\begin{aligned} \frac{\partial}{\partial D} \frac{\partial F}{\partial Y} &= 0\\ \frac{\partial F}{\partial Y} &= 1\\ \frac{\partial F}{\partial D} = \frac{\partial Y}{\partial D} &= -\frac{1}{2}\mathbf{s} : \frac{\partial^2 \mathbb{E}}{\partial D^2} : \mathbf{s} = -\mathbf{s} : \mathbb{E}\_0 : \mathbf{s} \\ \frac{\partial F}{\partial R} &= -K. \end{aligned} \tag{24}$$

BV/TV dependence is included in this model by defining K = K0,iso ρ o and B = B0,iso ρ p , where ρ is the BV/TV of the considered sample, and o and p are the exponents expressing the BV/TV dependency.

#### **3.2.3.2. Three scalars anisotropic formulation**

In the anisotropic damage formulation a damage scalar for each principal direction of the sample is considered (D1, D2, and D3), meaning that each of these three orthogonal directions has a different damage behavior (as previously stated, these orthogonal directions are parallel to the axes of the cubic sample). Since the range of post-elastic strains applied to the sample is relatively small, it is assumed that no rotation of the orthotropic axes occurs. The damage effect tensor is

$$\mathbb{M} = (\mathbb{I}\_{\text{sym}} - \mathbb{D}),\tag{25}$$

where

∂D ∂D<sup>1</sup> = 1 α α 0 0 0 α 0 0 0 0 0 α 0 0 0 0 0 0 0 0 β 0 0 0 0 0 0 β 0 0 0 0 0 0 0 ; ∂D ∂D<sup>2</sup> = 0 α 0 0 0 0 α 1 α 0 0 0 0 α 0 0 0 0 0 0 0 β 0 0 0 0 0 0 0 0 0 0 0 0 0 β ; ∂D ∂D<sup>3</sup> = 0 0 α 0 0 0 0 0 α 0 0 0 α α 1 0 0 0 0 0 0 0 0 0 0 0 0 0 β 0 0 0 0 0 0 β . (26)

in which α and β are parameters which determine how the components of the stiffness tensor are affected by the different damage scalars.

The Helmholtz free energy potential is

$$\psi(\mathbf{s}, D\_k, R\_k) = \frac{1}{2}\mathbf{s} : \mathbb{E}(\mathbb{E}\_0, D\_k) : \mathbf{s} + \frac{1}{2} \sum\_{k=1}^3 K\_k R\_k^2,\tag{27}$$

which leads to the following expressions for the conjugate thermodynamic associated variables

$$\begin{aligned} \boldsymbol{\sigma} &= \mathbb{E} : \mathbf{s} = [(\mathbb{I}\_{\text{sym}} - \mathbb{D}) : \mathbb{E}\_0 : (\mathbb{I}\_{\text{sym}} - \mathbb{D})] : \mathbf{s} \\ Y\_k &= -\frac{1}{2} \mathbf{s} : \frac{\partial \mathbb{E}}{\partial D\_k} : \mathbf{s} \\ B\_k &= K\_k R\_k \end{aligned} \tag{28}$$

and to the following expressions for the derivatives in Equation (20)

$$\begin{aligned} \frac{\partial}{\partial D\_j} \frac{\partial F\_k}{\partial Y\_k} &= 0\\ \frac{\partial F\_k}{\partial Y\_k} &= 1 \end{aligned}$$

$$\begin{aligned} \frac{\partial F\_k}{\partial D\_j} &= \frac{\partial Y\_k}{\partial D\_j} = -\frac{1}{2} \mathfrak{s} : \frac{\partial}{\partial D\_j} \frac{\partial \mathbb{E}}{\partial D\_k} : \mathfrak{e} \\\ \frac{\partial F\_k}{\partial R\_k} &= -K\_k \\\ \frac{\partial \mathbb{E}}{\partial D\_k} &= -\left[\frac{\partial \mathbb{D}}{\partial D\_k} : \mathbb{E}\_{\mathbb{D}} : \mathbb{M} + \mathbb{M} : \mathbb{E}\_{\mathbb{D}} : \frac{\partial \mathbb{D}}{\partial D\_k}\right] \\\ \frac{\partial}{\partial D\_j} \frac{\partial \mathbb{E}}{\partial D\_k} &= \frac{\partial \mathbb{D}}{\partial D\_k} : \mathbb{E}\_{\mathbb{D}} : \frac{\partial \mathbb{D}}{\partial D\_j} + \frac{\partial \mathbb{D}}{\partial D\_j} : \mathbb{E}\_{\mathbb{D}} : \frac{\partial \mathbb{D}}{\partial D\_k} \end{aligned}$$

Fabric eigenvalue dependencies are included in this model by defining K<sup>k</sup> = K0,aniso m q k and B<sup>k</sup> = B0,aniso m<sup>r</sup> k , where m<sup>k</sup> is the MIL fabric eigenvalue corresponding to the kth orthotropic direction of the sample; and q and r are the exponents expressing the fabric eigenvalue dependency.

#### **3.2.3.3. Combined formulation with tension/compression asymmetry**

We propose a combined isotropic/anisotropic damage formulation, which consists of four damage scalars: a single scalar defines the isotropic part of the model (Diso); and three scalars define the anisotropic part of the model, one for each of the three orthotropic directions (D1, D2, and D3). As in the previous cases, the isotropic damage scalar equally affects all directions, while each of the three orthotropic damage scalars only affect their corresponding orthogonal direction. It is assumed that there is no rotation of the orthotropic axes. The tension/compression asymmetry is included in the damage effect tensor, such that

$$\mathbf{M} = \mathbb{I}\_{\text{sym}} - \mathbb{D}\_{\text{iso}} - \sum\_{i=1}^{3} [1 + \eta \,\mathrm{H}(-\mathbf{m}\_{i} \cdot \mathbf{s} \,\mathrm{m}\_{i}) \mathbb{D}\_{\text{aniso},i}],\tag{30}$$

where

$$\mathbb{D}\_{\text{iso}} = (1 - D)\mathbb{I}\_{\text{sym}},\tag{31}$$

$$\mathbb{D}\_{\text{aniso},i} = \frac{\partial \mathbb{D}}{\partial D\_i} D\_i \tag{32}$$

with ∂D ∂Di being defined in Equation (26), η is the parameter governing the tension/compression asymmetry, **m**<sup>i</sup> is the i th fabric tensor eigenvector, and H(·) is the Heaviside function defined as

$$\mathbf{H}(\cdot) = \begin{cases} 1 & \text{if } (\cdot) > 0 \\ 0 & \text{if } (\cdot) \le 0 \end{cases}. \tag{33}$$

The Helmholtz free energy potential for this model is

$$\psi(\mathfrak{s}, D\_k, R\_k) = \frac{1}{2}\mathfrak{s} : \mathbb{E}(\mathbb{E}\_0, D\_k) : \mathfrak{s} + \frac{1}{2} \sum\_{k=1}^4 K\_k R\_k^2,\tag{34}$$

BV/TV and fabric eigenvalue dependencies are included in this model by defining Kiso = K0,iso ρ o ; Kk,aniso = K0,aniso ρ tm q k , k ∈ {1, 2, 3}; Biso = B0,iso ρ p ; and Bk,aniso = B0,aniso ρ <sup>u</sup>m<sup>r</sup> k , k ∈ {1, 2, 3}, where o and p are the exponents expressing BV/TV dependency of the isotropic part of the model; and t, u, q, and r are, respectively the exponents expressing BV/TV and fabric eigenvalue dependencies of the anisotropic part of the model. The rest of expressions in the model are the same to those in section 3.2.3.2.

#### 3.3. Fitting of the Different Damage Laws

The different damage constitutive models described in the previous section are fitted to the macroscopic damage response obtained from the homogenization-based multiscale µFE simulations. The constitutive laws were fitted by using a particle swarm optimization scheme (particleswarm, MATLAB R2017b, MathWorks Inc.), followed by a gradient-based scheme (fmincon, MATLAB R2017b, MathWorks Inc.) to enhance the final tuning of the parameters, as it is assumed that when particleswarm finishes, the solution is already within the proximity of a minimum. The minimization problem is thus defined as

$$\min \sum\_{i=1}^{n} \left( \| \left[ \mathbb{E}\_{\text{pred}}(\theta\_{i}) - \mathbb{E}\_{\mu \text{FE}} \right] \|\_{i} \right)^{2},\tag{35}$$

where n is the number of samples×load cases×strain levels, which means that the damage results for each sample, each considered load case, and each considered strain level (i.e., 0.2, 0.3, 0.4, and 0.5%) are used in the parameter fitting procedure; k[Epred]k is the Frobenius norm of the matrix projection of the damaged stiffness tensor predicted by the considered theoretical damage model, k[EµFE]k is the Frobenius norm of the matrix projection of the damaged stiffness tensor calculated through homogenization, and θ are the s different parameters of the considered damage model.

This minimization problem (Equation 35) involves the fitting of parameters which govern the size of the damage dissipation potentials (i.e., the surface containing the elastic regime, in which damage does not develop; it is the damage analog to the yield surface in plasticity), and therefore the solution of the CPPM scheme may involve negative 1γ<sup>k</sup> , which are not physical solutions. The CPPM scheme is used in computational plasticity and/or damage contexts to solve the corresponding non-linear equations (Equation 20). If the loading state of a sample is found within the elastic regime (i.e., inside of the yield surface in a plasticity context, or inside the damage dissipation potential in a damage context), no equations need to be solved as plasticity and/or damage related quantities would not further develop. Thus, these undesired values of 1γ<sup>k</sup> will arise only if the loading state of the considered sample is not outside of the damage dissipation potential. In order to avoid these, the minimization problem is modified with a penalty term to avoid such unwanted situations, such that

$$\min \sum\_{i=1}^{n} \left[ \left( \| \left[ \mathbb{E}\_{\text{pred}} (\theta\_s) - \mathbb{E}\_{\mu \text{FE}} \right] \| \right)\_i \right]^2 + \sum\_{k=1}^{l} \mathbb{H} (-\Delta \chi\_{i,k}) K\_{\text{perm}} (\mathbf{e}^{\left| \Delta \chi\_{i,k} \right|} - 1) \right],\tag{36}$$

where Kpen is a large (penalty) constant.

The initial choice of a solver not based on gradients is because the addition of this penalty term breaks the C 1 continuity of the functional to be minimized, and its global non-convexity is assumed a priori. The specific choice of particle swarm optimization over other methods not based on gradients, such as genetic algorithm, is established on the superior computational efficiency of particle swarm optimization over the genetic algorithm (Panda and Padhy, 2008).

The goodness of the fitting procedure was analyzed with the standard error of the estimate (SEE). This is calculated as

$$\text{SEE}(\%) = 100 \frac{\sqrt{\sum\_{i=1}^{n} (\|\left[\mathbb{E}\_{\text{pred}} - \mathbb{E}\_{\mu \text{FE}}\right]\|\_{i})^2}}{\sqrt{\sum\_{i=1}^{n} (\|\left[\mathbb{E}\_{\text{pred}} - \mathbb{E}\_{0}\right]\|\_{i})^2}}. \tag{37}$$

#### 4. RESULTS

#### 4.1. Evaluation of the µFE Results

For all load cases in **Table 1**, the considered samples were subjected to several strain levels, leading to different damage levels. The resulting macroscopic damaged stiffness tensors and the macroscopic strain Frobenius norms were measured at 0.2, 0.3, 0.4, and 0.5% strain levels by using the 0.2% strain criterion (Wolfram et al., 2012). This theoretically leads to damage and macroscopic strain Frobenius norms being evaluated, respectively, at 0–0.3% (with 0% being considered as macroscopic yield) macroscopic plastic strain Frobenius norms. The macroscopic strain Frobenius norms at 0.5% strain level for each load case are shown in **Figure 2** in the form of boxplots. It can be seen from this figure that within each group (T, C, S, or MA), higher macroscopic strain Frobenius norms correspond to compression-dominated load cases (load cases 4–6, 13, 17, and 21 in **Figure 2**).

Damage is evaluated by subtracting the damaged stiffness tensor from the undamaged stiffness tensor and calculating the Frobenius norm of its matrix projection (k[E<sup>0</sup> − Edam]k). The values of these norms for each of the considered load cases are shown in **Figure 3**; the damage shown corresponds to the 0.5% strain level. Due to the alignment of the samples and ordering of their fabric eigenvalues (m<sup>1</sup> > m<sup>2</sup> > m3), it can be seen from this figure that within each group (T, C, S, or MA), higher damage values are seen where the fabric tensor eigenvalues are the largest (i.e., load cases 1, 4, 7, and 10–13 in **Figure 3**). Moreover, higher damage values are also seen in load cases that are compression-dominated (load cases 4–6, 13, 17 and 21). These higher damage values in uniaxial compression, or in compressive-dominated multi-axial load cases, compared to tension load cases indicate a possible

tension/compression asymmetry in the damage behavior at the macroscopic level. It is important to mention that, although damage values were measured at the same strain levels according to the 0.2% strain criterion, the macroscopic strain Frobenius norms (**Figure 2**) were considerably larger in compression than in tension.

Multi-linear regressions in log-log space were performed to establish possible relationships between damage and the microarchitectural indices of the considered samples. These regressions were between k[E<sup>0</sup> − Edam]k at 0.5% strain level, BV/TV, fabric eigenvalues and macroscopic strain Frobenius norms, such as

$$\begin{aligned} \log(\|\left[\mathbb{E}\_0 - \mathbb{E}\_{\text{dam}}\right]\|) &= A + B \log(\text{BV/TV}) + C \log(m\_1) \\ &+ D \log(m\_2) + E \log(\|\mathfrak{s}\_0\|) \end{aligned} \tag{38}$$

where m<sup>1</sup> and m<sup>2</sup> are the fabric eigenvalues corresponding to directions 1 and 2 (only shear and multi-axial load cases have two directions); A, B, C, D, and E are the constants in the regression. These regressions were performed separately for the following

TABLE 3 | Results from the multi-linear regressions between k[E<sup>0</sup> − Edam]k at 0.5% strain level, BV/TV, fabric eigenvalues, and macroscopic strain Frobenius norms, in log-log space.


*Regressions were performed for uniaxial tension (T), uniaxial compression (C), combined uniaxial tension and uniaxial compression (T* ∪ *C), shear (S), and multi-axial (MA) in normal strain space.*

sets of load cases: uniaxial tension, uniaxial compression, combined uniaxial tension and uniaxial compression, shear, and multi-axial load cases in normal strain space. The results from these regressions can be seen in **Table 3**. **Table 3** shows that both BV/TV and fabric eigenvalues have a significant effect (p ≤ 0.05), and that damage expressed as per Equation (38) is directly proportional to the micro-architectural indices, with the slopes for BV/TV being substantially larger than those for the fabric eigenvalues. The coefficients of determination (R 2 ) show that only the multi-linear model of the multi-axial load cases in normal strain space behaves poorly in comparison to the rest.

The component-wise fraction between the matrix projection of E<sup>0</sup> − Edam at 0.5% strain level and the matrix projection of E<sup>0</sup> (i.e., the i-th and j-th component of E<sup>0</sup> − Edam is divided by the i-th and j-th component of E0) leads to the 6 × 6 matrix with components

$$[\mathbf{D}]\_{ij} = \frac{[\mathbb{E}\_0 - \mathbb{E}\_{\text{dam}}]\_{ij}}{[\mathbb{E}\_0]\_{ij}}.\tag{39}$$

This matrix depicts the component-wise ratio of the damaged and undamaged coefficients for each sample and load case. The component-wise mean of [**D**]ij over all the considered samples was calculated and then normalized from 0 to 1 for each of the considered load cases, forming another 6 × 6 matrix (e.g., the new matrix i-th and j-th component is the mean of the **D**ij components of all the samples); the components in E<sup>0</sup> which are zero are ignored and not considered in the normalization, i.e., the non-orthotropic coefficients. The resulting 21 normalized matrices are shown in **Figure 4**. These plots suggest that macroscopic damage in trabecular bone is actually anisotropic and dependent on the considered load case. In uniaxial tensile and compressive load cases, it can be observed that the normal components of the stiffness tensor which are related to the considered load case are the most affected ones (e.g., in the load case ε<sup>11</sup> > 0, components E1111, E1122, E1133, and the corresponding symmetric counterparts are more affected than the rest). In shear load cases, the corresponding shear component is the most affected one. Considering multiaxial load cases in normal strain space we find that in tensiontension and compression-compression load cases, the most affected components are in the off-diagonals of the matrix the components that are related to the plane which is being loaded (e.g., in the load case ε<sup>11</sup> = ε<sup>22</sup> > 0, components E<sup>1122</sup> and E<sup>2211</sup> are more affected than the rest); in tensioncompression/compression-tension load cases, the most affected components are in the matrix diagonal - the components that are related to the plane which is being loaded (e.g., in the load case ε<sup>11</sup> = −ε<sup>22</sup> > 0, components E<sup>1111</sup> and E<sup>2222</sup> are more affected than the rest).

### 4.2. Effect of BV/TV and MIL Fabric Tensor on the Damage Behavior

The effect of BV/TV and fabric on the macroscopic damage behavior of trabecular bone was assessed by (1) considering the single scalar isotropic damage model in section 3.2.3.1 with and without considering the effect of BV/TV and then comparing the respective values of SEE; and (2) considering the anisotropic damage model in section 3.2.3.2 with and without considering the effects of BV/TV and fabric eigenvalues and then comparing the respective values of SEE. In the anisotropic scenario, in the case in which fabric eigenvalues were not included, the order of fabric eigenvalues was randomized to maximise the effect of including fabric in the comparison (the ordering no longer corresponds to m<sup>1</sup> > m<sup>2</sup> > m3; the corresponding stiffness and strain tensors were reordered accordingly). The minimization scheme was run for five times to ensure that a suboptimal solution was not chosen. This comparison is shown in **Table 4**.

Note that the values of SEE of the anisotropic cases are not considerable lower than those of the isotropic cases. This is because even if the damage is higher in the components related to the considered load case, all the components of the stiffness tensor are damaged, and k[E<sup>0</sup> − Edam]k takes into account the reduction of all the components of the stiffness tensor. The exponents that express BV/TV dependency are considerably larger than those expressing fabric eigenvalue dependency.

### 4.3. Macroscopic Damage Model for Trabecular Bone

A damage model which incorporates both isotropic/anisotropic damage progression and tension/compression asymmetry was implemented and its efficacy in evaluating the macroscopic damage behavior of trabecular bone was assessed. BV/TV and fabric eigenvalue dependencies were considered; BV/TV dependency was included in the isotropic part of the model while both BV/TV and fabric eigenvalue dependencies were included in the anisotropic part. Tension/compression asymmetry was included as shown in section 3.2.3.3. The SEE and the value of the parameters of the model are shown in **Table 5**.

This considered model reduces the SEE in more than 15% with respect to the single scalar isotropic model (SEE = 37.03%). Despite the 13 parameters, a considerably larger number in comparison with the two parameters of the isotropic model, the values of some of these parameters suggest that not all of them need to be considered. For

uniaxial tension, (D–F) correspond to uniaxial compression, (G–I) correspond to shear, and (J–U) correspond to multi-axial in normal strain space.

instance, the value of B0,aniso is very small, which means that these parameters, together with the corresponding exponents expressing BV/TV and fabric eigenvalue dependencies (u and r) could be ignored, reducing the number of parameters to 10. It is important to point out the negative values of η and B0,iso.

### 5. DISCUSSION

The macroscopic damage behavior of trabecular bone has been researched in a few studies, but these are usually restricted to uniaxial load scenarios which only permit the assessment of stiffness reduction in the direction of loading (Keaveny et al., 1994b; Zioupos et al., 2008; Garcia et al., 2009). Consequently these studies are unable to provide a comprehensive constitutive model that can be included in whole-bone simulations. This study investigated the possible relationship between damage at the tissue level and the macroscopic multi-axial damage behavior, by employing a homogenization-based multiscale approach to samples with a relatively wide range of BV/TV and fabric tensor eigenvalues, subjected to multiple loading scenarios. The macroscopic damage behavior of trabecular TABLE 4 | SEEs, BV/TV, and fabric eigenvalue exponents for the isotropic and anisotropic models.


*Models 1 and 2 are isotropic with and without BV/TV dependency, respectively; models 1, 2, and 3 are anisotropic and: (1) without BV/TV and fabric eigenvalue dependencies, (2) with BV/TV dependency only, and (3) with fabric eigenvalue dependency only.*

TABLE 5 | Value of the parameters and SEE of the combined isotropic/anisotropic model with tension/compression asymmetry.


bone was approximated via different continuum damage models: isotropic and anisotropic; with and without BV/TV and fabric eigenvalue dependencies; and with and without tension-compression asymmetry. From the results, it can be concluded that the macroscopic damage behavior of trabecular bone has the following features: BV/TV and fabric eigenvalue dependencies; tension/compression asymmetry; a combined isotropic/anisotropic behaviour. The first two of these features are not unexpected as they play a key role in the evaluation of elastic stiffness (Odgaard et al., 1997; Zysset, 2003), however, the previously unexplored, last feature indicates that damage in trabecular bone is best represented by using both isotropic and anisotropic damage variables. This is likely to be true for most cellular materials.

This study assumed an isotropic model with coupled damage and plastic behavior at the tissue level, which was deemed appropriate as the isotropy assumption at this level is known to result in little to no error in macroscopic results (Cowin, 1997). Isotropic damage at the solid phase level leads to an anisotropic macroscopic damage response with a dependency on the considered load case (Levrero-Florencio et al., 2017a). The variation in the components of the stiffness tensor shows anisotropic damage which depends on the considered load case (**Figure 4**). Shi et al. (2010) suggested that there is a larger proportion of damaged tissue in the longitudinal trabeculae (direction of loading) for uniaxial load cases, which is in agreement with the results presented here, as the most damaged components of the macroscopic stiffness tensor are always the on-axis components. An issue which may make validation of these results very challenging is the use of kinematic uniform boundary conditions; these boundary conditions are extremely difficult, not to say impossible, to reproduce experimentally, especially for the more complex load cases. Most previous studies involving damage in trabecular bone have used isotropic models (Garcia et al., 2009; Schwiedrzik and Zysset, 2013), which may be acceptable for proportional loading scenarios, but not for changing loads or cyclic loading scenarios, such as those arising during physiological activities.

The results show that the macroscopic strain Frobenius norms were considerably larger in macroscopic compression than in macroscopic tension. This is important in the considered context of damage modeling as the thermodynamic stress-like variables governing damage evolution (Y<sup>k</sup> ) directly depend on the macroscopic strain values, which could explain the higher damage values in compression without the explicit need of modeling tension/compression asymmetry. However, this asymmetry is taken into account because it still leads to a better fit of the damage model and it only consists of one additional parameter. The fact that damage values are higher in compression-dominated load cases compared to tension load cases could be related to the more heterogeneous stress distributions at the solid phase level occurring during macroscopic compression, which includes tensile stresses at the tissue level due to bending and buckling of trabeculae (Stölken and Kinney, 2003). Another important factor to take into account is that the considered model at the tissue level is ductile (i.e., fracture is not incorporated). If fracture was considered at a critical damage threshold, the tension/compression asymmetry would probably be different as tissue damage is more diffused in compression than in tension (Lambers et al., 2014), and therefore a significant decrease of load carrying capacity would occur in tension.

The variation in the components of the stiffness tensor shows anisotropic damage which depends on the considered load case (**Figure 4**). Shi et al. (2010) suggested that there is a larger proportion of damaged tissue in the longitudinal trabeculae (direction of loading) for uniaxial load cases, which is in agreement with the results presented here, as the most damaged components of the macroscopic stiffness tensor are always the on-axis components. An issue which may make validation of these results very challenging is the use of kinematic uniform boundary conditions; these boundary conditions are extremely difficult, not to say impossible, to reproduce experimentally, especially for the more complex load cases. Most previous studies involving damage in trabecular bone have used isotropic models (Garcia et al., 2009; Schwiedrzik and Zysset, 2013), which may be acceptable for proportional loading scenarios, but not for changing loads or cyclic loading scenarios, such as those arising during physiological activities.

Multi-linear regressions between k[E<sup>0</sup> − Edam]k, BV/TV, fabric eigenvalues and macroscopic strain Frobenius norms (from **Table 3**). It shows that both BV/TV and fabric eigenvalues are statistically significant. The coefficients of determination suggest that only the regression of k[E<sup>0</sup> − Edam]k of the multi-axial load cases in normal strain space behaved poorly in comparison to the others. The slopes of BV/TV are significantly higher than those of fabric eigenvalues, suggesting that BV/TV plays a more important role in these regressions; they also suggest that the higher the BV/TV and fabric eigenvalues, the higher the damage is. Results in Levrero-Florencio et al. (2017a) showed that the damage in the orthotropic coefficients of the macroscopic stiffness tensors do not have significant dependencies on BV/TV or fabric, for each of the considered load cases. In this study the Frobenius norm k[E<sup>0</sup> − Edam]k is used instead, which takes into account the damage of all the components of the macroscopic stiffness tensor. Therefore, the slopes and p-values in **Table 3** suggest that lower BV/TV samples have a more anisotropic damage behaviour in the sense that the longitudinal trabeculae are more damaged than the oblique, and that higher BV/TV samples have a more isotropic behavior, or are more damaged in general. Even if fabric eigenvalues have a significant effect on k[E<sup>0</sup> −Edam]k, the considerably lower slopes suggest that their relevance is significantly lower than that of BV/TV.

The standard errors of the estimate (SEE) and the exponents with respect to BV/TV and fabric eigenvalues of five different damage models indicate that the SEEs are not substantially different in all these considered models, this is because, despite the anisotropic damage behavior, all the components of the stiffness tensor are damaged (Levrero-Florencio et al., 2017a), suggesting that while a combined isotropic and anisotropic model is most suitable for simulating the macroscopic damage behavior of trabecular bone, an isotropic model is not necessarily poor. The SEEs of the models with dependencies are not substantially lower to those without the dependencies, suggesting that the considered BV/TV and fabric eigenvalue dependencies may not be needed. Nonetheless, the results of the multi-linear regressions (**Table 3**) show significance of BV/TV and fabric eigenvalues when modeling damage. Furthermore, since these five assessed damage formulations only partially model some of the features of the macroscopic damage behavior of trabecular bone mentioned earlier, the dependencies are maintained in the combined isotropic/anisotropic model with tension/compression asymmetry.

It is apparent that the model with a combined isotropic/anisotropic behavior and tension/compression asymmetry is a substantial improvement over the single scalar damage formulation since the SEE is reduced by more than 15% (**Table 5**). Nonetheless, it is important to mention that this model has 13 parameters instead of 2, though the value of the parameter B0,aniso indicates that this parameter and the associated exponents expressing BV/TV and fabric eigenvalue dependencies can be ignored. The negative value of η suggests that if tension-dominated cases had similar strains to those in compression-dominated cases, the damage values would be higher in tension, as a negative value of η implies crack-closure, which is expected as bone could be considered a quasi-brittle material (Hambli, 2013; Mayya et al., 2016). The negative value of B0,iso suggest that, when modeling the damage progression with a linear model, there is an initial presence of damage, which has been previously observed in Levrero-Florencio et al. (2017a) (the intercepts of the y-axis of the damage-accumulated plastic strain plots are not zero).

This study has a number of limitations. As previously mentioned, bone at the solid phase level is assumed to be ductile, i.e., while reduction in stiffness due to damage is included, fracture is not. This is perhaps appropriate for the considered level of loading, but it is indeed not applicable if large strains are applied, as complete fracture of trabeculae can occur. Nawathe et al. (2013) shows that ductile tissue behavior overestimates the experimental yield properties. Another limitation, previously stated in Levrero-Florencio et al. (2017a), is that although there is plenty of experimental data on uniaxial load cases (Keaveny et al., 1997; Bayraktar and Keaveny, 2004; Sanyal et al., 2012; Manda et al., 2016), these physical experiments do not allow evaluation of stiffness for samples subjected to different load cases and the effect of loading in one direction on the behavior in the others. Therefore, a study completely based on numerical simulations is the only alternative even though the results cannot be currently validated experimentally. The use of kinematic uniform boundary conditions in the µFE analyses could also be considered a limitation, as they are known for providing an upper bound of the stiffness tensor (Pahr and Zysset, 2008; Wang et al., 2009) or macroscopic yield (Panyasantisuk et al., 2015), and may also affect the damage morphology when compared to the in situ case (Daszkiewicz et al., 2017). We also assume that the orthotropic directions do not rotate during loading, which may be a valid assumption for the considered range of strains.

Use of a large number of load cases (21) and samples (10) shows that the evolution of the damaged macroscopic stiffness tensor is based on the loading history. By examining relationships between bone microstructural indices (such as BV/TV and fabric) with macroscopic damage constitutive laws, we show that the proposed combined isotropic/anisotropic damage law with tension/compression asymmetry is a viable superior alternative to the widely used single scalar isotropic damage formulation as it reduces the fitting error from 37 to 22%; it does, however, require specification of a larger number of material parameters. The relationships of damage progression with bone's microarchitectural indices (density and fabric) developed in this study provide an approach for the creation of macroscale continuum models that incorporate damage and will, therefore, improve clinical predictions of the behavior of bone and bone-implant systems.

### AUTHOR CONTRIBUTIONS

FL-F designed the study, performed the FE simulations and parameter fittings, and analyzed the data; PP contributed to the design of the study. Both authors contributed to the critical writing and revision of the manuscript.

#### FUNDING

This work was supported by funding from the Engineering and Physical Sciences Research Council grant EP/K036939/1. The authors gratefully acknowledge ARCHER, UK National Supercomputing Service, for access to their Cray XC30

#### REFERENCES


supercomputer under the project "Modelling the nonlinear micromechanical behaviour of bone".

#### ACKNOWLEDGMENTS

We gratefully acknowledge Dr. Lee Margetts, from The University of Manchester, for his assistance with the implementation and development of the used version of ParaFEM.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Levrero-Florencio and Pankaj. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Left Ventricular Trabeculations Decrease the Wall Shear Stress and Increase the Intra-Ventricular Pressure Drop in CFD Simulations

Federica Sacco1,2 \*, Bruno Paun<sup>2</sup> , Oriol Lehmkuhl <sup>1</sup> , Tinen L. Iles <sup>3</sup> , Paul A. Iaizzo<sup>3</sup> , Guillaume Houzeaux <sup>1</sup> , Mariano Vázquez 1,4, Constantine Butakoff <sup>2</sup> and Jazmin Aguado-Sierra<sup>1</sup> \*

<sup>1</sup> Barcelona Supercomputing Center (BSC), Barcelona, Spain, <sup>2</sup> PhySense, ETIC, Universitat Pompeu Fabra, Barcelona, Spain, <sup>3</sup> Visible Heart Laboratory, Department of Surgery, University of Minnesota, Minneapolis, MN, United States, 4 IIIA - CSIC, Bellaterra, Spain

#### Edited by:

Timothy W. Secomb, University of Arizona, United States

#### Reviewed by:

Patrick Segers, Ghent University, Belgium Gernot Plank, Medizinische Universität Graz, Austria

#### \*Correspondence:

Federica Sacco federica.sacco@bsc.es Jazmin Aguado-Sierra jazmin.aguado@bsc.es

#### Specialty section:

This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology

Received: 27 November 2017 Accepted: 13 April 2018 Published: 30 April 2018

#### Citation:

Sacco F, Paun B, Lehmkuhl O, Iles TL, Iaizzo PA, Houzeaux G, Vázquez M, Butakoff C and Aguado-Sierra J (2018) Left Ventricular Trabeculations Decrease the Wall Shear Stress and Increase the Intra-Ventricular Pressure Drop in CFD Simulations. Front. Physiol. 9:458. doi: 10.3389/fphys.2018.00458

The aim of the present study is to characterize the hemodynamics of left ventricular (LV) geometries to examine the impact of trabeculae and papillary muscles (PMs) on blood flow using high performance computing (HPC). Five pairs of detailed and smoothed LV endocardium models were reconstructed from high-resolution magnetic resonance images (MRI) of ex-vivo human hearts. The detailed model of one LV pair is characterized only by the PMs and few big trabeculae, to represent state of art level of endocardial detail. The other four detailed models obtained include instead endocardial structures measuring ≥ 1 mm<sup>2</sup> in cross-sectional area. The geometrical characterizations were done using computational fluid dynamics (CFD) simulations with rigid walls and both constant and transient flow inputs on the detailed and smoothed models for comparison. These simulations do not represent a clinical or physiological scenario, but a characterization of the interaction of endocardial structures with blood flow. Steady flow simulations were employed to quantify the pressure drop between the inlet and the outlet of the LVs and the wall shear stress (WSS). Coherent structures were analyzed using the Q-criterion for both constant and transient flow inputs. Our results show that trabeculae and PMs increase the intra-ventricular pressure drop, reduce the WSS and disrupt the dominant single vortex, usually present in the smoothed-endocardium models, generating secondary small vortices. Given that obtaining high resolution anatomical detail is challenging in-vivo, we propose that the effect of trabeculations can be incorporated into smoothed ventricular geometries by adding a porous layer along the LV endocardial wall. Results show that a porous layer of a thickness of 1.2 · 10−<sup>2</sup> m with a porosity of 20 kg/m<sup>2</sup> on the smoothed-endocardium ventricle models approximates the pressure drops, vorticities and WSS observed in the detailed models.

Keywords: trabeculae, papillary muscles, left ventricular modeling, left ventricular hemodynamics, porosity

### 1. INTRODUCTION

Computational cardiac modeling has become important as a non-invasive modality to study the overall cardiac function (Trayanova, 2011; Taylor et al., 2013). Recently, regulatory bodies are encouraging and supporting the use of in-silico modeling to reduce animal experimentation. Within this context, models of cardiac hemodynamics have yet to be improved. The majority of the hemodynamic cardiac computational simulations consider simplified geometries with smoothed endocardial surfaces (Doost et al., 2015; Khalafvand et al., 2015; Imanparast et al., 2016), mostly due to a lack of high-resolution, fast and safe in-vivo imaging techniques. It is also true that solving highly detailed models require computationally expensive simulations that can only be carried out using HPC.

In reality the heart anatomy is complex and all individuals have their own unique anatomies. The interior of the cardiac chambers is not smooth: it is populated by PMs, trabeculae of different sizes and false tendons (Gao et al., 2014). In the LV, PMs are the muscles responsible for properly positioning the chordae tendinae during systole to optimize mitral valve leaflet coaptation. Trabeculae are complex muscular structures that are unique to a given human heart, mostly consisting of myocytes, that protrude from the endocardial wall into the interior of the ventricle and present a sponge-like structure. The primary role of the trabeculae in the overall cardiac function remains unknown, but they are often associated with the Purkinje network.

State of the art LV CFD simulations, employing detailed endocardial structure models, have been created from either MRI or computed tomography (CT) in-vivo modalities (Chnafa et al., 2016; Lantz et al., 2016; Vedula et al., 2016), but they only incorporated PMs and a few large trabeculae. In previously reported studies, Chnafa et al. (2016) used 4D MR images to reconstruct the LV geometry, characterized only by PMs, and prescribe physiological deformations using numerical treatments. In this way the author could study blood flow instabilities within the ventricular cavity and found out that high-frequency flow fluctuations can be common in normal LVs. Both Vedula et al. (2016) and Lantz et al. (2016) added also few big trabeculae together with the PMs in their LV geometries and studied the impact of these endocardial structures on the blood flow by comparing simulations results with smoothedendocardium ventricles. Vedula et al. (2016) reconstructed the LV geometry from high-resolution 4D CT scans and applied prescribed mesh deformation based on immersed boundary method. The authors observed a "scrambling" of blood flow vortices produced by PMs and trabeculae, which caused the generation of deeper and more complex vortices that were not present in the smoothed model. In this way, trabeculations help diastolic filling and, during systole, they help ventricular washout by wringing out the blood flow out from the apex. Lantz et al. (2016) extracted the LV endocardial surface and wall motion over time from 4D CT data. In contrast with Vedula et al. (2016) and Lantz et al. (2016) did not observe any deep penetration of the mitral inflow jet toward the apex: the jet strongly interacted with the PMs and was diverted toward the outflow tract. However, both papers have demonstrated that the detailed anatomies of LV endocardium have important influences on blood flow dynamics: in particular, particle tracking used by Lantz et al. (2016) demonstrated that blood flow interacted with trabeculae and PMs, creating vortices around the endocardial spaces between the trabeculations. More vortices appeared during diastole in the detailed LV as compared to the smoothed one, and PMs redirected blood flow and generated a large vortex, which was not present in the smoothed model. Finally, it was shown that the presence of trabeculations created a region where the flow appeared to be stagnant during five cardiac cycles, which is impossible to reproduce with smoothed endocardium models. While Vedula et al. (2016) and Lantz et al. (2016) considered the PMs and few trabeculae, the level of detail and the amount of trabecular structures in LV geometry reconstructions was not as high as in Kulp et al. (2011), who segmented detailed endocardial structures from high-resolution 4D CT data. The authors studied the interaction between trabeculations and the blood flow by deforming the initial 3D mesh in each following frame. Results showed how the complex endocardial surface caused the blood to move through the empty spaces between the trabeculations and fill these cavities during diastole.

In this paper we used highly detailed anatomical LV endocardium models to characterize the effects of trabeculae and PMs on the blood flows using CFD simulations. Four detailed LV geometries were reconstructed from high resolution imaging data of perfusion fixed human hearts (2 male and 2 female), which were obtained at the Visible Heart <sup>R</sup> Laboratory (Atlas of Human Cardiac Anatomy, RRID:SCR\_015734). Detailed and smoothed endocardial models were reconstructed for each of the four hearts to quantify the differences between these two cases and thus characterize the impacts of PMs and trabeculae on ventricular hemodynamics. The level of detail in these reconstructions of the endocardial structures was, to the best of our knowledge, the highest ever achieved for this kind of study: the average size of the smallest structures reconstructed measures about 1 mm<sup>2</sup> in cross-sectional area.

A fifth (male) LV geometry, named control LV, was reconstructed from the human hearts high resolution images dataset, together with its smoothed equivalent. This model was only characterized by PMs and few large trabeculae: in this way we could compare simulation results obtained from the highly detailed models described previously to the ones from an LV geometry which is similar in detail to those present in literature (Kulp et al., 2011; Chnafa et al., 2016; Lantz et al., 2016; Vedula et al., 2016).

Through CFD simulations we aim to characterize the hemodynamics inside detailed vs. smoothed human ventricular anatomies by quantifying the trabecular volume, intraventricular pressure drop, WSS and vorticity within the LV cavities. Furthermore, we propose that a porous layer can be added to the LV endocardium to compensate for the absence of

**Abbreviations:** LV, Left Ventricle; PMs, Papillary Muscles; HPC, High Performance Computing; MRI, Magnetic Resonance Imaging; CFD, Computational Fluid Dynamics; WSS, Wall Shear Stress; CT, Computed Tomography; BSC, Barcelona Supercomputing Center; PBS, Phosphate Buffered Saline; FE, Finite Element; FSI, Fluid Structure Interaction.

trabeculae within smoothed ventricular hemodynamic models. Our main findings show that the presence of trabeculae alters significantly the blood flow by increasing intra-ventricular pressure drop, reducing the shear stress at the ventricular walls and generating multiple secondary vortices, absent in smoothwalled ventricle simulations. Furthermore, our results show that indeed a porous layer can compensate for the absence of trabeculations inside simplified ventricular models by increasing the intra-ventricular pressure drop, by reducing the wall shear stress at the interface of the porous layer and by increasing the amount of vortex structures within the LV.

#### 2. MATERIALS AND METHODS

#### 2.1. Left Ventricular Models

The five LV models used in this work were reconstructed from high-resolution MR images obtained from in vitro perfusion of fixed human hearts. The research uses of these heart specimens have received appropriate approval from both the University of Minnesota's Institution Review Board and LifeSource Research Committee (Minnesota's non-profit procurement donation organization). The hearts were recovered from organ donors whose hearts were not viable for transplantation. Written and informed consents were obtained from the donors families which follow the wishes of the donor. The database is open to public access.

DICOM data sets were acquired utilizing a 3T Siemens scanner with 0.44 × 0.44 mm in-plane resolution and slice thickness of 1 to 1.7 mm. The hearts were fixed with 10% formalin in phosphate buffered saline (PBS) solution for at least 24 h under 40–50 mmHg of pressure and then stored in 10% formalin. The five hearts DICOM datasets are shown in **Figure 1**.

Image segmentation was carried out with Fiji software (Fiji, RRID:SCR\_002285), using the maximum entropy-based thresholding algorithm (Qi, 2014), followed by endocardial surface reconstruction using marching cubes algorithm in Seg3D (Seg3D, RRID:SCR\_002552). The relative high contrast of the images guaranteed that the thresholding produced detailed endocardial models. The smoothed models were generated from the detailed geometries by manually deleting trabeculae and PMs and closing holes on associated surfaces using ReMESH software (ReMESH, RRID:SCR\_015735). Autodesk Meshmixer (Autodesk Meshmixer, RRID:SCR\_015736) sculpting software was then used to adjust the smoothed endocardial surface as to maintain the same outline for both the smoothed and detailed geometries.

The control LV, as a representative of a state of the art anatomical model, was reconstructed using a regularized region growing algorithm of ITK-SNAP (ITK-SNAP Medical Image Segmentation Tool, RRID:SCR\_002010) to get only large scale anatomical detail from the images. The algorithm allowed controlling the smoothness of the extracted contour making it easier to obtain the smoothed surface with just the PMs and a few large trabeculae (approximately 5 mm<sup>2</sup> in cross-sectional area). The obtained level of detail for the control LV was similar to the reported models used in recent publications on blood flow analysis in LV such as Vedula et al. (2016) and Chnafa et al. (2016).

In order to let the flow develop, a 50 mm long tube was attached at the inlet (corresponding to the mitral valve orifice) and a 70 mm tube at the outlet (corresponding to the aortic valve orifice). Each tube base matched exactly the corresponding valvular ring plane. Tubes were created for every given LV model using ParaView (ParaView, RRID:SCR\_002516).

The resulting surface meshes were uniformly remeshed using Remesh and then volumetric tetrahedral meshes were generated using an isosurface-stuffing-based algorithm (Labelle and Shewchuk, 2007) with an in-house mesher developed at the Barcelona Supercomputing Center (BSC). The volumetric meshes had adaptive element size, with volumes varying from 10−<sup>7</sup> mm<sup>3</sup> to 1.9 · 10−<sup>2</sup> mm<sup>3</sup> , with an average size of 5.7 · 10−<sup>5</sup> mm<sup>3</sup> . Wireframe zoomed images of the tetrahedral meshes can be found in Figure S1.

The five LV anatomies, both smoothed and detailed, are shown in **Figure 2**. A more detailed view is reported in Figure S2. The medical histories and related information can be found in Table 1 of the Supplementary Materials. The ventricular volume of each mesh is reported in Table 2 of the Supplementary Materials.

#### 2.2. Hemodynamic Simulations

To carry out CFD simulations the walls were defined as rigid, noslip boundary conditions. For the outlet, a stabilizing boundary condition employed a baseline pressure of 10.7 kPa (80 mmHg, a normal end-diastolic arterial pressure) plus an outflow resistance (Bazilevs et al., 2009). Blood viscosity was set to 0.0035 kg/(m · s) and density to 1,060 kg/m<sup>3</sup> .

Hemodynamic simulations solving continuity and Navier-Stokes equations for incompressible flows were run on the MareNostrum 4 supercomputer (MareNostrum, RRID:SCR\_015737) and on Archer (ARCHER, RRID:SCR\_015854), UK supercomputer, using Alya, the BSC's in-house, parallel multi-physics, HPC solver (Houzeaux et al., 2009; Vazquez et al., 2016).

Simulations were carried out using a low dissipation finite element (FE) strategy described below. The Navier-Stokes equations for a fluid domain bounded by Ŵ = ∂ within the time interval (t0, t<sup>f</sup> ) reside in calculating a velocity **u** and a kinematic pressure p so that Equations (1, 2) are satisfied; where ν is the kinematic viscosity, **f** is the vector of external body forces and **S**(**u**) is the rate-of-strain tensor.

$$
\partial\_t \mathbf{u} + (\mathbf{u} \cdot \nabla) \mathbf{u} - 2\nu \nabla \cdot \mathbf{S}(\mathbf{u}) + \nabla p - \mathbf{f} = \mathbf{0} \qquad \text{in } \Omega \times (t\_0, t\_f)
$$

$$\{\}\tag{1}$$

$$\nabla \cdot \mathbf{u} = 0 \qquad \text{in } \Omega \times (t\_0, t\_f) \tag{2}$$

To obtain weak or variational formulation of the Navier-Stokes equations Equations (1, 2), we introduced the spaces of vector functions **V**D= **H**<sup>1</sup> D (), **V**0= **H**<sup>1</sup> 0 () and Q=L 2 () /ℜ. L 2 () is the space of square-integrable functions, H<sup>1</sup> () is a subspace of L 2 () formed by functions whose derivatives belong also to

L 2 (), H<sup>1</sup> D () is a subspace of H<sup>1</sup> () which satisfies Dirichlet boundary condition on Ŵ. H<sup>1</sup> 0 () is a subspace of H<sup>1</sup> () whose functions are zero on Ŵ; and **H**<sup>1</sup> D () and **H**<sup>1</sup> 0 () are their vector counterparts in a two- or three-dimensional space. (·, ·) determines the standard L 2 inner product. For the evolutionary case, **V**t≡L 2 t0, t<sup>f</sup> ; **V**<sup>D</sup> and Qt≡D′ t0, t<sup>f</sup> ; Q were introduced, where L p t0, t<sup>f</sup> ; X is the space of time dependent functions in a normed space <sup>X</sup> so that <sup>R</sup> <sup>t</sup><sup>f</sup> t0 f p X dt < ∞, 1 ≤ p < ∞ and Q<sup>t</sup> consists of mappings whose Q-norm is a distribution in time. The weak form of problem (Equations 1, 2) with the boundary conditions is then: Find **u** ∈ **V**<sup>t</sup> , p ∈ Q<sup>t</sup> such that Equation (3) is satisfied for every **v**, q ∈ **V**<sup>0</sup> × Q.

$$\begin{aligned} \left(\partial\_l \mathbf{u}, \mathbf{v}\right) &+ \left(\mathbf{u} \cdot \nabla \mathbf{u}, \mathbf{v}\right) + 2\nu \langle \mathbf{S} \mathbf{u}, \nabla \mathbf{v} \rangle - \left(p, \nabla \cdot \mathbf{v}\right) + \left(q, \nabla \cdot \mathbf{u}\right) \\ &- \left(\mathbf{f}, \mathbf{v}\right) = \mathbf{0}, \end{aligned} \tag{3}$$

Moreover, in the previous equations the non-linear term convective form reported in Equation (4) was used, which is the most frequent choice in computational practice. Using Equation (2), other non-linear term forms can be derived, which are the same at the continuous level but do have different properties at the discrete level. In Equation (5) we consider the energy, momentum and angular momentum conserving form recently proposed in Charnyi et al. (2017). A non-incremental fractional-step method was used for pressure stabilization. This allows the use of finite element pairs which do not satisfy the inf-sup conditions, like the equal order interpolation for the velocity and pressure used in this work. An energy conserving Runge-Kutta explicit method lately proposed by Capuano et al. (2017) along with an eigenvalue based time-step estimator (Trias and Lehmkuhl, 2011) were used in order to time integrate the set of equations. This methodology, recently proposed by

Lehmkuhl et al. (2017), follows the principles of Verstappen and Veldman (2003), generalized for unstructured finite volumes by Jofre et al. (2013) and Trias et al. (2014) but in a FEM framework. The presented methodology has been successfully validated and benchmarked vs. other popular CFD approaches and experimental data in the bioengineering flows environment in Koullapis et al. (2017).

$$NL\_{conv}\left(\mathbf{u}\right) = \mathbf{u} \cdot \nabla \mathbf{u} \tag{4}$$

$$NL\_{emac}\left(\mathbf{u}\right) = 2\mathbf{S}\left(\mathbf{u}\right)\mathbf{u} + \left(\nabla \cdot \mathbf{u}\right)\mathbf{u} \tag{5}$$

We performed the following simulations:


$$\nu\left(t\right) = \begin{cases} \frac{A\_E}{2} \left(1 + \cos\left(\frac{2\pi\left(t - t\_{\bar{p},E}\right)}{t\_{1,E} - t\_{0,E}}\right)\right) & t\_{0,E} \le t \le t\_{1,E} \\ 0 & t\_{1,E} < t < t\_{0,A} \\ \frac{A\_A}{2} \left(1 + \cos\left(\frac{2\pi\left(t - t\_{\bar{p},A}\right)}{t\_{1,A} - t\_{0,A}}\right)\right) & t\_{0,A} \le t \le t\_{1,A} \end{cases} \tag{6}$$

$$
\mu = -\frac{1}{\mu} k(\nabla p) \tag{7}
$$

#### 2.3. HPC Characteristics

HPC characteristics for both constant and transient inflow simulations in terms of cores, total simulation time and time step are reported in **Tables 1**, **2**. Every simulation, with both constant and transient inflow, was run up to 800 ms. Information on the scalability of the incompressible flow module within Alya multiphysics solver can be found in the works of Houzeaux et al. (2009) and Vazquez et al. (2016). The elements-per-core ratio that was used to run these hemodynamic simulations was about 25,000.

#### 2.4. Geometric Markers

The following geometrical markers were used:

Trabecular volume was calculated as a difference between the volume of the convex hull of the detailed-endocardium LV model and the volume of the model itself.

The angle between inlet and outlet was the angle between the vectors normal to the two valvular planes (mitral and aortic). An illustration can be seen in **Figure 5**.

The distance between inlet and outlet was the distance between the mitral and aortic valves centers. An illustration can be seen in **Figure 5**.

#### 2.5. Hemodynamic Analysis

#### 2.5.1. Intra-ventricular Pressure Drop

The pressure distributions were analyzed within 15 mm long volumes of both inlet and outlet tubes. The sections were chosen right at the inlet of the mitral and at the outlet of the aortic valves. The histograms were normalized to unit area under the curve and bin width was calculated using the Freedman-Diaconis rule. As the pressure distributions were non-Gaussian (**Figure 6**), the intra-ventricular pressure drop was calculated as the difference between the inlet and outlet pressure mode. The pressure difference was then averaged over the ten last time frames (approximately 50 ms of simulations), during steady flow. From these results, we calculated the intra-ventricular pressure drop difference (1Pdiff ) as the difference of the detailed and smoothed pressure drops for every studied LV.

#### 2.5.2. WSS on the Ventricular Walls

WSS histograms were computed for every LV cavity (**Figure 7**) and normalized using the Freedman-Diaconis rule to choose the width of the bins. Given that these distributions were upward skewed, the median was used to analyze the WSS of each model. The total magnitude range and the mode are also reported for each case. In the case of the porous layer simulations, the median and mode WSS were calculated on the interface between the porous layer and the blood flow.

#### 2.5.3. Vorticity

Coherent structures were analyzed in both steady and transient inflow simulations applying the Q-criterion method (Hunt et al., 1988; Chakraborty et al., 2005). The applied thresholds for vortex visualization were 5,000 s−<sup>2</sup> and 1,000 s−<sup>2</sup> for steady and


TABLE 1 |Constant inflow simulations results and HPC information on both constant and transient inflow

simulations.

 1Pdiff=1Pd−1Ps, differences between detailed and smoothed pressure

 aIn the case of thecontrol LV we calculated the PMs volume, since this LV geometry was only characterized by PMs.

drops.

 bC.I., Constant Inflow simulations.cT.I.,TransientInflowsimulations.

transient inflow simulations respectively. Vortex quantification was done in Paraview, by integrating the contours of the vortices to estimate the total surface area.

#### 3. RESULTS

#### 3.1. Constant Inflow

The results of constant inflow quantification for the five LVs are shown in **Table 1**. In all LVs, the intra-ventricular pressure drops (1Pdiff ) increased in the detailed geometries by an average 0.2 kPa, except for subject D, that elicited the highest 1Pdiff of 2.7 kPa. Also the detailed E LV, exhibited a similar pressure drop to models A, B, and C, regardless of the low percentage of trabecular volume within its geometry. The geometrical markers do not indicate a direct correlation to the pressure drop (see **Table 1**). The pressure drop in models A, B, C, and E correlated best with the distance between their inlet and outlet, however, when model D is included, any good correlation disappears. The pressure drop is not correlated either to the Reynolds number at the inlet of each model.

The magnitudes of the WSS for each case is shown in **Figure 7**. The WSS histograms are shown in **Figure 8**; they were cropped at 3 Pa for visualization purposes, but the maximum values are included in each plot. The median WSS decreased in the detailed geometries on all subjects, except for subject D, as shown in **Table 1**. Notice that in model D, in **Figure 7**, high WSS regions are markedly localized on the PMs and the outflow region. In model D the WSS median values remain relatively similar between the smoothed and detailed models, even though the mode values are significantly lower for the detailed geometry. It is important to point out that the peak WSS was higher on the detailed models in comparison with the smoothed geometries, except for model E.

The vortical structures shown in **Figure 9** are thresholded at 5,000 s−<sup>2</sup> and color coded according to the velocity magnitude [m/s]. The smoothed geometries generated fewer and larger vortices, while the detailed LVs showed the disruption of larger structures breaking down into a multitude of small scale vortices. To quantify them, the vortex contour surface area was calculated (shown in **Table 3**). The total surface area of the vortices was smaller in the smoothed (with a mean of 31.6 ± 8.9 · 10−<sup>3</sup> m<sup>2</sup> ) compared to the detailed LVs (with a mean of 43.5 ± 10.8 · 10−<sup>3</sup> m2 ).

#### 3.2. E-A Wave Mitral Valve Inflow Results

From the E-A wave transmitral inflow function 6 time instants were selected, as highlighted in **Figure 4**: early E wave (1), E wave peak (2), late E wave (3), early A wave (4), A wave peak (5) and late A wave (6). **Figure 10** shows the vortices in model D. The Q-criterion values were thresholded at 1,000 s−<sup>2</sup> and colored according to the velocity magnitude [m/s]. **Figure 10** and **Table 3** show that the presence of trabeculae created secondary vortices at the early E wave (time instants 1), increasing the total vortex surface from 9.4 · 10−<sup>3</sup> m<sup>2</sup> in the smoothed to 14.8 · 10−<sup>3</sup> m<sup>2</sup> in the detailed geometry. The secondary vortices in the detailed LV penetrated deeper between the trabeculations during the late E wave (time instant 3). During the early and peak A wave, a second weaker vortex ring was formed and it mixed with the vortices TABLE 2 | Constant inflow simulations results for A-D LVs with the porous layer of thickness 1.2 · 10−<sup>2</sup> m and porosity 20 kg/m<sup>2</sup> , along with the corresponding HPC information.


P, mode of the pressure distribution at inlet and outlet.

1P, intra-ventricular pressure drop from inlet to outlet.

1Pdiff = 1P<sup>d</sup> − 1Ps, differences between detailed and smoothed pressure drops.

generated during the early filling (time instants 4–5). Here, for both smoothed and detailed geometries, the total surface of the vortices increased due to the mixing vortices but the amount was still higher in the detailed compared to the smoothed LV (25.5 − 22.5 · 10−<sup>3</sup> m<sup>2</sup> vs. 29.2 − 25.9 · 10−<sup>3</sup> m<sup>2</sup> respectively).

#### 3.3. Constant Inflow With a Porous Layer

The proposed porous layer produces energy dissipation, which increases the intra-ventricular pressure drop and adds complexity to the blood flow. The results of the sensitivity analysis of the effect on the intra-ventricular pressure drops for each thickness-porosity combination are reported in **Table 4**. By adding a layer of 1.2 · 10−<sup>2</sup> m thickness and a porosity of

20 kg/m<sup>2</sup> to the smoothed subject A, we obtained the intraventricular pressure drop equal to the one of its detailed case (1.5 kPa). Additionally, the vortex visualization using the Qcriterion thresholded at 5,000 s−<sup>2</sup> demonstrates that the amount of vortices in the smoothed LV with the porous layer are similar to the ones in the detailed case (see **Figure 3**), presenting a total vortex surface area of 36.8 · 10−<sup>3</sup> m<sup>2</sup> . This may indicate that the roughness interacts with the boundary layer as a sandgrain roughness does, without prioritizing any particular flow direction, being then the proposed porous layer model very effective. The presence of the porous layer could also approximate the WSS calculated for the detailed geometry A as can be observed in **Table 2**, providing a relative error of only 0.076.

The same layer with a thickness of 1.2 · 10−<sup>2</sup> m and a porosity of 20 kg/m<sup>2</sup> was then applied to all the other cases (subjects B, C, D). In subject B and C, the presence of the porous layer increased the intra-ventricular pressure drop to values similar to the ones obtained within the detailed cases (see **Table 2**) providing a relative error of just 0.2 and 0.02 respectively. In subject D, the intra-ventricular pressure drop increased slightly with the presence of the porous layer, but in this case the thickness and porosity values of the porous layer were not able to reproduce the high values of pressure drop obtained inside the detailed model (relative error is 0.46). However, in all the models with the porous layer the WSS was reduced as shown in **Table 2** with the biggest relative error being 0.23 in model D. This table also shows how in all the cases the total vortex surface increased with the presence of the porous layer by providing values slightly higher than those of the detailed geometries in all cases, with the maximum relative error being −0.12 for model D.

#### 4. DISCUSSION

LV endocardiums of humans present a highly trabeculated appearance, which is often ignored in ventricular hemodynamic studies. Even though a few studies have been done to analyze the effect of papillaries and trabeculae on blood flow, no study to our knowledge has ever included small trabeculae of cross-sectional area of 1 mm<sup>2</sup> .

In this study we focused on characterizing solely the effects of the geometries on the hemodynamics using CFD simulations. The fact that the walls are rigid makes it impossible to extapolate any of the findings to a clinical or physiologically relevant scenario. This study provides an engineering-like approach to a very complex biological system, by quantifying and characterizing the interaction between endocardial structures and blood flow and providing a potential model to include the effect of the complex structures within the heart without the need of segmenting an extremely complex structure and running large simulations every time. Therefore the absence of the mitral valve may constitute a limitation if we were drawing physiological conclusions, however it is not the case for the kind of study presented in this manuscript.

Given the different metrics analyzed, there appears to be no direct correlation between the volume of trabeculae and the intra-ventricular pressure drop. For the geometrical markers, subject D, for example, presented the highest 1Pdiff and is characterized by the smallest angle between the valvular planes (43.3◦ ). We hypothesize that the 1Pdiff obtained in model D is a result of the location of the PMs, which were positioned right below the inlet, disturbing the blood flow at the inlet, which led to a higher energy dissipation, not observable in the other cases. There is no direct correlation between the angle or distance of inlet and outlet and the 1Pdiff . It is clear that the existence of rugosities along the endocardial walls alter the hemodynamics by creating flow recirculation regions, vortex disruption into secondary vortices, which increase the energy dissipation, hence increasing the intra-ventricular pressure drop in complex ways. A key observation is that the location and orientation of the mitral and aortic valvular rings influence the direction of the flow, and hence, the high WSS visible either on trabeculated regions or on the PMs. We hypothesize that this observation is responsible for the high maximum WSS magnitude on the detailed geometries. Given the surface area of trabeculae or papillaries, WSS tends to be concentrated in small regions (**Figure 7**), increasing its maximum magnitude. This fact may have high implications on local tissue remodeling. However, regardless of the range of WSS, in **Figure 8** it can be observed that in the smoothed geometries the mode WSS was of 0.5–1 Pa, while in detailed meshes the mode drops to approximately 0 Pa. PMs and trabeculae reduced the WSS in LV about 23.5–66.7%. Wall shear stress is an important parameter in biology in general. Mechano-transduction is an important mechanism in biology. Even though it is practically impossible to measure it in-vivo in the LV, and it has never been reported before, the fact that WSS is reduced in the presence of trabeculae may provide an insight of the reason of why such endocardial structures exist. The presence of trabeculae and PMs generated a multitude of secondary vortices that were not present in the smoothed geometries, as shown in **Figure 9**. The overall vortex area decreased in the smoothed LVs (**Table 1**), with a mean of 31.6 ± 8.9 · 10−<sup>3</sup> m<sup>2</sup> . The presence of detailed endocardial structures increased the amount of vortices with a mean area of 43.5 ± 10.8 · 10−<sup>3</sup> m<sup>2</sup> . Subject D was intriguing. The effect of trabeculae and PMs led to a higher WSS median. This is because this LV is characterized by large PMs (noticeable in **Figures 1**, **8**), which led to higher WSS concentrated on the PMs. A 4.8% higher WSS was indeed observed in the detailed D case. We hypothesize that the high pressure drop in case D was due to the prominent PMs, which disturbed flow markedly, increasing the energy dissipation within the intra-ventricular volume, and thus, generating a high intra-ventricular pressure drop (see Figure S3). The analysis of more geometries is required to further understand and characterize the effect of endocardial structures on hemodynamics.

The results from the control subject E demonstrated that the presence of only the PMs led to a low 1Pdiff (0.2 kPa), however the main impact appears to be the WSS distributions in comparison with the more detailed geometries. In other words, having only PMs and a few big trabeculae does not significantly modify the WSS.

Using a transient inflows (E-A wave) allowed the study of vortex formations following physiological inputs. In the smoothed LV vortices were nominal and the generation of the vortex rings was clearly visible during the E waves (see the example in **Figure 10**). On the other hand, in the trabeculated ventricles, the vortex rings were disrupted, generating a multitude of secondary vortices. The vortex surface areas are provided in **Table 3**. It can be observed that the total surface areas of the vortices were larger in the trabeculated, in comparison to the smoothed geometry. Furthermore, in the anatomically detailed LV the secondary vortices penetrated deeper between the trabeculations during the late E wave (time instant 3). Notice that the A wave seems to produce higher vorticity in this model. We hypothesize that the reason for this is that we are starting our simulation with an organized zero flow all throughout the model. Recirculation within the cavity at zero flow (diastasis) would create higher vorticity in the second inflow wave. In the smoothed case vortices were more compact, while in the anatomically detailed LV there were multiple vortices that tended to be pushed toward the apex during the late A wave (time instant 6). This finding is in accordance with the results from the work of Vedula et al. (2016), in which the authors suggests that the observed behavior may help increasing LV washout. On the other side, the main vortex ring disruption and secondary vortices formation due to LV trabeculations was not seen in the work of Lantz et al. (2016), where they noticed that the presence of PMs and trabeculae generates a large vortex in the middle of the LV cavity.

Preliminary results from replacing the trabeculae with a porous layer show that it is possible to obtain an intra-ventricular pressure drop similar to those generated by the detailed endocardial models. Only for subject D, the intra-ventricular

TABLE 3 | Total vortex surface [m<sup>2</sup> ] for the six instants (1−6) of the synthetic E-A wave in subject D.


pressure drop was not as high as the one observed in the detailed model. This is due to the presence of big PMs right below the mitral valve inlet, which, as explained previously, highly disturbed the flow and increased the energy dissipation.

Moreover, the addition of a porous layer on the smoothed geometries helped to reduce the WSS median in all the models providing small relative errors 0.07, 0.05, 0.23, and 0.16 respectively. Again, for model D, the median WSS had a bigger relative error mostly due to the large impinging of flow on the PMs, which increase the WSS median for that specific case. Finally, the presence of the porous layer disrupted the main vortices into smaller ones, increasing the total vortex surface to values slightly higher than the observed in the detailed cases, however, the relative errors range from −0.1 to −0.12, therefore reproducing the hemodynamic behavior in terms of vorticity.

#### 4.1. Limitations

The main limitation of this study is the use of CFD, without fluid-structure interaction (FSI), valves or moving walls. The lack of motion prevents us from comparing any measured value to in-vivo heart function, however, as was mentioned before, this study attempts to characterize solely the geometry effects on hemodynamics. Future work involves the use of FSI to compare our findings to in-vivo measurements. Another limitation is the small sample size, which limits the generalization and statistical significance of our results.

The generation of smoothed ventricles from the trabeculated ones provides a degree of variability in the geometries created. We removed the trabeculae keeping the overall shape of the ventricle and the volume unchanged. This however is observer dependent and requires a fair amount of user interaction. In models A–D, the smoothing procedure led to slightly smaller estimated LV volumes due to the elimination of the trabeculae. In the case of model E, the ventricular surface was primarily characterized by PMs, presenting just a few trabeculae, which led to a larger volume in the smoothed heart.

An important limitation is that valves were not considered in our simulations. The presence of the mitral valve will direct the blood flow jet to create impinging and this will have some influence on the interaction between the flow and the detailed endocardial structures. The valves have also been reported to create a vortex ring right below its leaflets, as observed in previous studies (Töger et al., 2012; Vedula et al., 2016), which is impossible to capture in this study.



Constant inflow simulations.

### 5. CONCLUSIONS

The highly anatomically detailed LV models developed in this study present a level of geometric information that was never achieved before. The simulations performed highlight the differences between blood flow CFD simulations in detailed vs. smoothed human ventricular models. The presence of detailed structures increase the intra-ventricular pressure drop, create multiple secondary vortices and decrease the WSS within the LV cavity. The amount of trabeculations have no direct correlation with the 1Pdiff , which was noted highest in the female LV D case. To the best of our knowledge, our study analyzed for the first time intra-ventricular pressure drops to investigate the effects of trabeculae and PMs on LV hemodynamic modeling. LV hemodynamics in detailed geometries are more complex than we anticipated, hence, a detailed study with more subjects is necessary and ongoing. Furthermore, our results confirm that neglecting detailed endocardial structures prevent computational models from recreating the complex blood flow behavior within the ventricles. Given that HPC simulations and high resolution MRI data are not always accessible, we propose that a simulated porous layer on the endocardial wall of smoothed LV models can potentially substitute the highly detailed geometries. Finally, we demonstrated that by adding a layer of 1.2 · 10−<sup>2</sup> m thickness and 20 kg/m<sup>2</sup> porosity to the smoothed cases we obtained pressure drops and WSS similar to the ones in the detailed LVs. The porous layer also increased the amount of secondary vortices, close to the amount observed inside the trabeculated models.

#### 6. RESOURCE IDENTIFICATION INITIATIVE

Atlas of Human Cardiac Anatomy, RRID:SCR\_015734.

MareNostrum Supercomputer, BSC, Barcelona, RRID:SCR\_015737. ARCHER, UK National Supercomputing Service, RRID:SRC\_015854. Fiji, RRID:SCR\_002285. Seg3D, RRID:SCR\_002552.

#### REFERENCES


ReMESH, RRID:SCR\_015735. ITK-SNAP Medical Image Segmentation Tool, RRID:SCR\_002010. Autodesk Meshmixer, RRID:SCR\_015736. Paraview, RRID:SCR\_002516.

### AUTHOR CONTRIBUTIONS

FS: Conception, drafting, data analysis and interpretation of data. BP: Data preprocessing, data analysis. TI, PI: Acquisition of data and critical revision. CB, JA-S: Conception, drafting, analysis, critical revision. MV, GH, OL: Implementation of the solvers, critical revision.

### FUNDING

This paper has been partially funded by CompBioMed project, under H2020-EU.1.4.1.3 European Union's Horizon 2020 research and innovation programme, grant agreement n◦ 675451. FS is supported by a grant from Severo Ochoa (n◦ SEV-2015- 0493-16-4), Spain. CB is supported by a grant from the Fundació La Marató de TV3 (n◦ 20154031), Spain. TI and PI are supported by the Institute of Engineering in Medicine, USA, and the Lillehei Heart Institute, USA.

### ACKNOWLEDGMENTS

The DICOM datasets were provided by the Visible Heart <sup>R</sup> Laboratory, obtained by MRI scanning perfusion fixed hearts that were graciously donated by the organ donors and their families through LifeSource. Part of the simulation hours were provided by the CompBioMed project in the Archer supercomputer, EPCC, UK.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphys. 2018.00458/full#supplementary-material

could be the implications? Ann. Biomed. Eng. 44, 3346–3358. doi: 10.1007/s10439-016-1614-6


Research, Proceedings of the Summer Program, (Stanford, CA), 193–208.


fractional flow reserve: scientific basis. J. Am. College Cardiol. 61, 2233–2241. doi: 10.1016/j.jacc.2012.11.083


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Sacco, Paun, Lehmkuhl, Iles, Iaizzo, Houzeaux, Vázquez, Butakoff and Aguado-Sierra. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Scalable and Accurate ECG Simulation for Reaction-Diffusion Models of the Human Heart

#### Mark Potse1,2,3 \*

<sup>1</sup> CARMEN Research Team, Inria Bordeaux Sud-Ouest, Talence, France, <sup>2</sup> Institut de Mathématiques de Bordeaux , UMR 5251, Université de Bordeaux, Talence, France, <sup>3</sup> IHU Liryc, Electrophysiology and Heart Modeling Institute, Foundation Bordeaux Université, Pessac-Bordeaux, France

Realistic electrocardiogram (ECG) simulation with numerical models is important for research linking cellular and molecular physiology to clinically observable signals, and crucial for patient tailoring of numerical heart models. However, ECG simulation with a realistic torso model is computationally much harder than simulation of cardiac activity itself, so that many studies with sophisticated heart models have resorted to crude approximations of the ECG. This paper shows how the classical concept of electrocardiographic lead fields can be used for an ECG simulation method that matches the realism of modern heart models. The accuracy and resource requirements were compared to those of a full-torso solution for the potential and scaling was tested up to 14,336 cores with a heart model consisting of 11 million nodes. Reference ECGs were computed on a 3.3 billion-node heart-torso mesh at 0.2 mm resolution. The results show that the lead-field method is more efficient than a full-torso solution when the number of simulated samples is larger than the number of computed ECG leads. While the initial computation of the lead fields remains a hard and poorly scalable problem, the ECG computation itself scales almost perfectly and, even for several hundreds of ECG leads, takes much less time than the underlying simulation of cardiac activity.

#### Edited by:

Mariano Vázquez, Barcelona Supercomputing Center, Spain

#### Reviewed by:

Arun V. Holden, University of Leeds, United Kingdom Mohammad Hasan Imam, American International University-Bangladesh, Bangladesh

#### \*Correspondence:

Mark Potse mark@potse.nl

#### Specialty section:

This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology

Received: 14 January 2018 Accepted: 27 March 2018 Published: 20 April 2018

#### Citation:

Potse M (2018) Scalable and Accurate ECG Simulation for Reaction-Diffusion Models of the Human Heart. Front. Physiol. 9:370. doi: 10.3389/fphys.2018.00370 Keywords: numerical modeling, electrocardiogram, high-performance computing, reaction-diffusion model, bidomain model, lead fields

### 1. INTRODUCTION

The electrocardiogram (ECG) is one of the most common tools in present-day medicine, yet its relation with the molecular biology of the heart is still poorly understood. The ECG witnesses the collective activity of about a million current-generating transmembrane proteins in each of the heart's muscle cells (Hille, 2001). Many of these proteins have been identified and their actions have been captured in mathematical models that predict their collective behavior on the scale of a cell (Noble and Rudy, 2001). By coupling millions of these membrane models one can create a model of whole-heart electrophysiology. Such models generate crucial insights in the functional effects of molecular-level changes, allowing for example to predict dangerous side effects of new drug designs (Passini et al., 2017) or to understand how cardiac ion-channel mutations influence cardiac rhythm disorders (Gima and Rudy, 2002). Moreover, from their results one can compute the corresponding ECG and predict how lab results on subcellular components would translate to everyday practice (Hoogendijk et al., 2010; Keller et al., 2012; Zemzemi et al., 2013).

Such realistic models are large and, when run on a single processor, would take days to simulate just one heartbeat. Fortunately the problem can be expressed in such a way that the work may be spread over many processors with little communication between them. Therefore, these computations are said to scale very well, meaning that they run almost twice as fast every time the number of processors is doubled (Vázquez et al., 2011). This makes them suitable for use on large-scale parallel computers, allowing models to run in nearly real time (Niederer et al., 2011b; Richards et al., 2013).

Simulation of a realistic ECG from the results of such a numerical heart model is much harder, because the electrical current generated by the heart meets a different conductivity at each point in the torso. As a result, each point influences the potential everywhere else, so to find the potential anywhere one must solve it everywhere at the same time.

Numerically this means that a large system of linear equations must be solved, one for each point in the torso model. These problems are harder when they are larger and require frequent communication between the processors in a parallel computer. This means that they cannot be solved much faster by using more processors. Therefore, ECG computation is becoming a bottleneck, limiting both the speed and the spatial resolution of our models.

To avoid this problem many researchers have used simplified torso models, resulting in a less accurate ECG. A solution that can avoid such a sacrifice is to simulate the ECG using an electrocardiographic concept named a lead field. This allows the problem to be split into a hard (poorly scaling) part and an easy (well scaling) part. The hard part is solved only once for each ECG lead, while the easy part is run repeatedly for each time step in a simulation and for multiple simulations on the same geometry. This approach has been used by several authors, but generally with simplified heart models (Pezzuto et al., 2017) or, again, with simplified torso models (Horacek, 1973; Miller and Geselowitz, 1978; Mailloux and Gulrajani, 1982; Aoki et al., 1987).

The purpose of this paper is to show that a lead-field approach can greatly improve scalability in a high-performance computing (HPC) context without sacrificing accuracy. This is not obvious, because the method requires a large set of transfer coefficients (the lead field) to be stored between the two phases of the computation. The efficiency of the method depends on the accuracy with which the lead field must be computed and the degree to which it can be downsampled without affecting the accuracy of the ECG too much. Finally, to provide answers to these questions an accurate reference solution is needed.

Using a reference solution computed on a full torso model at 0.2 mm resolution this study shows that the lead field can indeed be downsampled enough to achieve an efficient and scalable computation, providing roughly two orders of magnitude speedup with negligible loss in accuracy.

The results of this study make it possible to build more realistic heart models with higher spatial resolution, without spending much more time to compute the ECG.

### 2. METHODS

#### 2.1. Model Equations

The methods in this study are based on the bidomain model of cardiac electrophysiology (Miller and Geselowitz, 1978; Tung, 1978), on which most of the current modeling work in this area is based (Niederer et al., 2011a; Henriquez, 2014). The bidomain model is a continuum approximation of the heart muscle, which in reality consists of a network of interconnected muscle cells embedded in an extracellular matrix and other structures such as fibroblasts and capillaries. The bidomain model approximates this as two co-located spaces: the intracellular domain, consisting of the interior of the cells and the gap junctions that connect them, and the extracellular domain, consisting of everything else.

The two domains are characterized by conductivity tensors G<sup>i</sup> and Ge, respectively. Their values at each point in the model depend on the fiber direction and account for the partial volume occupation of the two domains. In addition the parameters C<sup>m</sup> and β determine the capacitance of the cell membrane and the amount of membrane per unit volume, respectively. The state variables of the model are the potential fields φ<sup>i</sup> in the intracellular and φ<sup>e</sup> in the extracellular domain, and a set of variables Ey describing the state of the membrane model at each location. Using the auxiliary variable V<sup>m</sup> = φ<sup>i</sup> − φ<sup>e</sup> and agreeing that all variables are functions of time and position we can express the bidomain model compactly as

$$\boldsymbol{\beta}^{-1}\nabla \cdot (\mathbf{G}\_{\mathrm{i}}\nabla\phi\_{\mathrm{i}}) = \mathbf{C}\_{\mathrm{m}}\partial\_{\mathrm{i}}V\_{\mathrm{m}} + I\_{\mathrm{ion}}(V\_{\mathrm{m}}, \vec{\jmath})\tag{1}$$

$$
\beta^{-1} \nabla \cdot (\mathbf{G\_e} \nabla \phi\_e) = -C\_{\rm m} \partial\_t V\_{\rm m} - I\_{\rm ion} (V\_{\rm m}, \vec{y}) \tag{2}
$$

$$
\partial\_t \vec{\jmath} = F(V\_{\mathfrak{m}}, \vec{\jmath}) \tag{3}
$$

where the term Cm∂tV<sup>m</sup> represents the capacitive transmembrane current, the function Iion the density of ionic current flowing between the two domains, and F is a nonlinear vector-valued function describing how the membrane state evolves. The pair of functions Iion and F constitutes the membrane model. Suitable boundary conditions are

$$G\_{\mathbf{i}} \nabla \phi\_{\mathbf{i}} \cdot \partial \Omega\_{\mathbf{A}} = 0 \tag{4}$$

on the boundary <sup>A</sup> of the cardiac muscle and

$$G\_{\mathbf{e}} \nabla \phi\_{\mathbf{e}} \cdot \partial \Omega\_{\Gamma} = 0 \tag{5}$$

on the torso boundary <sup>T</sup> (Tung, 1978; Krassowska and Neu, 1994).

The electrical activity of the heart can then be simulated by integrating Equations (1), (2), and (3) under the boundary conditions (4) and (5) (Vigmond et al., 2002). This is known as a bidomain reaction-diffusion model. In this study a simplified version, a "monodomain" reaction-diffusion model, was used. This model can be derived by assuming that G<sup>i</sup> and G<sup>e</sup> are proportional (Leon and Horácek, 1991). Although this is a gross simplification the effect of this assumption is negligible for most purposes if the model parameters are well chosen (Potse et al., 2006; Nielsen et al., 2007; Bishop and Plank, 2011; Coudière et al., 2014). The monodomain model reads

$$\begin{cases} \mathcal{C}\_{\rm m} \partial\_{t} V\_{\rm m} = \beta^{-1} \nabla \cdot \left( G\_{\rm m} \nabla V\_{\rm m} \right) - I\_{\rm ion} (V\_{\rm m}, \vec{\mathcal{y}}) \\ \partial\_{t} \vec{\mathcal{y}} = F(V\_{\rm m}, \vec{\mathcal{y}}) \end{cases} \tag{6}$$

The "monodomain conductivity tensor" G<sup>m</sup> was computed as the series conductivity of the two domains, G<sup>m</sup> = GiGe/(G<sup>i</sup> + Ge). With this choice the resistance encountered by a current loop through the cell membrane is the same as in a bidomain model, so that also the conduction velocity of a propagating activation wavefront is almost the same.

An ECG potential V(t) at time t is the difference in φ<sup>e</sup> between two locations on the body surface or, more generally, a linear combination

$$V(t) = \sum\_{i} c\_{i} \phi\_{\mathbf{e}}^{i} \tag{7}$$

where c<sup>i</sup> are the relative contributions of the two or more electrodes and φ i e are the potentials at the corresponding positions. The coefficients <sup>P</sup> <sup>c</sup><sup>i</sup> must fulfill charge conservation, c<sup>i</sup> = 0.

To compute φ<sup>e</sup> we must return to the bidomain model. Equations (1) and (2) can be combined and reorganized to yield

$$\nabla \cdot \left( \left( G\_{\text{i}} + G\_{\text{e}} \right) \nabla \phi\_{\text{e}} \right) = -\nabla \cdot \left( G\_{\text{i}} \nabla V\_{\text{m}} \right). \tag{8}$$

This equation can be solved for φ<sup>e</sup> in the whole torso at once from a given distribution of Vm. However, for the ECG we need to know φ<sup>e</sup> at a few locations only. Therefore, it can be more efficient to use a Green's function of the operator ∇ · ((G<sup>i</sup> + Ge)∇.) for each of these locations. Since an ECG lead is a linear combination of φ<sup>e</sup> at two or more points it can also be represented directly by a linear combination of Green's functions. In electrocardiology such linear combinations of Green's functions are named lead fields (McFee and Johnston, 1953; Geselowitz, 1989; Colli-Franzone et al., 2000). A lead field is computed once for each ECG lead. It is then used to evaluate the ECG at each time step of the reaction-diffusion model and, as long as the conductivity parameters are not changed, can be re-used for multiple simulations. In terms of a lead field Z(xE) the ECG potential V(t) at time t is

$$V(t) = \int \nabla Z(\vec{\chi}) \cdot G\_{\rm i} \nabla V\_{\rm m} \, d\vec{\chi} \tag{9}$$

where the integration is over the myocardium. In contrast to the solution of the full system (8) this calculation is simple and a priori highly scalable. The lead field can be computed as the potential field resulting from a unit current applied at the electrode locations xE<sup>i</sup> (Geselowitz, 1989):

$$\nabla \cdot \left( \left( G\_{\vec{\mathbf{x}}} + G\_{\mathbf{e}} \right) \nabla Z(\vec{\mathbf{x}}) \right) = \sum\_{i} c\_{i} \delta(\vec{\mathbf{x}} - \vec{\mathbf{x}}\_{i}) \tag{10}$$

where the coefficients c<sup>i</sup> are as in Equation (7) and δ is Dirac's delta function. To avoid a scaling factor in (9) the total injected current must be unitary, P|c<sup>i</sup> | = 2.

#### 2.2. Model Geometry

In order to run tests on a relevant geometry a model of the heart and torso was used that had been created for a previous study (Kania et al., 2017). The methods to build this geometry, only tersely described before, were as follows. High-resolution cardiac and thoracic computed tomography (CT) images were obtained from a female patient in her thirties. Images were segmented automatically using the MUSIC software (IHU Liryc, Université de Bordeaux and Inria Sophia Antipolis, France), under supervision of an expert operator. The boundaries of the segmented volumes were expressed as triangulated surfaces and meshing errors were manually corrected using Blender (The Blender Foundation, Amsterdam, The Netherlands). The resulting surface mesh defined the volumes of the ventricular myocardium, left and right cavities with parts of the great vessels, lungs, and the whole body. To define hexahedral meshes for the computations the surfaces were overlaid with a 3D cartesian mesh whose elements were assigned types according to the surfaces in which they were contained. The bones were also segmented and meshed but not included in the simulations. The atrial myocardium was not segmented.

The heart mesh was processed to define subendocardial and subepicardial layers and fiber directions using the rule proposed by Beyar and Sideman (1984), as previously described (Potse et al., 2006). The torso mesh was similarly processed to define a layer of 1 cm thickness directly under the skin as skeletal muscle and to define a sheet direction in this layer. Since the true fiber directions of the skeletal muscle layer are too complex to account for the model muscle simply had a low conductivity in the radial direction and a high conductivity in all circumferential directions (**Table 1**).

During the thoracic scan the patient was wearing a vest with 252 embedded electrodes (Tilt et al., 2013; Cochet et al., 2014). The locations of these electrodes were extracted from the CT data using software provided by the manufacturer of the vest. In addition the locations of the 9 standard ECG electrodes were determined by referring to the bone mesh, and two electrode locations on the hips were chosen. The surface mesh with electrode positions is illustrated in **Figure 1**.

#### 2.3. Spatial Discretization

Spatial discretization was done using a finite-difference method. Differential operators of the form ∇ · (G∇.), where G is any of the conductivity tensor fields employed, were computed using an

TABLE 1 | Tissues used in the simulations together with the volumes they occupy in the torso model, the conductivity parameters σ (in mS/cm), and β (cm−<sup>1</sup> ); the subscript "i" stands for intracellular, "e" for extracellular, "L" for longitudinal, "T" for transverse (within a tissue sheet), and "C" for across-sheet.


expression proposed by Saleheen and Ng (1997). This expression assumes that G is constant on elements and that potentials are defined on the nodes of the mesh. It produces a 19-point stencil that takes anisotropy and inhomogeneities into account. The simulation code read its geometry in terms of elements, and created a node mesh, assigning node types such that all corners of a myocardial element would have myocardial nodes. In order to treat myocardial boundaries correctly, the β value of each node was the average of those associated with the 8 elements around it, which was zero for non-myocardium (Potse et al., 2006).

#### 2.4. Simulation of Cardiac Activity

To prepare input data for ECG simulation propagating activation was simulated using the monodomain reaction-diffusion model (6) using the membrane model of Ten Tusscher and Panfilov (2006) for the functions F and Iion. A uniform time step of 10 µs was used. At each time step the code


After each 100 time steps results were written to file. Simulations were run on a heart mesh at 0.2 mm resolution. Tissue parameters determining G<sup>m</sup> and β are listed in **Table 1**. Gating variables were integrated with the method of Rush and Larsen (1978) and all other variables with a forward Euler method.

Activation was started with a single stimulus at one location, at the beginning of the simulation. Seven simulations were run, each time with the stimulus at a different location. Simulations covered 500 ms to include the full depolarization and repolarization of the ventricles.

### 2.5. ECG Simulation

The ECG was computed with several methods:


$$\nabla \cdot \left( \left( G\_{\text{i}} + G\_{\text{e}} \right) \nabla \phi\_{\text{e}} \right) = -I\_{\text{w}} \tag{11}$$

where I<sup>w</sup> is a projection of the term ∇ · (Gi∇Vm) from a 0.2 mm resolution heart mesh onto a 1 mm resolution torso mesh. Each coarse-mesh node received contributions from a cube-shaped area including all fine-mesh nodes within the up to 8 coarse-mesh elements around it, with higher weights attributed to nearby nodes, as in a trilinear interpolation: Let 1x, 1y, 1z be the number of fine-mesh edges between a coarse-mesh node and a fine-mesh node along the x, y, and z axis, respectively. Then the contribution of the fine-mesh node to the coarse-mesh node was

$$\mathcal{W} = \begin{cases} 0, & \text{if } \Delta x \ge 5 \lor \Delta y \ge 5 \lor \Delta z \ge 5\\ (5 - \Delta x)(5 - \Delta y)(5 - \Delta z)/5^6, & \text{otherwise} \end{cases}$$

The coarse mesh was constructed such that a myocardial fine-mesh node was always surrounded by 8 coarse-mesh nodes. Therefore, w added up to unity for each fine-mesh node and charge conservation was ensured.

For the FSC method the monodomain reaction-diffusion model (6) was integrated in a separate run which saved I<sup>w</sup> to file. This method has been used routinely in several studies (Nguyên et al., 2015; Meijborg et al., 2016; Duchateau et al., 2017; Kania et al., 2017). The torso mesh in this case consisted of 2.7 · 10<sup>7</sup> nodes.


The notations LF(C, S) and LFS(C, S) will be used for the LF and LFS methods, respectively, with lead fields computed at a resolution of C millimeters and downsampled to a resolution of S millimeters.

#### 2.6. Computation of Lead Fields

To prepare the lead fields Z for the ECG computation the system (10) was solved for each lead. This was done once with a torso model at 1 mm resolution and once with a torso model at 0.2 mm resolution. Like the FSF, the latter calculation was exceptionally large and was only intended to provide reference values, to test the hypothesis that 1 mm resolution suffices for such calculations.

In either case 266 lead fields were computed: the 12 standard ECG leads, and one lead for each of the 252 vest electrodes and 2 hip electrodes referenced against Wilson's central terminal (the average of the two arm electrodes and the left leg electrode).

The computed lead fields Z were stored in files. A dedicated program computed ∇Z and downsampled it using the two methods described in section 2.5, i.e., with and without consideration of the tissue types of the elements. The field computed at 0.2 mm resolution was downsampled by the factors 2, 5, 10, and 25 to obtain resolutions of 0.4, 1, 2, and 5 mm. The field computed at 1 mm resolution was downsampled by the factors 2 and 5 to obtain resolutions of 2 and 5 mm.

#### 2.7. Testing Protocol

ECGs were simulated using each of the 4 methods described in section 2.5 and, for the methods based on lead fields, at each of the resolutions mentioned in section 2.6.

The ECG potentials V were compared to a reference ECG V ref in terms of three measures: maximum, root-mean-square (RMS), and relative difference (RelDif) (van Oosterom, 2001; Tysler et al., 2007), defined as

$$\text{RelDiff} = \sqrt{\frac{\sum\_{t} \sum\_{n} (V\_{tn} - V\_{tn}^{\text{ref}})^2}{\sum\_{t} \sum\_{n} (V\_{tn}^{\text{ref}})^2}} \tag{12}$$

where the index t ranges over all 500 samples and the index n ranges over all 266 leads. For the 252 vest leads the dependence of the error values on the position of the positive electrode was investigated.

The effect of the ECG computation on the run time of a reaction-diffusion model was investigated and the scalability of the 4 methods was investigated by running tests on 16, 32, . . . , 512 nodes of a Bull cluster. Each of these nodes was equipped with two 14-core Intel Xeon E5-2690 processors with 2.6 GHz clock frequency and 64 GB memory. Accuracy results are reported as averages over the 7 activation sequences. Performance tests were carried out 5 times to report average values and standard deviations of run time.

#### 2.8. Numerical Methods

Simulations were performed using the Propag-5 software (Krause et al., 2012), to which new code was added to compute a lead field-based ECG on the fly during a simulation of the heart, and to facilitate the computation of the lead fields themselves. Like its predecessor Propag-4 (Potse et al., 2006), the software uses a structured mesh, but stores information only for elements and nodes that are relevant for the computation: only myocardium for a monodomain model, and only conducting material for a bidomain model. As discussed by Krause et al. (2012) Propag-5 uses a hybrid MPI/OpenMP parallellization scheme. Using a naive temporary partitioning of the domain the code reads the geometry in terms of elements and creates a node mesh using rules that ensure consistency with the scheme discussed in section 2.3. It then uses the ParMetis library to partition this mesh in parallel and creates a definitive domain partitioning for the computations. This fully parallel workflow allowed it to load and partition a mesh with over 3 billion nodes.

Because in some of the computations the model size exceeded the maximum value of a signed 32-bit integer, Propag was compiled with a 64-bit integer type for global indices. The PetSC (Balay et al., 2017) and Parmetis libraries which Propag uses were compiled entirely with 64-bit integers because they do not have a distinct type for global indices.

The linear systems (8), (10), and (11) were solved with a biCGStab solver (van der Vorst, 1992) with a BoomerAMG preconditioner from the Hypre package (Henson and Meier Yang, 2002; Falgout et al., 2017). The solver terminated when the norm of the error term was 10−<sup>8</sup> times smaller than the norm of the right-hand side. Multigrid preconditioners such as BoomerAMG are very powerful and well-suited for large bidomain problems (Sundnes et al., 2002; Weber dos Santos et al., 2004; Austin et al., 2006) so that the solver typically needs only a handful of iterations, in contrast to the problematic convergence observed on large models with an incomplete-LU preconditioner (Potse et al., 2006).

#### 3. RESULTS

An example of a computed lead field is shown in **Figure 2**. This field was computed and stored at 1-mm resolution. The figure shows how the field suddenly changes direction and magnitude at lung boundaries. There is a slight left-right asymmetry because the highly conductive cardiac cavities concentrate the field on the left side of the thorax.

The computed depolarization sequences of the 7 simulated heart beats that were used for ECG computation are shown in **Figure 3**.

Potentials computed with a full-torso solution from beat 5 are shown in **Figure 4**. They are about 10 times larger in the myocardium than near the body surface.

### 3.1. Lead-Field ECG Compared to Full Solution

To establish that the lead-field and full solution methods produce the same results, simulated ECGs were compared between the LF(1, 1) and FSC methods. Averaged over the 7 simulations, RelDif was 0.0016, RMS error 0.3 µV, and maximum error 4.6 µV, while ECG amplitudes were in the order of 1 mV.

Analogously, a single ECG was compared between the LF(0.2, 0.2) and FSF methods. In this case the differences were slightly smaller: RelDif was 0.0014, RMS error 0.2 µV, and maximum error 2.6 µV.

#### 3.2. Effect of Resolution

To determine the effect of lead-field resolution on ECG accuracy, 7 different activation sequences were simulated with a monodomain reaction-diffusion model and ECGs were simulated on the fly using a lead field. This was done for the lead fields computed at 0.2 and at 1.0 mm and all downsamplings thereof, both with the LF and with the LFS method. The resulting ECGs were compared to a reference ECG.

The results are shown in **Figure 5**. In **Figure 5A** errors are shown using the ECG computed with LF(0.2, 0.2) as the reference. For the fields subsampled from those computed at 0.2 mm resolution, differences are seen to increase roughly linearly with the stepsize of the lead field. The LFS method resulted in smaller differences. Results obtained with the field computed at 1.0 mm resolution and downsamplings differed from the reference solution with little dependence on the sampling level. **Figure 5B** shows that this dependency is recovered when ECGs computed with LF(1, 1) are used as the reference.

The relatively large influence of the spatial stepsize in the leadfield computation suggests that differences in model geometry

dominate the error. Indeed, the difference between full solutions at 0.2 and 1.0 mm, computed only for one simulation, had a RelDif of 0.10, RMS error 12 µV, and maximum error 0.15 mV,

FIGURE 3 | Depolarization order in the 7 monodomain reaction-diffusion simulations from which ECGs were computed; anterior view. The scale is in milliseconds.

which are very similar to the differences between LF(1, 1) and LF(0.2, 1) in **Figure 5A**.

To find out at which locations in the model the lead fields computed with LFS(0.2, 1) and LF(1, 1) differed, the L2 norm of the difference between the two vector fields was computed for all elements. Large differences were found to occur at locations where the fiber direction was highly variable. One such location, at the inferior septal junction, is illustrated in **Figure 6**. It is compared with a measure of variability in fiber direction in the underlying anatomy files, computed as

$$1 - \frac{1}{N} \sum\_{i=1}^{N} |\vec{P} \cdot \vec{p}\_i|^2$$

where PE is the fiber direction in the coarse-mesh element and pE<sup>i</sup> are the fiber directions in the corresponding fine-mesh elements. The absolute value, denoted as |.|, was taken because the orientation of the direction vector is irrelevant.

In **Figure 7** a few ECG leads are compared between different computation methods. In **Figure 7A** full solutions at 0.2 and 1.0 mm are compared. At the coarser resolution the ECG appears more fractionated; this is particularly visible in lead III. As discussed above, the RelDif between these ECGs was 0.10. In **Figure 7B** the same full solution at 0.2 mm is compared with an ECG computed with LFS(0.2, 2). Despite the 10-fold downsampling of the lead field the traces are visually identical;

the RelDif was 0.02. Thus, an ECG computed with a lead field downsampled to 2 mm resolution is more faithful than a full solution at 1 mm resolution, when compared to a solution at 0.2 mm.

### 3.3. Performance

and horizontal grid lines are 0.1 mV apart.

**Table 2** shows how ECG computation with lead fields at different resolutions affects the run time of a typical simulation. The data in each row were obtained from 5 simulations of 500 ms activity with a reaction-diffusion model at 0.2 mm resolution, run on 32 compute nodes (896 cores). The table separates initialization time, ECG computation time, and simulation time (including ECG computation but excluding initialization). For lead fields at 0.2 and 0.4 mm resolution the initialization time is of the same order of magnitude as the simulation time, due to the time it takes to read the lead fields from file (141 and 53 GB in these cases). The time for ECG computation itself ranges between 4 and 5 % of the simulation time, slightly reducing with the leadfield resolution. At 1 mm resolution the memory accesses related to ∇Z (for 266 leads) are similar to those for Gi∇V<sup>m</sup> so a further reduction would not be expected. At 0.2 mm resolution the ECG computation is faster than at 0.4 mm, likely because in this case the lead field has the same resolution as the reaction-diffusion model and the code then avoids an index conversion.

**Figure 8A**, shows how the computation times scale with the number of cores used for a single lead-field resolution of 1.0 mm. The reaction-diffusion simulation and the ECG computation scale well. Initialization time increases with the number of cores, due to increasing communication for mesh distribution and data input. Tests with higher and lower lead-field resolutions, not TABLE 2 | Time required for LF-based ECG computation during a reaction-diffusion simulation of 500 ms.


res, lead-field resolution in mm; sim, total simulation time; init, initialization time. Time is given as average ± standard deviation over 5 simulations, in seconds.

presented in the figure, showed that the initialization time was highly variable and had no clear relation with the resolution (and thus the storage size) of the field. Rather, the number of collective read operations seemed to be determining.

The black trace in **Figure 8A** shows the scaling of a full solution (FSC method). It is over 2 orders of magnitude slower than the lead-field ECG and stops scaling at 7,168 cores.

**Figure 8B** shows how the ECG computation time scales with the number of nodes for all tested values of lead-field resolution. Lead-field resolution is seen not to affect the scaling with the number of cores. Generally the time decreases slightly with decreasing resolution but, as in **Table 2**, the computation at 0.2 mm was faster than the one at 0.4 mm.

#### 4. DISCUSSION

This study shows that a lead-field approach is an attractive solution for ECG simulation on (large) parallel computers

FIGURE 8 | (A) Scaling of propagation, lead-field ECG, and full solution. The blue, green, and red traces show average simulation time, ECG computation time, and initialization time for reaction-diffusion simulations run on 16–512 nodes (448–14,336 cores) with 4 threads per process, with ECG computation based on a lead field at 1.0 mm resolution. The black trace shows the time for a full bidomain solution. Each data point represents an average over 5 simulations. (B) As (A), but showing only ECG computation time, for all lead-field resolutions.

whenever the number of ECG leads is smaller than the number of samples. It is about 100 times faster than a full solution, scalable to more than 10<sup>4</sup> cores, and does not cause a significant loss in accuracy. Lead fields can be stored at a resolution as low as 2 mm, meaning that they do not use excessive disk space even for a few hundred leads.

#### 4.1. Previous Work on Lead Fields

The concept of lead fields was initially proposed by McFee and Johnston (1953) as a method to understand how ECG leads "view" the heart. Their purpose was in the first place to design leads that would be better in the sense that their fields would be more uniform inside the heart muscle (McFee and Johnston, 1954). Later the idea has been adopted for the purpose of accurate numerical simulation of the ECG (Geselowitz, 1989) and even local electrograms inside the heart (Colli-Franzone et al., 2000; Western et al., 2015).

The idea to use lead-field methods for ECG simulation has been widely adopted. While the very earliest studies did not use them, for example because they computed only a small number of potential distributions (Gelernter and Swihart, 1964) or because a full solution required less memory (Barr et al., 1966; Barnard et al., 1967), numerous studies are based on some form of lead fields or transfer coefficients between V<sup>m</sup> in the heart and φ<sup>e</sup> on the body surface (Horacek, 1973; Miller and Geselowitz, 1978; Mailloux and Gulrajani, 1982; Aoki et al., 1987; Lorange and Gulrajani, 1993; Trudel et al., 2004).

Mailloux and Gulrajani (1982) and further work from the same group (Lorange and Gulrajani, 1993; Trudel et al., 2004) used transfer coefficients that are mathematically identical to lead fields. Their transfer coefficients were computed with a boundary element model (BEM) which accounted for heterogeneity of the torso, but not for anisotropy. They found that they needed <100 regions to define these coefficients, likely because their model was isotropic. In the anisotropic model used here the lead field changed considerably through the wall, requiring a much higher though not prohibitive resolution. Jacquemet (2015, 2017) evaluated the performance of the same (BEM-based) method on a reaction-diffusion model of the human atria and found that 1,000 regions sufficed for a 1% accuracy.

Boulakia et al. (2010) reported that an ECG simulation based on a transfer matrix was 60 times faster than solving a coupled heart-torso problem. They were using a finite-element model with about 1 million tetrahedra whose sizes gradually increased from the heart to the torso surface, and a serial code. Despite the obvious differences in methods the speedup was very similar to what was found in the current study.

Electrocardiographic inverse modeling studies that used volumetric transmembrane potentials or current dipoles as their source models have also used transfer coefficients that are similar to lead fields (Liu et al., 2006; Wang L. et al., 2013).

### 4.2. Other Methods to Compute the ECG

Many other studies have used full torso solutions to obtain the ECG from a reaction-diffusion model using finite-difference (Potse et al., 2009; Hoogendijk et al., 2010; Meijborg et al., 2016; Chamorro-Servent et al., 2017) or finite-element models (Lines et al., 2003; MacLachlan et al., 2005; Boulakia et al., 2010; Keller et al., 2010; Zemzemi et al., 2015; Janssen et al., 2017). In some cases this was done because intracardiac electrograms in a torso-coupled heart were also simulated (Hoogendijk et al., 2010; Meijborg et al., 2016). The ECG is then a free byproduct.

An interesting alternative is a mixed approach in which anisotropic regions such as the heart and skeletal muscle are handled with finite elements and isotropic regions with boundary elements (Pullan and Bradley, 1996), resulting in fewer degrees of freedom than a complete volume discretization.

There is a considerable body of literature dedicated to the problem of solving body-surface potentials from epicardial (extracellular) potentials (Barr et al., 1977; Pilkington et al., 1987; Stenroos and Haueisen, 2008), which has found an application in cardiac inverse modeling (Greensite and Huiskamp, 1998; Ramanathan et al., 2004; Shou et al., 2008). A formulation in terms of transmembrane potentials on the (endocardial and epicardial) surface of the cardiac muscle is possible if equal anisotropy of the intracellular and extracellular domain is assumed (Geselowitz, 1989; van Oosterom and Jacquemet, 2005) and is also used to solve cardiac inverse problems (Oosterhoff et al., 2016).

#### 4.3. Strengths and Limitations

ECG simulation based on lead fields is very fast and as scalable as a monodomain reaction-diffusion model. This makes it suitable for inclusion in the same model run on a large-scale parallel computer or a GPGPU, in contrast to full solutions, which would limit the scalability of the entire computation. This advantage is present whenever the number of ECG samples to be simulated exceeds the number of leads.

Lead-field methods can also be used to compute local electrograms in the heart but this may require a higher spatial resolution at least near the electrode (Colli-Franzone et al., 2000).

For detailed spatial mapping of potentials, either in the heart or on the torso surface, lead-field methods are less advantageous, as the number of locations might exceed the number of samples and may even be so large that the storage of the lead fields becomes a performance bottleneck. In such cases full solutions remain the method of choice and a relatively long solution time will have to be accepted. Although new developments in scalable preconditioners may improve the situation somewhat (Munteanu et al., 2009; Ottino and Scacchi, 2015), it is unlikely that full solvers will ever scale as well as an ECG computation based on lead fields.

It would also be challenging to use a lead-field approach in an electromechanical, deforming heart model. A lead field that would be deformed with the mesh might be a reasonable approximation but this has not been tested here.

The results of this study also suggest further improvements, in the first place the use of non-uniform mesh density for lead-field computation. Comparison of ECGs computed at 0.2 and 1.0 mm resolution showed that the latter had artefactual notches of about 0.05 mV amplitude in the QRS complex, due to misrepresentation of fiber orientation at locations where this orientation changed rapidly. This applied to both full solutions and lead-field ECGs. To avoid such artifacts one could try to ensure a smooth fiber orientation throughout the model (Bayer et al., 2012), but this can be challenging at the interventricular junctions, or whenever measured fiber orientations rather than rule-based orientations are used. The only alternative seems to be computation of the lead field with a mesh at the same resolution as the reaction-diffusion model inside the heart, and for improved efficiency a lower resolution elsewhere in the torso (Pullan and Bradley, 1996; Boulakia et al., 2010). While the computations could still be hard on a mesh with a wide variation in element size, the memory requirements would be much lower than the 12 TB reported here for the reference torso model.

Another possible improvement that would be relevant for very accurate computations with high-resolution lead fields is to develop suitable compression methods for lead-field data. Very likely the regularity of the field could be exploited by using fixedpoint numbers in combination with spatial differentiation and a variable-length encoding.

In **Figure 8A**, a particularly unfavorable scaling of the initialization phase was shown for the propagation model with lead-field ECG. This was probably due to an issue with the collective reading operation in the MPI library that was used, but also to the fact that for this feasibility study little care had been taken to organize this efficiently—after all the specifications for this code depended on the outcome of the study. With these results in hand it should be possible to avoid this problem by using a more efficient storage format and organizing the read operation in a different way. The figure also shows that the FSC method takes an order of magnitude more time than the reactiondiffusion model. This difference is partly due to the small solver tolerance that was chosen for this study.

### 4.4. Applications

The use of lead-field methods simplifies the workflow for largescale cardiac simulations, as it allows the ECG to be computed on the fly with very little overhead during a reaction-diffusion simulation on a mesh of the heart alone. Moreover, its high scalability allows the resolution of the models to be increased without causing a disproportional increase in the time needed for ECG computation.

The results of this study are not only relevant for work on large-scale computers but also for simulations on generalpurpose graphics processing units (GPGPU). Reaction-diffusion simulations on GPGPUs have been reported by several groups (e.g., Bartocci et al., 2011; Neic et al., 2012; Mena et al., 2015; Kudryashova et al., 2017), recently even for a whole human heart model run on a desktop computer (Vandersickel et al., 2016). The strength of a GPGPU is that it provides thousands of parallel processors for the price of a single CPU. However, communication between these processors is a distinct weakness. With a method based on lead fields it is nevertheless possible to add rapid ECG computation to a model running on a GPGPU. Pezzuto et al. (2017) have recently reported such a method, though in combination with an eikonal model rather than a reaction-diffusion model.

In the context of ECG inverse models and model personalization a variety of methods has been reported ranging from infinite-medium potentials (Giffard-Roisin et al., 2017; Neic et al., 2017) to full-torso bidomain solutions (Wang D. et al., 2013). A lead-field approach could offer a solution that combines the speed of the former (if the computation of the lead field itself is excluded) with the accuracy of the latter. Only methods based on equivalent double layers (Geselowitz, 1992; van Oosterom and Jacquemet, 2005) offer more efficiency as they need to evaluate only the surface of the heart, but the price for this efficiency is that these methods neglect anisotropy. A lead-field approach combined with an eikonal-diffusion model for cardiac propagation (Konukoglu et al., 2011; Jacquemet, 2012; Neic et al., 2017) could soon be a practical solution for ECG inverse problems with an accuracy very close to the state of the art in forward modeling of the ECG.

### 5. CONCLUSION

Lead fields are a practical alternative for full-torso solutions when the number of ECG leads that need to be simulated is smaller than the total number of samples that will be calculated. The method is fast and highly scalable. Lead fields can be stored at a resolution as low as 2 mm without unacceptable loss of accuracy.

### AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and approved it for publication.

### REFERENCES


#### ACKNOWLEDGMENTS

This work was granted access to HPC resources of CINES under GENCI allocation 2018-A0030307379. This work was supported by the National Research Agency, grant reference ANR-10- IAHU04-LIRYC.

The author thanks Dr. Michael Leguèbe and Dr. Emmanuelle Saillard for proofreading the manuscript, and Dr. Hubert Cochet for providing geometry data and proofreading.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphys. 2018.00370/full#supplementary-material


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Potse. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Coupled Immunological and Biomechanical Model of Emphysema Progression

#### Mario Ceresa<sup>1</sup> \*, Andy L. Olivares <sup>1</sup> , Jérôme Noailly <sup>1</sup> and Miguel A. González Ballester 1,2

<sup>1</sup> BCN-Medtech, Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain, 2 ICREA, Barcelona, Spain

Chronic Obstructive Pulmonary Disease (COPD) is a disabling respiratory pathology, with a high prevalence and a significant economic and social cost. It is characterized by different clinical phenotypes with different risk profiles. Detecting the correct phenotype, especially for the emphysema subtype, and predicting the risk of major exacerbations are key elements in order to deliver more effective treatments. However, emphysema onset and progression are influenced by a complex interaction between the immune system and the mechanical properties of biological tissue. The former causes chronic inflammation and tissue remodeling. The latter influences the effective resistance or appropriate mechanical response of the lung tissue to repeated breathing cycles. In this work we present a multi-scale model of both aspects, coupling Finite Element (FE) and Agent Based (AB) techniques that we would like to use to predict the onset and progression of emphysema in patients. The AB part is based on existing biological models of inflammation and immunological response as a set of coupled non-linear differential equations. The FE part simulates the biomechanical effects of repeated strain on the biological tissue. We devise a strategy to couple the discrete biological model at the molecular /cellular level and the biomechanical finite element simulations at the tissue level. We tested our implementation on a public emphysema image database and found that it can indeed simulate the evolution of clinical image biomarkers during disease progression.

Keywords: COPD, emphysema, chronic bronchitis, finite element methods, agent-based models, biophysical modeling, multiscale modeling, supercomputing

### INTRODUCTION

Chronic obstructive pulmonary disease (COPD) is estimated to affect more than 500 million people worldwide, causing significant disability, loss of quality of life and social burden, with costs in excess of e 56 billion per year in the European Union (Decramer et al., 2012). The disease has a lifetime prevalence of about 28% and cigarette smoking is commonly considered to be the principal risk factor (Gershon et al., 2011). Recent projections suggest that COPD will be the third cause of global mortality by the year 2030.

The pathogenesis of COPD is still not completely understood (Larsson, 2007; Yoshida and Tuder, 2007) and involves a number of multi-scale cellular processes, including airways inflammation, adaptation and innate immunity to cigarette smoking, sensitivity to self and not-self

#### Edited by:

Mariano Vázquez, Barcelona Supercomputing Center, Spain

#### Reviewed by:

Zhihui Wang, The University of Texas at Austin, United States Andreas Lintermann, RWTH Aachen Universität, Germany

> \*Correspondence: Mario Ceresa mario.ceresa@upf.edu

#### Specialty section:

This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology

Received: 15 December 2017 Accepted: 28 March 2018 Published: 19 April 2018

#### Citation:

Ceresa M, Olivares AL, Noailly J and González Ballester MA (2018) Coupled Immunological and Biomechanical Model of Emphysema Progression. Front. Physiol. 9:388. doi: 10.3389/fphys.2018.00388

**254**

antigens, accelerated senescence, and deregulation of mechanisms of cell repair (Repapi, 2010; Pavord et al., 2012). Interactions between the environment and a selected group of candidate genes is also considered very important (Akinbami et al., 2012; Mizuno et al., 2017; Zhao et al., 2017).

Clinical management of COPD involves consistent use of inhaled corticosteroids that help reducing COPD mortality. However, their efficacy is limited (Faner and Agustí, 2016) and many patients experience exacerbations and poor symptoms control (Brightling et al., 2012).

As a matter of fact, the clinic presentation of COPD is not homogeneous, but presents two main clinical phenotypes, emphysema and chronic bronquitis, each with many sub-types, different comorbidities and risk profiles (Martinez et al., 2012). Even if there is no therapeutic target that can reverse the decline of lung function over time (Vestbo et al., 2013), a **broader recognition** of markers associated with adverse risk (Partridge et al., 2006) and therapies that specifically target **different phenotypes** specifically reduce exacerbations and improve patient's life (Castro et al., 2010; Holgate, 2012).

An additional problem is that it is extremely challenging to do an early detection and staging of COPD. This is because the gold standard for clinical diagnosis is Pulmonary Function Tests (PFTs) which is not sensitive enough to detect any disease progression before a large part of the lung has been compromised (Cooper et al., 2017). It is also not sensitive enough to detect different subtypes and elucidate different mechanisms of actions.

Specifically in **emphysema,** the continued inflammation of lung parenchyma eventually leads to a loss of collagen and elastin in the alveoli (Sharafkhaneh et al., 2008; Goldklang and Stockley, 2016). As a result of this sustained damage the septa become increasingly compliant and eventually fail mechanically during normal breathing. This reduces the area available for gas exchange causing dyspnea and shortness of breath. In addition, the mechanical damage due to emphysema is likely to stimulate tissue repair mechanisms at cellular level, that result in the production of type I collagen (Crosby and Waters, 2010). As a matter of fact, alveolar fibrosis is observed in emphysematous spaces, in the form of thickened and stiffened alveoli, which most likely contributes to shortness of breath (Yousem, 2006).

Faced by this complexity in the mechanisms and the lack of a simple clinical tests, it is important to assess the patient by integrating information from heterogeneous sources such as molecular data and medical imaging, in order to adapt the treatment options with the phenotype and risk profile. A promising option is to include information from **computational models** of biological systems that can account for causative effects, otherwise difficult to apprehend in clinics. These models have the potential to predict complex behaviors, elucidate regulatory mechanisms, and inform experimental designs to eventually point out specific factors to control or therapeutic targets, in order to improve patient management (Di Ventura et al., 2006).

Cancer research has already exploited computational models over different spatial and temporal scales as **a promising way to describe complex diseases** (Deisboeck et al., 2009, 2011; Wang et al., 2015). There, multiscale models interact with clinical data to generate and test different hypotheses, facilitating drug development (Clancy et al., 2016) and optimizing delivery and therapeutic effect (Cristini et al., 2017). We refer the interested reader to the detailed review by Wang and Maini (2017).

Recent interdisciplinary advances contributed to unravel the complex pathophysiological mechanisms that occur in COPD on both the macroscopic and microscopic scale. In case of **macroscopic model of the respiratory system**, for example Bordas et al. (2015) describes how to obtain a specific mesh of the patient for CFD simulations and Berger et al. (2016) discuss the application of a poroelastic deformation model for pulmonary ventilation. Chernyavsky et al. (2014) proposes a theoretical model of the possible effect of inflammation on the restriction of small airways. The reader can also refer to the review of COPD multi-scale modeling by Burrowes et al. (2013).

Among others, the "Protective Artificial Respiration" initiative fundamentally contributed to the understanding of COPD. We would like to cite Wiechert et al. (2011) for their multiscale model of respiratory system that coupled large bronchi and small alveoli, as well as Roth et al. (2017a,b) respectively for a study of the essential interactions between flow and deformation in the lungs and a simplified model of lung microstructures. Also Verdugo et al. (2017) reported on efficient solvers for respiratory mechanics. Among the works devoted to particle deposition we recall the work of Freitas and Schröder (2008) for a numerical study of 3D flows in a human lung model, and Lintermann and Schröder (2017) for the simulation of aerosol particle deposition and Calmet et al. (2016) for their model and simulations of particle deposition based on High-Performance Computing. Very recently, an experimental characterization of the nonlinear compressible behavior of the parenchyma is reported in Birzle et al. (2018).

For the **microscopic modeling,** literature contains numerous works on the modeling of the immune system at the molecular level. For instance, Folcik et al. (2007) developed an agent-based model for the innate and adaptive immune system while (An, 2008) contributed an agent-based model of the epithelium. A model of inflammation with interactions between macrophages and fibroblasts capable of simulating scarring, tissue damage and fibrosis is presented in Brown et al. (2011). Most of the studies on AB modeling of COPD focus on emphysema, and mainly study the resulting destruction of the tissue. The most common method uses a 2D network of springs to represent alveolar tissue (Mishima et al., 1999). These modeling studies have the merit to highlight the redistribution of forces within the tissue during the progression of emphysema. This simulated progression was found to produce experimentally observed emphysema patterns (Suki et al., 2003) and was extended to 3D by Parameswaran et al. (2011) through the use of cuboidal cells to represent the alveoli. The European AirProm project has initiated the study of multi-scale models for the study of COPD (Burrowes et al., 2013).

In the INSPIRE project<sup>1</sup> , we would like to give a multi-scale, multi-physics description of the phenomena that cause the onset of emphysema and the possibility to predict the risk profile of

<sup>1</sup> INSPIRE - Personalized computational models of COPD progression for patient phenotyping. FIS2017-89535-C2-2-R

the patient. Accordingly, the **main purpose** of the **presented work** is to propose a multi-scale model, able to integrate known interactions among inflammation, remodeling and parenchyma destruction, with particular attention to the role played by the immune system. We extended our previous work Ceresa et al. (2017) to couple the dynamics of the biological events captured through agent cooperation in an agent-based (AB) model with a biomechanical simulation of the tissue captured by a coupled Finite Element (FE) Model that iteratively predicts the evolution of the mechanical cues transmitted to the cells inside the lungs. We hope that such model could, once properly refined and validated, add to the interpretation of the specific disease phenotype toward the prediction of personalized risk profiles. We think our model builds nicely on the previous cited literature for the microscopic models because we use a less simplified model of the molecular interactions. In addition, we explicitly take into account the mechanical forces the tissue is subjected to using a well-vetted FE model, while others have worked more on connection models with elastic spring.

In the following sections we will discuss the coupled AB and FE model that we contribute (section Methods), and the experimental setup designed to validate the model on a public CT dataset of emphysema images (section Experimental Setup). We then present the results of the experiments, their discussion (section Results and Discussion) and the conclusions and future works (section Conclusions and Future Works).

#### METHODS

As we commented before, research and clinical practice suggest that emphysema development happens along two different timescales: a slow molecular one due to the inflammatory response to solid particles (Cosio et al., 2009), and a rapid one, caused by sudden rupture of the alveolar walls due to mechanical forces which act on lung tissue during respiration (Suki et al., 2003).

In the following sections, first we present a dynamic model of inflammatory response using ordinary differential equations (ODE) taken from literature that does not account for spatial and mechanical effects (section Well-Mixed Molecular Model of Inflammation and Tissue Remodeling). This is followed by an AB molecular model for inflammation and remodeling coupled with a FE model of biomechanical tissue that supersedes those limitations (section Agent Based Model of Inflammation and Coupling to the Finite Element Model).

### Well-Mixed Molecular Model of Inflammation and Tissue Remodeling

In order to prepare the implementation of the AB model and define the rules thereof, we performed a large bibliographical study to obtain relevant information about:


#### • the role of MMPs on collagen cleavage and fibroblast deposition which are important terms for elastin degradation and remodeling.

This literature (Ignotz and Massagué, 1986; Onozaki et al., 1988; Oliver et al., 1993; Bellingan et al., 1996; Tsutsumi et al., 1996; Darby et al., 1997; Meng and Lowell, 1997; Hehenberger et al., 1998; Horio et al., 1998; Cobbold and Sherratt, 2000; Steinmüller et al., 2000; Eberhardt et al., 2002; Huang et al., 2002; Maass et al., 2002; Zhang et al., 2003; Mantovani et al., 2004; Porcheray et al., 2005; Tanaka et al., 2005; Edwards et al., 2006; Lenga et al., 2008; Marino et al., 2008; Moro et al., 2008; Jin and Lindsey, 2010; Wang et al., 2012) is reported in Reference section and it is associated to the different biological parameteres considered in **Table 1**. We focus mainly on the wellvetted interactions between different types of macrophages, proand anti-inflammatory cytokines, fibroblasts, collagen deposition and degradation, neutrophils and elastase production. Those interactions were already described by Brown et al. (2011), Jin et al. (2011), and Wang et al. (2012) and our main contribution was to integrate all the available information of the different biological processes and adapt them for the specific case of emphysema modeling. The final model we used is composed by two algebraic equations and thirteen coupled non-linear ordinary differential equations (ODE). This model belongs to the category of well-mixed (WM) systems in the sense that no spatial effects are considered.

These equations are presented below (Equations 1–15) and the biology they reflect can be schematically represented in an integrated picture of the main molecular and cellular actors that regulate the chronic immune response and the consequent changes in tissue properties (**Figure 1**), after initial particle deposition on the lung tissue.

The aforementioned particle deposition causes sustained inflammation of the tissues with a fast secretion of Tumor Necrosis Factor alpha (TFNα-Tα in the equations for brevity) and a slow secretion of Transforming Growth Factor beta (TGFβ-Tβ in the equations for brevity) respectively by monocytes (M) and epithelial cells. These cytokines attract monocytes, according to the model proposed by Wahl et al. (1987) (Equation 1):

$$M(T\_{\alpha}) = 0.335 T\_{\alpha}^3 - 6.309 T\_{\alpha}^2 + 32.281 T\_{\alpha} + 57.302 \tag{1}$$

and further govern the differentiation between inactivated macrophages (Mun) and the specific sub-types M<sup>1</sup> and M2, according to Equations (2–4):

$$\dot{M}\_{\text{int}} = M(T\_{\alpha}) - k\_{2}M\_{\text{int}}\frac{IL\_{1}}{IL\_{1} + c\_{IL1}} - k\_{3}M\_{\text{int}}\frac{T\_{\alpha}}{T\_{\alpha} + c\_{T\alpha}}$$

$$-k\_{4}M\_{\text{int}}\frac{IL\_{10}}{IL\_{10} + c\_{IL10}} - \mu M\_{\text{int}}\tag{2}$$

$$\dot{M}\_1 = k\_2 M\_{\text{un}} \frac{IL\_1}{IL\_1 + c\_{IL1}} + k\_3 M\_{\text{un}} \frac{T\_\alpha}{T\_\alpha + c\_{T\alpha}} + k\_{m21} M\_2$$

$$-k\_{m12} M\_1 - \mu M\_1 \tag{3}$$

$$\dot{M}\_2 = k\_4 M\_{\text{un}} \frac{IL\_{10}}{IL\_{10} + c\_{IL10}} - k\_{m21} M\_2 + k\_{m12} M\_1 - \mu M\_2 \tag{4}$$

#### TABLE 1 | Parameters of the AB model.


Apart from the indicated sources, also Jin et al. (2011) and Wang et al. (2012).

We see that all attracted monocytes will become inactivated macrophages first, and then switch to one of the two subtypes depending on constants k2−<sup>4</sup> and the concentration of pro-inflammatory cytokines IL1, TFNα and anti-inflammatory cytokine IL10. The pro-inflammatory cytokines will promote differentiation to M1 and the anti-inflammatory ones to M2. The promotion effect of the cytokines is mediated by the Hill equation for a cooperative binding type (Stefan and Le Novère, 2013) with coefficients cIL1, cTα, and cIL10. In time, the transition from M<sup>1</sup> to M<sup>2</sup> can be reversed, with constants km12 and km21. Eventually, the macrophages will be removed through the lymphatic system with rate µ.

Continuing our discussion of **Figure 1**, we see that each macrophage type will now secrete cytokines with a dynamic expressed by Equations (5–7):

$$
\dot{I}L\_{10} = k\_5 M\_2 \frac{\mathcal{L}\_1}{\mathcal{U}\_{10} + c\_1} - d\_{IL10} \mathcal{U}\_{10} \tag{5}
$$

$$
\dot{T}\_{\alpha} = k\_6 M\_1 \frac{c\_1}{L\_{10} + c\_1} - d\_{Ta} T\_{\alpha} \tag{6}
$$

$$
\dot{I}L\_1 = k\_7 M\_1 \frac{c\_1}{L\_{10} + c\_1} - d\_{IL1} L\_1 \tag{7}
$$

Here we see in Equation (5) that IL<sup>10</sup> is secreted by M<sup>2</sup> proportionally to k<sup>5</sup> and regulated by self-inhibition with effectiveness c1. Eventually, it is degraded with half-time decay rate dIL10.

In Equations (6, 7) we have an analogous process for the secretion of TFNα and IL<sup>1</sup> by the M1 macrophages subtype.

Additional TGFβ is secreted from fibroblasts (F) and M<sup>2</sup> to increment deposition of collagen in the composition:

$$
\dot{T}\_{\beta} = k\_8 M\_2 + k\_9 F - d\_{T\beta} T\_{\beta} \tag{8}
$$

$$F\_{\mathcal{g}}(T\_{\beta}) = 0.05T\_{\beta}^3 - 0.98T\_{\beta}^2 + 6.54T\_{\beta} + 7.11\tag{9}$$

$$\dot{F} = k\_{10} F\_{\mathcal{g}} (T\_{\beta}) F - d\_{\mathcal{F}} F \tag{10}$$

$$
\dot{C} = k\_{11}F - d\_{\rm FC}MMP \, \text{C} \tag{11}
$$

$$\dot{M}\dot{M}P = \dot{k}\_{12}M\_1 - d\_{\rm FC}MMP\,\mathrm{C} - d\_{\rm M}MMP\,\mathrm{I} \tag{12}$$

where K<sup>8</sup> is the secretion rate by M2, k<sup>9</sup> the one by fibroblasts and dTb the decay rate in Equation (8). Fibroblasts proliferates from the population of already existing cells proportionally to TGF<sup>β</sup> in Equations (9–10) and emigrate with rate d<sup>f</sup> . Collagen deposition is governed by Equation (11), where we have to consider the deposition rate kFC, and the degradation effect of matrix-metalloproteinases (MMP). Those are enzymes produced by M<sup>1</sup> that degrade the collagen, as described in Equation (12).

Finally, macrophages attract neutrophils to the wound site by secreting IL8, and those release the elastase enzyme that cleaves the elastin bonds in the fibers:

$$
\dot{\mathcal{U}}\_8 = k\_{13} M\_2 \frac{c\_2}{\mathcal{U}\_8 + c\_2} - d\_{\rm IL8} \mathcal{U}\_8 \tag{13}
$$

$$\dot{N} = k\_{14}(1 - \frac{N}{N\_{\text{max}}})\frac{IL\_8}{IL\_8 + c\_{IL8}} - \mu\_N N \tag{14}$$

$$
\dot{E} = k\_{15}N - d\_{\bar{E}}\tag{15}
$$

IL8 secretion (Equation 13) is similar to Equation (5), with constant k13, a self-inhibition term with efficacy c<sup>2</sup> and a degradation constant of dIL8. Equation (14) governs the recruitment of neutrophils up until their maximum value Nmax with a cooperative effect of IL<sup>8</sup> and an emigration rate of µN. Finally the density of elastase is dependent upon the number of neutrophils and the inactivation rate, dE.

The final proportion of elastase and collagen density is directly used in our biomechanical model to calculate the properties of the lung tissue for the FEM simulation as discussed at the end of the next section.

All the values for the discussed parameters are presented in **Table 1** and the related literature is listed in Reference section.

## Agent Based Model of Inflammation and Coupling to the Finite Element Model

In order to add spatial effects to the molecular model of inflammation and tissue remodeling, an AB model is created, using Equations (1–15) as a basis for the behavior of the agents. The first important difference is that the simulation of the agents happens on a grid. This gives the model an inherent spatial aspect and allow us to consider additional details w.r.t. the WB model. For instance, now the composition of the alveolar unit (AU) which includes among others epithelial cells, collagen, elastin and basement membrane (Zemans et al., 2015)- becomes relevant. In our case, every cell of the grid represents a small portion of the AU with different variables accounting for the content of elastin, collagen, the cytokines and the structural integrity of the cells (called "tissue-life" in the following).

During the simulation a "smoking" signal determines whether we introduce particles into the simulated AU or not. This signal is a periodic square wave with frequency f <sup>s</sup> and intensity e<sup>s</sup> . The intensity quantifies the exposure, that is, the number of particles inhaled in each cycle. The signal starts from zero and last for a total smoking time of T<sup>s</sup> . By varying frequency, intensity and total time, we can study the effect of particles on the model as detailed in experiment of section Experiment to Characterize Parameter Sensitivity. After the end of the total smoking time, the model is allowed to run for some additional time steps in order to reach equilibrium again.

The initial, unperturbed, dynamic of the system includes a small number of inactivated macrophages that move randomly, "patrolling" the tissue and searching for solid particles, similarly to the mononuclear cells behavior described by Auffray et al. (2007). When the smoking signal is active, inhaled particles deposit and cause an initial rapid rise of TFNα that attracts inactive macrophages to the deposition site according to Equation (2). From there, according to the dynamics described in Equations (3–4), macrophages differentiate in M1 or M2 subtypes which respectively govern the production of proinflammatory (Equations 6, 7, 12) and anti-inflammatory cytokines (Equations 5, 8, 13).

As previously indicated in Brown et al. (2011), at sites with high levels of pro-inflammatory cytokines, tissue is damaged by a complex network of interconnected factors called Damageassociated Molecular Patterns (DAMPS) (Matzinger, 2002; Lotze et al., 2007). This aspect was not included in the WM model because of its specific spatial nature, but it is implemented in the AB model where the tissue life of the AU is reduced proportionally to the inflammation level. Damaged tissue (i.e., with reduced tissue-life) in turn, start secreting TFGβ to recruit fibroblasts for wound healing as in Equations (9, 10).

The model tracks separately the amount of collagen and elastin in the tissue and their equilibrium varies depending on the concentration of fibroblasts, neutrophils, elastase and MMPs as in Equations (11, 12, 14, 15).

The cellular death caused by DAMPS and the amount of collagen and elastin, all affect the mechanical properties of the tissue used in the FE simulations. In this first version we use an elastic, isotropic material implemented in Elmer FEM software (Råback, 2013). Now, on the one hand, when a cell dies, we reduce its Young's modulus (ETissue) to 1 Pa, to account for the fact that it contributes no more to the elastic properties, but without changing the topology of the mesh. On the other hand, if the cell is not dead, its Young's modulus is calculated as a linear mixture of the corresponding concentration of elastin and collagen as in Equation (16). The initial values are Eel =0.1 kPa and Ecl =20 kPa, as described by Suki et al. (2011).

$$E\_{\text{Tissue}} = \delta\_{el} c\_{el} E\_{el} + \delta\_{cl} c\_{cl} E\_{cl} \tag{16}$$

$$\delta\_{el} = 0.7; \; \delta\_{cl} = 0.3, \; c\_{el} \in [0, 1]; \; c\_{cl} \in [0, 1];$$

The material properties are calculated and loaded in the solver as continuous static field using a custom made Fortran code.

Apart from the molecular damages caused by DAMPS, the tissue can also die because it was subjected to too much strain during the mechanical simulations. While elastin withstands deformation as high as 100%, the maximum tensile strain of pure collagen fibrils with low cross-link density is considered to be around 10% of the initial length (Depalle et al., 2015; Sherman et al., 2015). Accordingly, the maximum tensile strain for each cell is calculated weighting the previous values for the amount of elastin and collagen contained.

We present in **Figure 2** the indirect coupling strategy used for the AB and FE models. At each step the former simulate additional particle deposition that accounts for continued smoking; release of inflammatory cytokines and degradation of mechanical properties. Periodically the AB model is frozen and the calculated tissue properties are imported in the AB-FE coupler code which will reconstruct a topologically equivalent geometry, recover the contours of the damaged zones and assign new material properties taking into account the final amount of collagen and elastin from the AB model. The resulting information is passed to the FE solver that runs until convergence and then export the strain results for further processing. After the FE solver has run, the second coupler code, FE-AB is run to import the strain field and calculate which fibers, if any, have been destroyed in the simulation. It thus updates the AB status and restarts it with the updated state.

#### EXPERIMENTAL SETUP

The next sections deal with the more experimental part of our work. First, we explain the inner working of the coupling between AB and FEM solvers (section Procedure to Couple AB and FEM). Then, we present the meshing process (section Mesh Creation and Sensitivity). In the central part of this section we detail the two main experiments that validate our implementation: the first is an initial exploration of the sensitivity of the model to initial parameters (section Experiment to Characterize Parameter Sensitivity), while the second is the validation on a public CT image dataset of emphysematous lungs (section Experiment to Study the Emphysema Progression in Clinical Images). Finally, we briefly discuss the High Performance Computing infrastructure we used to run the studies (section High Performance Computing).

#### Procedure to Couple AB and FEM

In a typical execution cycle, the AB model is stopped at regular intervals and control is transferred to the FE model for analysis of the mechanical strains. After each interruption of the AB simulation, the latest iteration of this simulation is saved to disk and the AB-FE coupling code first calculates the percentage of damaged tissue area, as predicted by the AB model, and evaluates whether there is enough healthy tissue to proceed with the mechanic simulation. If this is the case, the saved status of the AB model is inspected to retrieve the last topology of the computational grid and the amount of collagen and elastin is used to calculate the new Young modules of the tissue according to Equation (16). This information is used together with connected component analysis, morphological operators and k-Nearest Neighbors (kNN) classifiers, to extract the contours of the broken tissue, define a 2D mesh and assign mechanical properties to each element. Materials, boundary conditions and solver parameters are adjusted if necessary and a case directory is created for the FE solver. The FE model runs asynchronously until convergence of the steady state and deformation and displacement fields are saved in a vtk compatible format (vtu). After that, the FE-AB coupling code is executed again. It reads back the strain fields from the solver status files and determines which, if any, nodes of the mesh have exceeded their maximum strain. Those are added to the damaged zone and the agent simulation is restarted. Cycle by cycle the coupled simulations continue until tissue damage is above 80% of the area or until the desired simulated time is reached.

A detailed view of typical results for the inflammation, meshing and mechanical process is shown in **Figures 3**, **4**.

#### Mesh Creation and Sensitivity

We use 2D FE meshes with topologies equivalent to the AB simulation grids. The exact size and topology of each mesh is thus very dependent on the current state of the simulation. In addition, an optimization step is run after the first mesh creation using Gmsh<sup>2</sup> . The average mesh contains around 50,000 polygons with four nodes. We manually refined the parameters of the mesh creation to ensure a quick convergence of the FE simulations, while keeping a low computational cost, necessary to ensure reasonably fast and smooth interactions between the AB and the FE models.

<sup>2</sup>Open source: gmsh.info

FIGURE 2 | Full coupled model. Original patches from a public emphysema database are segmented to separate the parenchyma from the vessels and airways and seed deposition is simulated. For each seeded pixel and for all its neighbors we run a simulation job that represent the evolution of 130 alveoli. In each job there is a cyclic sequence between the agent and finite element model. At each step the former simulate additional particle deposition that accounts for continued smoking; release of inflammatory cytokines and degradation of mechanical properties. Periodically the AB model is frozen and the calculated tissue properties are imported in the AB-FE coupler code which will reconstruct a topologically equivalent geometry, recover the contours of the damaged zones and assign new material properties taking into account the final amount of collagen and elastin from the AB model. The resulting information is passed to the FE solver that runs until convergence and then export the strain results for further processing. After the FE solver has run, the second coupler code, FE-AB is run to import the strain field and calculate which fibers, if any, have been destroyed in the simulation. It thus updates the AB status and restarts it with the updated state.

### Experiment to Characterize Parameter Sensitivity

We studied several possible parameters to characterize the model's behavior. First, we varied the quantity of particles and the frequency with which they are added. We varied the number of particles inhaled in each smoking step from 0 to 20 and the smoking time from 10 to 90 simulation steps, for a total of 25 experiments. This will be referred to as the "exposure" experiment in the results section.

In order to asses the sensitivity to the parameters, we selected and varied six main parameters of the model as described in **Table 2**. This resulted in a total of 26−<sup>2</sup> = 16 experiments following a fractional factorial analysis. We will refer to this as the "parameters" experiment in the results section.

### Experiment to Study the Emphysema Progression in Clinical Images

One of the main objective of this model was to predict the development of emphysema in time. We devised an initial way to test our hypothesis using a public lung image dataset. We explain our approach in the following sections.

#### Dataset

We test our system against the public CT Emphysema database (Sorensen et al., 2010). We use 168 square patches manually annotated in a subset of the 115 high-resolution CT (HRCT) slices. As explained in the previous reference, CT scanning was performed using General Electric (GE) equipment (LightSpeed QX/i; GE Medical Systems, Milwaukee, WI, USA) with four detector rows. The acquisition protocol was: in-plane resolution 0.78 × 0.78 mm, slice thickness 1.25 mm, tube voltage 140 kV, and tube current 200 mAs. The slices were reconstructed by using a high-spatial-resolution (bone) algorithm. The data comes from a study group of 39 subjects, including 9 never-smokers, 10 smokers, and 20 smokers with COPD. **Figure 2** shows a sample of each of the three categories of images.

### Pre-processing

All slices were automatically segmented and reviewed to create a mask of only parenchyma tissue. In order to prepare the computational model, we first segmented the pulmonary tissue in the lung patches, using a fixed threshold of −750 HU. Stereological analysis of the lung parenchyma revealed a mean of 500 million alveoli per double lung in the normal population, with a mean alveolar volume of around 4.2 × 10<sup>6</sup> µm<sup>3</sup> and, on average, 170 alveoli per cubic millimeter (Ochs et al., 2004). In our case, with an anisotropic spacing of 0.78 × 0.78 × 1.25 mm<sup>3</sup> , this corresponds to roughly 130 alveoli per voxel. For each voxel of this binary mask we generate a planar grid of the 130 alveoli that is used as a computational mesh.

#### Particle Deposition and Simulation

As detailed in **Figure 2**, once the patch has been segmented, random pixels of the parenchyma and their neighbors are marked as "affected" and, for each one, a new simulation of the AB and FE models is run. Final results are mapped back into the main image patch and the updated mechanical properties calculated by the coupled AB-FE model are linearly translated back into HU values. In this way, we simulate the typical darkening of the CT scan caused by emphysema progression.

#### High Performance Computing

Among many different frameworks available for AB modeling (Abar et al., 2017), we chose to use Pandora (Rubio-Campillo, 2014), for its ease of programming and superb scalability. The model is implemented in an in-house version, specifically modified to allow biological model developments and available online<sup>3</sup> .

<sup>3</sup>https://bitbucket.org/mrceresa/pandora


#### TABLE 2 | Parameters experiments.

Values of the parameters used in the 26−<sup>2</sup> = 16 experiments.

In order to satisfy the high demand in computational resources, we run the simulations on our institution's supercomputing SNOW Linux cluster. The cluster is currently composed by 20 computing nodes and a total of 840 cores with a theoretical calculation capacity of 8.49 Tflops. Highly relevant for agents simulations were six GeForce GTX TITAN X GPU with 12 Gb of memory.

### RESULTS AND DISCUSSION

In this section we present the results of the two main experiments that we have used to validate our implementation. Those experiments were previously discussed in detail respectively in sections Experiment to Characterize Parameter Sensitivity and Experiment to Study the Emphysema Progression in Clinical Images.

#### Parameter Sensitivity and Model Analysis

The results of the Exposure experiment are shown in **Figures 5A,B**. **Figure 5A** illustrates the effect of changing the number of particles inhaled for each simulated smoking exposure and the total time spent smoking. When exposure is zero, the model is able to capture that the tissue should remain healthy no matter how long the simulation runs. However, as both the exposure and total time spent smoking increase, the tissue starts getting damaged, independently on the values of the rate constants. For lower to medium exposures, the implicit stochasticity of the AB model and the variability of the rate constants lead to some the fluctuations of the results in function of the smoking time, but tissue life is always reduced by at least 50%. For higher exposure, tissue damage is irreversible and continues even after smoking cessation is simulated, as shown in **Figure 5B**. These outcomes nicely reflect the fact that smoke frequency and exposure are considered as one of the main risk factors for the development of COPD and emphysema (Yoshida and Tuder, 2007; Liu et al., 2008).

In the model, the mechanical damage largely depends on the regulation of the collagen content, because the stiffness of this macromolecule is two orders of magnitude the stiffness of elastin. The degradation of collagen is heavily affected by TNFα through the recruitment of monocytes (Equation 1) and the activation of macrophages into M1 type (Equation 3) with positive feedback loops generated through IL1 (Equations 3, 7) and TNFα (Equation 6). In contrast, the activation of macrophages into M2 type is promoted by the anti-inflammatory cytokine IL10 (Equation 4), the production rate of which is positively retro-alimented by M2 macrophages (Equation 5). While IL10 inhibits TNF-alpha and IL1 (Equations 6, 7), it is also self-inhibited (Equation 5). Hence, the anti-inflammatory effect of IL10 is limited compared to the strong inflammatory effects of TNFα and IL1, because less positive feedback in favor of the promotion of type M2 activated macrophages. In the Parameters experiment, the exposure and total smoking time parameters were set to respectively 10 particles and 50 time steps to ensure that the system would be in a medium damage situation. Results were all very similar and the tissue life in each time step only varied with an average standard deviation of 0.166 units, revealing that the above interpretation of the model holds true regardless the variation of the rate constants within the considered ranges of values.

According to the analysis of the model equations, the promotion of the anabolic TGFβ (Equation 8) should be limited compared to the promotion of the catabolic MMP (Equation 12), and the persistent inflammation induced by particles should

promote the unequivocate destruction of collagen (Equation 11). Nevertheless, we sometimes saw an increase in the mean amount of collagen. This outcome can be due to the mechanical feedback and mechanical tissue damage that promotes the secretion of TGF-beta and provides additional weight to collagen anabolism (Equations 9, 10). In our model emphysema progression, was indeed related to sustained inflammation that continued after smoking cessation (Willemse et al., 2005), but required the additional effect of DAMPS to relate inflammation, altered tissue turnover and tissue mechanics to cell endothelial death. This phenomenon needs to be further explored in a more mechanistic way, but our approximation of DAMPS effects allows qualitative validation of the simulated mechanisms for emphysema progression against clinical data (see below).

#### Emphysema Progression in Clinical Images

To test whether our model is able to produce images similar to those seen by clinicians, we use the public emphysema database described in section Experiment to Study the Emphysema Progression in Clinical Images. Images of some of the patches representative of the data we used in the experiment are presented in **Figure 6**. Parenchyma destruction in emphysema is strongly associated with decreased HU absorption value in CT images, and many image descriptors are commonly used to (semi-)automatically detect emphysema progression in CT images (Stern and Frank, 1994; Gevenois and Yernault, 1995; Madani et al., 2006). In the present study, emphysema progression is quantified through the well-known Mean Lung Density (MLD) (Heremans et al., 1992).

We quantify all the patches from the database and group by the different degrees of emphysema severity. As it can be seen from **Figure 7**, images with increasing emphysema severity have also a lower MLD score. Differences of more than 40–60 HU between groups (1) and (2), (3) are significant with p-value of less than 0.01.

Once the association between emphysema progression and MLD score is determined, we take all the 69 patches annotated with low or no emphysema affectation and use them as input for our model. Images are quantified with MLD before and after model execution and the results are tested with t-test for statistical significance of the differences.

As we can see in **Figure 8**, there is a statistically significant difference of about 30 HU between the baseline and progression groups with a p-value of less than 0.001. We thus conclude that our implemented model is able to simulate changes that are in agreement with the progression of emphysema in clinical images quantified by MLD.

#### Scaling

The parallelization strategy for the AB part consists in assigning a job for each patch, as they were completely independent from each other, then recursively create a new job per voxel, which is the smaller unit we can parallelize for now. With 69 patches to process and 200 seeds per patch plus their four closest neighbors, this resulted in 69,000 jobs. The AB code uses OpenMP to parallelize the execution of the agents. For the FE part, we use the MPI capability of Elmer solver to partition the mesh and distribute the computation on 5 MPI processes per job. Finally, the coupling between AB and FE is executed in a sequential way with a python script. Most of the code executions for the coupling use libraries for which C code bindings were available (numpy, scipy, and skimage) and, thus, simulations run at almost native speed. During a single job execution, the computational time is taken mostly by the AB model (54%), then by the FE solver (37%) and finally by the coupling part (9%). Each job took about 40 min on the cluster and 1 h on a workstation computer (Intel i7 with 32 GB of RAM). A significant amount of time

of the coupled simulations was spent on writing the files to disk to share data between the solvers. This could be reduced in future works by using faster SSD disk or in-memory access. All jobs would have taken months to be processed sequentially on the abovementioned workstation, but required 8 days on our cluster, by using a maximum of 256 simultaneous jobs. The use of the HPC resulted, therefore, in several orders of magnitude of computational time reduction, making the present study actually feasible. All jobs in the cluster use Sun Grid engine.

affectation is related to the appearance of bigger cluster of low attenuation areas from top to bottom.

### CONCLUSIONS AND FUTURE WORKS

In this paper we conceived, developed and tested a high performance multi-scale agent-based model of lung parenchyma evolution after repeated exposure to solid irritants such as the particles that arrive to the lung while smoking. We modeled the simplified behavior of immune system cells such as alveolar macrophages and neutrophils, and also cells in charge of wound healing mechanisms such as fibroblasts. Finally, the tissue behavior under the forces present in the lung during respiration was modeled using a FE elastic model. An initial analysis of sensitivity of the model to parameter variations confirmed (i) the ability of the model to point out particle inhalations as a major risk factor in emphysema pathogenesis, and (ii) the strong inertia of the catabolic shift of cell activity due to sustained inflammation that resulted in sustained damage to most of the tissue. A preliminary validation of the capacity of the model to cause a significant change was performed against clinical images on 69 cases of a public database of CT images affected by emphysema progression.

To the best of our knowledge, this model advances the state of the art because: (1) it includes a more detailed molecular model of inflammation and tissue remodeling (2) uses a FE solver to calculate the response to mechanical solicitations thus allowing for future extensions where arbitrary complex tissue constitutive

equations could be used. (3) has a bi-directional coupling between AB and FE models (4) exploits HPC technologies so that enough tissue can be simulated to start validating against imaging data (5) uses clinical CT images to perform an initial validation of the capacity of the model. The implementation of a system of coupled ODEs into AB has the great advantage over a well-mixed model to take into account the spatial aspect, and the formation of self-sustaining spatial patterns that affect substantially the equilibrium points of the system (Brown et al., 2011).

The present model has, of course, several limitations. Simplifications were still made in the immune response and the mechanical model. In particular, the relative importance of DAMPS in the validated model suggest that more mechanistic development of this biological phenomenon are necessary. Additionally, while in this implementation of the model we used a 2D mapping between the alveolar exchange surface and the computational grid, in following works we will explore the effect of extending the connectivity of the tissue to 3D. On top of that, we do not account for heterogeneous tissue structures such as airways or blood vessels. However, we plan to do so in a following extension as the relevant information is already present in the CT images used to initialize the model. Effect of the mesh size and topology should be further explored. In a follow-up study we plan to automatically find the best parameters and better characterize the impact of the mesh on the stability of the solution with a convergence study.

Also, the validation is still somehow limited, as no histological comparison with ex-vivo animal models could be performed and the one on CT clinical images is limited to one clinical descriptor, namely the MLD score. As a future work, we are planning a retrospective study with COPD patients with 1 year follow-up. Of course, the real ground truth should be histology, which is unfortunately very difficult to obtain in human subjects. A promising alternative is to use mice models of emphysema.

We suggest that such a model, once properly extended and calibrated with histological and clinical data, could be useful to improve patient classification and prediction of exacerbations and thus contribute to the selection of a personalized therapy.

## AUTHOR CONTRIBUTIONS

MC: conceived the research, designed the model, generated and analyzed the experimental data, validated the implementation and wrote the paper; AO: implemented part of the ABM rules and helped with the collection and analysis of the experimental data; JN: helped with the preparation of the FE model with the writing of the paper and the analysis of the results; MG: oversaw the project and revised the paper. All authors read and approved the final version of the manuscript.

### FUNDING

We gratefully acknowledge the NVIDIA Corporation for the donation of part of the hardware used for this research. This

work was supported by the Spanish Ministry of Economy and Competitiveness (Project INSPIRE FIS2017-89535-C2-2-R, Ramon y Cajal contract RYC-2015-18888, Maria de Maeztu Units

#### REFERENCES


of Excellence Program MDM-2015-0502), and by the QUAES Foundation (Chair QUAES-UPF Computational Technologies for Healthcare).

inflammation: inflammatory macrophages do not die locally, but emigrate to the draining lymph nodes. J. Immunol. 157, 2577–2585.


approaches and two recent extensions. Comput. Methods Appl. Mech. Eng. 314, 473–493. doi: 10.1016/j.cma.2016.08.010


and Engineering Investigations on Protective Artificial Respiration (Berlin; Heidelberg: Springer-Verlag), 1–32.


human osteoblasts on implant materials. Biomaterials 24, 2013–2020. doi: 10.1016/S0142-9612(02)00616-6

Zhao, D., Zhou, Y., Jiang, C., Zhao, Z., He, F., and Ran, P. (2017). Small airway disease: a different phenotype of early stage COPD associated with biomass smoke exposure. Respirology 23, 198–205. doi: 10.1111/resp.13176

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Ceresa, Olivares, Noailly and González Ballester. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Modeling Patient-Specific Magnetic Drug Targeting Within the Intracranial Vasculature

Alexander Patronis 1†, Robin A. Richardson1†, Sebastian Schmieschek 1† , Brian J. N. Wylie<sup>2</sup> , Rupert W. Nash<sup>3</sup> and Peter V. Coveney <sup>1</sup> \*

<sup>1</sup> Centre for Computational Science, University College London, London, United Kingdom, <sup>2</sup> Jülich Supercomputing Centre, Forschungszentrum Jülich, Jülich, Germany, <sup>3</sup> Edinburgh Parallel Computing Centre, University of Edinburgh, Edinburgh, United Kingdom

#### Edited by:

Timothy W. Secomb, University of Arizona, United States

#### Reviewed by:

Jacopo Biasetti, Johns Hopkins University, United States Zhihui Wang, The University of Texas at Austin, United States

#### \*Correspondence:

Peter V. Coveney p.v.coveney@ucl.ac.uk

†These authors have contributed equally to this work.

#### Specialty section:

This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology

Received: 12 December 2017 Accepted: 16 March 2018 Published: 19 April 2018

#### Citation:

Patronis A, Richardson RA, Schmieschek S, Wylie BJN, Nash RW and Coveney PV (2018) Modeling Patient-Specific Magnetic Drug Targeting Within the Intracranial Vasculature. Front. Physiol. 9:331. doi: 10.3389/fphys.2018.00331 Drug targeting promises to substantially enhance future therapies, for example through the focussing of chemotherapeutic drugs at the site of a tumor, thus reducing the exposure of healthy tissue to unwanted damage. Promising work on the steering of medication in the human body employs magnetic fields acting on nanoparticles made of paramagnetic materials. We develop a computational tool to aid in the optimization of the physical parameters of these particles and the magnetic configuration, estimating the fraction of particles reaching a given target site in a large patient-specific vascular system for different physiological states (heart rate, cardiac output, etc.). We demonstrate the excellent computational performance of our model by its application to the simulation of paramagnetic-nanoparticle-laden flows in a circle of Willis geometry obtained from an MRI scan. The results suggest a strong dependence of the particle density at the target site on the strength of the magnetic forcing and the velocity of the background fluid flow.

Keywords: magnetic drug targeting, particle suspension, blood flow, lattice-Boltzmann method, multiscale, HemeLB

### 1. INTRODUCTION

The accurate targeting of drugs toward specific regions of the human body promises to enhance future therapies and improve patient quality of life. The adverse effects of medications, such as those caused by chemotherapeutic drugs, may be minimized, while lower dosage requirements may decrease costs (Torchilin, 2000).

Drug targeting can be classified by the means as well as the level at which it is performed (Schleich et al., 2014). Viable mechanisms to enhance selective absorption include, but are not limited to, control of particle (drug carrier) size, addition of biochemical markers to drug carriers, and release of drug payloads within magnetized particles guided by external magnetic fields. Depending on the method employed, the term drug target may designate a certain type of tissue, specific cell type, or a location in space, such as the site of a tumor (Lockman et al., 2002).

Advances in technology have facilitated the production of micro- and nano-structures with great precision (Champion et al., 2007). In addition to the spherical particle carriers used in early experiments, state-of-the-art drug delivery systems incorporate bundles of nanotubes to encase biochemically active components. Such carrier structures can be designed to various specifications (Berry and Curtis, 2003; Tartaj et al., 2003), while a viable compromise between competing requirements may need to be found. For example, larger magnetic particles with micrometre radii are easier to manipulate via external fields, as the forces acting on them are proportional to their volume. On the other hand, the use of smaller particles (with dimensions of order tens of nanometres) has been found to enhance bioavailability and drug lifetime in vivo (Pankhurst et al., 2003; Nacev et al., 2012). Furthermore, the emergence of super-paramagnetic behavior, a finite-size effect that occurs for particle sizes below ∼40 nm (Ulbrich et al., 2016), can substantially increase magnetic susceptibility, and hence enhance the response of particles to an external magnetic field. The use of such nanoparticles has received much attention in recent years, and the purpose of this paper is to report on the simulation of these, so as to inform on their design and aid future efforts.

The optimization of carriers and functionalization for drug targeting typically involves in vivo experiments and the immolation of animals. In this context, computational models can help to reduce the experimentation required. Within personalized medicine, the simulation, ahead of treatment, of magnetized particle suspensions in patient-specific geometries of vasculature derived from medical imaging data, would permit the selection of magnetic fields to control drug targeting.

There is significant interest in using magnetic drug targeting (MDT) for the treatment of diseases such as cancer (Tietze et al., 2012), due to the need to maximize damage to tumor cells (via the injection of highly toxic chemotherapeutic drugs) while keeping the exposure to healthy tissue in the remainder of a patient's body within tolerable levels. There have been several preclinical studies (Lübbe et al., 1996a; Goodwin et al., 1999; Alexiou et al., 2000), with a phase I clinical human trial carried out by Lübbe et al. using a single permanent magnet to concentrate epidoxorubicincoated magnetic nanoparticles within shallow, inoperable tumors (Lübbe et al., 1996b, 2001), but with a number of issues identified (Shapiro et al., 2015). A major goal of MDT is to reach targets (e.g. tumors) deeper within the body, but different locations can require very different magnetic nanoparticle properties. In vitro experiments with flow phantoms can be used to determine the behavior of magnetic nanoparticles with different physiological and physical parameters (Radon et al., 2017). Simulation work by Nacev et al. suggests the use of a feedback control algorithm that modifies the applied magnetic field based on accurate realtime information on the distribution of particles (in principle obtainable from imaging) to focus the particles (on average) at a particular site (Nacev et al., 2012).

To be of most value in real world systems, MDT simulations must include a range of physical phenomena. Furthermore, so as to be able to resolve processes on relevant time and length scales, the simulation tools used must be computationally efficient. The ideal model would account for the mechanical properties of vessel walls, the complex rheological behavior of blood and its particulate nature, external magnetic fields and gravity etc. However, careful evaluation and control of the errors arising from different modeling assumptions and simplifications should enable reduced (and computationally efficient) models to be used with accuracy and reliability in clinical decision support. Moreover, multiscale models can inform coarse grained parametrization by quantifying effective parameter values.

There has been considerable development of models for MDT, focussing on the various scales and features of interest. Significant effort has been expended in modeling the MDT-relevant properties of the nanoparticle cores themselves (Winkler, 2017), e.g. through the use of the generalized finite element method (Plaks et al., 2003). The behavior of such nanoparticles in blood flow through simplified geometries has been explored using computational fluid dynamics (CFD) techniques such as the lattice-Boltzmann method (LBM) (in a simple channel) (Kandelousi and Ellahi, 2015), or the finite volume method (in a vessel bifurcation) (Larimi et al., 2014). Kenjereš and Righolt (2012) apply the conservation equations of mass and momentum (with an additional model describing a very dilute particle phase) for the simulation of blood flows carrying magnetic drug particles. Rukshin et al. modeled the motion of super-paramagnetic nanoparticles in a Poiseuille flow under the influence of an external magnet, taking into account the effects of Brownian motion and interactions with red blood cells, to determine particle arrival at the designated tumor site (found to depend dominantly on particle size, Rukshin et al., 2017).

In this work we aim to tackle comparatively much larger systems, with the exemplar case of a patient-specific vascular system (the circle of Willis) in a three-dimensional vascular system, concerning ourselves with determining the fraction of injected particles that reach a defined target site under varying physical parameters (of the nanoparticles) and physiological states (of the patient). We do not consider absorption into tissue at the target site, magnetically induced heating, biochemical reactions, or any other aspects specific to local treatment. Our strategy for the simulation of such a system relies on the LBM, which boasts extreme efficiency on massively parallel architectures, i.e. utilizing many compute units in an efficient manner (in section 4.2.1 we demonstrate strong scaling to approximately 100,000 cores). Through exploitation of its outstanding parallel performance, we use the LBM to reach a new level of understanding.

In this article we report on the integration of paramagnetic particles into HemeLB, an open-source lattice-Boltzmann code that is optimized for the large-scale simulation of sparse geometries on high performance computing resources (Mazzeo and Coveney, 2008). HemeLB is used for blood flow analysis (Bernabeu et al., 2013; Nash et al., 2014), and has been applied to gain insight into angiogenesis (Bernabeu et al., 2014) and vascular flow under different boundary conditions (Itani et al., 2015). Here, we assess the potential of HemeLB to evaluate magnetic drug targeting strategies in the context of personalized medicine. We develop, implement and validate a model for the simulation of magnetic particles in the circle of Willis, the central blood distribution system in the brain.

### 2. MATERIALS AND METHODS

### 2.1. Blood Flow by the Lattice-Boltzmann Method

We simulate the flow of blood by the lattice-Boltzmann method (LBM), and assume incompressible flow at low Mach numbers. Our current approach approximates blood as a Newtonian fluid at a characteristic viscosity; for the systems presented herein, this provides a good approximation, and minimizes computational effort. Note that HemeLB allows for the simulation of non-Newtonian behavior, which may be used in conjunction with the particle model (Bernabeu et al., 2013).

The lattice-Boltzmann method describes fluid dynamics via a mesoscale approach. This replaces the single-particle distribution function f(**x**,**c**, t) (at a position **x**, continuous velocity **c**, and time t) of the Boltzmann equation with a distribution function fi(**x**, t), where velocity space is reduced to a discrete set {**c**i}. After discretization in space and time, we have the lattice-Boltzmann equation (LBE),

$$f\_i(\mathbf{x} + \mathbf{c}\_i \delta\_t, \ t + \delta\_t) - f\_i(\mathbf{x}, t) = -\Omega\_i(f\_i(\mathbf{x}, t), f\_i^0(\mathbf{x}, t)) + \delta\_t \mathcal{F}\_i(\mathbf{x}, t) \tag{1}$$

which describes the evolution of f<sup>i</sup> by the streaming (lefthand terms) and collision terms. The last term in Equation (1) reproduces the effects of a hydrodynamic body force. Time is incremented by δ<sup>t</sup> during each propagation step, and the discrete equilibrium distribution function f 0 approximates the Maxwell-Boltzmann equilibrium distribution function to second order. The full derivation of the second-order accurate integration scheme for the forced LBE can be found in Nash et al. (2008).

Like the Bhatnagar-Gross-Krook (BGK) model of kinetic theory, the lattice Bhatnagar-Gross-Krook (LBGK) model describes particle collisions as a relaxation toward a local equilibrium, i.e.

$$
\Omega\_i = \frac{1}{\pi} \left[ f\_i - f\_i^0 \right] \tag{2}
$$

Herein, relaxation toward equilibrium on a single time scale τ is assumed. It can be shown that this approach approximates the Navier-Stokes equations (NSE) to second order (Qian and Orszag, 1993). For the purposes of this study, the LBGK collision model is used exclusively due to its simplicity.

#### 2.1.1. Parametrization and Scaling

The lattice-Boltzmann method, as presented here, is athermal. The equation of state for a single fluid component, analogous to that of an ideal gas, relates the pressure to the lattice density ρ: p = ρc 2 s . The lattice speed of sound c<sup>s</sup> for D3Q19, the threedimensional 19 velocity lattice, which is used throughout, is equal to 1/ √ 3. The simulation parameters δ<sup>x</sup> (spatial discretization, i.e. the lattice spacing), δ<sup>t</sup> (temporal discretization, i.e. the time-step length), and δ<sup>m</sup> (the lattice mass) scale length, time and mass, respectively, such that the physical speed of sound is equal to csδx/δ<sup>t</sup> and energy is non-dimensionalized by

$$
\delta\_m \cdot \delta\_\mathbf{x}^2 \cdot \delta\_t^{-2} \tag{3}
$$

Despite the athermal nature of the fluid model (by the LBM, which can be extended to give a thermal lattice-Boltzmann model), thermal energy kBT (where k<sup>B</sup> is the Boltzmann constant and T is temperature) is considered in the calculation of a noise term, to be discussed in section 2.2, emulating the Brownian motion of particles (specifically, kBT appears in our calculation of particle diffusion by the Stokes-Einstein equation). True to the parametrization of blood flow we choose a temperature of 310.15 K or 37 ◦C.

To ensure consistent viscous behavior for a given set of scaling parameters, the dynamic viscosity

$$
\mu = 0.004 \text{ Pa s} \tag{4}
$$

and density of blood plasma

$$
\rho\_b = 1000 \,\mathrm{kg} \,\mathrm{m}^{-3} \tag{5}
$$

are used to calculate relaxation parameters for the collision process. Note that, strictly speaking, µ is a function of the hematocrit (Pries et al., 1992). The lattice (kinematic) viscosity ν is related to the relaxation time τ by

$$\upsilon = c\_s^2 \left( \tau - \frac{\delta\_l}{2} \right) \text{ or, in our case, } \upsilon = \frac{1}{3} \left( \tau - \frac{1}{2} \right) \tag{6}$$

For numerical stability, the viscosity must be sufficiently large, i.e. τ > 0.5 (the limit of inviscid flow). In addition to this, the flow velocity must remain low relative to the speed of sound. We impose the Mach number limit Ma = u/c 2 <sup>s</sup> < 1/30, corresponding to a maximum velocity of umax ≈ 0.02 in lattice units.

### 2.2. Magnetized Particles

Our strategy for the computationally-efficient simulation of paramagnetic particles suspended in blood combines an approach for the simulation of point-like particles (accounting for particle-fluid interaction) with a dipolar model. This pairing enables users of HemeLB, including clinicians and medical scientists, to study the efficacy of magnetic nanoparticles as a drug delivery system under the influence of an external magnetic field. We are particularly interested in understanding how such particles can be directed to problem sites, e.g. to the location of an inaccessible (by invasive procedures) tumor.

#### 2.2.1. Model for Suspended Particles

Our approach for the simulation of dilute suspensions, with particle sizes that are orders of magnitude smaller than the lattice spacing δx, was developed with computational efficiency in mind; we aim to inform clinical decision-making, a time-critical process. The model is parameterized by particle radius a, position **x**<sup>p</sup> and velocity **u**p. An efficient coupling mechanism is employed by neglecting particle inertia.

We list the source of forces that can be, by our implementation, applied to a paramagnetic particle (if, for a particular configuration, a forcing mechanism has a negligible impact on particle dynamics, it is deactivated to minimize computational effort): (1) a constant gravitational field; (2) hydrodynamic (Stokes') drag, due to the viscosity of the fluid (blood); (3) a (generally attractive) magnetic force due to paramagnetism; (4) a lubrication force, introduced to satisfy the wall-boundary condition on vessel walls and prevent the overlap of interacting particles; and (5) a stochastic force **F**<sup>R</sup> (Brownian noise). For a paramagnetic particle under the action of these forces, we obtain (by balance of forces) the following for its motion:

$$m\dot{u}\_{\mathcal{P}} = -6\pi\,\mu a[\omega\_{\mathcal{P}} - \nu(\mathfrak{x}\_{\mathcal{P}})] + \mathcal{F} + \mathcal{F}\_{\mathcal{R}} \tag{7}$$

where **F** is the combined sum of forces 1, 3, and 4 (excepting drag and **F**R), and **v** is the (interpolated) fluid velocity at the location **x**p. By neglecting particle inertia, the left-hand side vanishes, and the hydrodynamic drag must balance the external forces on the particle. With the mobility β = 1/(6πµa), the motion of a non-inertial particle in a dilute suspension can be expressed as

$$
\mu\_{\mathbb{P}} = \nu(\chi\_{\mathbb{P}}) + \beta(\mathbb{F} + \mathcal{F}\_{\mathbb{R}}) \tag{8}
$$

which is dependent on the interpolated fluid velocity **v**(xp) and the associated force terms. Note, in Equation (8), the effects of Brownian noise are only introduced through **F**<sup>R</sup> – the fluid velocity **v** is deterministic. Noise is computed (by applying the fluctuation—dissipation theorem) to model the effects of Brownian motion.

In general, where the particle size is large relative to the lattice spacing δx, a correction to the radius of the particle is required (Ladd, 1994; Nguyen and Ladd, 2002). Because we restrict our attention to the simulation of particles that are much smaller than the lattice spacing δ<sup>x</sup> (the largest radius we consider is 0.5 µm with δ<sup>x</sup> = 25 µm), we do not concern ourselves with the calculation of this correction. We similarly neglect the Faxén contributions in the particle equation of motion (Boivin et al., 1998; Horwitz and Mani, 2016), Equation (8) (discussed in section 5).

#### 2.2.2. Dipolar Model

Since the calculation of inter-particle interactions can be costly, we exploit the dilute approximation and employ a simple dipolar model (DM) (Yung et al., 1998; Du and Biswal, 2014) to determine the (attractive) magnetic force between particles (dipoles) i and j, which we assume to be identical. The force on particle i due to particle j is

$$\mathbf{F\_M} = \frac{3\mu\_0}{4\pi r^5} \left[ (m\_i \cdot r\_{ij})m\_j + (m\_j \cdot r\_{ij})m\_i + (m\_i \cdot m\_j)r\_{ij} \right]$$

$$-5r^{-2}(m\_i \cdot r\_{ij})(m\_j \cdot r\_{ij})r\_{ij}$$

where µ<sup>0</sup> is the permeability, **r**ij is the connecting vector from j to i, and **m**<sup>i</sup> = 4πa <sup>3</sup>χv**H**/3 (and similarly forj). Note that we neglect variations in the magnetic field **H** over the size of a particle, and that χ<sup>v</sup> is the effective volumetric susceptibility. We calculate **H** at the position of the interaction by Yung et al. (1998)

$$\mathbf{H} = \frac{1}{4\pi} \left[ (m\_0 \cdot r\_0) \frac{3r\_0}{r\_0^5} - \frac{m\_0}{r\_0^3} \right] \tag{9}$$

where **m**<sup>0</sup> is the magnetic moment of a permanent magnet, and **r**<sup>0</sup> is the vector connecting the magnet and a particle. For the results presented in section 4, **m**<sup>0</sup> is imposed in the x-direction, i.e. perpendicular to the sagittal plane (see **Figure 2**). Equation (9) also gives the force exerted by the magnet on a particle.

We demonstrate the effects of this model by following the trajectories of 5 paramagnetic particles in a three-dimensional Poiseuille flow, as shown in **Figure 1**. A permanent magnet (on the yz-plane passing through the center of the vessel) is placed 0.0022 mm from the centerline. A magnetic moment of **m**<sup>0</sup> = {0.0, 3000.0, 0.0}A m<sup>2</sup> is imposed. The pressure at the inlet (at z = 0) is 0.01 mmHg or 1.33 Pa, resulting in a pressure gradient of 103.9 Pa m−1. Initially, the evenly-spaced particles follow the pressure-induced flow, with the particle on the centerline at maximum (flow) velocity. As they approach the magnet, the particles experience a significant force that disrupts their motion; the particle closest to the magnet (i.e. the outermost) is significantly affected, with its streamwise velocity reduced such that it remains near to the wall of the vessel for a considerable time (relative to the other particles). Because the force exerted by the magnet on the particles is larger than the force experienced between particles (owing to paramagnetism), we do not see the trajectories of the particles converge. Note that to avoid divergence of the attractive forces, a lubrication force between particles is applied, ensuring that particles do not overlap.

#### 2.2.3. Lubrication Forces

The wall-boundary interaction of particles is modeled by a lubrication force (ten Cate et al., 2002)

$$\mathbf{F}\_{\rm L} = 6\pi \,\mu a^2 (\mu\_{\rm p} \cdot \hat{\mathbf{r}}\_{\rm w}) \left[ \frac{1}{h} - \frac{1}{h\_e} \right] \tag{10}$$

with the particle-wall separation h = k**r**wk −a (**r**<sup>w</sup> is the particleto-wall vector), a cut-off distance h<sup>e</sup> (for numerical efficiency, and dependent on the strength of interactions), and the velocity of the particle **u**p. In ten Cate et al. (2002), the force from Equation (10) is compared to experimental data. In section 3.1, our implementation of the boundary condition is validated by comparison with the analytical predictions of Maude (1961).

The lubrication force between two identical particles is similarly given by Nguyen and Ladd (2002)

$$\mathcal{F}\_{\mathcal{L}} = \frac{6\pi}{4} \mu a^2 (\mu\_{ij} \cdot \hat{r}\_{ij}) \left[ \frac{1}{h} - \frac{1}{h\_e} \right] \tag{11}$$

with the relative velocity between particles **u**ij = **u**<sup>i</sup> − **u**<sup>j</sup> , the separation between particles h = k**r**ijk − 2a, and a cut-off distance he, which is not necessarily equal in value to that used for particle-wall lubrication.

#### 2.3. Flow Geometry

Acting as the central blood distribution system in the brain, the circle of Willis (coW) connects the inflow from the basilar and internal carotid arteries to the cerebral arteries via a circular system closed by communicating arteries. Studies have found considerable variation in the structure of this system (Kayembe et al., 1984; Eftekhar et al., 2006). Its inherent redundancy allows it to function despite the presence of deformed or missing subsystems.

**Figure 2** depicts a volume rendering of the structure of a complete coW (with lateral dimensions of order cm), obtained from a magnetic resonance imaging (MRI) scan. For details on

FIGURE 1 | Trajectories of five paramagnetic nanoparticles (initially placed at the inlet of a three-dimensional Poiseuille flow) as they approach a permanent magnet that is external to the flow (represented by a circle). Deviation from the pressure-induced flow occurs once the magnetic attraction experienced by the particles is sufficiently large; the magnetic field is imposed in the y-direction (indicated by the arrow). The coloring of the trajectories represents the evolution of time. The force exerted by the magnet on each particle far exceeds that experienced between particles; hence, the particles do not converge.

the generation of this particular geometry, see Coogan et al. (2013). The geometry is used exclusively throughout, and is prepared for use by HemeLB using Palabos' (http://www.palabos. org) fully-parallelized voxelizer (indispensable when voxelizing large geometries with billions of lattice sites); our "common vascular pipeline" allows HemeLB and Palabos to share the same pre-processing workflow.

**Table 1** lists the names of the modeled arteries with the boundary conditions employed. Boundary conditions at the inlet are approximated by a parabolic flow profile with a maximum flow speed informed by a 1D Navier-Stokes simulation (performed using PyNS, Manini et al., 2015) of the complete



The inlet boundaries 1, 2, and 3 (see text for details) are parameterized by 1D Navier-Stokes solutions for the full arterial network (Itani et al., 2015; Manini et al., 2015). At the outlet boundaries 4–9, a vanishing pressure gradient is enforced approximating constant pressure. The communicating arteries (10, 11, and 12) close the circular structure; no boundary conditions are applied to the limits of these arteries. The numbering of the inlet/outlet is to be cross-referenced with Figure 2, which shows the full circle of Willis.

arterial network. The maximum velocity observed in the left internal carotid artery is umax ≈ 0.63 m s−<sup>1</sup> (see **Figure 3**). This value, in conjunction with the stability requirements introduced in section 2.1.1 and the spatial discretization δ<sup>x</sup> = 25 µm (resulting in a simulation domain of 1.66 × 10<sup>8</sup> lattice sites), leads to a time-step of 7.8 × 10−<sup>7</sup> s. We use this lattice spacing (δ<sup>x</sup> = 25 µm) throughout, with the exception of section 4.2.1, where we use δ<sup>x</sup> = 15 µm to produce approximately 7.77 × 10<sup>8</sup> lattice sites for our assessment of application scalability. Outlet boundary conditions assume a vanishing pressure gradient.

obtained by assigning weighting factors (of the peak velocity) to lattice sites that lie on the boundaries.

### 3. IMPLEMENTATION AND VALIDATION

HemeLB is a lattice-Boltzmann implementation optimized for the simulation of sparse geometries by means of indirect addressing of lattice sites. The code is written in C++ and makes use of static polymorphism to allow the efficient selection of different lattice discretizations, collision models and boundary conditions. Parallelization is implemented via MPI. The HemeLB application relies on several external libraries for standardized tasks, such as XML processing, domain decomposition and unit testing (Groen et al., 2013). External tools are available for the creation of input files (including the previously mentioned voxelizer) and the post-processing and evaluation of extracted data. The code is open-source, licensed under the GNU Lesser Public License (LGPL), and is available at https://github.com/UCL/ hemelb.

HemeLB supports D3Q15, D3Q19, and D3Q27 lattice discretizations, that is three dimensions comprising Q discrete lattice velocities; in this work we limit ourselves to D3Q19. Collision processes can be modeled either by the lattice Bhatnager-Gross-Krook (LBGK) scheme (as is the case in this work), relying on a single relaxation time, or by invoking a multi relaxation time (MRT) model. Furthermore a non-Newtonian approximation of a shear thinning fluid is available. The code supports various wall boundary conditions, including simple bounce-back, Guo-Zheng-Shi (Guo et al., 2002), Bouzidi-Firdaouss-Lallemand (BFL) (Bouzidi et al., 2001) (used exclusively, for its superior accuracy, in this work) and Junk and Yang (2005) (see Nash et al., 2014 for discussion of these).

**Figure 4** illustrates the algorithm which implements the paramagnetic particle model. After the LBM lattice velocity update, the particle update procedure begins. Firstly, particles are communicated between ranks; a particle is only communicated if (by the update of its position at the end of the previous step) it has moved to another rank, or its 3D Moore neighborhood spans multiple ranks (so that the interpolation of the fluid velocity can occur correctly; we refer to these as ghost particles). Once particles have been communicated, we zero the force on each and accumulate the new value as the sum of any external forces. As the fluid velocity is only calculated at lattice sites, interpolation is used to find **v** at **x**p, as required by Equation (8). When mass and volume loading are sufficient (Birzer et al., 2012), the influence of the particles on the flow cannot be neglected. In this case, we enable two-way coupling and the forces exerted on the fluid by locally owned particles are then interpolated onto local lattice sites. The memory of particle momentum is carried by the fluid model, allowing the computational cost to be dramatically reduced (Ahlrichs and Dünweg, 1999; Nash et al., 2008).

### 3.1. Lubrication Boundary Condition

Wall-boundary conditions for the point-like particle model are implemented by introducing an additional force, Equation (10). We use a constant body force to drive monodisperse particles (of radii a = 25 nm and a = 500 nm) into a wall that is perpendicular to the instantaneous direction of motion. We record the resulting lubrication force experienced by each particle. **Figure 5** shows the lubrication force imposed by the boundary condition as a function of the separation h (the distance of the particle to the wall). The measured lubrication force **F**<sup>L</sup> and h are non-dimensionalized by the drag force **F**<sup>0</sup> = 6πµa**u**<sup>p</sup> and the particle radius a, respectively. A theoretical expression for the lubrication force,

$$\mathbf{F}\_{\rm L} = \mathbf{F}\_0 \left( \frac{9}{8} \frac{a}{h} + 1 \right) \tag{12}$$

has been formulated by Maude (1961). For verification, we compare this to the simulated **F**L. As can be seen in **Figure 5**, the lubrication boundary condition approximates the theory well. The observed deviations are a result of the finite size of the simulation time-step, and the particle's non-continuous motion.

### 3.2. Inter-Particle Interactions in an External Magnetic Field

The dipolar model (DM) is evaluated by comparison of the simulated interaction force (obtained from Equation 9 as implemented in HemeLB) between two identical paramagnetic particles (oriented parallel and perpendicular to a constant external field) with solutions of the Laplace equation. **Figure 6** clearly illustrates the isotropy of the approximation of the DM, which neglects contributions of the particle orientation. Note, the force **F**<sup>M</sup> is normalized by the force encountered for touching particles of separation h = 2a; we refer to this maximum force as **F**0. As expected, the error increases as h/2a → 1. For separations exceeding h = 3a the approximation becomes more accurate, to

within a few percent of the analytical solution. As h is increased further, we observe excellent agreement between the simulated result and theory. As our model requires the suspension to be dilute, the latter case, where h > 3a, will be most likely.

## 4. RESULTS

In this section we present two simulations of paramagnetic particles suspended in blood while circulating in the circle of Willis: (1) a permanent magnet, assumed to be a pure dipole with **m**<sup>0</sup> = {3000, 0.0, 0.0}A m<sup>2</sup> , is held at a distance of 3 cm from the geometric center of the circle of Willis (shown in **Figure 2**), causing the particles to experience an attractive force that brings them together and toward the external magnet (source of the magnetization); (2) the magnet is removed and no attraction exists between any dipoles (paramagnetic particles). In both of these simulations, all other body forces listed in section 2.2.1 are active. The captured flow will first be presented, with illustrations revealing the behavior of particles through the coW, followed by an analysis of the computational performance of HemeLB when simulating such flows.

### 4.1. Simulations of Paramagnetic Particle Suspensions

**Figure 7** shows the transport of nanoparticles through the circle of Willis; initially, particle positions are randomly distributed (without overlap) within a sphere (colored orange in **Figure 7**, and shown only for illustrative purposes; it is not present in the simulation) at inlet 2 of **Figure 2**. Particles are colored by the x component of the magnetic force they experience as they travel. In **Figure 7**, the cyan sphere represents the permanent magnet that is responsible for the magnet field (with magnetic moment **m**<sup>0</sup> = {3000.0, 0.0, 0.0}A m<sup>2</sup> ). The region of interest (RoI), colored pink, is a three-dimensional volume that we are attempting to target (e.g. the site of a tumor) using the nanoparticles. We simulate three cases, varying particle radius a (= 65, 105, and 500 nm) to study the efficacy of the magnet to direct the paramagnetic particles toward a site. Note that although particles are monodispersed (i.e. all of the same size) in all reported simulations, our method fully supports polydispersity (to be exploited in future studies). The visualizations shown here are for a = 65 nm, but particles are not shown to scale.

**Figure 8** presents a comparison of the magnetic force experienced by particles of radius a = 65 nm (top) and a = 500 nm (bottom) at 0.3549 s (smallest and largest radii considered). The maximum force in the case of a = 65 nm is **F**<sup>M</sup> = −1.144 × 10−<sup>6</sup> N, whereas the maximum force in the case of a = 500 nm is **F**<sup>M</sup> = −5.928 × 10−<sup>4</sup> N; two orders of magnitude separate the maximum force observed in these cases.

Beyond the small region shown in **Figure 7**, the particles continue to travel through the circle of Willis before exiting through the left anterior cerebral artery (outlet 4), the left middle cerebral artery (outlet 6), and the posterior cerebral artery (outlet 8). **Figure 9** shows the progress of the nanoparticles as they approach the outlets; particles are colored by their velocities. These results demonstrate that we are able to simulate tens of thousands of particles in complex (and sparse) geometries.

#### 4.2. Computational Performance

The strengths of the LBM, in regards to clinical simulation, lie in three key areas: pre-processing, parallel efficiency of simulation (to be discussed in detail in the following), and predictability of time-to-solution.

As a contributor to the time-to-solution, the time required to prepare a geometry for simulation must be factored into the cost of a simulation. Generally speaking, traditional CFD relies on an unstructured-mesh generation procedure to produce a discrete representation of a geometry; complex geometries tend to require high levels of user intervention and considerable CPU time to ensure mesh quality. In comparison, preparation of a geometry for simulation by the LBM requires it to be

FIGURE 6 | Non-dimensionalized forces acting on pairs of particles oriented parallel (according to theory and the simulation ) or orthogonal (according to theory \* and the simulation ) to a homogeneous magnetic field. Our simple dipolar model assumes the field is undisturbed by the inter-particle interaction. The validity of this simplification can be justified by considering the disparity in time scales of hydrodynamic and magnetic interactions (the latter can be assumed to occur instantaneously). As expected, the deviation caused by neglect of the rotational contribution is most pronounced as h/2a → 1, where a is the particle radius (of monodispersed particles), and h is the separation between interacting particles.

voxelized: a relatively rapid and simple process that requires little to no user interaction, and only a small fraction of the time-to-solution (since only structured grids are produced). As mentioned previously, we make use of Palabos' voxelization procedure. We use a lattice spacing δ<sup>x</sup> = 25 µm to showcase the capabilities of the drug targeting model, but in practice significantly higher resolution may be required to meet stringent clinical and regulatory standards (e.g. decreasing lattice spacing from 25 to 12 µm results in approximately a 9-fold increase in lattice sites); we benefit greatly from the relative simplicity of voxelization in such instances. Furthermore, because the computational intensity of LBM is predictable (i.e. the variance in the wall-clock time to complete a time-step is minimal), the time-to-solution can be estimated with a high degree of certainty.

Since the LBM is highly parallelizable (and because HemeLB boasts good performance characteristics relative to other codes, as reported in Groen et al., 2013), we have been able to successfully simulate systems consisting of over 1.5 × 10<sup>9</sup> lattice sites on meaningful time-scales (sufficiently long for most of the particles to have evacuated the geometry), i.e. three cardiac cycles in the case of a resting patient with a heart rate of 68 bpm (using 5,600 ranks of Blue Waters, a petascale supercomputer). In the following section, we present a scalability study of HemeLB using a case consisting of 7.77 × 10<sup>8</sup> lattice sites.

#### 4.2.1. Scalability

We demonstrate that our memory-optimized version of HemeLB is capable of efficiently simulating large problems on hundreds of thousands of cores, highlighting its potential on petaflops (and beyond) computers; the large-scale simulation of the human arterial tree requires such performance (Grinberg et al., 2009). Our efforts to reduce the memory footprint of the Initialize phase (involves the reading of input files, the decomposition of the domain over multiple ranks, and the creation of large data structures that the Simulate phase operates on) have allowed for the simulation of flow problems consisting of O(10<sup>9</sup> ) lattice sites on Blue Waters. Further work is needed to initialize problems with tens of billions of lattice sites.

Strong scalability of HemeLB (without any particles present, since scalability would be strongly affected by the potential load imbalance caused by the varying distribution of particles) was investigated with the coW15 (15 µm resolution) circle of Willis dataset with 7.77 × 10<sup>8</sup> lattice sites, executed on the ARCHER Cray XC30 system and built using system GCC 5.1.0 compilers.

ARCHER has dual 12-core Intel Xeon E5-2697v2 (Ivy Bridge) 2.7 GHz processors joined by two QPI links, connected via proprietary Cray Aries interconnect in a dragonfly topology. Some compute nodes have 128 GB of shared memory; however, most have only 64 GB. Executions were performed using fullypopulated compute nodes, i.e. each node is assigned 24 MPI ranks (one process per core).

The substantial memory requirements of HemeLB with the coW15 test case meant that the smallest configuration required 125 compute nodes (3,000 MPI processes), and progressively larger configurations were run with up to 4,000 compute nodes (96,000 MPI processes). Ten thousand simulation timesteps were executed with periodic writing of the simulation data disabled to reduce variability. The simulation wallclock execution time and speed-up relative to the smallest execution configuration are shown in **Figure 10**. Almost a 20 fold speed-up is obtained using 4,000 compute nodes, with 80 % parallel efficiency up to 2,000 compute nodes. Note, by exploiting Streaming SIMD Extensions (SSE), which HemeLB

fully supports, we observe a significant ∼15% reduction in simulation time.

Performance auditing of HemeLB was done with the opensource Scalasca tool-set (Geimer et al., 2010) for scalable performance analysis of large-scale parallel application executions. Scalasca 2.3.1 with the community-developed Score-P 3.1 instrumentation and measurement infrastructure was used on ARCHER. An instrumented version of HemeLB was prepared with only the main application program and SimulationMaster class selectively instrumented by the GCC compiler, and combined with MPI library interposition. Profiles generated from measured executions

orange sphere, in which particle positions are randomly distributed. The pink volume (internal to the coW) represents some region of interest, e.g. a site requiring therapeutic attention, to which we force particles by virtue of the magnetic field; we record the instantaneous particle count in this region. (A) Particle positions at 0.468 s. (B) Particle positions at 0.546 s.

were post-processed to derive additional metrics and interactively examined using the Scalasca analysis report explorer.

While the Initialize phase of a simulation (when simulation configuration and domain decomposition occurs) requires a roughly constant time to load and distribute the dataset, our primary focus is on the Simulate phase (when time-stepping is performed) with its 10,000 time-steps. Also MPI rank 0, which monitors the execution and does not process any part of the simulation data, could be excluded.

A breakdown of the Simulate phase CPU time for each execution configuration is shown in **Figure 11**, along with associated efficiencies. There is a negligible amount of MPI collective communication, and the amount of non-blocking point-to-point communication for data exchange decreases in proportion to computation time. Therefore communication efficiency remains above 0.89. Load balance, however, starts at 0.86 and progressively deteriorates to 0.76, such that the overall parallel efficiency degrades to 0.72 using 96,000 cores. This computational load imbalance will be addressed in future optimization work.

#### 4.2.2. Load Balance

As stated in the previous section, the distribution of particles affects the load balance. Here, we analyse the imbalance during various stages of a full-scale simulation with δ<sup>x</sup> = 25 µm on 350 nodes (5,600 cores) of Blue Waters, a petascale supercomputer at the National Centre for Supercomputing Applications (NCSA). **Figure 12** presents the performance of HemeLB under a simulation of 73,215 nanoparticles injected through the left and right internal carotid arteries, and the basilar artery (all three inlets to the circle of Willis, as shown in **Figure 2**). Load imbalance due to the accumulation of particles on few ranks (as seen in frame a of the figure) results in an average of 33.4 time-steps per second. As the simulation progresses, and particles become more uniformly distributed across ranks (as seen in frames c and d), the code achieves approximately 37.5 time-steps per second. The same system containing no particles runs at an average of 39 time-steps per second. For comparison, from **Figure 10** the code is capable of 23 time-steps per second when δ<sup>x</sup> = 15 µm (and no particles are present) on 250 nodes (6,000 cores) of ARCHER; on 96,000 cores, we compute 232 time-steps per second. Note that because no particles are present in the system, there is no overhead associated with file output. Therefore, in the case presented, the performance degradation is, even in the worst case of load imbalance (33.4 steps per second), not particularly severe.

## 5. DISCUSSION

The application of our magnetic drug targeting model to a patient-specific geometry has allowed us to explore the relevance of various physical properties and design parameters to the manipulation of paramagnetic iron oxide nanoparticles in cerebral blood flow. The physiological environment (e.g. flow and heart rate) determines which forces dominate, and hence the optimum choice of particle properties and magnetic field configuration will vary between patients and target site location. Our computational model intends to facilitate the optimization of these properties for a particular patient, or to predict the percentage of injected particles that will reach a given target site under a fixed configuration (thus potentially advising on the most appropriate dosage or carrier type for that patient).

We demonstrate the use of our model with a test case: modeling magnetically steered nanoparticles in a human circle

FIGURE 11 | Breakdown of metrics and efficiencies for HemeLB Simulate phase (operating on a voxelized representation of the circle of Willis model previously described) on ARCHER Cray XC30 (24 ranks per node). Bars represent, in seconds, the collective communication time, point-to-point communication time, and computation time. Note that the time required for collective operations is negligible; for this reason, the data is not presented. Lines represent communication efficiency ( ), load balance efficiency ( ), and parallel efficiency ( ). The proportion of computation vs. MPI communication time remains roughly the same (with primarily point-to-point communication and negligible collective communication), with communication efficiency remaining above 0.89. Load balance efficiency starts at 0.86 and progressively deteriorates to 0.76, such that the overall parallel efficiency degrades to 0.72 using 96,000 cores.

of Willis, with the target site (referred to as the region of interest, RoI) located on a bend in the left internal carotid artery (inlet 2 in **Figure 2**); an (invasive) magnet placed 0.9 cm from the geometric center of the RoI is used to steer the particles. We study the effects of particle radius on targeting efficiency at the RoI. **Figure 7** shows the trajectory of 17,077 particles in the LICA under the influence of a point dipolar magnet of moment **m**<sup>0</sup> = {3000.0, 0.0, 0.0}A m<sup>2</sup> . **Figure 13** shows the percentage of particles (of radius a = 65, 105, 250, and 500 nm) passing through the target region. In physical terms, we find the behavior of the particles to be largely governed by hydrodynamic and dipolar interactions with little contribution from diffusive effects, most likely due to the high flow rates in the given arterial section (∼0.8 m s−<sup>1</sup> peak velocity), which requires a strong magnetic field gradient to overcome drag.

To provide additional insight into the optimization of the particles, we investigate the effect of coating thickness. In the context of drug delivery, for example, the (organic or inorganic) coating surrounding the magnetic core is loaded with the drug. Our implementation of the model can accept a coating thickness a<sup>c</sup> (previously assumed to be zero). The application of a coating only affects the drag experienced by the particle, and is assumed to have a negligible effect on the magnetic forcing (i.e. provides no magnetic shielding). With the core radius (a =)65 nm, which is used in all calculations pertaining to the magnetic forcing, three coating thicknesses are considered: a<sup>c</sup> = 16.25, 32.5, and 65 nm. For the configurations considered, our simulations suggest that particle motion is unaffected by the additional drag due to the coating. On inspection of Equation (8), it is clear that if the local fluid velocity **v**(xp) at the particle's location x<sup>p</sup> is much greater than the velocity modification resulting from any external forcing, i.e.

$$\nu(\mathbf{x}\_{\mathcal{P}}) \gg \beta (\mathbf{F} + \mathbf{F}\_{\mathcal{R}}) \tag{13}$$

then any realistic coating will have little influence (since only the mobility β = 1/[6πµ(a+ac)] is modified). Because the magnetic field can only (strongly) influence particles within the proximity of the magnet (it falls off as 1/r 3 ), the current configuration is such that no discernible difference is seen.

By modifying the velocity profiles of the inlet boundary conditions, we are able to study the impact of three physiological parameters (mean blood pressure, volumetric flow rate, and heart rate at the opening) on particle behavior, demonstrating that our model can handle patient specificity (down to a patient's current physiological state). As a function of these parameters, the values for which we take from the experimental work of Sugawara et al. (2003), the peak inlet velocity is obtained from 1D Navier-Stokes simulations using our multiscale framework (Itani et al., 2015), and introduced to the 3D solver (HemeLB) as scaled parabolic profiles. All simulations presented to this point use the heart rate of a resting patient (80 mmHg, 4.8 l min−1, 68 bpm; see **Figure 3**) to derive inlet boundary conditions. Here, we consider three other cases with greater heart rates (see **Figure 14**): 112 mmHg, 10.7 l min−1, 113 bpm (—); 116 mmHg, 11.9 l min−1 , 120 bpm (—); 122 mmHg, 13.2 l min−1 , 134 bpm (—). For a fixed particle radius a = 65 nm, **Figure 13** shows how particle concentration in the RoI is affected. Relative to the case of resting heart rate (—), we see fewer particles in the RoI for higher-flow-rate cases. This is an unsurprising result; as discussed, the relative contribution of magnetic forcing to particle motion is reduced when the fluid velocity is increased. The reduced arrival time of the particles at the RoI is simply due to the greater fluid velocity.

The central parameter controlling hydrodynamic interactions, mediated by frictional coupling, is the particle radius. For particle radius a in the range of 65 to 500 nm thermal diffusivity was observed to be negligible. However, diffusive terms introduced by the interaction of the particles with blood cells may well play a significant role. Our current model does not include blood cells in the suspension, but can take into consideration the bulk shear thinning effect resulting from the presence of blood cells; a comparison of Newtonian and non-Newtonian blood models has shown little observable difference in mass flow (Bernabeu et al., 2013).

The effect of gravity (and other homogeneous accelerations) was modeled via a body force term. Our evaluations have found contributions of gravity (buoyancy of the particle, caused by the blood, is also considered) to the dynamics of the particles to be negligible in the test cases presented here. However, with increasing particle size or when considering larger capillary numbers gravity may become significant.

The magnetic properties of the paramagnetic particles are largely determined by the size and crystallinity of their magnetite (or maghemite) core. For simplicity, in the above simulations

we have chosen to model particles of pure magnetite. In reality, the magnetite content is expected to be lower, thus reducing the effective magnetic susceptibility χ<sup>v</sup> of a particle. Volumetric magnetic susceptibility, as reported in the literature, varies widely (i.e. 1.0 to 5.7 m−3) with the preparation, means of creation, and grain size of the nanoparticles (Hunt et al., 2013); for greatest effect we have chosen the maximum reported value. The size of the particle itself can also affect the susceptibility, as a finite size effect in small particles (e.g. for particle radius a . 25 nm Ulbrich et al., 2016) induces super-paramagnetic behavior which manifests as a vastly increased magnetic susceptibility (relative to that of paramagnets). With a > 65 nm in the simulations presented here, we neglect to consider the superparamagnetic regime. Note that our model is able to capture super-paramagnetic behavior, but values for χ<sup>v</sup> would need to be determined experimentally. It is expected that the magnetic susceptibility will be known for any super-paramagnetic iron oxide nanoparticles (SPIONs) used in a clinical context. We have additionally approximated the magnetic permeability inside the brain as that of a classical vacuum, i.e. µ<sup>0</sup> = 4π ×10−7H m−<sup>1</sup> . In general, the presence of iron rich tissues may cause the magnetic permeability of the surrounding brain matter to deviate from this value.

The initial distribution of the particles, and the invasive proximity of the magnet (both indicated in **Figure 7**), are clearly unrealistic, and were chosen for illustrative and performance testing purposes. Furthermore, the (single) permanent magnet here is modeled as a pure point dipole, effectively overestimating the field gradient. In future work, particles will be introduced via a timed release at inlets in a manner more closely modeling the concentration profile of an intravenous delivery. In addition, future implementations will model particle function in the target region (such as the absorption of particles into target tissue, or magnetically induced heating of nanoparticles and subsequent drug release). Furthermore, an external electromagnetic field solver will be used to recreate a complex and realistic field (such as may be induced in a clinical context). As stated previously, the input flow velocities for each inlet were obtained using a multiscale approach (to represent the rest of the human arterial tree, Itani et al., 2015), whereas we may wish to consider that in an unhealthy patient the blood pressure and flow rates may be much higher.

Segmentation of the clinical images necessary to construct the three-dimensional vascular geometry is in practice difficult to automate consistently, often needing human intervention to identify artifacts to be filtered out. As a result of this, and other uncertainties in the input data, a number of replica simulations may be required to capture the full statistics of the system, and allow uncertainty quantification of the results. Computational efficiency is therefore very important to the practicality of this model. Currently, the most significant influence on computational performance comes via the distribution of particles across computational subdomains, with large numbers of particles on any single computational subdomain causing load imbalance. While the dilute requirement of our model largely mitigates the problem in high performance computing environments (where core counts of high scaling codes can be increased with relative ease), the transition to smaller workstations using accelerators may require the implementation of sophisticated load balancing techniques. Nevertheless, in the most extreme case of imbalance observed in our simulations, using 5,600 cores (350 nodes) on Blue Waters, the performance was degraded by around ∼15% relative to the case where no particles are present—a manageable reduction in performance that can be alleviated through further development of the load balancing techniques employed. To simulate 20,377 particles over three cardiac cycles and with lattice spacing δ<sup>x</sup> = 25 µm using 5,820 cores (220 nodes) on ARCHER requires 20 wallclock hours. Therefore, based on the scalability study presented

FIGURE 14 | Peak inlet velocity for (1) the basilar artery (- - -), (2) the left internal carotid artery ( ), and (3) the right internal carotid artery ( ). For each of the three inlets, the complete inlet-velocity profile is obtained by assigning weighting factors (of the peak velocity) to lattice sites that lie on the boundaries. The top plot (red) is for 10.7 l min−<sup>1</sup> at 113 bpm (red line in Figure 13), the middle plot (green) is for 11.9 l min−<sup>1</sup> at 120 bpm (green line in Figure 13), and the bottom plot (blue) is for 13.2 l min−<sup>1</sup> at 134 bpm (blue line in Figure 13).

in section 4.2.1, and the encouraging results of the loadbalance testing involving 73,215 particles, we postulate that our method can simulate tens of thousands of particles over multiple cardiac cycles in geometries consisting of O(10<sup>9</sup> ) lattice sites in approximately a day. Such performance allows us to address flow problems that previously could not be approached, and will lead to new a level of understanding.

In order to achieve the necessary computational performance, a number of approximations were implemented. As our particle sizes are significantly smaller than the scale of the lattice discretization (1/25th in the case of the largest particle radius), and with sufficiently low particle density (1–5 particles per lattice volume), we permit ourselves the use of a one-way coupling strategy (no feedback from particles to fluid). Another consequence of the dilute approximation is the use of the much cheaper pairwise expression for the dipolar force (see Equation 9); in practice this would break down for non-dilute fluids. We also assume that particles align instantaneously with the local magnetic field, as the time scale for rotation is extremely rapid (Ulbrich et al., 2016) (relative to the characteristic timescale of hydrodynamic processes).

### 6. CONCLUSION

We present an efficient computational model for simulating magnetic drug targeting in patient specific brain geometries, via the steering of paramagnetic nanoparticles with an external magnetic field. The model couples the dynamics of spherical particles to a lattice-Boltzmann hydrodynamics simulation, taking into account body forces (e.g. gravity), diffusivity, and dipolar interactions. A study of the model's computational performance found favorable results, with a performance drop of ∼15 % (relative to a simulation of the hydrodynamics alone, i.e. in the absence of any particles) in the most extreme case of load imbalance (all particles clustered in one region). We demonstrated the use of the model to predict the particle density (as a function of time) near a target site for a specific patient circle of Willis vascular system and heart rate, using a single point dipolar magnet. Through a multiscale coupling with a 1D representation of the wider vascular system, we obtained inlet velocity profiles for a patient in a range of physiological states (varying heart rate, cardiac output and mean blood pressure). Initial results allow confidence in the viability of the model to answer a wide range of questions relating to the design and manipulation of iron oxide nanoparticles in a clinical context. Comparison to phantom flow results and medical imaging research will allow further tuning of system parameters to further increase the accuracy of the model. A next step toward using the simulation technique in a more realistic manner will involve coupling of the flow solver to a comprehensive electromagnetic simulation. This will allow for the investigation of particle behavior when exposed to more complex magnetic fields created by a combination of multiple electromagnets.

### REFERENCES


### AUTHOR CONTRIBUTIONS

AP, RR, and SS: Programming of paramagnetic particle controller; RR, AP, and PC: Conception and design of simulations; AP: Performed simulations showcasing capabilities; RR and AP: Analyzed simulation results; BW: Performance audit; AP, RR, SS, BW, RN, and PC: Drafted, edited, and revised manuscript.

### FUNDING

We acknowledge funding support from the EU H2020 CompBioMed Centre of Excellence, grant agreement No. 675451 (http://www.compbiomed.eu); the EU H2020 Performance Optimization and Productivity Centre of Excellence, grant agreement No. 676553 (http://www.pop-coe.eu); the UK Consortium on Mesoscale Engineering Sciences (UKCOMES, http://www.ukcomes.org), EPSRC reference EP/L00030X/1; Large Scale Lattice Boltzmann for Biocolloidal Systems, EPSRC reference EP/I034602/1; the EU H2020 ComPat project, grant agreement No. 223979 (http://www.compat-project.eu); and the Qatar National Research Fund (NPRP), project No. 5-792-2-238. The simulations were performed on ARCHER, the UK National Supercomputing Service (http://www.archer.ac.uk), and Blue Waters at the National Center for Supercomputing Applications (NCSA) (https://bluewaters.ncsa.illinois.edu/), supported by the NSF Grant More Power to the Many: Scalable Ensemble-based Simulations and Data Analysis, award No. 1713749.

### ACKNOWLEDGMENTS

We thank Alberto Figueroa for providing the circle of Willis geometry, and Ulf Schiller, Derek Groen and Miguel Bernabeu for constructive discussions relating to the implementation of the particle controller.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Patronis, Richardson, Schmieschek, Wylie, Nash and Coveney. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# 3D Fluid-Structure Interaction Simulation of Aortic Valves Using a Unified Continuum ALE FEM Model

Jeannette H. Spühler\*, Johan Jansson, Niclas Jansson and Johan Hoffman

Department of Computational Science and Technology, School of Computer Science and Communication, KTH Royal Institute of Technology, Stockholm, Sweden

Due to advances in medical imaging, computational fluid dynamics algorithms and high performance computing, computer simulation is developing into an important tool for understanding the relationship between cardiovascular diseases and intraventricular blood flow. The field of cardiac flow simulation is challenging and highly interdisciplinary. We apply a computational framework for automated solutions of partial differential equations using Finite Element Methods where any mathematical description directly can be translated to code. This allows us to develop a cardiac model where specific properties of the heart such as fluid-structure interaction of the aortic valve can be added in a modular way without extensive efforts. In previous work, we simulated the blood flow in the left ventricle of the heart. In this paper, we extend this model by placing prototypes of both a native and a mechanical aortic valve in the outflow region of the left ventricle. Numerical simulation of the blood flow in the vicinity of the valve offers the possibility to improve the treatment of aortic valve diseases as aortic stenosis (narrowing of the valve opening) or regurgitation (leaking) and to optimize the design of prosthetic heart valves in a controlled and specific way. The fluid-structure interaction and contact problem are formulated in a unified continuum model using the conservation laws for mass and momentum and a phase function. The discretization is based on an Arbitrary Lagrangian-Eulerian space-time finite element method with streamline diffusion stabilization, and it is implemented in the open source software Unicorn which shows near optimal scaling up to thousands of cores. Computational results are presented to demonstrate the capability of our framework.

Keywords: fluid-structure interaction, finite element method, Arbitrary Lagrangian-Eulerian method, parallel algorithm, blood flow, patient specific heart model

### 1. INTRODUCTION

The World Health Organization (WHO, 2014) has identified cardiovascular disease as the major cause for death in the world. Therefore, developing new ways to support early diagnosis of cardiac dysfunction is of vital importance. In vivo and in vitro studies offer valuable information on the relationship between the blood flow (hemodynamics) and cardiac disease, and advances in computational fluid dynamics (CFD) and high performance computing (HPC) enable the usage of computer simulation as an important tool to further enhance our understanding of this relationship.

#### Edited by:

Mariano Vázquez, Barcelona Supercomputing Center, Spain

#### Reviewed by:

Jazmin Aguado-Sierra, Barcelona Supercomputing Center, Spain Tinen Lee Iles, University of Minnesota Twin Cities, United States

> \*Correspondence: Jeannette H. Spühler spuhler@kth.se

#### Specialty section:

This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology

Received: 12 December 2017 Accepted: 23 March 2018 Published: 16 April 2018

#### Citation:

Spühler JH, Jansson J, Jansson N and Hoffman J (2018) 3D Fluid-Structure Interaction Simulation of Aortic Valves Using a Unified Continuum ALE FEM Model. Front. Physiol. 9:363. doi: 10.3389/fphys.2018.00363

**287**

The field of cardiac modeling is extensive, and highly interdisciplinary. It is therefore important to be clear on what the research is aiming for. Our goal is to develop a framework for simulating the intraventricular blood flow, where specific properties such as fluid-structure interaction (FSI) of the aortic valve can be implemented in a modular way without extensive efforts. In Spühler et al. (2015) we focus on the aspect of fluid mechanics, and present a computational model of the blood flow in the left ventricle (LV) of the heart. The movement of the wall is based on ultrasound measurements and an Arbitrary Lagrangian-Eulerian (ALE) space-time finite element method is used to simulate the blood flow by solving the incompressible Navier-Stokes equations. The opening and closing of the mitral and aortic valves are modeled by time-dependent velocity and pressure boundary conditions. In this paper, we present an extension of this work by embedding different geometrical models of aortic valves in the LV and the aorta. Prototypes of a biological valve and bileaflet mechanical heart valve (BMHV) are modeled. While surgical treatments of valvular diseases are firmly established, many decisive factors for the performance of the implant are not fully understood yet. Numerical simulations provide an important insight to the interaction between the blood flow and the leaflets which can be applied to optimize the design of BMVHs or improve technologies as transcatheter aortic valve replacement (Wu et al., 2016). The fluid-structure interaction problem is described by a unified continuum model, using the conservation laws for mass and momentum and a phase function, which is a novel approach for simulating valve motions. The Navier-Stokes equations are solved by an ALE space-time finite element method with streamline diffusion stabilization implemented in Unicorn (Hoffman et al., 2012), which is part of the open source software framework FEniCS-HPC (Jansson, 2013).

This paper is structured as follows. In section 2 we describe the different components and functions of an anatomical aortic valve. section 3 explains the mathematical equations and the numerical method. In section 4, we specify the mechanical and biological aortic valve model we use in our simulations. The numerical results are presented in section 5 and we conclude our paper in section 6 by summarizing our findings and discuss possible steps of future work.

### 2. MODELING THE AORTIC VALVE

The left ventricle possesses a mitral and an aortic valve, each of them consisting of two and three leaflets respectively. The valves ensure unidirectional flow and prevent the blood to flow back. The opening and closing of the valves are mainly controlled by the pressure gradient between the ventricle and the adjacent chamber. One edge of the biological leaflet is completely attached to the inner wall of the heart. The free edge of the mitral valve is connected to the papillary muscles by the chordae tendineae. The aortic leaflets do not have such fibrous tissue connections and they open and close passively due to the flow.

The nomenclature of the different components of the aortic root can vary remarkably as revealed by Sievers et al. (2012). We apply the definitions proposed in Sievers et al. (2012), as indicated in **Figure 1**. The aortic root is situated between the left ventricle and the ascending aorta, and is bordered by the annulus and the sinotubular junction. The three bulges just above the annulus are referred as sinus of Valsalva. The aortic valve contains three leaflets which are attached to the aorta wall. The point of contact where two leaflets meet at the root wall is called commissure and the surface of contact at the free edge is known as coaptation.

### 3. MATHEMATICAL MODEL AND NUMERICAL METHOD

In order to put our approach in context, we review different models for simulating the FSI of the blood flow around the aortic valve. Usually, the structure model is formulated in the Lagrangian coordinate system whereas fluid flow is described in the Eulerian coordinate system. At the common interface of the two models, the following kinematic and dynamic constraints have to be satisfied by the velocity **u** and the stress τ :

$$\mathbf{u}\_{f} = \mathbf{u}\_{\mathcal{S}} \quad \text{(kinematic constraint, continuity of the velocity),} \quad \text{(1)}$$

$$\mathbf{r}\_{\mathcal{S}} \cdot \mathbf{n} = \mathbf{r}\_{f} \cdot \mathbf{n} \quad \text{(dynamic constraint, continuity of the normal stresses).}$$

(2)

The subscript indicates whether the variable is defined in the solid (s) or in the fluid (f) part respectively, while **n** is a unit vector normal to the interface. We denote vectors and matrices with bold letters. FSI simulations can roughly be categorized as moving or fixed mesh methods and partitioned or monolithic approach as presented in Borazjani et al. (2008).

### 3.1. Discretization of the Coupled Problem

For fixed mesh methods, the fluid and structure domains are discretized in a non-boundary conforming matter. Since the structure is spatially disconnected from the fixed background mesh, it is crucial to efficiently trace and move the interface between the solid and the fluid domain. The interface can be discretized with a set of markers and tracked by a Lagrangian method (front tracking) or represented by contours or level sets of a scalar function (front capturing). Fixed mesh methods were pioneered by Peskin and McQueen (Peskin, 1972; McQueen and Peskin, 2000) introducing the concept of immersed boundary methods, where body forces are imposed on the fluid domain to account for the interaction between the fluid and the structure. Large structural deformations are manageable, but the solution at the interface can be diffuse. This disadvantage can be lessened by e.g., increasing the mesh resolution in the vicinity of the immersed boundary as done by Griffith (2012), or by treating the boundary as a sharp interface as in e.g., Borazjani et al. (2008); Udaykumar et al. (1999); Mittal and Iaccarino (2005); Gilmanov and Sotiropoulos (2005), and Xia et al. (2009). Fictitious domain methods is a another class of fixed mesh methods where Lagrange multipliers account for the kinematics constraints between the fluid and solid domain, see e.g., Glowinski et al. (1999); van Loon et al. (2005); De Hart et al. (2003), and Astorino et al. (2009).

In moving mesh methods, the computational mesh conforms to the deformation of the solid domain, and is typically represented by an Arbitrary Lagrangian Eulerian formulation. The strength of the moving grid methods can be found in its accuracy and clearly defined coupling condition, as the mesh is aligned with the fluid-structure interface. A good smoothing algorithm or local remeshing is needed to keep the quality of the computational mesh.

Reviewing the literature of aortic valve simulations, we came across the following work which apply an ALE approach: Bolger et al. (2007) and Penrose and Staples (2002) simulate the flow past a geometrically reduced mechanical valve prosthesis taking advantage of its symmetrical form; in Dumont et al. (2007) two commercially available bileaflet mechanical heart valves are compared regarding hemodynamics and thrombogenic performance; Guivier-Curien et al. (2009) employs particle image velocity measurements to quantitatively and qualitatively compare experiments and numerical simulations; the FSI model of Choi and Kim (2009) provides detailed flow information and leaflet behavior of a BMHV; Morsi et al. (2007) analyzes the fluid dynamics of a trileaflet heart valve but only for the initial opening phase.

#### 3.2. Coupling Strategies

Depending on whether the structure and fluid problems are solved simultaneously or separately, the FSI solver can be classified as monolithic or partitioned. The FSI approach is called monolithic if the fluid and solid problems are solved as one single system where no matching of the data is required at the interface.

In a partitioned approach, there are two different solvers simulating the fluid and the solid part respectively. If the coupling between the solvers is explicit in time then the coupling is loose. The loose coupling has low computational cost, but the simulation may become unstable. To overcome these instability issues, the partitioned problem can be formulated implicit in time, introducing an iteration loop at each time step until a dynamic equilibrium between the fluid and solid is achieved. Data exchange between the fluid and solid part in this implicit algorithm is called a strong coupling.

## 3.3. Unified Continuum Model

We now specify our ansatz, which corresponds to a monolithic, moving mesh method. An elaborate description can be found in Jansson et al. (2011) at full length. Here, we only describe the main features.

Where the size of the vessel is much larger than the size of a red blood cell, the blood flow can be modeled as an incompressible Newtonian fluid (Quarteroni et al., 2014). The governing equations are the Navier-Stokes equations. The dynamic viscosity is chosen as µ = 0.0027Pa · s and the blood density ρ = 1, 060kg/m<sup>3</sup> (Di Martino et al., 2001). In small domains, as the region around the revolute joints of a mechanical heart valve, non-Newtonian effects might have to be incorporated in the model, but these flow features are not targeted in this work.

With the aim of establishing a framework that allows for general formulation and implementation of different models, while applying adaptive error control for realistic 3D applications, a so-called unified continuum model for FSI was developed. The model is described by the conservation laws of mass and momentum for an incompressible continuum, where a stress and phase variable define the properties of the continuum.

Let <sup>t</sup> ⊂ R <sup>3</sup> be a time-dependent domain with t ∈ I : = [0, tˆ]. Our goal is to determine **u**(**x**, t): <sup>t</sup> → R 3 , where <sup>t</sup> encompasses both the solid and the fluid domain and **u** defines the fluid velocity in the fluid part and the deformation velocity in the structure part:

$$
\rho(\dot{\mathbf{u}} + ((\mathbf{u} - \mathbf{m}) \cdot \nabla)\mathbf{u}) = \nabla \cdot \mathbf{r}(\mathbf{u}, p) \quad \text{(x, t)} \in \Omega^l \times I,\tag{3a}
$$

$$
\nabla \cdot \mathbf{u} = 0 \quad \text{(x, t)} \in \Omega^l \times I.\tag{3b}
$$

Here τ is the stress tensor and **m** identifies the mesh velocity in the ALE formulation. In the solid, we choose

**m** to be the material velocity of the structure. In the remaining part of the mesh, **m** is determined by the mesh smoothing algorithm applied to uphold the quality of the mesh.

The constitutive laws are defined via the stress term, where the phase function θ is set to zero in the solid domain and to one in the fluid domain:

$$
\mathbf{r} = \mathbf{r}\mathbf{p} - p\mathbf{l},\tag{4}
$$

$$\mathbf{r\_D} = \theta \,\mathbf{r\_f} + (1 - \theta)\mathbf{r\_s},\tag{5}$$

$$\mathfrak{r}\_f = 2\mu\_f \mathfrak{e}(\mathfrak{u}),\tag{6}$$

$$D\_l \mathbf{r\_s} = 2\mu\_s \epsilon(\mathbf{u}) + \nabla \mathbf{u} \mathbf{r\_s} + \mathbf{r\_s} \nabla \mathbf{u}^T,\tag{7}$$

$$\boldsymbol{\epsilon}(\mathbf{u}) = \frac{1}{2} (\nabla \mathbf{u} + \nabla \mathbf{u}^T). \tag{8}$$

The kinematic constraint **u<sup>f</sup>** = **u<sup>s</sup>** is satisfied implicitly by the continuity of the velocity field **u** for the unified continuum. The dynamic constraint is weakly enforced by applying integration by parts on the stress term and setting it to zero in the weak formulation.

This approach allows us to use the same discretization method, stabilization technique and mesh deformation algorithm as for a pure fluid problem.

#### 3.4. Time and Space Discretization

Let 0 : = t <sup>0</sup> < t <sup>1</sup> < · · · < t <sup>N</sup> : = tˆ be a sequence of discrete time steps, with associated time intervals I n : = (t n−1 , t n ] of length k n := t <sup>n</sup> − t n−1 .

We introduce the space-time slab S n : = <sup>t</sup> n × I n , and let T <sup>n</sup> = {K} denote the spatial discretization of <sup>t</sup> n . **U**<sup>n</sup> is the discrete velocity, P n is the discrete pressure, and h n specifies the maximal diameter of the cells *K* ∈ T n .

We choose the finite element function space of piecewise linear functions W<sup>n</sup> ⊂ H<sup>1</sup> (<sup>t</sup> n ), where

$$H^1(\mathfrak{Q}^{t''}) := \{ \nu \in L^2(\mathfrak{Q}^{t''}) | \frac{\partial \nu}{\partial \mathfrak{X}\_k} \in L^2(\mathfrak{Q}^{t''}) , k = 1, 2, 3 \}, \tag{9}$$

$$\mathcal{W}^{\boldsymbol{\eta}^{\boldsymbol{\eta}}} := \{ \boldsymbol{\nu} \in C(\boldsymbol{\Omega}^{\boldsymbol{\eta}^{\boldsymbol{\eta}}}) \, | \, \boldsymbol{\nu} \in P^{1}(K), \forall K \in T^{\boldsymbol{\eta}} \}, \tag{10}$$

$$\mathcal{W}\_0^{\boldsymbol{\eta}} := \{ \boldsymbol{\nu} \in \boldsymbol{W}^{\boldsymbol{\eta}} | \boldsymbol{\nu} = \boldsymbol{0} \text{ on } \partial \Omega^{\boldsymbol{\eta}^{\boldsymbol{\eta}}} \}, \tag{11}$$

$$\mathbf{W\_0^n} := [W\_0^n]^3. \tag{12}$$

We identify the discrete solution for velocity and pressure as **U**ˆ = (**U**, P), the discrete stress for both the fluid and the solid as T , the discrete mesh velocity as **M**, and the test function as **v**ˆ = (**v**, q). In time, we choose **U** to be piecewise linear, and P, **v** and q to be piecewise constant.

Based on these definitions and assuming homogeneous Dirichlet boundary condition for the velocity, the spatially and temporally discretized variational formulation of Equation (3) reads as follows: for each space-time slab S n , find (**U<sup>n</sup>** , P n ): = (**U**(**t n** ), P(t n )) with **U<sup>n</sup>** ∈ **W<sup>n</sup> 0** and P <sup>n</sup> ∈ W<sup>n</sup> , such that:

$$\begin{aligned} \left(\rho k\_n^{-1} (\mathbf{U}^\mathbf{n} - \mathbf{U}^{\mathbf{n}-1}) + (\rho (\bar{\mathbf{U}}^\mathbf{n} - \mathbf{M}^\mathbf{n}) \cdot \nabla) \bar{\mathbf{U}}^\mathbf{n}, \mathbf{v}\right) + (\mathcal{T}^\mathbf{n} : \nabla \mathbf{v}) \\ + \text{SD}\_\delta (\bar{\mathbf{U}}^\mathbf{n}, \mathbf{M}^\mathbf{n}, P^\mathbf{n}, \mathbf{v}, q, \rho) &= 0, \end{aligned} \tag{13}$$

for ∀(**v**, q) ∈ **W<sup>n</sup> <sup>0</sup>** × W<sup>n</sup> , where **U**¯ **<sup>n</sup>** = 1 2 (**U<sup>n</sup>** + **Un**−**<sup>1</sup>** ) and (., .) denotes the L 2 (S n )-inner product.

To stabilize the convection dominated problem (3), we use a simplified Galerkin/least-square method, where we drop the time derivative and the diffusion term, and we define SDδ as

$$\begin{split} \delta D \delta (\bar{\mathbf{U}}^{\mathfrak{n}}, P^{\mathfrak{n}}, \mathbf{v}, q, \rho) &= \\ \delta (\delta\_1 \rho ( (\bar{\mathbf{U}}^{\mathfrak{n}} - \mathbf{M}^{\mathfrak{n}}) \cdot \nabla) \bar{\mathbf{U}}^{\mathfrak{n}} + \nabla P^{\mathfrak{n}}), \rho ( (\bar{\mathbf{U}}^{\mathfrak{n}} - \mathbf{M}^{\mathfrak{n}}) \cdot \nabla) \mathbf{v} + \nabla q) \\ &+ (\delta\_2 \nabla \cdot \bar{\mathbf{U}}^{\mathfrak{n}}, \nabla \cdot \mathbf{v}). \end{split} \tag{14}$$

The stabilization parameters are chosen as δ<sup>2</sup> = κ2ρh n |**Un**−**<sup>1</sup>** | and δ<sup>1</sup> = κ1ρ −1 (k −2 <sup>n</sup> + |**Un**−**<sup>1</sup>** <sup>−</sup> **<sup>M</sup>n**−**<sup>1</sup>** | 2h −2 n ) −1/2 , where κ1, κ<sup>2</sup> are problem independent positive constants of order O(1). By

TABLE 1 | Model parameters used for generating the native aortic root geometry.


applying the midpoint quadrature rule in time, we obtain a Crank-Nicolson time-stepping scheme. We use Bi-CGStab with a block Jacobi preconditioner where each sub-block is solved with ILU(0).

#### 3.5. Smoothing Algorithms

Due to the fluid-structure interaction of the aortic valve and the pumping blood flow from the left ventricle, it is crucial for an ALE-method to have a suitable method to adjust an existing mesh. There are different ways to enhance and optimize the quality of the mesh, which may involve e.g., swapping faces and edges, or changing the number of vertices.

Meshing algorithms, which involve change of topology or the number of mesh cells, are not suitable for time-dependent,

parallel computing. Therefore, it is preferable to use a mesh adaptivity method which omits the necessity or at least minimizes the frequency of remeshing.

To keep a good mesh quality, while limiting the computational cost, our solver combines a linear and a nonlinear mesh smoothing algorithm. The linear smoother accounts for the rough overall re-distribution of the vertices, while the nonlinear smoother optimizes locally the mesh based on the quality of the cells.

#### 3.5.1. Linear Smoother

The linear smoother solves a linear elastic equation in the fluid domain for the mesh velocity, which corresponds to a Poisson equation with Dirichlet boundary conditions given by the structure velocity on the fluid-structure interface, where the vertices are diffusively relocated over the domain. Although it is a simple and fast method, there is no guarantee that improvement is achieved since the equation does not take into consideration the quality of the cells in the mesh.

#### 3.5.2. Nonlinear Smoother

To locally enhance distorted cells, we describe the deformation of the mesh using a nonlinear elasticity problem, and weight the stiffness of the model by a quality measure Q(K) of each cell K in the mesh T n :

$$Q(K) := \frac{||F||\_F^2}{\det(F)^{2/d}d},\tag{15}$$

where d specifies the dimension of the spatial domain and ||.||<sup>F</sup> the Frobenius norm. F denotes the deformation gradient between K ∈ T n and a scaled equilateral reference cell.

By weighting the equation by Q(K) and advancing the partial differential equation toward its equilibrium, the mesh is improved toward its goal of optimal shape. A more detailed description is elaborated in Jansson et al. (2011).

To limit the computational cost, the nonlinear smoother is stopped after a certain number of "pseudo" time steps ˜k before a stationary solution is obtained. Depending on the quality of the mesh T n , the total number of pseudo time steps can be adapted to achieve a desired quality.

### 3.6. Modeling of Contact

In order to simulate the closing of a heart valve, an algorithm needs to be implemented to both detect collision and to simulate contact. Our approach is derived from the idea to describe the fluid-structure interaction as a unified continuum. We model contact implicitly by switching fluid cells to solid cells as soon as contact is detected. Collision is detected by solving an Eikonal equation for the distance between two solid surfaces.

In order to detect contact between two leaflets of a native valve, we calculate the minimal distance dmin : = minij,i6=j{d ij} between the leaflets L j and L i for i, j = 1, 2, 3, as illustrated in **Figure 2A**. To model a proper closure of the leaflets, we include a 2D-surface in our volume mesh, which covers the entire valve opening, and as soon dmin is below a certain threshold, all cells directly attached under the 2D-surface are marked as solid, as shown in **Figures 2B,C**. Since the closing moment of a healthy valve is very short, we argue that it is acceptable to cover the whole opening at once. The contact is released at the beginning of the subsequent contraction phase (systole) of the left ventricle.

#### 3.7. Computational Tools

Nowadays high performance computing is an essential part of computational science. The Heart-FSI solver is implemented in the HPC branch (Jansson, 2013) of the open source FEM library DOLFIN (Logg and Wells, 2010) and the adaptive flow solver Unicorn (Hoffman et al., 2012). Both libraries have successfully been used to efficiently solve large scale industrial problems as described in e.g., Jansson et al. (2011) and Vilela de Abreu et al. (2016).

The simulations were performed on Beskow, a Cray XC40 system, where each node has two CPUs (Intel E5-2698v3) with 16 cores. All volume meshes are created in ANSA (2014), a computer-aided engineering tool for pre-processing.

### 4. VALVE MODELS

In the subsequent paragraphs, we describe how we model native and bileaflet mechanical heart valves (BMHV) embedded in the left ventricle and ascending aorta. For each case, we detail the geometry and specify the material as well as the initial and boundary conditions.

#### 4.1. Native Valve

#### 4.1.1. Geometry

The geometry of the aortic root has been studied, where geometrical parameters are optimized to resemble the function of a trileaflet valve (Swanson and Clark, 1974). Our model is based on such an optimized geometry proposed by Thubrikar (1990).

We generate a computer-aided design (CAD) model of an idealized native aortic root based on a small set of parameters which can be personalized to a particular patient. The aortic root generator is a set of Python scripts for Rhinoceros 5 (Rhinoceros, 2016) that outputs an aortic root in a fully open valve configuration, as presented in **Figure 3A**. The model parameters are illustrated in **Figures 3B–D**.

We assume that the aortic root has a threefold symmetry around the z-axis and label the rotational angle by β as depicted in **Figure 3B**. The plane P<sup>A</sup> at the annulus and the plane at the sinotubular junction P<sup>S</sup> are assumed to be parallel. The inner radius at the annulus RA, the inner radius at the sinotubular junction R<sup>S</sup> and the length of the leaflet in the open position h<sup>l</sup> , are used to define a truncated cone as shown in **Figure 3C**. To find the leaflet attachment and the leaflet surface the cone is cut by the plane P<sup>c</sup> , which is defined by three points P 1 c , P 2 c , and P 3 c , see **Figures 3C,D**. These points are determined by the height of the commissure h<sup>c</sup> and the opening angle β. We attach a cylinder with radius R<sup>S</sup> to the aortic root to model the beginning of the ascending aorta. Geometrical parameters for the sinus of Valsalva are not considered as modifiable yet. The thickness is acquired by copying, scaling and translating surfaces. The parameter values used in the simulations are listed in **Table 1**.

#### 4.1.2. Material

The leaflets are made of a very thin, flexible and inextensible material. The fibers in an aortic leaflet are aligned in the circumferential direction (Swanson and Clark, 1974), and the mechanical properties vary in different parts of the aortic valve (Kasyanov et al., 1985). In the framework of this work, it is sufficient to assume the solid material to be homogeneous. As material model we choose an incompressible, neo-Hookean material. At this point of development, the material parameters are set to µ<sup>s</sup> = 3.3 · 103MPa and ρ = 1, 000kg/m<sup>3</sup> . Although these parameters do not conform with realistic values yet, typical characteristics of the flow and valve dynamics can be captured.

#### 4.1.3. Initial and Boundary Conditions

Even though in the initial geometry the valve is in a fully open position, the leaflets are pushed into a starting configuration to facilitate the movement of the leaflets. In order to remove excessive leaflet material resulting from the deformation, we prescribe a constant, initial stress in radial direction such that the material behaves like a contracting balloon which was stretched. The starting position for our simulations with initial radial stress 4 Pa is shown in **Figure 4A**. The stress is reset for the FSI simulation.

We only consider the two major phases, systole and diastole, and one heart cycle lasts for 1.124 s. The inflow profile is flat and the magnitude is adopted from the left ventricle flow simulations presented in Spühler et al. (2015). At the end of systole, the direction of the inflow is inverted to create a backflow which is physiologically consistent and helps the valve to close. The timedependent inflow magnitude is plotted in **Figure 4B**. Diastole starts when the valve is closed and the inflow is set to zero. A homogeneous Dirichlet boundary condition for the pressure is set at the outlet.

### 4.2. Bileaflet Mechanical Heart Valve

#### 4.2.1. Geometry

Pathological conditions caused by valvular dysfunction in the form of a narrowing of the valve opening (stenosis) or insufficient closing of the leaflets, reduce the efficiency of the heart. To restore the hemodynamics function, the native heart valve may need to be repaired or even replaced by an artificial implant. Since the first clinical implantation of an artificial valve by Dr. Charles A. Hufnagel in 1952, many different mechanical and bioprosthetic valves have been developed. Due to their wear resistance, the bileaflet mechanical heart valves (BMHV) are most widely favored as aortic valve replacement. As can be seen in **Figure 5A**, a typical BMHV is made of a circular housing and two semicircular discs, which are mounted in the housing through a hinge mechanism. Both leaflets are rotating passively in response to the fluid dynamics resulting from the periodic contraction and expansion of the left ventricle.

Since feasibility, but not clinical validation is the focus of this paper, a detailed geometric model of a mechanical valve is secondary at this stage of investigation. Therefore, the BMHV models are reduced to the leaflets only, embedded in an idealized aorta as depicted in **Figure 5B**. The geometry is simplified in such a way that no contact between the leaflets occurs.

To use numerical simulations in order to study the flow through a mechanical prosthetic heart valve began in the early 1970s. Since then, many simulations of the flow dynamics around a BMHV have been conducted with the aim to elucidate and eliminate complications as thromboembolism. Simulating flow dynamics in the vicinity of a heart valve is a challenging task. The flow is pulsative and undergoes transition to turbulence. Patient-specific framework and the computational models should account for the multi-scale nature of the flow and deformability of the wall.

#### 4.2.2. Material

We apply the same material model as for the native valve and set the material parameters to µ<sup>s</sup> = 6.5 · 105MPa and ρ = 1, 000kg/m<sup>3</sup> .

#### 4.2.3. Initial and Boundary Conditions

Contrary to the native valve, we simulate the fluid-structure interaction of the leaflets and the hemodynamics of the left ventricle (LV) conjointly. A detailed description of the boundary conditions for the numerical simulation of the blood flow in the LV can be found in Spühler et al. (2015). We define a rotational axis by fixing two edge points of each leaflet. The hinges on which

FIGURE 8 | The instantaneous vector field of the velocity using arrows and line integral convolution (LIC), the pressure field and the aortic valve position during RVOT [t = 0.05 (A), 0.08 (B), 0.1 s (C)].

of gradual closure [t = 0.25 (A), 0.3 (B)] and RVCT [t = 0.4 s (C)].

the leaflets are placed limit the rotational angle so that the BMHV is properly opened and closed. To mimic this mechanism, we set a threshold for the opening and closing angles respectively. As soon as a leaflet exceeds this angular barrier during systole or diastole, its position is locked. The leaflets are released from the fully open position if the mean pressure above the valve exceeds the mean pressure under the valve, and is disengaged from the closed position as soon as a new heart cycle starts. The maximal angular opening is set to 45◦ .

#### 5. RESULTS

In this section we present the numerical results for the native and bileaflet mechanical valves. The 2-D cuts for the native valve and the BMHV are specified in **Figure 6**. Since we do not model the coronary arteries, which originate from the sinus of Valsalva, and the flow within the aortic root is almost quiescent during diastole, the results are based on the first heart cycle. The results for the native and mechanical valve are presented from meshes with 248′ 980 and 783′ 823′ vertices respectively. At the beginning of the simulation and during diastole, the time step size k n is set such that the Courant-Friedrichs-Lewy (CFL) number is 0.5. During systole, we have to reduce k n such that the mesh smoothing algorithms can maintain the mesh quality. No remeshing is required but in the worst case the CFL number had to be reduced to 0.01 to bypass a sensitive phase of large and fast deformation of the leaflets. To advance the solver one time step, the momentum and continuity equation, the Eikonal equation for contact detection, and the linear and non-linear elasticity equations for mesh smoothing have to be solved. When distributing ∼ 2, 000 vertices per core, each sub-problem is solved in less than 0.5 s but its total time is about 5 s. The latter can slightly vary depending on the quality of the mesh since the cost of the non-linear elastic smoother is higher when the mesh quality is low.

#### 5.1. Simulation Results of the Native Valve

First, we examine the opening and closing movement of the aortic valve, which can be divided into four phases (Bellhouse and Talbot, 1969; Labrosse et al., 2010). A rapid valve opening time (RVOT) is followed by a period when the valve stays widely opened (quasi-steady phase). The valve first closes steadily and then rapidly due to reversed flow (RVCT) in the very end of systole. All these stages can be observed in our simulations by measuring the geometric orifice area (GOA), which is calculated by determining the area of the surface used for closing the valve. The time-dependent GOA is depicted in **Figure 7** and matches well the dynamics captured in Labrosse et al. (2010). The rapid opening phase takes about 0.05 s and the valve stays open for about 0.15 s. Three-quarters of the valve closure is taking place when the flow is still flowing forward (∼0.15 s) and a total closure is obtained by a small amount of reversed flow (∼0.05 s).

To study the flow dynamics, in **Figures 8**, **9** the velocity and pressure fields together with the valve position are visualized at six time instances during the different phases: RVOT (t = 0.05, 0.08, 0.1 s), the phase of gradual closure (t = 0.25, 0.3 s) and RVCT (t = 0.4 s).

During RVOT, the fluid is accelerated over the whole domain flowing toward the outlet. As observed in De Hart et al. (2003), even the blood residing in the sinus cavity is washed out as shown in **Figures 8A–C**.

During the subsequent period, as the valve reaches and stays in the fully opened position, the flow is dominated by a strong, central jet. The flow starts to decelerate at about t = 0.2 s when the valve is still completely opened, and at about t = 0.25 s the flow in the sinus cavity does not flow toward the outflow anymore, see **Figure 9A**. A small vortex starts to form at the tip of the backside of the leaflet, as depicted in **Figure 10**. Computing Lagrangian coherent structures, (Shadden et al., 2010) can distinguish two flow domains in this phase of deceleration. They observe a boundary between the strong

outflowing jet and the regions with recirculating flow. These features can also be observed in our simulations as visualized in **Figure 11**.

During the closing phase, two different vortices can be observed, as shown in **Figures 9B,C**, **12**. One vortex is located just above the leaflet and the other one within the sinus cavity. Although they are rotating in counter directions, both of them drive the valve to close. The vortex within the sinus cavity merges to a streamline flow and only the vortex at the tip of the leaflet is left. A fully reversed flow in the ascending aorta is modeled by altering the inflow condition and a complete closure of the valve is achieved.

High stress has been connected to leaflet damage and failure. To analyze the stress distribution in the leaflets of our model, the von Mises stress τ<sup>v</sup> in logarithmic scale is computed for the same time instances as the velocity and pressure fields in **Figures 8**, **9**,

$$\pi\_{\nu}^{2} := \sum\_{i,j=1}^{3} |\mathfrak{r}\_{ij} - \delta\_{ij}\frac{1}{3}tr(\mathfrak{r})|^{2}. \tag{16}$$

We also visualize the stress distribution at the moment when the valve has just been closed at t = 0.442. As can be observed in **Figure 13**, regions with high stress can be localized to the attachment lines, commissure and leaflet belly. However, due to the low mesh resolution, this is only a qualitative analysis of the stress distribution.

No elaborated studies to analyze mesh sensitivity have been conducted yet. So far, we have only investigated to what extent the point of contact is affected by mesh refinement. For this purpose, the mesh is uniformly refined in the vicinity of the aortic root and we observe that the point of contact does slightly differ as listed in **Table 2**.

#### 5.2. Simulation Results of the BMHV

To examine the valvular kinematics, we calculate the opening and closing angle as well as the rotational velocity of the leaflets. The rotational angle of the right and left leaflet is defined as depicted in **Figure 14A**, and the results are presented in **Figure 14B**. We observe that both leaflets are slightly open at first and accelerate and decelerate linearly while opening. The right leaflet precedes the left leaflet in the opening phase. This kinematic variation is of course strongly influenced by the geometry of the aorta. They then stay in their fully opened position until they close very rapidly, mainly due to backflow.

The geometry of a BMHV generates three jets, namely one central jet flowing through the gap between the leaflets and two side jets. During the end of the rapid opening phase, vortex rings are shed from the tip of the leaflets due to the difference in the velocity magnitude of the central jet and the two side jets. The vortex rings travel downstream a short distance before they vanish. Snapshots of the velocity field using line integral convolution (LIC) are visualized in **Figure 15A**. **Figure 15B** provides a closer view of the recirculation areas We use the open source code Saaz to calculate λ<sup>2</sup> for our simulations (King et al., 2011). The threshold 2λ<sup>2</sup> is manually adjusted until we can differentiate coherent vortex structures as shown in **Figure 15C**. The velocity vectors are added to indicate the rotational direction. The vortex observed at t = 0.1 at the right leaflet merges after a very short time into a recirculating flow with opposite direction (t = 0.11) and separates from the leaflet (t = 0.115). Meanwhile, a clockwise vortex is developed at the outer part of the left leaflet (t = 0.12), which eventually entails a neighboring, counter-clockwise rotating flow (t = 0.124). The former is swept off downstream, while the latter stays attached to the leaflet. When the valve has reached the fully opened position, no further vortices are developed.

#### 6. CONCLUSION

The aim of our research is to develop an open source modular framework for modeling and simulating the blood flow in the heart. In the present work we place prototypes of a native and mechanical aortic valve between the left ventricle and the aorta.

We model both the fluid-structure interaction of the valve and the contact problem in the framework of a unified continuum. This approach to simulate the valvular dynamics is unique and has the advantageous properties that the whole problem can be described by a set of partial differential equations for which the same numerical methods are applicable. Furthermore, no instability issues due to the fluid-structure coupling is encountered. All algorithms are implemented in the FEniCS-HPC software framework optimized for parallel computing.

We generated a CAD model of an idealized native aortic root based on a small set of parameters, where we leave for future work to adapt the geometry to patient-specific data, and to connect the native aortic root to the left ventricle. The bileaflet mechanical heart valve is reduced to the leaflets only, which are embedded in a simplified geometric model of the aorta. In contrast to the native valve, we simulate the fluid-structure interaction of the leaflets and the hemodynamics of the left ventricle conjointly. The next step is a more realistic geometric model of the BMHV.

The weak point of our approach is the degradation of the mesh quality under large mesh deformations. All the simulations were conducted without remeshing, but we usually had to reduce the time step such that the mesh smoothing algorithms could comply with the deformation. The small time step size increased the computational time. Remeshing the volume mesh is an alternative, but not ideal for parallel computing. Thus, this limitation has to be addressed.



Although the material properties of both valves do not conform with realistic values yet, typical characteristics of the flow can be identified. Based on the simulation results, we conclude that our approach for simulating the fluid dynamics around aortic valves is feasible. More anatomically accurate models are targeted as a next step in order to not only examine the hemodynamics but also to test and optimize the design of valve implants. Simulations on larger meshes with higher resolution are to be performed to examine and strengthen the accuracy and robustness of our approach. Extension of the BMVH model, and connecting the native aortic root to a LV geometry, as well as simulations with much larger meshes, are aimed in our future work.

### AUTHOR CONTRIBUTIONS

JS had the main responsibility to implement and perform the simulations as well as to prepare the manuscript. The modeling was done in collaboration between JJ and JH, while NJ's work focused on parallel computing in FEniCS-HPC.

### FUNDING

The authors would like to acknowledge the financial support from the Swedish Foundation for Strategic Research, the Swedish Research Council and the European Research Council - ERC

#### REFERENCES


Starting Grant UNICON, proposal number 202984. The research was conducted on resources provided by the Swedish National Infrastructure for Computing (SNIC) at the Center of High-Performance Computing (PDC).

### ACKNOWLEDGMENTS

We would like to express our sincere gratitude to Tobias Nilsson for his diligent work to create CAD models of an idealized native aortic root. We also would like to thank ANSA from Beta-CAE Systems S. A., who generously provided an academic license.

deforming domains and complex geometry. Comput. Fluids 80, 310–319. doi: 10.1016/j.compfluid.2012.02.003


Swanson, W. M., and Clark, R. E. (1974). Dimensions and geometric relationships of the human aortic value as a function of pressure. Circul. Res. 35, 871–882. doi: 10.1161/01.RES.35.6.871


with an in vitro test and feasibility study in a patient-specific case. Ann. Biomed. Eng. 44, 590–603. doi: 10.1007/s10439-015-1429-x

Xia, G., Zhao, Y., and Yeo, J. (2009). Parallel unstructured multigrid simulation of 3D unsteady flows and fluid-structure interaction in mechanical heart valve using immersed membrane method. Comput. Fluids 38, 71–79. doi: 10.1016/j.compfluid.2008.01.010

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer JA-S and handling Editor declared their shared affiliation.

Copyright © 2018 Spühler, Jansson, Jansson and Hoffman. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Thubrikar, M. (1990). The Aortic Valve. Boca Raton, FL: CRC Press.

# High-Performance Agent-Based Modeling Applied to Vocal Fold Inflammation and Repair

Nuttiiya Seekhao<sup>1</sup> \*, Caroline Shung<sup>2</sup> , Joseph JaJa<sup>1</sup> , Luc Mongeau<sup>2</sup> and Nicole Y. K. Li-Jessen<sup>3</sup>

<sup>1</sup> Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, United States, <sup>2</sup> Department of Mechanical Engineering, McGill University, Montreal, QC, Canada, <sup>3</sup> School of Communication Sciences and Disorders, McGill University, Montreal, QC, Canada

#### Edited by:

Mariano Vázquez, Barcelona Supercomputing Center, Spain

#### Reviewed by:

Thomas Edward Gorochowski, University of Bristol, United Kingdom Steve McKeever, Uppsala University, Sweden

> \*Correspondence: Nuttiiya Seekhao nseekhao@umiacs.umd.edu

#### Specialty section:

This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology

Received: 15 December 2018 Accepted: 13 March 2018 Published: 12 April 2018

#### Citation:

Seekhao N, Shung C, JaJa J, Mongeau L and Li-Jessen NYK (2018) High-Performance Agent-Based Modeling Applied to Vocal Fold Inflammation and Repair. Front. Physiol. 9:304. doi: 10.3389/fphys.2018.00304 Fast and accurate computational biology models offer the prospect of accelerating the development of personalized medicine. A tool capable of estimating treatment success can help prevent unnecessary and costly treatments and potential harmful side effects. A novel high-performance Agent-Based Model (ABM) was adopted to simulate and visualize multi-scale complex biological processes arising in vocal fold inflammation and repair. The computational scheme was designed to organize the 3D ABM sub-tasks to fully utilize the resources available on current heterogeneous platforms consisting of multi-core CPUs and many-core GPUs. Subtasks are further parallelized and convolution-based diffusion is used to enhance the performance of the ABM simulation. The scheme was implemented using a client-server protocol allowing the results of each iteration to be analyzed and visualized on the server (i.e., in-situ) while the simulation is running on the same server. The resulting simulation and visualization software enables users to interact with and steer the course of the simulation in real-time as needed. This high-resolution 3D ABM framework was used for a case study of surgical vocal fold injury and repair. The new framework is capable of completing the simulation, visualization and remote result delivery in under 7 s per iteration, where each iteration of the simulation represents 30 min in the real world. The case study model was simulated at the physiological scale of a human vocal fold. This simulation tracks 17 million biological cells as well as a total of 1.7 billion signaling chemical and structural protein data points. The visualization component processes and renders all simulated biological cells and 154 million signaling chemical data points. The proposed high-performance 3D ABM was verified through comparisons with empirical vocal fold data. Representative trends of biomarker predictions in surgically injured vocal folds were observed.

Keywords: high-performance computing, agent-based modeling, biosimulation, inflammation, wound healing, vocal fold, in situ visualization

### 1. INTRODUCTION

### 1.1. Agent-Based Modeling (ABM)

Agent-based modeling is a widely used approach to quantitatively simulate dynamical systems (Macal, 2016). The popularity of ABMs can be observed in the variety of ABM frameworks developed in the past decade (for reviews, please see An et al., 2009; Gorochowski, 2016; Hellweger et al., 2016; Macal, 2016). Each ABM is defined by a set of autonomous agents whose interactions among themselves and with their environment are governed by a number of stochastic or deterministic rules (Hellweger et al., 2016; Macal, 2016). In contrast to equation-based approaches, ABMs are decentralized. That is, the system's behavior is determined by the collective behavior of each individual agent in the system. Although a universal definition of ABMs remains debatable (Macal, 2016), fundamental components of ABM typically include: agent set, agent relationship set, and agents' environment (Macal and North, 2010).

Firstly, a set of agents includes the agents themselves, their attributes and their behavioral rules. Agents' behavioral rules govern their decisions and actions. In ABM, agents can represent a wide spectrum of individual entities such as consumers, markets, and geographic regions in economic models (Tesfatsion, 2006; Caiani et al., 2016), animals in ecosystems (McLane et al., 2011, 2017), and biological cells and proteins in systems biology models (D'Souza et al., 2009; Krekhov et al., 2015; Shi et al., 2016). Secondly, the set of "agent relationships and methods of interactions" (Macal and North, 2010) defines the criteria of a group of entities each agent is bound to interact with, and how these interactions are carried out. For instance, some ABMs may allow agents to interact only directly with other agents, some may allow only indirect interactions while some may allow both (Ausloos et al., 2015). A direct interaction represents an immediate impact one agent leaves on another. Particle collision is an example of a direct interaction, where colliding particle agents affect the states of each other directly. On the other hand, indirect interactions have been used to mimic the lingering effects of transmitted signals (Godfrey et al., 2009; Crandall et al., 2010; Richardson and Gorochowski, 2015; Gorochowski and Richardson, 2017). An example of indirect agent interaction includes chemical secretion as a form of intercellular communication. This chemical secretion example is classified as indirect because the agents alter the states of the environment to communicate, rather than altering the states of the recipient agents directly. Lastly, the agents' environment houses the autonomous agents. This space can be discrete latticebased (Wilensky and Evanston, 1999), continuous lattice-free (Van Liedekerke et al., 2018), or hybrid (Chooramun et al., 2012). The environment may maintain local attributes depending on the application and underlying implementation (Drasdo et al., 2018).

Our first published ABM (Li et al., 2008) was programmed on the platform of Netlogo and thus most of the terminology used herein was adopted from the dictionary of NetLogo (Wilensky, 2015). In our implementation, the 3D environment, also known as the ABM world, represents a human tissue. The 3D environment is spatially discretized into rectangular volumes called 3D patches. Each mobile agent represents an inflammatory cell that can move from one patch to an adjacent patch and make decisions to perform certain actions at discrete time steps. Agents make decisions based on the state of the patches, which allow them to alter their environment to interact indirectly with other agents. Chemokines and extracellular matrix (ECM) proteins are associated with the states of the patches.

### 1.2. Computational Challenges

The simulation of high-resolution ABMs in biology (Bio-ABM) often deals with large data sets. Processing a large amount of data demand significant computational resources. To address the challenges of the significant computational demands of largescale ABMs, multiple high-performance computing (HPC) ABM tools have been developed over the years. These tools have also been used to parallelize bio-ABMs. For example, FLAME (Kiran et al., 2010; Coakley et al., 2012) is an implementation of an ABM framework for parallel architectures based on stream Xmachines. FLAME has been used to speed up the simulation of ecological systems in various fields including systems biology (Richmond et al., 2010). FLAME GPU (Richmond et al., 2009; Richmond and Chimeh, 2017) and SugarScape on steroid (D'Souza et al., 2007) represent efforts to support ABM acceleration on GPU platforms. These tools have demonstrated their applicability to biological system simulations such as tissue wound and disease modeling (D'Souza et al., 2009; Richmond et al., 2010; de Paiva Oliveira and Richmond, 2016). Repast HPC (Collier and North, 2013) was developed as an MPI extension to its predecessors, Rapast and Repast Symphony (Collier, 2003; North et al., 2005). Repast HPC was adopted to accelerate the simulation of bone tissue growth (Murphy et al., 2016).

Multiple HPC ABM tools have also been developed specifically for systems biology applications. An example includes a Repast-based framework for single-cells and bacterial population called AgentCell (Emonet et al., 2005). The AgentCell framework provides support for running multiple non-interacting single-cell instances concurrently on massively parallel computers. More examples include HPC ABM frameworks for multi-core CPUs such as CompuCell3D (Swat et al., 2012a,b), CellSys (Hoehme and Drasdo, 2010), and Morpheus (Starruß et al., 2014). These frameworks target multi-core CPU acceleration on a single compute node using OpenMP. In addition, other techniques have been proposed to accelerate specific biological models on multi-core CPUs or GPUs (Christley et al., 2010; Falk et al., 2011; Zhang et al., 2011; Cytowski and Szymanska, 2014). However, none of the aforementioned HPC ABM techniques or tools exploit the computing power of both CPUs and GPUs simultaneously, resulting in a sub-optimal resource utilization.

Another significant challenge in systems biology modeling lies in the multi-scale nature of the model (Dallon, 2010; Eissing et al., 2011; Cilfone et al., 2014; Schleicher et al., 2017). To ensure optimal performance, it is important for differences in spatiotemporal scales between cellular and chemical interactions to be handled in a cost-effective manner. Cellular movements occur at a rate of micrometers per hour (µm/h), while cytokine diffusion in tissue occurs at a rate of micrometers per second (µm/s). A naive approach would be to iteratively simulate the model at the smallest temporal scale required. However, this approach would result in a prohibitive increase in the computational cost. A possible solution is to use coarse-graining techniques to lower the computational intensity (Qu et al., 2011). The concept of coarse-graining in ABM refers to the simulation of super-agents whose rules represent aggregated behaviors of smaller units (Chang and Harrington, 2006; Maus et al., 2011; Sneddon et al., 2011). Our earlier 2D framework uses a mechanism that captures the behavior of multiple iterations of the finer-scale processes, i.e., chemical diffusion, over a coarse time window using convolution (Seekhao et al., 2016). This intensive computation is then offloaded to a single GPU while the CPU cores focus on coarse-grain cellular processes.

An effective visualization component is essential for understanding the progress of the simulation and emerging trends. However, with billions of data points being produced after each iteration, implementing real-time visualization is not trivial. Usually, visualization is performed on pre-simulated/preprocessed data that are stored on disk. Such a method is known as post-hoc visualization. On the other hand, large simulation data sets have prompted work on coordinating the simulation and visualization simultaneously, also known as in situ visualization (Rivi et al., 2012; Nvidia, 2014). In situ visualization allows the outputs to be analyzed on the same machine that produced them. The ability to perform on-site data analysis reduces the amount of data movements between the server and remote users. This property makes in situ visualization an ideal way to visualize simulations that produce large data sets such as our case. Paraview Catalyst (Bauer et al., 2013; Ayachit et al., 2015) and work reported in Kuhlen et al. (2011) are examples of libraries developed to enable in situ processing of simulation output on popular existing visualization frameworks such as Paraview (Henderson et al., 2004) and VisIt (Childs et al., 2005). A bitmap-based and a quadtree-based ABM approach (Krekhov et al., 2015; Su et al., 2015) were proposed respectively to analyze the numerical output in situ and reduce non-essential simulation data. Most of these strategies were able to reduce the disk loads, but still required disk storage for the remaining essential data. In the present work, similar to (Seekhao et al., 2016, 2017), VirtualGL was employed as a tool for developing in situ visualization of an ABM that circumvents disk storage and directly visualize simulated outputs written on to a RAM. This real-time visualization feature would assist researchers in tracking the progress and steering the course of the simulation.

### 1.3. Case Study—Vocal Fold Inflammation and Repair

#### 1.3.1. Problem Background

In the United States, voice problems were estimated to affect one in 13 adults annually (Bhattacharyya, 2014). In one study, nearly one third of the sampled population has experienced voice disorder symptoms at some point in their lifetime (Roy et al., 2004). In particular, voice disorders constitute a major occupational hazard in many professions such as salespeople, teachers, performing artists, attorneys, and sport coaches, due to the intensive vocal demand of the job (Vilkman, 2000; Verdolini and Ramig, 2001; Jones et al., 2002; Fellman and Simberg, 2017). The estimated lifetime prevalence of voice disorders is as much as 80% in occupational voice users (Cutiva et al., 2013; Martins et al., 2015). Human vocal folds are under continuous biomechanical stress during voice production. Excessive phonatory stress can induce a cell-mediated inflammatory response and structural tissue damage, leading to a pathological condition (Gunter, 2004; Li et al., 2013; Kojima et al., 2014). Patients with phonotraumatic lesions are usually prescribed behavioral voice therapy (Johns, 2003; Misono et al., 2016) or surgical excision of the lesion in combination with various adjunctive treatments (Hansen and Thibeault, 2006; Hirano et al., 2013; Ingle et al., 2014; Moore et al., 2016). Unfortunately, the healing outcome of voice treatments often depend on the lesion, the treatment dose, and the patient's vocal needs (Abbott et al., 2012; Roy, 2012; Li N.Y. et al., 2014). The success rate of voice treatment varies extensively between 30 and 100% (MacKenzie et al., 2001; Zeitels et al., 2002; Wang et al., 2014; Vasconcelos et al., 2015), making the treatment planning process difficult for voice therapists and surgeons. The unpredictable treatment outcome is axiomatic and takes a huge toll on a person's career, a clinician's decision-making process and society's healthcare costs. A computational tool that can estimate voice treatment success would spare patients from unnecessary and costly treatments and potentially harmful side effects.

Computer simulations have become central to personalized medicine (Deisboeck, 2009; Chen and Snyder, 2012; Li et al., 2016; Canadian Institutes of Health, 2017). This approach involves the creation of computational models to estimate treatment outcome and identify the best possible treatment for a given patient. Simulation modeling involves the integration of the best available knowledge into a computer platform to represent the real-world problem. The process involves an abstraction of causal relationships between patient variables and health outcomes followed by a rigorous and iterative protocol of model calibration and validation (Galea et al., 2009; Marshall and Galea, 2014; O'Donnell et al., 2016). The property that sets numerical simulation models apart from standard statistical models is the observability of the evolution of patient behaviors and health conditions in the computer model as time passes during simulation. Such an approach provides a computational tool for clinicians to evaluate the impact of intervention or other modifiable variables on health outcomes in advance or along any point during the intervention.

Computer models have been developed for complex health conditions, including sepsis (Clermont et al., 2004; Kumar et al., 2004; Vodovotz et al., 2006), traumatic brain injury (Vodovotz et al., 2010), acute liver failure (Wlodzimirow et al., 2012), diabetes (Boyle et al., 2010; Day et al., 2013), obesity (El-Sayed et al., 2013; Hammond and Ornstein, 2014), and cardiovascular disease (Hirsch et al., 2010; Li Y. et al., 2014; Li et al., 2015). In our case, a series of ABMs have been developed to numerically simulate the essential biology underlying vocal injury and repair with the goal of helping clinicians to better tailor treatments for patients with voice disorders (Li et al., 2008, 2010a,b, 2011; Miri et al., 2015; Seekhao et al., 2016).

In the current study, an existing high-performance 2D ABM (Seekhao et al., 2016) is substantially enhanced to a much larger 3D model in an attempt to faithfully capture the physiological dimension of human vocal folds. A diffusion kernel reduction technique is used to enhance the performance and ensure that all necessary 3D data required for diffusion fits within the GPU global memory. A scheduling scheme for a heterogeneous compute node, which consists of multi-core CPU and manycore GPUs, is then used to completely mask the execution time of the computationally intensive diffusion and visualization tasks. This low-cost, high-resolution, and high-performance computing ABM platform with real-time visualization capability is an original concept in disease modeling, and can make complex disease models practical in clinical settings.

#### 1.3.2. Modeling Vocal Fold Repair With ABM (VF-ABM)

In the vocal fold ABM (VF-ABM) used in this work, the inflammatory cells were implemented as agents (Li et al., 2008, 2010b, 2011). The chemokines and ECM proteins were implemented as states of the patches. The aggregation of these components yields the state of the vocal fold (ABM world) at each given point in simulated time. **Table 1** summarizes the roles that each type of cell agent plays in the healing process. At the time of acute injury, the traumatized mucosal tissue within the damaged area triggers platelet degranulation. Different chemokines get secreted resulting in vasodilation stimulation and attraction of inflammatory cells, namely, neutrophils and macrophages to the wound site. Activated neutrophils and macrophages at the wound area further secrete chemokines to attract fibroblasts and remove cell debris. To repair the wound, activated fibroblasts proliferate and deposit ECM proteins such as collagen, elastin, and hyaluronan. These ECM proteins then form a scaffold for supporting fibroblasts in wound contraction, cell migration, and other wound repair activities (Bainbridge, 2013). The flow diagram of the interactions between all the components in the model is shown in **Figure 1** (modified from Li et al., 2008). In each iteration, the VF-ABM executes the following major steps:


#### 2. MATERIALS AND METHODS

The 3D ABM simulation suite includes both computation and visualization components. The computational tasks can be TABLE 1 | Summary of agent rules.


TABLE 2 | Summary of NVIDIA Tesla M40 GPU specifications.


categorized as coarse- or fine-grain. Coarse grain tasks include inflammatory cell and ECM functions, which involve more complex control structures and relatively small data movements. On the other hand, fine-grain tasks include the diffusion of the different chemicals, which involves relatively simple operations applied to large amounts of data. In this section, we start by describing our hardware and software environment. We will then discuss how task assignments and coordination are performed to ensure correct synchronization and maximize load balance. Finally, we will describe how each task category underwent optimization specific to its computational and data access characteristics. The model size and configuration details are summarized in **Table 3**. The source code of the VF-ABM prototype with optimizations described in this work can be found at https://github.com/VF-ABM/hpc-abm-vf-version\_0\_6.

#### 2.1. Hardware and Software Environment

Our high-performance VF-ABM was tested and benchmarked on a compute node with a 44-core Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz host and two attached accelerators, NVIDIA Tesla M40. **Table 2** summarizes the GPU specifications.

Each Tesla M40 GPU consisted of 3,072 cores per device with 24 GB of global memory. C++, a lightweight programming language, was used to implement the program to ensure fast and efficient simulation. To utilize the multiple CPU cores available, Open Multi-Processing (OpenMP) was used to parallelize coarse-grain cellular processes. OpenMP is a highly portable Application Programming Interface (API) that supports multi-threading on shared-memory platforms via a set of platform-independent compiler directives (Dagum and Enon, 1998). OpenMP was further used to allocate separate threads to communicate and launch tasks on the GPUs. Chemical diffusion tasks were offloaded to the GPUs due to their high computational needs. These tasks were programmed using the NVIDIA Compute Unified Device Architecture (CUDA) (Nvidia, 2007) model. CUDA is a parallel computing platform and programming model, which allows general purpose multithreaded programming of GPUs via C-like language extension keywords. In the CUDA language, a GPU is presumed to be attached to the host (CPU), which controls data movement to/from the GPU. The CPU is responsible for launching kernels, which are functions to be executed by all threads launched on the GPU. Open Graphics Library (OpenGL) was used to implement the visualization component of the simulation. OpenGL is an open standard, cross-language API for 2D and 3D rendering. OpenGL is widely used over a broad range of graphics applications due to its portability and speed.

### 2.2. Scheduling and Coordination of CPU-GPU Computation and Visualization

The 3D VF-ABM consisted of an environment with 154 million patches (**Table 3**). Each patch stored information



of ECM proteins and chemical data. In addition, around 17 million mobile agents, representing the inflammatory cells, resided in this ABM world. The model simulated the dynamic biological processes pertinent to vocal fold inflammation and repair at 30 min time intervals. At each model iteration, the operations corresponding to ECM functions, chemical diffusion, and cell (agent) functions were executed followed by the update of the ABM world. Given the computational complexity and the amount of data involved, each iteration required a careful mapping and scheduling of these operations on the available hardware resources. In addition, the visualization provides essential spatial information of ECM proteins, chemicals, and inflammatory cells during the simulation. The overall goal was to simulate and visualize the 3D VF-ABM as fast as possible for each iteration.

The typical approach to tackle such computational complexity has been to use multi-core CPUs and many-core GPUs. Accelerators such as GPUs need a CPU host, and each of the GPU and CPU has a number of cores that can be exploited using parallel programming techniques. However, GPUs have received much more attention in general whenever accelerated performance is the main goal due to their extremely high performance in data parallel computations. Often, this hardware preference results in idle CPUs, waiting for GPUs to perform all the work after the dispatch of the computing tasks to the GPUs. In this work, the aim was to exploit the resources available on both the CPU and GPU simultaneously so as to achieve the best possible performance. In fact, a host-device computation overlap technique was used in our earlier work, resulting in much improved performance for the 2D ABM framework (Seekhao et al., 2016). However, the 3D ABM framework was substantially more computationally demanding. The previous methods were thus further developed to achieve the desired high speed simulation and visualization necessary for the 3D ABM framework.

To achieve optimal resource utilization, it is important to address the challenges of load balancing, minimizing data movements between the CPU and GPU, and coordinating the tasks on various devices. As we moved from 2D (Seekhao et al., 2016) to 3D, the computational complexity of the simulation and the amount of data involved increased substantially. Furthermore, the execution time of the visualization component, which was negligible in the 2D simulation, became significant. Therefore, the issues of task assignment, load balancing, and device coordination need to be revisited and addressed properly.

**Figure 2** illustrates the workflow of the 3D ABM simulation during each iteration. Specifically, it describes the task allocation on a platform consisting of a single multicore CPU with NGPU GPUs attached to it. For our specific setup consisting of 2 GPUs, the simulation started on the CPU host, and then split into three paths: coarse-grain, fine-grain/visualization, and fine-grain. Each of the paths was run on separate hardware resources. The first path spawned multiple CPU threads to execute coarse-grain tasks on CPU cores. The second path was responsible for visualization and some of the fine-grain tasks that execute on a single GPU resource. The remaining fine grain tasks executed on the rest of the GPUs. All paths met at the end to exchange and update the ABM world.

The overlap of visualization and computational components required a careful device coordination as these components now shared computing resources. Algorithm 1 describes, at a high-level, how to map tasks and perform hostdevices synchronization. Each GPU task, computational or visualization, has its own CPU thread for data management and communication with the GPUs. Nested CPU threads were launched at three levels. At the first level, the driver started the execution by initializing the simulation and launching two threads, one for visualization and the other for computation.

tasks), while the other is used for both visualization and diffusion. With p available CPU cores, p − NGPU − 1 or p − 3 threads are allocated for coarse-grain functions. The other NGPU threads are in charge of managing data transfers and dispatching fine-grain tasks to the GPUs, and the last thread is spared for visualization.

**Algorithm 1:** Pseudocode describing CPU-GPU scheduling related functions in Driver, Computation and Visualization class **Function** Driver::run()**:**

```
init()
launchCPUthreads(2)
  if thread_id == 0 then
    Visualization.start()
  else
    Computation.start()
```
**return**

```
Function Visualization::start():
  while !simulationDone do
    renderOnGPU()
    visualizationDone ← 1 // Notify
            // Computation class of
            // visualization completion
    while computationDone 6= 1 do
      // wait for computation on both
         CPU and GPUs to complete
    computationDone ← 0 // reset
     computation completion flag
  return
```

```
Function Computation::start():
  while !simulationDone do
    launchCPUthreads(2)
      if thread_id == 0 then
        executeCPUtasks()
      else
        executeGPUtasks()
    syncAndUpdateWorld() // Sync CPU
                // and GPU chemical data
    computationDone ← 1 // Notify
              // Visualization class of
              // computation completion
  return
```
The visualization rendered the current state of the ABM world using an available GPU, and then broadcast the completion of the rendering task. Concurrently with the visualization execution, the computation started by launching two more threads at the second level. Both threads at this level further launched multiple threads at the third level, depending on the number of cores available. More specifically, the first thread at level 2 was responsible for executing CPU tasks, which launched parallel threads for coarse-grain task parallelization i.e., level 3. The second thread at level 2 spawned NGPU level-2 threads to launch fine-grain computation tasks on available GPUs. Note that if the visualization was not yet completed, one of the GPUs

```
Algorithm 2: Pseudocode describing VF-ABM
operations and workflow
 Procedure executeCPUtasks()
   /* model computation */
   launchCPUthreads(p − NGPU − 1)
               // p denotes the number of
                    // available CPU cores
     for each Patch pt ∈ 3Dworld do
        if pt.conditionMet() then
          pt.seedCell()
        pt.ECMFunction()
        pt.fragmentECMs()
     for each Cell c ∈ InflammatoryCells do
        c.cellFunction()
     /* model update (excluding
         chemical data update) */
     for each Patch pt ∈ 3Dworld do
        pt.updateECMs()
        pt.updatePatch()
     for each Cell c ∈ InflammatoryCells do
        c.updateCell()
 Procedure executeGPUtasks()
   launchCPUthreads(NGPU)
     gpu_id ← thread_id
     if gpu_id == gpu_idvis then
        while visualizationDone 6= 1 do
          // wait for visualization on
             // GPU to complete
        visualizationDone ← 0 // reset
        visualization completion flag
     for each ChemicalType
      ct ∈ ChemicalTypeSet[thread_id] do
        diffuseChemicalOnGPU(ct, gpu_id)
        // using GPU FFT library
               // (i.e. NVIDIA cuFFT) for
              // convolution computations
```
would not be available and the fine-grain tasks will have to wait (Algorithm 2). If a fine-grain task had grabbed the same GPU used for visualization, it would have to broadcast its completion so that the visualization can proceed.

### 2.3. Computational Optimization of Diffusion

Chemical diffusion was the most demanding computational component of the model. As previously mentioned, its computational demand was primarily a result of the extremely small spatiotemporal scale and high rate at which chemical diffusion occurs. To reduce the computational load, a convolution-based method was used to simulate the diffusion process (Seekhao et al., 2016). A Fast Fourier transform (FFT) was then used to reduce the complexity of convolution computations. Lastly, kernel size reduction was achieved by extracting the most dense segment of the Gaussian kernel to optimize the diffusion performance. Note that, since we deal with regular grids for the ABM world, finite difference method (FDM) is used as opposed to the more computationally intensive integral schemes.

#### 2.3.1. FFT-Convolution-Based Diffusion

In 3D, the diffusion equation with decay can be written as

$$\frac{\partial \mathcal{L}}{\partial t} = D \left( \frac{\partial^2 \mathcal{L}}{\partial x^2} + \frac{\partial^2 \mathcal{L}}{\partial y^2} + \frac{\partial^2 \mathcal{L}}{\partial z^2} \right) - \chi \mathfrak{c}, \tag{1}$$

where c is the chemical concentration, D is the diffusion coefficient and γ is the decay constant. Assuming that 1x = 1y = 1z, and using a Taylor expansion to discretize the continuous 3D diffusion equation, we get

$$c\left(\mathbf{x},\boldsymbol{\chi},\boldsymbol{z},t+\Delta t\right) = \left(1-\frac{4D\Delta t}{\Delta x^2}-\boldsymbol{\chi}\Delta t\right)c\left(\mathbf{x},\boldsymbol{\chi},\boldsymbol{z},t\right)+$$

$$\frac{D\Delta t}{\Delta x^2}\left[c\left(\mathbf{x}+\Delta x,\boldsymbol{\chi},\boldsymbol{z},t\right)+c\left(\mathbf{x}-\Delta x,\boldsymbol{\chi},\boldsymbol{z},t\right)+$$

$$c\left(\mathbf{x},\boldsymbol{\chi}+\Delta \boldsymbol{\chi},\boldsymbol{z},t\right)+c\left(\mathbf{x},\boldsymbol{\chi}-\Delta \boldsymbol{\chi},\boldsymbol{z},t\right)+$$

$$c\left(\mathbf{x},\boldsymbol{\chi},\boldsymbol{z}+\Delta \boldsymbol{z},t\right)+c\left(\mathbf{x},\boldsymbol{\chi},\boldsymbol{z}-\Delta \boldsymbol{z},t\right)\right]\tag{2}$$

subject to the stability constraints

$$
\Delta t \le \frac{\Delta x^2}{6D}. \tag{3}
$$

As shown in **Table 4**, the largest value of D in the set of chemical types in VF-ABM is 900 <sup>µ</sup>m<sup>2</sup> min (Spiros, 2000), with patch width 1x = 15µm. The condition 1t ≤ 2.5 s needs to hold to meet stability constraints. Clearly, the complexity of the simulation would be unnecessarily high if the model evolved at 1τ = 2.5 s rather than 1τ = 30 min or 1, 800 s.

By letting λ = D1t 1x 2 , Equation (2) can be rewritten as

$$\begin{aligned} \mathcal{L}\left(\mathbf{x}, \boldsymbol{y}, \boldsymbol{z}, t + \Delta t\right) &= \left(1 - 6\lambda - \boldsymbol{\chi}\Delta t\right) \cdot \mathcal{c}\left(\mathbf{x}, \boldsymbol{y}, \boldsymbol{z}, t\right) \\ &\quad \lambda \cdot \mathcal{c}\left(\mathbf{x} + \Delta \boldsymbol{x}, \boldsymbol{y}, \boldsymbol{z}, t\right) + \lambda \cdot \mathcal{c}\left(\mathbf{x} - \Delta \boldsymbol{x}, \boldsymbol{y}, \boldsymbol{z}, t\right) + \\ &\quad \lambda \cdot \mathcal{c}\left(\mathbf{x}, \boldsymbol{y} + \Delta \boldsymbol{y}, \boldsymbol{z}, t\right) + \lambda \cdot \mathcal{c}\left(\mathbf{x}, \boldsymbol{y} - \Delta \boldsymbol{y}, \boldsymbol{z}, t\right) + \end{aligned}$$


TNF-α, TGF-β1, IL-1β, and IL-6 values are taken from Spiros (2000).

$$\lambda \cdot c\left(\mathbf{x}, \mathbf{y}, z + \Delta z, \mathbf{t}\right) + \lambda \cdot c\left(\mathbf{x}, \mathbf{y}, z - \Delta z, \mathbf{t}\right) \tag{4}$$

or,

$$\begin{aligned} \mathcal{L}\left(\mathbf{x}, \mathbf{y}, \mathbf{z}, t + \Delta t\right) &= \\ \sum\_{i=x-1}^{x+1} \sum\_{j=y-1}^{y+1} \sum\_{k=z-1}^{z+1} \mathcal{c}\left(i, j, k, t\right) \cdot f\left(\mathbf{x} - i, \mathbf{y} - j, z - k\right), \end{aligned} \tag{5}$$

where

$$f\left(\mathbf{x},y,z\right) = \begin{cases} 1 - 6\lambda - \mathbf{y}\,\Delta t & \mathbf{x} = \mathbf{0} \bigwedge \mathbf{y} = \mathbf{0} \bigwedge z = \mathbf{0} \\ \lambda & \mathbf{x} = \pm 1 \bigwedge \mathbf{y} = \mathbf{0}, \bigwedge z = \mathbf{0}, or \\ & \mathbf{y} = \pm 1 \bigwedge \mathbf{x} = \mathbf{0}, \bigwedge z = \mathbf{0}, or \\ & z = \pm 1 \bigwedge \mathbf{x} = \mathbf{0}, \bigwedge \mathbf{y} = \mathbf{0} \\ \mathbf{0} & \text{otherwise.} \end{cases}$$

Clearly, Equation (2) is equivalent to Equation (5), thus c x, y, z, t + 1t = c x, y, z, t ∗ f (x), where ∗ represents the convolution operation. To compute c x, y, z, τ + 1τ , where 1τ = m · 1t, the chemical concentrations from the previous step, c x, y, z, τ , is convolved with f x, y, z , m times. The commutative property of convolution implies that convolving f x, y, z with itself m times results in f<sup>m</sup> x, y, z , and the diffused concentrations at each iteration can be computed as

$$
\mathcal{L}\left(\mathbf{x}, \mathbf{y}, \mathbf{z}, \mathbf{z} + \Delta \mathbf{z}\right) = \mathcal{L}\left(\mathbf{x}, \mathbf{y}, \mathbf{z}, \mathbf{z}\right) \* f\_m\left(\mathbf{x}, \mathbf{y}, \mathbf{z}\right) \,. \tag{6}
$$

The diffusion computation can thus be accelerated by computing Equation (6) at a large time step, 1τ , without violating stability constraints. The effective diffusitivity of IL-1β in tissue, for example, is 900 <sup>µ</sup>m<sup>2</sup> min (Spiros, 2000). In a 15 µm patch world, a 30-min time step implies that the program has to calculate c x, y, z, τ ∗ f<sup>720</sup> x, y, z at each time step. In other words, a chemical on a given patch (x,y,z) has a spatial diffusion range of x ± 720, y ± 720 and z ± 720, within a window of dimension 1, 441 × 1, 441 × 1, 441, which covers approximately 3 billion patches.

#### 2.3.2. Kernel Reduction

The diffusion kernel was computed by convolving the initial coefficient function, f(x, y, z), in Equation (5), with itself m = 1τ/1t times, where 1τ is the biological time step of 30 min and 1t = 1x 2 /6D is the diffusion time step subjected to the stability constraints (Equation 3). As calculated earlier, the effective diffusitivity of IL-1<sup>β</sup> of 900 <sup>µ</sup>m<sup>2</sup> min results in a 1, 441×1, 441×1, 441 kernel.

Note that f(x, y, z) is smoother as it gets convolved with itself, thus a Gaussian shaped diffusion kernel is obtained. The values in Gaussian distributions are highest at the center. These values decrease and approach zero, the further they are from the center. This observation enabled the reduction of the kernel size by focusing on the center window, while keeping almost 100% of kernel mass. The coverage levels of the kernel mass with respect to extracted window sizes are plotted in **Figure 3**.

#### 2.4. Visualization Optimization

The 3D VF-ABM processes at least 17 million agents in each iteration while producing 1.23 and 0.46 billion chemical data and ECM protein data points, respectively. The model currently does not visualize the state of the ECM proteins on each individual patch, but rather outputs the aggregated ECM protein statistics at the end of the simulation. Due to the screen space, the user can only select one out of eight types of chemicals to be visualized in each frame. The visualization component is thus responsible for visualizing 17 million biological cells and 154 million chemical data points. To optimize the visualization of such a large amount of data, sampling was used and its effects on output simulation and corresponding performance enhancements were studied. The performance evaluation is reported in section 3.1.2.

A client-server in situ visualization protocol was employed to bypass the disk storage and provide users the ability to steer computation in real-time. For a seamless simulation and visualization experience, the latency of the server-client visualization pipeline had to be kept as minimal as possible even when a large amount of data is being simulated and visualized. One possible approach is to redirect OpenGL commands to the remote X server on the client side (Project, 2015). However, this approach puts significant loads on the network due to the transferring of both OpenGL calls and 3D data from the server to the remote client. Moreover, this approach strains the client with all of the rendering responsibilities, making the approach only suitable for applications with small and static data or specifically tuned OpenGL applications (Project, 2016). Another possible approach would be to use remote display software. However, some remote display software either lack the ability to run OpenGL applications, or force OpenGL applications to use a slow OpenGL software renderer (Project, 2016). Due to the size of the data produced by the 3D VF-ABM, the most suitable candidate is VirtualGL. The open source package, VirtualGL, allows any Unix or Linux remote display software to display OpenGL applications on the client's machine, while taking full advantage of the server's 3D graphics accelerators (Project, 2015). The OpenGL commands and 3D data are redirected to a 3D graphics accelerator on the server by VirtualGL. Thus, instead of sending a large amount of data points over the network, only one single simulation image frame (shown in **Figure 4**), which was visualized on the server, is sent to the client in each iteration. Given that this protocol shifts most of the rendering loads to the server, the client can take full advantage of the server's hardware, which is usually much more powerful than that of the client's machine. The employment of VirtualGL thus enhances the speed of the visualization through the server's accelerators without costing the client much hardware overhead.

#### 3. RESULTS

The simulation speed and accuracy are critical in making any biological model clinically useful. This section starts by examining the overall performance of the ABM simulation for our case study of the 3D VF-ABM, thereby illustrating the scalability of the model with respect to the number of cores available. The impact on the simulation accuracy with respect to the computational enhancement is then reported. Section 3.1.3 analyzes the performance of the 3D VF-ABM simulation suite and benchmarks its performance against existing ABM frameworks. Finally, the verification of model outputs is reported in section 3.2.

#### 3.1. Performance Evaluation

To optimize the overall simulation suite, each simulation component underwent aforesaid optimization techniques. Each

technique was tailored to the specific computation and data access patterns of the respective component. Thus, their effects on performance were studied with respect to computation, visualization, and coupled simulation-visualization.

#### 3.1.1. Computational Component

Due to the efficiency of the FFT-based diffusion method, diffusing 1.2 billion point chemical data on two GPUs only took 2.5 s per iteration. However, the set of coarse-grain tasks (excluding updates) took about 4 s to execute. As a result, the coarse-grain tasks became the performance bottleneck. That is, the time that the VF-ABM takes to complete the computational component of a single iteration depends on how long it takes to execute the cellular tasks plus the time to synchronize the results. **Figure 5A** shows the execution time for the compute component using different numbers of CPU threads overlapping with two GPUs. These results indicate that the best performance using 32 threads takes approximately 6.2 s per iteration on average. The average speedup of the computational component as well as the speedups of its two main sub-components across 240 iterations over different numbers of threads are plotted in **Figure 5B**. Tasks were grouped into model functions (cell/ECM/synchronization) and update routines, and their speedups within each respective group were averaged. Notice that the update tasks consisted mostly of memory access operations. These operations were memory bound, thus showing poor scalability. Memory bound refers to the problem of memory speed not being able to keep up with the processor speed (McKee, 2004). The memory speed thus becomes the bottleneck of applications with low ratio of number of computation operations to number of memory operations. In contrast, other model function tasks involved more computation, and thus these tasks showed good scalability, making the overall speedup of the simulation reasonable.

#### 3.1.2. Visualization Component

The coarse-grain tasks (excluding updates) took about 4.7 s to complete on the CPU. On the other hand, the fine-grain tasks on the GPUs only took 2.5 s. This difference in execution

time resulted an idle period on the GPUs. If the visualization component was fast enough, this window would allow us to integrate visualization with the GPU computation without increasing the total execution time.

The visualization component included the rendering of cell migration, chemical diffusion, and tissue damage tracking. The most time consuming component was the chemical diffusion, which required an access of 154 million points of data during each iteration. As discussed earlier, data sampling was used to improve the visualization performance. **Figure 6** shows the execution time and screenshots of chemical visualization using different sampling window widths. The visualization of the entire world looked almost identical for up to 6<sup>3</sup> sampling windows. Results showed that, looking at the entire simulation area, enough visual information was retained by using a fixed 6<sup>3</sup> sampling window. However, if the user needed to zoom in to highly active areas, a more sophisticated adaptive sampling technique could be used instead of the fixed sampling used here (Seekhao et al., 2017).

#### 3.1.3. Coupled Simulation and Visualization

Since the visualization execution time was reduced from 23 s down to 0.4 s using data sampling for chemical diffusion, the visualization execution could then be placed in the idle period on one of the GPUs. By placing the visualization execution in a GPU idle gap, the total execution time remained unchanged at 6.2 s per iteration on average. This fast execution time enabled the simulation to execute remote computation, remote visualization, remote transmission of the result frame, and frame rendering on the client's machine in under 7 s/frame. This performance, as far as we know, is the fastest known complex ABM simulation and visualization at a similar scale.

For benchmarking purposes, the 3D VF-ABM was compared to our previous and other ABM works of similar nature (**Figure 7**). The M. Tuberculosis (MTb) ABM (D'Souza et al., 2009) was benchmarked on a system with an NVIDIA GeForce 8800M GTX GPU, while GeForce GTX Titan was used for FLAME GPU immune system ABM (de Paiva Oliveira and Richmond, 2016). Despite the differences in underlying

FIGURE 6 | Visualization-only performance. This chart shows visualization screenshots and corresponding execution time for different sampling resolutions. The stride denotes the gap between two consecutive sampled points, thus the higher the stride the coarser the sampling. The visual appearance of the each sampling case looks almost identical for up to stride 6 or 6<sup>3</sup> sampling windows. The visualization was able to retain sufficient information by using 6<sup>3</sup> sampling.

FIGURE 7 | Processing power of 3D VF-ABM vs. existing work comparison. This bar chart compares workload and execution time in terms of number of patches (i.e., lattice points, grid points, stationary cells) per ms between the 3D VF-ABM to other bio-simulation ABM work. Notice that the 3D VF-ABM is capable of processing 25K patches/ms, or about 900x, 63x, 2.3x, and 2.4x more patch processing power than NetLogo, MTb ABM (D'Souza et al., 2009), FLAME GPU immune system ABM (de Paiva Oliveira and Richmond, 2016), and the earlier 2D VF-ABM work (Seekhao et al., 2016).



hardware, MTb ABM simulation is arguably one of the most suitable works for performance comparison with the 3D VF-ABM. The 2D MTb ABM simulated a complex multi-scale biological system of agents that communicate via chemical signals, which aligned in most respects with the 3D VF-ABM. The human immune system ABM was built on a widely used HPC ABM platform, FLAME GPU (de Paiva Oliveira and Richmond, 2016). Although this ABM executed the immune system at a much smaller timescale, the cell communication method is similar to other ABMs included in this performance comparison, i.e., communication via chemical signals. The FLAME GPU immune system ABM thus served as a good performance reference.

The 3D VF-ABM was simulated at a scale physiologically representative of a human vocal fold. Such scale was not feasible to be implemented on ABM freeware NetLogo (Wilensky and Evanston, 1999). Furthermore, to our best knowledge, no similar scale had been reported in any other publication. For a common throughput unit, the simulation performance was measured in terms of environment space unit per millisecond. The space units represent the smallest granularity of the ABM environment. Depending on the model, the space units can be patches (Wilensky, 2015), grid points (D'Souza et al., 2009), or immobile tissue cells (de Paiva Oliveira and Richmond, 2016). These quantities determine the ABM environment size. Therefore, the number of space units are proportional to the amount of work required to simulate the ABM environment in each iteration. For this reason, space unit per millisecond serves a reasonable throughput measure. The 3D VF-ABM is capable of processing 25K patches/ms, which is about 900x, 63x, 2.3x, and 2.4x the throughputs of NetLogo, MTb, FLAME GPU immune system ABM and the 2D VF-ABM, respectively. The comparison of the model scale, complexity and performances are in **Table 5**. Of note, FLAME GPU can process roughly 1.9x more mobile agents than 3D VF-ABM per time unit. The primary reason was that the time step used in FLAME GPU immune system ABM are smaller than that of our model in orders of magnitudes. This time scale difference caused their agent rules to be much less complex. For example, FLAME GPU immune system ABM would take roughly 18 h to complete a 5-day simulation while the 3D VF-ABM only takes less than half an hour. In addition, the 3D VF-ABM offered a much more rigorous data visualization in real-time at a scale of over 100 times more mobile agents than that of FLAME GPU immune system ABM.

### 3.2. Verification

The trends of the 3D VF-ABM output were qualitatively verified using the pattern-oriented analytical approach (Railsback, 2001; Grimm et al., 2005; Li et al., 2010b). The purpose of qualitative verification was to ensure that the dynamics of the model reflect what is expected in the wound healing literature and the available experimental data (Railsback, 2001; Grimm et al., 2005; Lim et al., 2006; Welham et al., 2008).

Cell population and ECM protein trends were compared against known patterns reported in wound healing literature as summarized in **Table 6** (Martin, 1997; Witte and Barbul, 1997; Robson et al., 2001; Cockbill, 2002; Tateya et al., 2005; Dechert et al., 2006; Stern et al., 2006; Tateya I. et al., 2006; Tateya T. et al., 2006; Jiang et al., 2007). **Figure 8** shows cellular and molecular outputs of the VF-ABM from a 7-day simulation. The model predicted a peak neutrophil population at the end of day 1 and significant decreases in day 2. The model also reproduced a peak of macrophage population around day 2 and a downward trend from the beginning of day 3 onward. Furthermore, the fibroblast proliferation started around the end of day 1 in the simulation. Trends of these specific cell populations agreed well with the known patterns in wound healing literature (**Table 6**). For ECM outputs, the VF-ABM reproduced the trends of collagen but not of hyaluronan. In particular, both empirical and ABM results showed the accumulation of collagen starting from Day 3. The ABM predicted an earlier accumulation of hyaluronan (Day 1) compared to empirical data (Day 3). This early hyaluronan accumulation might be related to high levels of TNF-α, TGF-β, FGF, and IL-1β that stimulated the secretion of hyaluronan by fibroblasts in the model. More data and calibration are needed for further investigation.

Due to the data availability, only a subset of chemicals was compared against the empirical data (Lim et al., 2006; Welham et al., 2008). This subset includes measured mRNA levels of three inflammatory mediators (TNF-α, TGF-β, and IL-1β) out of 8 that are simulated by the model. The comparison of the model outputs and the empirical data are shown in **Figure 9**. The ABM generated a peak of TNF-α after 13 h (26 ticks) of injury, whereas this peak occurred at hour 8 (tick 16) in the empirical data. For IL-1β, the model generated a peak at hour 12 (tick 24), where the peak was observed at hour 8 in the empirical data. Overall, the ABM-predicted peaks for TNF-α and IL-1β lagged behind the experimentally observed peaks by 4–5 h. The discrepancy between the model outputs and literature data may be explained as follows. First, since TNF-α and IL-1β were down-regulated by TGF-β and IL-10 via macrophages and fibroblasts, a possible reason for the peak delay could be an insufficient strength of TGF-β or IL-10. Second, since no empirical data were reported between hour 8 and 16, a peak between this interval might have been missed experimentally. More empirical data are needed for further investigation. For TGF, the model missed predicting the spike at hour 1. However, the sub-linear upward trend from hour 4 till the end of the simulation predicted by the model matched with that of the empirical data. In sum, the VF-ABM trajectories of inflammatory mediators showed a few TABLE 6 | Summary of patterns used for qualitatively verify 3D VF-ABM (Li et al., 2010b).


discrepancies when comparing with the empirical vocal fold data in literature. Despite these few discrepancies, the overall dynamics of the VF-ABM outputs are consistent with those seen in the empirical data. Note that for this VF-ABM to be clinically ready, more experimental data is needed to calibrate the model. Future directions of this line of work will be discussed later in section 4.

### 4. DISCUSSION

This work presents novel 3D ABM implementation techniques to tackle the heterogeneity of time scales in large-scale and multi-scale computational biology modeling. This 3D ABM for complex biological systems harnessed high-performance computing techniques to accommodate high-resolution models in simulating the model geometry and cellular components in the full physiological dimension without having to scale down the problem size. Kernel volume reduction was used to speed up convolution-based fine-grain chemical diffusion tasks on the GPUs. OpenMP was used to parallelize the coarse-grain cellular tasks the CPU cores. A task scheduling scheme was then used to overlap and synchronize the coarse-grain, fine-grain diffusion and in situ visualization components. This approach incurred optimal concurrent utilization of both multi-core CPUs and

GPUs, resulting in minimal hardware resource idle time. The 3D VF-ABM prototype demonstrated tremendous performance improvements to high-resolution cellular-level models achieved with the proposed scheme. The high-performance simulation suite is capable of large-scale computing and remote visualization in under an average of 7 s per iteration. The computational

component tracks 17 million cells and process 1.7 billion signaling chemical and structural protein data points. The remote visualization component renders 17 million cells and 154 million signaling chemical data points on the server then send result frame to the user. Compared to related work of similar nature, the 3D VF-ABM showed roughly 900x, 63x, and 23x data processing power over the NetLogo version of vocal fold ABM, MTb ABM (D'Souza et al., 2009), and FLAME GPU immune system ABM (de Paiva Oliveira and Richmond, 2016), respectively.

Model verification of the VF prototype was perform qualitatively against known patterns (Martin, 1997; Witte and Barbul, 1997; Robson et al., 2001; Cockbill, 2002; Tateya et al., 2005; Dechert et al., 2006; Stern et al., 2006; Tateya I. et al., 2006; Tateya T. et al., 2006; Jiang et al., 2007), and against rat vocal fold surgical data (Lim et al., 2006; Welham et al., 2008). The model reproduced the overall dynamics of cellular and molecular trajectories seen in surgical vocal fold injuries. However, in a few cases, such as the trends of hyaluronan and collagen, the model missed predicting their peaks. This mismatch between the model and empirical trends was possibly caused by imbalances in the levels of regulating substances. More data and further calibration process are required to investigate this matter.

As discussed earlier, our ABM world currently only supports regular grids and thus FDM applies well to the diffusion computation. An arbitrary shape world is a possible direction of future work that is yet to be explored. A technique such as indirect addressing (Randles et al., 2015) and advanced data structures such as octrees or meshes are examples of possible approaches to an ABM world geometry solution. These techniques clearly offer a more realistic representation of the real-world geometries but will also increase the model complexity. Simple FDM for diffusion may not apply well to these complex geometries. Variations of FDM (Hunt, 1978; Liszka and Orkisz, 1980) and other PDE approximation schemes such as finite element method (FEM) should thus be considered in future ABM developments.

Ongoing work on parallelizable calibration automation is being developed to refine the parameter values of the VF-ABM with additional vocal fold data collected in our laboratory (Li et al., 2012; Heris et al., 2015; Latifi et al., 2016; Li-Jessen et al., 2017) and others (King et al., 2015; Kishimoto et al., 2016). Those works are necessary to improve the biological representation of the VF-ABM for the ultimate clinical application. Highperformance techniques are being expanded to facilitate more complex data explorations such as active area resolution enhancement (Seekhao et al., 2017), 3D volume rendering of ECM protein content, tissue fiber orientation and structure, while still maintaining real-time performance. This work focuses on the application of surgical vocal fold injury and repair because the empirical data (Lim et al., 2006; Welham et al., 2008) are available for model verification. However, the host-accelerators (CPU-GPUs) coordination, diffusion kernel reduction, and other techniques proposed here can be generalized and applied to other complex multi-scale biological system applications to enhance their performance on heterogeneous HPC platforms.

#### AUTHOR CONTRIBUTIONS

NS and JJ: conceived of the presented ABM optimization ideas; NL-J: designed the VF-ABM rules and the overall concept

#### REFERENCES


of the project; CS: designed and implemented the sequential version of the 3D VF-ABM; NS: designed, implemented, and benchmarked the parallel version of the 3D VF-ABM; NS: designed and developed the visualization component of the 3D VF-ABM; NS: analyzed model-generate outputs and performed the qualitative verification; JJ, NL-J, and LM: supervised the project; NS: wrote the manuscript; JJ and NL-J: revised the manuscript critically; JJ, NL-J, and LM: provided funding and computing resources to support the project. All authors provided critical feedback on the project and the manuscript.

### FUNDING

Research reported in this publication was supported by National Institute of Deafness and other Communication Disorder of the National Institutes of Health [R03DC012112 (NL-J) and R01DC005788 (LM)]. The author gratefully acknowledge the support provided by NSF, Contract Number CNS-1429404 MRI project (JJ).

#### ACKNOWLEDGMENTS

The authors would like to thank Yun (Yvonna) Li and Alireza Najafi Yazdi for their contributions to the development of the initial and base sequential model. We would also like to thank Sujal Bista for guidance in developing the visualization component and UMIACS staff for assistance in VirtualGL configuration. Lastly, we would like to thank Samson Yuen for code migration and project deposition on GitHub for public access.


surface damage with increasing time and magnitude doses of vibration exposure. PLoS ONE 9:e91615. doi: 10.1371/journal.pone.0091615


Wilensky, U. (2015). Netlogo Dictionary. NetLogo User Manual, 3.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Seekhao, Shung, JaJa, Mongeau and Li-Jessen. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Coupling Langevin Dynamics With Continuum Mechanics: Exposing the Role of Sarcomere Stretch Activation Mechanisms to Cardiac Function

Takumi Washio1,2 \*, Seiryo Sugiura1,2, Ryo Kanada<sup>3</sup> , Jun-Ichi Okada1,2 and Toshiaki Hisada1,2

*<sup>1</sup> UT-Heart Inc., Kashiwa, Japan, <sup>2</sup> Graduate School of Frontier Sciences, University of Tokyo, Kashiwa, Japan, <sup>3</sup> Predictive Health Team, Integrated Research Group, Compass to Healthy Life Research Complex Program, RIKEN, Kobe, Japan*

#### Edited by:

*Peter V. Coveney, University College London, United Kingdom*

#### Reviewed by:

*Joakim Sundnes, Simula Research Laboratory, Norway Arun V. Holden, University of Leeds, United Kingdom*

> \*Correspondence: *Takumi Washio washio@sml.u.tokyo.ac.jp*

#### Specialty section:

*This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology*

Received: *08 December 2017* Accepted: *16 March 2018* Published: *06 April 2018*

#### Citation:

*Washio T, Sugiura S, Kanada R, Okada J-I and Hisada T (2018) Coupling Langevin Dynamics With Continuum Mechanics: Exposing the Role of Sarcomere Stretch Activation Mechanisms to Cardiac Function. Front. Physiol. 9:333. doi: 10.3389/fphys.2018.00333* High-performance computing approaches that combine molecular-scale and macroscale continuum mechanics have long been anticipated in various fields. Such approaches may enrich our understanding of the links between microscale molecular mechanisms and macroscopic properties in the continuum. However, there have been few successful examples to date owing to various difficulties associated with overcoming the large spatial (from 1 nm to 10 cm) and temporal (from 1 ns to 1 ms) gaps between the two scales. In this paper, we propose an efficient parallel scheme to couple a microscopic model using Langevin dynamics for a protein motor with a finite element continuum model of a beating heart. The proposed scheme allows us to use a macroscale time step that is an order of magnitude longer than the microscale time step of the Langevin model, without loss of stability or accuracy. This reduces the overhead required by the imbalanced loads of the microscale computations and the communication required when switching between scales. An example of the Langevin dynamics model that demonstrates the usefulness of the coupling approach is the molecular mechanism of the actomyosin system, in which the stretch-activation phenomenon can be successfully reproduced. This microscopic Langevin model is coupled with a macroscopic finite element ventricle model. In the numerical simulations, the Langevin dynamics model reveals that a single sarcomere can undergo spontaneous oscillation (15 Hz) accompanied by quick lengthening due to cooperative movements of the myosin molecules pulling on the common Z-line. Also, the coupled simulations using the ventricle model show that the stretch-activation mechanism contributes to the synchronization of the quick lengthening of the sarcomeres at the end of the systolic phase. By comparing the simulation results given by the molecular model with and without the stretch-activation mechanism, we see that this synchronization contributes to maintaining the systolic blood pressure by providing sufficient blood volume without slowing the diastolic process.

Keywords: multiscale method, Langevin equation, continuum mechanics, actomyosin, heartbeat, stretch activation

## INTRODUCTION

With the advances in computational science made possible by improvements in hardware technology, it is now possible to create multi-scale simulation models of the heart in which the macroscopic behaviors of the beating heart can be reproduced and analyzed based on molecular mechanisms of the excitationcontraction coupling process (Kerckhoffs et al., 2007; Gurev et al., 2011; Sugiura et al., 2012). These models are based on many studies of cell models of cardiac electrophysiology (Luo and Rudy, 1994; ten Tusscher et al., 2004; Grandi et al., 2010). We also note that tissue modeling has provided deep insights into the nature of coupling and other interactions among cells in the heart wall (Clayton et al., 2011). Central to these in silico heart studies is an accurate model of crossbridge kinetics, which not only forms the basis of cardiac mechanics, but also has clinical relevance in the light of the many reports showing the involvement of sarcomeric proteins in the pathogenesis of cardiomyopathies (Cahill et al., 2013).

Ideally, a molecular dynamics simulation of actomyosin should be coupled with a macroscopic finite element model of the heart because with such a model the impact of mutations in the myosin molecules on cardiac function can be directly assessed. However, it is not possible to perform such simulations even with the best available high-performance computers, and current multi-scale heart simulators usually adopt state-transition models of crossbridge cycling. In these models, the rate constants for transitions between states are governed by the energy of each state (Huxley and Simmons, 1971), but the minimum in the energy landscape corresponding to each state ignores its width in the infinitely-sharp minimum approximation, in which the angle of each lever arm is fixed in the most stable configuration. Obviously, this is a simplification of the behavior of real myosin molecules experiencing thermal fluctuations, and we have recently reported that a model with an energy landscape possessing wide minima can reproduce experimental findings with higher accuracy (Marcucci et al., 2016). However, in that paper, we only examined simple Langevin dynamics with a single variable representing the free energy potential during the power stroke, and solved it using a Monte Carle (MC) simulation. In that case, the Kramers-Smoluchovski approximation (Gardiner, 2004) was used to obtain the rate constants of the transitions between the multiple states, which were given by discretizing the one-dimensional range of the

FIGURE 1 | Coupling strategy for three scales. In the actomyosin system, *x* and ξ are variables representing the deformation of the bound myosin molecule. In particular, ξ is the strain of the myosin rods, and *Wrod* (ξ ) is its strain energy. These variables were updated by the time step 1*t* ∼ 0.25 ns, while the variables in the half-sarcomere and the ventricle models were updated by the time step 1*T* = *n*1*t* ∼ 1 µ*s*. The shortening of the half-sarcomere model is represented by the variable *z*, which affected the Langevin dynamics of the bound myosin molecules through the constraint condition: 1ξ = 1*x* − 1*z*, while the sarcomeric contractile force *<sup>T</sup>*,1*<sup>T</sup> <sup>F</sup>* on the time interval [*T*, *<sup>T</sup>* <sup>+</sup> <sup>1</sup>*T*] was given by the sum of tensile forces *<sup>T</sup>*+*k*1*tFi*,*<sup>j</sup>* of the bound myosin rods averaged over the time interval (*<sup>k</sup>* <sup>=</sup> 1, · · · , *<sup>n</sup>*). The half-sarcomere model of actomyosin complexes was imbedded into each tetrahedral element of the finite element ventricle model in the reference configuration along the fiber direction f. The deformation at the time *T* of the ventricle is represented by the current position x = *<sup>T</sup>* x (X) of the material point X, thus λ = ∂ *<sup>T</sup>* x/∂X · f is the stretch along the fiber orientation direction. This stretching was transferred to the shortening of the imbedded half-sarcomere model with the factor −*SL*0/2, while the contractile tension *T*, 1*T Tf* was given along the fiber direction by scaling the sarcomeric contractile force *T*, 1*TF* by taking the cross-sectional area per thin filament (*SA*0), and the volume ratio of the sarcomere within the ventricle wall (*RS*) into account.


TABLE 1 | Parameters for the actomyosin dynamics.

angles of the lever arms. If we try to formulate a more realistic free energy potential as a function of multiple variables, the number of MC states increases explosively, and it is no longer possible to find the rate constants between the MC states theoretically. Therefore, it is desirable to establish a numerical scheme that directly couples the Langevin dynamics of the molecules with the macroscopic continuum dynamics.

Here, we report a novel numerical method to couple the microscale simulation of crossbridge kinetics described by the Langevin equation with the macroscopic mechanics simulations using the finite element method, even though the time scales differ considerably. In this method, the time step of the

FIGURE 2 | The half sarcomere model (A) and the attached myosin molecular model composed of the myosin head (MH) and the lever arm (LA) and the rod (B). In the half-sarcomere model, the Z-line was fixed and the shortening distance of the left edge of the thick filaments is denoted by *z*. The configuration of the attached myosin molecules is represented by two variables *x* and *y*. The LA is decomposed into LA1 and LA2 to represent its deflection around the point "P." The degree of the deflection is given by θ = *y* − *x*. The power stroke is given by the counter clockwise rotation of LA1 around the point "O." The rod is a non-linear spring connecting the thick filament and the point "Q" of LA2. The strain of the rod is denoted by ξ .

macroscopic model is set at a multiple of that from the microscopic model to reduce computational overhead. The validity of the method was confirmed with a comparison of the simulation results with the recently reported experimental findings on the spontaneous oscillation of cardiac sarcomeres (Ishiwata et al., 2011), which can be reproduced only by correctly handling the coupling of the motion of the sarcomeres with the actomyosin dynamics. By applying this method, we also show that a trapped crossbridge mechanism greatly facilitates ventricular function through the stretch-activation of the cardiac muscle (Stelzer et al., 2006). A notable feature of the stretchactivation is a long-lasting increase in the contractile tension after a small, rapid stretch is applied during activation. In the usual stretch-activation experiments, the stretch is 1% of the muscle length, which closely corresponds to the microscale size of lever arm swing (10 nm). It is likely that the rapid stretching induces an unusual persistent conformational change of the bound myosin molecules. In this work, we introduce a free energy potential for the power-stroke model in which some of the bound myosin molecules become trapped in a deformed conformation when a rapid stretch is applied. These trapped myosin molecules cannot recover under normal thermal fluctuation unless their rods become relaxed or extremely stretched by subsequent sarcomeric movements. Through the beating-ventricle simulations, we show how this mechanism contributes to improved blood circulation.

#### MATERIALS AND METHODS

Our strategy of coupling the different scales is summarized in **Figure 1**. The stretch rates were transferred from the macro- to micro-scale while the contractile forces were transferred back from the micro- to macro-scale. Finite element continuum mechanics were applied to the ventricle model. The halfsarcomere model of actomyosin complexes was imbedded into each tetrahedral element of the finite element ventricle model along the fiber direction. The molecular variables that represent the deformation of bound myosin molecules were computed by the Langevin dynamics. The shortening rate −λ˙ along the fiber direction in the ventricle model was transferred to the sarcomeric shortening velocity z˙ by scaling with the unloaded half-sarcomere length SL0/2. The sarcomeric shortening velocity z˙ was applied in the actomyosin model to slide the myosin thick filament. The contractile force of the half-sarcomere model was given by the sum of the tensile forces of the bound myosin rods. The contractile force in the half-sarcomere model was transferred to the macroscopic contractile tension along the fiber direction. In our coupling approach, the computational time step size 1T of the sarcomeric dynamics and the ventricle continuum dynamics is given by an integer multiple of the time step size 1t of the actomyosin Langevin dynamics (1T = n1t) to reduce the computational and communication overheads. As will be discussed in section Multiple Time Step (MTS) Method, such a multiple time-step strategy can be applied without suffering numerical instabilities by also transferring the stiffness given by the bound myosin rods. Readers who are not interested in the numerical schemes may skip sections Multiple Time Step LA, it was decomposed into two rigid components, LA1 and LA2, jointed at the point "P" (**Figure 2B**). As with the real structure of a myosin molecule, LA1 may contain a series of subdomains from the lower 50 kDa to the converter in the motor domain because some conformational changes of these parts were supposed to be accompanied by lever arm rotation. The displacement of the point "P" of the filament direction given by the rotation of LA1 from its pre-power stroke position around the joint point "O" was represented by y. Here, the conformation of the myosin molecule just after attachment was assumed to be the same as the pre-power stroke conformation. Similarly, the displacement about the joint point "Q" with the myosin rod was represented by x. Thus, θ = y − x was the deflection of the LA from the pre-power stroke conformation. The strain energy of the myosin rod was given by a function Wrod (ξ ), where ξ was the strain (length change) in the filament direction from its unloaded natural length ξ0. The rod strain energy was non-linear with the generated force, as with our previous work (Washio et al., 2016) for a rod with ξ < 0. For positive strain (ξ > 0), a constant stiffness with a spring constant 2.8 pN/nm was used (**Figure 3A**). Under these assumptions, the dynamics of the sarcomere was described by the following Langevin equations, where the suffixesi and jrepresent the indexes of the MHs and the thick filaments, respectively. Also, t is the time, and <sup>t</sup> δA,i,<sup>j</sup> is set to one if the MH was attached at time t to the thin filament, and zero otherwise.

$$\begin{cases} \gamma \chi^t \dot{\mathbf{x}}\_{ij} + \frac{\partial \varphi}{\partial x} \left( ^t \mathbf{x}\_{ij}, ^t \mathbf{y}\_{ij} \right) + \frac{dW\_{ml}}{d\xi} \left( ^t \xi\_{ij} \right) - ^t R\_{X,ij} = 0\\ \gamma \chi^t \dot{\mathbf{y}}\_{ij} + \frac{\partial \varphi}{\partial y} \left( ^t \mathbf{x}\_{ij}, ^t \mathbf{y}\_{ij} \right) - ^t R\_{Y,ij} = 0\\ \;^t \xi\_{ij} - ^{t\_{\lambda,ij}} \xi\_{ij} - \left( ^t \mathbf{x}\_{ij} - ^{t\_{\lambda,ij}} \mathbf{x}\_{ij} \right) + ^t \mathbf{z} - ^{t\_{\lambda,ij}} \mathbf{z} = 0 \end{cases}, \quad ^t \delta\_{\mathbf{A},ij} = \mathbf{1} \left( \mathbf{1} \le i \le n\_{\mathbf{M}}, \mathbf{1} \le j \le n\_{\mathbf{P}} \right) \tag{1}$$

$$\gamma\_D \, ^t \dot{\xi}\_{i\dot{j}} + \frac{d \mathcal{W}\_{rd}}{d \xi} \left( ^t \xi\_{i\dot{j}} \right) - ^t \mathcal{R}\_{D,i\dot{j}} = 0, \, ^t \delta\_{A,i\dot{j}} = 0 \left( 1 \le i \le n\_M, 1 \le j \le n\_F \right) \tag{2}$$

$$\gamma \chi^t \dot{z} + K\_Z^{\ \ t} z - \frac{1}{n\_F} \sum\_{j=1}^{n\_F} \sum\_{i=1}^{n\_M} ^t \delta\_{A,ij} \frac{d\mathcal{W}\_{rad}}{d\xi} \left( ^t \xi\_{i\dot{j}} \right) = 0 \tag{3}$$

(MTS) Method and Coupling With the Finite Element Ventricle Model.

#### Langevin Dynamics of a Single Sarcomere

The parameters adopted for the molecular dynamics are summarized in **Table 1**. Here, the dynamic equations for a half-sarcomere model composed of n<sup>F</sup> pairs of thick and thin filaments (**Figure 2A**) are introduced. In this half-sarcomere model, we assumed that the right ends of the thin filaments were connected to the Z-line, which was fixed in microscopic space. The shortening displacement of the left end of the thick filament from the unloaded position was denoted by z. On each thick filament, there were n<sup>M</sup> myosin molecules, which underwent repeated attachment and detachment with the thin filament. The value of n<sup>M</sup> = 38 was adopted from our previous work (Washio et al., 2016). During the attached phase, the lever arm (LA) of the myosin molecule rotated around the joint point "O" of the myosin head (MH) under a given free energy potential ϕ with additional random forces (**Figure 2B**). These rotations were either the power stroke or the reversal stroke, depending on the rotational direction. To represent the deflection of the Here, the probabilistic rules for transitions between the attached and detached states will be given below. At the time of attachment, the myosin molecule was assumed to be in the pre-power stroke state.

$$\begin{cases} {}^{t\_{\Lambda,i,j}}\varkappa\_{i,j} = \varkappa\_{\text{Pre}} \equiv \mathbf{0} \\ {}^{t\_{\Lambda,i,j}}\nu\_{i,j} = \wp\_{\text{Pre}} \equiv \mathbf{0} \end{cases} \tag{4}$$

Here, tA,i,<sup>j</sup> is the time at which the attachment occurred. The spring strain <sup>t</sup> ξi,<sup>j</sup> was continuously updated at the transitions.

In Equations (1–3), γX, γY, and γ<sup>D</sup> were the damping coefficients, and <sup>t</sup>RX,i,<sup>j</sup> , <sup>t</sup>RY,i,<sup>j</sup> , and <sup>t</sup>RD,i,<sup>j</sup> were the random forces, which fulfilled the condition:

$$\begin{cases} \left< ^t R\_{\alpha, i, j} \right> = 0\\ \left< ^t R\_{\alpha, i, j}, ^t R\_{\beta, k, l} \right> = \delta\_{\alpha \beta} \delta\_{ik} \delta\_{jl} \sqrt{2 \gamma\_{\alpha} k\_B T}^{t - t'} \delta \end{cases},$$

$$\alpha, \beta = X, Y, D, \ 1 \le i, k \le n\_M, 1 \le j, l \le n\_F \text{ (5)}$$

where Boltzmann's constant is k<sup>B</sup> and the temperature is T. In this paper, the damping coefficient γ<sup>D</sup> was set to 70 pN · ns/nm, following Howard (2001), while γ<sup>X</sup> and γ<sup>Y</sup> were set to 20 and 50 pN · ns/nm, respectively. Since the rotation of LA1 may involve structural changes in other parts in the MH, the drag coefficient for LA1 was larger than that for LA2.

In Equation (3), γ<sup>Z</sup> was the drag coefficient per length change of a single thin filament of the sarcomere and K<sup>Z</sup> was the spring constant for each thin filament of the sarcomere. Equation (3) follows from the fact that the sarcomeric contractile tension is just the sum of the tensile forces of the rods for all of the attached myosin molecules. The third line in Equation (1) indicates the constraint condition in the association state. This condition gives the rod strain <sup>t</sup> ξi,j in relation to the conformational change of the myosin (<sup>t</sup> xi,j) and the sarcomeric movement (<sup>t</sup> z).

#### Free Energy of a Myosin Molecule

We assume that the free energy of the myosin molecule ϕ in the attached state can be decomposed into the power stroke free energy ϕPS of LA1 and the deflection energy of the LA:

$$\varphi\left(\mathbf{x},\mathbf{y}\right) = \varphi\_{\rm PS}\left(\theta,\mathbf{y}\right) + \mathcal{W}\_{\rm LA}\left(\theta\right), \ \theta = \mathbf{y} - \mathbf{x} \tag{6}$$

For the deflection energy of the LA, a simple quadratic potential was assumed:

$$W\_{\rm LA} \left( \theta \right) = \frac{1}{2} K\_{\theta} \theta^2 \tag{7}$$

Since there was no appropriate reference for setting the stiffness, a comparable stiffness (K<sup>θ</sup> = 4 pN/nm) to that of the rod strain was adopted in our model. For the power-stroke free energy ϕPS, the three local minima at y = 0, s1, and s<sup>1</sup> + s<sup>2</sup> for a fixed deflection θ = y − x are given as shown in **Figure 2B**, which is described by the following equations:

adjusted so that enhanced beating performance was realized in the coupled simulation for the ventricle model, which is introduced below. In our model, the first barrier was assumed to be a function of the LA deflection θ as:

$$E\_{b1}(\theta) = \begin{cases} E\_{b01}, & \theta \le \theta\_{\text{trap}} \\ E\_{b01} + C\_{\text{trap}} \frac{\theta - \theta\_{\text{trap}}}{\Delta \theta\_{\text{trap}}}, \theta\_{\text{trap}} < \theta \le \theta\_{\text{trap}} + \Delta \theta\_{\text{trap}} \\ E\_{b01} + C\_{\text{trap}}, & \theta > \theta\_{\text{trap}} + \Delta \theta\_{\text{trap}} \end{cases} \tag{9}$$

This first energy barrier was introduced to reproduce the stretchactivation of the cardiac muscle (Stelzer et al., 2006). In their experiment, a small, rapid stretch of ∼1% of the sample length was imposed to activated skinned myocardium. Then, a nearly 10% increase in the contractile tension persisted for a time on the order of seconds compared with that of the steady state before the stretch. This suggests the existence of a trapped conformation for the MH and LA in an attached state that can be generated by the rapid stretch. By experiencing a high barrier, as in Equation (9), a myosin molecule that exhibits a large deflection θ and a large strain ξ after the first power stroke can become trapped in that state if the MH is strongly attached, since these myosin molecules cannot make progress toward a larger forward stroke, which would requires a large increment in either the deflection energy of the LA [WLA (θ)], or the strain energy of the rod [Wrod (ξ )]. Such large LA deflections and rod strains can be generated when the thick filament was pulled rapidly to the outside. In this work, the values Ctrap = 200 pN/nm, θtrap = −0.25 nm, and 1θtrap = 5 nm were adopted (**Figure 3B**) so that the appropriate response to the stretch-activation is reproduced, as shown in the numerical simulation. The second energy barrier was assumed to be a constant:

$$\begin{cases} \begin{aligned} &y\_{PS}(\theta,\mathbf{y})=\\ &E\_{P\mathbf{r}}+\frac{1}{2}\left(E\_{b1}\left(\theta\right)-E\_{P\mathbf{r}}\right)\left(1-2\pi\frac{\mathbf{y}+\mathbf{z}\_{1}/4}{z\_{1}}\right)+\frac{1}{2}k\_{Y}\left(\mathbf{y}+s\_{1}/4\right)^{2}, & \mathbf{y}\leq -\frac{s\_{1}}{4}\\ &E\_{P\mathbf{r}}+\frac{1}{2}\left(E\_{b1}\left(\theta\right)-E\_{P\mathbf{r}}\right)\left(1-\cos 2\pi\frac{\mathbf{y}}{z\_{1}}\right), & -\frac{s\_{1}}{4}<\mathbf{y}\leq\frac{s\_{1}}{2}\\ &E\_{P\mathbf{S}1}+\frac{1}{2}\left(E\_{b1}\left(\theta\right)-E\_{P\mathbf{S}1}\right)\left(1-\cos 2\pi\frac{\mathbf{y}-s\_{1}}{z\_{1}}\right), & \frac{s\_{1}}{2}<\mathbf{y}\leq s\_{1}\\ &E\_{P\mathbf{S}1}+\frac{1}{2}\left(E\_{b2}\left(\theta\right)-E\_{P\mathbf{S}1}\right)\left(1-\cos 2\pi\frac{\mathbf{y}-s\_{1}}{z\_{2}}\right), & s\_{1}<\mathbf{y}\leq s\_{1}+\frac{s\_{2}}{2}\\ &E\_{P\mathbf{S}2}+\frac{1}{2}\left(E\_{b2}\left(\theta\right)-E\_{P\mathbf{S}2}\right)\left(1-\cos 2\pi\frac{\mathbf{y}-s\_{1}-s\_{2}}{z\_{2}}\right), & s\_{1}+\frac{s\_{2}}{2}<\mathbf{y}\leq s\_{1}+\frac{5s\_{1}}{4}\\ &E\_{P\mathbf{S}2}+\frac{1}{2}\left(E\_{b2}\left(\theta\right)-E\_{P\mathbf{S}2}\right)\left(1+2\pi\frac{\mathbf{y$$

Here, EPre, EPS1and EPS<sup>2</sup> were the three local minimum energy values at y = 0, s1, ands<sup>1</sup> + s2, respectively. These local minima correspond to the configurations of the MH and LA1 in the pre-power stroke state, and the states after the first two power strokes. The power stroke step sizes, s<sup>1</sup> and s2, and the energies EPre − EPS<sup>1</sup> and EPS<sup>1</sup> − EPS<sup>2</sup> consumed in the two strokes, are given values (**Table 1**) similar to those used in the Monte Carlo (MC) model in our previous work (Washio et al., 2016), in which the ATP hydrolysis energy was set to EATP = 22KBT following Saupe et al. (1999) at a body temperature of T = 310 K.

In Equation (8), Eb<sup>1</sup> (θ) and Eb<sup>2</sup> (θ) are the energy barriers between the minima. The heights of the energy barriers were

$$E\_{b2}\left(\theta\right) \equiv E\_{b02}.\tag{10}$$

#### Control Model of Attachment and Detachment

For the transition between the attached and detached states (**Figure 4**), an MC model similar to the one in our previous work (Washio et al., 2016) was used. In the half-sarcomere model, the MHs were arranged on the thick filament at regular intervals, and the thin filament was divided into segments called troponin/tropomyosin (T/T) units. The transitions between the states of a T/T unit were affected by the Ca2<sup>+</sup> concentration,

FIGURE 3 | Strain energy *Wrod* (ξ ) of the myosin rod and the force given by its derivative (A). The details are described in Supplementary Material S1.2. The free energy landscape of the myosin molecule ϕ in the attached state with respect to the molecular variables *y* and θ (B). This free energy consists of ϕ*PS* (θ, *y*) and *WLA* (θ). ϕ*PS* (θ, *y*) is the energy source of the power stroke (rotation of LA1). *WLA* (θ) is the deformation energy of LA for its deflection. The pre-power stroke configuration corresponds to the local minimum at *y* = 0. The other two local minima at *y* = *s*1 = 5.5 nm and at *y* = *s*1 + *s*2 = 11 nm correspond to the states after the first and second power strokes, respectively. Between the pre-and the post first power stroke states, the high energy barrier is assumed for the positive deflection (θ > 0) of LA.

[Ca], and by the states of the MHs below the T/T unit. In this model, only the Ca-bound state increased the affinity of the MHs for the thin filament. There are two detached state of MHs - a nonbinding state NXB, and a weakly binding state PXB. The affinity was adjusted by modifying the factor Knp for the rate constant of the transition from NXB to PXB. The relationship between the MHi,<sup>j</sup> location and the T/T unit was determined from the offset position of the MHi,<sup>j</sup> ( t z + <sup>t</sup> ξi,<sup>j</sup> − <sup>t</sup> xi,j) from its unloaded position (**Figure 2A**). A cooperative mechanism with the nearest-neighbor MHs was added by introducing the factors γ ng and γ <sup>−</sup>ng (γ = 40), as in our previous work (Washio et al., 2016), in which the integer ng (= 0, 1 or 2) was the number of neighboring MHs in the weakly binding state PXB or the attached state XB. The details of the transients of the T/T unit states and between NXB and PXB are described in Supplementary Material S1.1.

Attachment was possible only from state PXB with the rate constant APre. Detachment from the attached state XB to the weakly bound state PXB was allowed only from the pre-power stroke state, as follows:

$$D\_{\rm PXB} \left( \mathbf{y} \right) = \begin{cases} D\_{\rm PXB,Pre}, & \mathbf{y} \le \mathbf{s}\_1/4 \\ (1 - \alpha) \, D\_{\rm PXB,Pre}, & \mathbf{y} = (1/4 + \alpha/2) \, \mathbf{s}\_1: 0 < \boldsymbol{\omega} \le 1 \\ 0, & \mathbf{y} > 3 \mathbf{s}\_1/4 \end{cases} \tag{11}$$

Here, the variable ω could take values between 0 and 1, and was introduced to interpolate the rate constant between the pre-power stroke state and the state after the first power stroke.

In this transition, no ATP molecules were consumed, whereas detachment to NXB required one ATP molecule. This rate constant is given as a function of both the rod strain ξ and the power stroke displacement y:

$$\begin{cases} D\_{\text{NXB}}\left(\xi,\mathcal{y}\right) = \\ \begin{cases} \max\left(0, D\_{\text{strain}}\left(\xi,\mathcal{y}\right)\right), & \mathcal{y} \le s\_1 + s\_2/4\\ \max\left(\omega D\_{\text{NXB}0}, D\_{\text{strain}}\left(\xi,\mathcal{y}\right)\right), & \mathcal{y} = s\_1 + (1/4 + \omega/2) \\ & s\_2: 0 < \omega \le 1 \\ \max\left(D\_{\text{NXB}0}, D\_{\text{strain}}\left(\xi,\mathcal{y}\right)\right), & \mathcal{y} > s\_1 + 3s\_2/4 \end{cases} \end{cases} \tag{12}$$

Similar to before, the variable ω could take values between 0 and 1, and interpolated the rate constant between the states after the first and second power strokes, while Dstrain indicates the forced detachment due to the extreme strain of the myosin rod:

$$\begin{cases} D\_{\text{strain}}\left(\xi,\mathsf{y}\right) = \\ c\_{\text{min}}\left(\exp\left(a\_{\text{min}}\left(\xi - d\_{\text{min}}\right)^2\right) - 1\right), & \xi \le d\_{\text{min}}\\ 0, & d\_{\text{min}} < \xi \le d\_{\text{max}}\left(\mathsf{y}\right) \\ c\_{\text{max}}\left(\exp\left(a\_{\text{max}}\left(\xi - d\_{\text{max}}\left(\mathsf{y}\right)\right)^2\right) - 1\right), & \xi > d\_{\text{max}}\left(\mathsf{y}\right) \end{cases} \tag{13}$$

Here, the negative strain threshold dmin was a constant, and the positive strain threshold dmax depended on the stroke displacement y as

$$d\_{\max} \left( \mathbf{y} \right) = \begin{cases} d\_{\max, \text{Pre}}, & \mathbf{y} \le \mathbf{s}\_1/4 \\ (1 - \omega) \, d\_{\max, \text{Pre}} + \omega d\_{\max, \text{PS1}}, & \mathbf{y} = (14 + \omega/2) \\ d\_{\max, \text{PS1}}, & \mathbf{s}\_1/4 < \mathbf{y} \le \mathbf{s}\_1 + \mathbf{s}\_2/4 \\ (1 - \omega) \, d\_{\max, \text{PS1}} + \omega d\_{\max, \text{PS2}}, & \mathbf{y} = \mathbf{s}\_1 + (1/4 + \omega/2) \\ & \mathbf{s}\_2 : 0 < \omega \le 1 \\ d\_{\max, \text{PS2}}, & \mathbf{y} > \mathbf{s}\_1 + 3\mathbf{s}\_2/4 \end{cases} \tag{14}$$

In this study, the parameters dmax,Pre = 5 nm, dmax,PS1 = 9 nm, and dmax,PS2 = 9 nm were used. These values were

FIGURE 4 | The state transition Monte Carlo model of the T/T unit and the myosin molecule. The MHs in either the NXB or PXB states are assumed to be detached. The rate constant factors *Knp* and *Kpn* between NXB and PXB are affected by the state of T/T unit above it. The detachment rate constant *DPXB* between PXB and XB are given as a function of *y*, so that the transition to PXB is allowed only for the MHs in the pre-power stroke position. The detachment rate constant *DNXB* to NXB is given as a function of *y* and ξ , so that the detachment is allowed for the second post-power stroke state or the MHs connected to the extremely stained myosin rods. The time-step strategy for reducing the computational loads is shown by the arrows. The molecular variables *x*, *y*, and ξ of the bound myosin molecules are updated by the finest basic time step 1*t* (black arrow), while the rod strain ξ in the detached states is updated by its multiples *nD*1*t*. The state transitions of the MHs and the T/T units are calculated by the MC method with the time step *nDA*1*t* (red arrows).

adjusted so that the appropriate responses to stretch-activation were reproduced. These choices did not conflict with the fact that the binding affinity to the thin filament increased as the power stroke proceeded (Llinas et al., 2015).

#### Multiple Time Step (MTS) Method

First, we consider a multiple time step (MTS) approach for a single half-sarcomere model (**Figure 2**) in which different time step intervals 1t and 1T were adopted, respectively, for updating the molecular variables xi,<sup>j</sup> , yi,<sup>j</sup> , ξi,<sup>j</sup> and the sarcomeric shortening displacement z, when solving Equations (1, 2) coupled with Equation (3). Below, this approach will be extended to coupling with a macroscopic finite element continuum model, in which a single sarcomere model was imbedded into each finite element.

The time step 1T was assumed to be an integer multiple of the time step interval for the molecular variables 1t:

$$
\Delta T = n \cdot \Delta t \tag{15}
$$

Such approaches reduce the computational overhead of the shared-memory synchronization, as well as the data communication needed in distributed parallel systems, if a sufficiently large integer n can be applied. For our Langevin dynamics model, the microscale time step t was set at 0.25 ns. This choice was constrained by the relationships between the magnitudes of the drag coefficients γX, γ<sup>Y</sup> with the curvature of the potential ϕ. For example, in the case of a simple Langevin equation:

$$
\gamma^t \dot{u} + \frac{d\phi}{du} \left( ^t u \right) - \, ^t R = 0 \tag{16}
$$

with a given free energy potential ϕ, a variable u, and the random force that satisfies

$$\begin{cases} \left< ^t \mathbb{R} \right> = 0\\ \left< ^t \mathbb{R} \, ^t \mathbb{R} \right> = \sqrt{2 \gamma ^k k\_B T} \, ^{t-t'} \delta \end{cases} \tag{17}$$

The stability of the explicit numerical integration scheme required that

$$
\Delta t \le \frac{\mathcal{V}}{K\_{\text{max}}} \tag{18}
$$

where Kmax was the maximum magnitude of the curvature of ϕ d <sup>2</sup>ϕ/du<sup>2</sup> over the range of u. Even if an implicit time integration scheme was applied, Equation (18) must be satisfied for the maximum magnitude value of the negative curvature (d <sup>2</sup>ϕ/du<sup>2</sup> < 0). In our case, as shown in **Figure 3B**, negative curvatures were unavoidable on the ridge lines of the potential landscape. For example, if 1t = 0.25 ns was used when γ = 50 pN · ns/nm, the allowable maximal curvature from Equation (18) was Kmax = γ /1t = 200 pN/nm. This curvature value implies an energy change of Kmax1u <sup>2</sup> = 100 pN · nm for a displacement 1u = 1 nm. Actually, values for the magnitude of the curvature were observed near the high energy barrier between the pre-power stroke state and the state after the first power stroke in our model (**Figure 3B**).

Another limitation on practicable time step size comes from considerations of fluctuations 1u during each time interval 1t. If we ignore the potential ϕ in Equation (16), the standard deviation of 1u given by a series of random forces in Equation (17) during time 1t is

$$
\Delta^{\Delta t} \sigma = \sqrt{\langle \Delta u^2 \rangle} = \sqrt{2k\_B T \Delta t / \mathcal{\mathcal{V}}} \tag{19}
$$

At body temperature, we have kBT = 4.278 pN · nm. Thus, for the case of γ = 50 pN · ms/nm and 1t = 0.25 ns, we have <sup>1</sup>tσ ∼ 0.2 nm. These displacements are large enough to make a noticeable difference in the landscape of the potential ϕ.

Compared with the dynamics of the molecules, the sarcomeric movement in cardiac muscle is generally much slower, as shown by the following argument. The shortening velocity of the sarcomere model is related to the stretch rate λ˙ of the cardiac muscle along the fiber direction by

$$
\dot{z} = -\frac{1}{2} \text{SL}\_0 \dot{\lambda} \tag{20}
$$

Here, SL<sup>0</sup> = 2.1 µm is the unloaded sarcomere length. If we assume the maximal shortening velocity of the cardiac muscle −λ˙ max = 5ML/s, where ML is the muscle length (Edman et al., 1974), the maximal shortening velocity of a half-sarcomere is z˙max = 5.25µm/s = 5.25×10−6nm/ns. However, the previous consideration regarding the fluctuations during the time interval 1t = 0.25 ns gives the average magnitude of the molecular velocity to be <sup>1</sup>tσ /t ≈ 0.8 nm/ns. This comparison between the sarcomeric and molecular velocities suggests the possibility of applying a multi-valued time step approach, in which tens of thousands of fine time steps of size 1t are calculated when integrating the molecular variables xi,<sup>j</sup> , yi,<sup>j</sup> , ξi,<sup>j</sup> for each large one step interval 1T used for integrating the sarcomeric variable z.

During the time integration process using the small time step 1t over the time interval [T : T + 1T], the LA conformation variables xi,<sup>j</sup> , yi,<sup>j</sup> are updated explicitly, and then the rod strains ξi,<sup>j</sup> are temporarily updated so that the constraint in Equation (1) is fulfilled, using the most recently calculated shortening velocity z˙ from time T. The temporarily updated variables are denoted with bars over them, such as ξ <sup>i</sup>,<sup>j</sup> and z. When the process switches to an implicit computation of the sarcomeric shortening displacement z and its time derivative z˙ at time T + 1T for use in Equation (3), the tensile forces exerted by the attached MHs during the time interval [T : T + 1T] are computed by using the corrected rod strain ξi,<sup>j</sup> , for which the shortening velocity z˙ over the time interval [T : T + 1T] is replaced with <sup>T</sup>+1<sup>T</sup> z˙. By doing so, the stiffness due to the strained rods of the attached MHs is involved in the implicit time integration of Equation (3). This implicit strategy allows us to apply a time interval 1T which is four orders of magnitude larger than 1t.

The molecular variable time integrations can be performed using the temporal sarcomeric shortening displacement z on the time interval [T : T + 1T] given by

$$\mathbf{z}^{T+\Delta T}\overline{\mathbf{z}} = \mathbf{^T}\mathbf{z} + k\Delta t^T\dot{\mathbf{z}},\ k = 1, \cdots, n \tag{21}$$

The LA conformation variables for the attached MHs at time t + 1t are explicitly updated from those at time t, so that the following equations are satisfied:

$$\begin{cases} \begin{aligned} \label{eq:1} \chi \mathbf{x} \frac{t^{t+t} \mathbf{x}\_{i,j} - \, ^t \mathbf{x}\_{i,j}}{\Delta t} + \frac{\partial \varphi}{\partial \mathbf{x}} \left( ^t \mathbf{x}\_{i,j}, ^t \mathbf{y}\_{i,j} \right) + \frac{d \, W\_{\text{rod}}}{d \overline{\xi}} \left( ^t \overline{\xi}\_{i,j} \right) \\ \begin{aligned} \label{eq:1} \chi \mathbf{y} \frac{t^{t+t} \mathbf{y}\_{i,j} - ^t \mathbf{y}\_{i,j}}{\Delta t} + \frac{\partial \varphi}{\partial \mathbf{y}} \left( ^t \mathbf{x}\_{i,j}, ^t \mathbf{y}\_{i,j} \right) \end{aligned} \end{cases} \begin{aligned} \begin{aligned} \label{eq:1} \xi\_{i,j} \end{aligned} \end{aligned} \end{cases}$$

Then, the temporal rod strains n ξ i,j o at time t + 1 are updated according to

$$\begin{cases} \begin{aligned} \boldsymbol{\chi} \mathbf{x} \frac{\boldsymbol{t} + \iota \overline{\boldsymbol{\xi}}\_{i,j} - \iota \overline{\boldsymbol{\xi}}\_{i,j}}{\boldsymbol{t}} + \frac{dW\_{\text{rad}}}{d\overline{\boldsymbol{\xi}}} \left( \iota \overline{\boldsymbol{\xi}}\_{i,j} \right) - \, ^t R\_{D,i,j} = \mathbf{0}, \ ^t \boldsymbol{\delta}\_{A,i,j} = \mathbf{0} \\\ \boldsymbol{t} + \iota \overline{\boldsymbol{\xi}}\_{i,j} - \, ^{t\_{A,i,j}} \overline{\boldsymbol{\xi}}\_{i,j} - \left( \iota ^{t + \Delta t} \boldsymbol{\varkappa}\_{i,j} - \, ^{t\_{A,i,j}} \boldsymbol{\varkappa}\_{i,j} \right) & \quad ^t \boldsymbol{\delta}\_{A,i,j} = \mathbf{1} \\\ \boldsymbol{+} \, ^{t + \Delta t} \overline{\boldsymbol{z}} - \, ^{t\_{A,i,j}} \overline{\boldsymbol{z}} = \mathbf{0}, \end{aligned} \end{cases}$$

After performing the above time integrations for k = 1, · · · , n over the interval [T : T + 1T], the true sarcomeric shortening displacement z is implicitly computed by solving the following equations:

$$\begin{cases} \mathcal{Y} \mathbf{z}^{T+\Delta T} \dot{\mathbf{z}} + \mathbf{K}\_{\mathbf{Z}} \mathbf{z}^{T+\Delta T} \mathbf{z} - \mathbf{}^{T,\Delta T} \boldsymbol{F} = \mathbf{0} \\ \mathbf{z}^{T+\Delta T} \mathbf{z} = \mathbf{z}^{T} \mathbf{z} + \Delta \mathbf{z}^{T+\Delta T} \dot{\mathbf{z}} \end{cases} \tag{24}$$

In Equation (24), the mean total tensile force <sup>T</sup>,1TF over the time interval [T : T + 1T] is found by applying the true rod strains ξi,j over the time interval [T : T + 1T] according to

$${}^{T, \Delta T}F = {}^{T, \Delta T}\overline{F} \quad + \frac{1}{n \cdot n\_F} \sum\_{k=1}^{n} \sum\_{j=1}^{n\_F} \sum\_{i=1}^{n\_M} {}^{T+k\Delta t} \delta\_{A, i, j} \frac{d^2 W\_{rod}}{d\xi^2}$$

$$\left( {}^{T+k\Delta t}\overline{\xi}\_{i, j} \right) \left( {}^{T+k\Delta t}\xi\_{i, j} - {}^{T+k\Delta t}\overline{\xi}\_{i, j} \right) \qquad \text{(25)}$$

where the temporary total tensile force is evaluated using

$${}^{T,\Delta}\overline{F} = \frac{1}{n \cdot n\_F} \sum\_{k=1}^{n} \sum\_{j=1}^{n\_F} \sum\_{i=1}^{n\_M} {}^{T+kt} \delta\_{A,i,j} \frac{dW\_{rod}}{d\xi} \left( {}^{T+k\Delta t} \overline{\xi}\_{i,j} \right) \tag{26}$$

from the temporary rod strain values n T+k1t ξ i,j o . Note that Equation (25) is a linear approximation of the tensile force for the true rod strains about the temporary rod strains, for which the differences are given by

$$\begin{aligned} \left( \mathbf{z}^{T+k\Delta t} \xi\_{l,j} - \mathbf{z}^{T+k\Delta t} \overline{\xi}\_{l,j} = - \left( \mathbf{k} - k\_{A,l,j} \right) \Delta t \left( \mathbf{z}^{T+\Delta T} \dot{\mathbf{z}} - \mathbf{z}^{T} \dot{\mathbf{z}} \right), \ k = 1, \dots, n \end{aligned} \tag{27}$$

where kA,i,<sup>j</sup> is the most recent microscale step index for k for which MHi,<sup>j</sup> is attached. This number is initialized to zero before starting the small time steps with k = 1. By substituting Equation (27) into Equation (25), the mean total tensile force can be rewritten as

$${}^{T,\Delta T}F = {}^{T,\Delta T}\hat{F} - \Delta T {}^{T,\Delta T}K\_{\hat{F}}{}^{T+\Delta T}\dot{z} \tag{28}$$

with total mean stiffness

$${}^{T, \Delta T}K\_F = \frac{1}{n^2 \cdot n\_F} \sum\_{k=1}^n \sum\_{j=1}^{n\_F} \sum\_{i=1}^{n\_M} {}^{T+k\Delta t} \delta\_{A, i, j} \left(k - k\_{A, i, j}\right)$$

$$\times \frac{d^2 \, W\_{rod}}{d\xi^2} \left(\,^{T+k\Delta t}\overline{\xi}\_{i, j}\right) \tag{29}$$

and extrapolated mean total tensile force using <sup>T</sup> z˙

$${}^{T, \Delta T}\tilde{F} = {}^{T, \Delta T}\overline{F} + \Delta T \, {}^{T, \Delta T}K\_F {}^{T}\dot{z} \tag{30}$$

By substituting Equations (28–30) into Equation (24), the implicit scheme is established as follows:

$$\left(\chi\_{\rm Z} + K\_{\rm Z} \Delta T + \Delta T^{\,\,T,\Delta T} K\_{\rm F}\right)^{T+\Delta T} \dot{\mathbf{z}} = -\left(K\_{\rm Z} \,^{\,T} \mathbf{z} - {}^{T,\Delta T} \tilde{F}\right) \tag{31}$$

To see the necessity of the above implicit coupling scheme, consider the instability of the usual explicit scheme here. If an explicit scheme for the total mean tensile force is used

$$\begin{cases} \begin{aligned} \gamma \mathbf{z}^{T+\Delta T} \dot{\mathbf{z}} + \mathbf{K}\_{\mathbf{Z}}^{T+\Delta T} \mathbf{z} - \mathbf{}^{T,\Delta T} \overline{\mathbf{F}} = \mathbf{0} \\ \mathbf{^{T}+\Delta T} \mathbf{z} = \mathbf{^{T}} \, \mathbf{z} + \Delta T \, ^{T+\Delta T} \dot{\mathbf{z}} \end{aligned} \end{cases} \tag{32}$$

instead of Equation (24), the time step size 1T is limited by the total mean stiffness by

$$
\Delta T < \frac{\gamma\_Z + \Delta T K\_Z}{T\_{\gamma} \Delta T K\_F} \tag{33}
$$

As an illustration, in the case of γ<sup>Z</sup> = 10<sup>4</sup> pN·ns/nm, as assumed in our previous work (Washio et al., 2017), <sup>T</sup>,1TK<sup>F</sup> = 28 pN/nm, 20 attached MHs, the stiffness of each rod set to 2.8 pN/nm, and K<sup>Z</sup> ≈ 0, the constraint in Equation (33) would be 1T < 360 ns. However, the proposed algorithm is stable for any time step size, as far as the linear approximation in Equation (28) is concerned.

In coupling with the macroscopic finite element model, a halfsarcomere model is assigned to each element, for which Equation (20) is applied based on the relationship between the stretching along the fiber orientation **f** and the deformation gradient tensor:

$$\mathbf{r}^T \boldsymbol{\lambda} = \left\| \frac{\partial^T \mathbf{x}}{\partial \mathbf{X}} f \right\| \tag{34}$$

Here, <sup>T</sup>**x** = <sup>T</sup>**x** (**X**) is the current position at time T of the material point **X** in the unloaded condition. Specifically, the following equation, obtained from Equation (34), is substituted into Equation (20).

$${}^{T}\dot{\lambda} = \frac{1}{{}^{T}\dot{\lambda}} \left(\frac{\partial{\,}^{T}\dot{\mathbf{x}}}{\partial{\mathbf{x}}} f\right) \cdot \left(\frac{\partial{\,}^{T}\mathbf{x}}{\partial{\mathbf{x}}} f\right) \tag{35}$$

From Equations (20, 28), the mean total tensile force of each thin filament is given by

$${}^{T,\Delta T}F = {}^{T,\Delta T}\tilde{F} - \Delta T \frac{{}^{T,\Delta T}K\_F}{2} {}^{S}\mathbf{S}L\_0 {}^{T+\Delta T}\dot{\lambda} \tag{36}$$

Here, <sup>T</sup> z˙ in Equation (30) is also replaced with −SL<sup>0</sup> <sup>T</sup>λ/˙ 2 to determine <sup>T</sup>,1TF˜. Thus, the total active tension per unit area in the unloaded configuration, the nominal stress, is given by

$$\Delta^{T,\Delta T}T\_f = 2\frac{R\_\text{S}}{\text{SA}\_0}\,^{T,\Delta T}F = 2\frac{R\_\text{S}}{\text{SA}\_0}\left(^{T,\Delta T}\tilde{F} + \Delta T \frac{^{T,\Delta T}K\_F}{2} \text{SL}\_0\,^{T+\Delta T}\dot{\lambda}\right) \text{(37)}$$

Here, SA<sup>0</sup> is the cross-sectional area of a single thin filament and R<sup>S</sup> denotes the volume ratio of the sarcomere. The factor of two in Equation (37) comes from the fact that <sup>T</sup>,1TF is the total tensile force given by the MHs surrounding one of the double spirals along the thin filament.

Although a small time step on the order of 1 ns must be used for the time integration of the molecular variables, a larger time step can be applied to the MC state-transition phase. Thus, it is reasonable to apply a much larger time step size, as long as it is an integer multiple of 1t, to the computation of the MC state-transitions. Furthermore, even for the time integration of the molecular variables, a coarser time step than the one used for the attached MHs can be applied to the detached MHs, since the magnitudes of the curvatures are different for the potentials Wrod and ϕ (**Figure 3**).

### Coupling With the Finite Element Ventricle Model

In the beating-ventricle simulation, the Ca2<sup>+</sup> transient is given for each element of the ventricle model (**Figure 5**). By referencing the Ca2<sup>+</sup> transients, together with the stretching λ and the stretching rate λ˙ along the fiber direction, the molecular variables were integrated using the small time step 1t, while the macroscopic displacements of the continuum were computed using the large time step 1T. As derived in the Supplementary Material S3, the active stress on the continuum at time T + 1T is represented by the first Piola–Kirchhoff stress tensor:

$$H\_{\rm act} = \frac{^{T,\Delta T}T\_f}{^{T+\Delta T}\lambda} f \otimes f \cdot \left(\frac{\partial^{T+\Delta T}\mathbf{x}}{\partial \mathbf{X}}\right)^T \tag{38}$$

In the definition of the tension <sup>T</sup>,1TT<sup>f</sup> in Equation (37), the stiffness due to the attached MHs is implicitly included by the use of <sup>T</sup>+1Tλ˙ for <sup>T</sup>+1<sup>T</sup> z˙ in Equation (28). See also the explanation of the stiffness in the Supplementary Material S3. Thus, the proposed scheme is stable for any size of time step.

The governing equation in the macroscale to be solved can be represented by

$$\begin{split} \int\_{\Omega} \delta \dot{\boldsymbol{\mu}} \cdot \rho \ddot{\boldsymbol{\mu}} \, d\Omega &+ \int\_{\Omega} \delta \dot{\mathbf{Z}} : \left( \boldsymbol{\Pi} + 2 \boldsymbol{\varrho} \boldsymbol{I} \boldsymbol{\Gamma}^{-1} \right)^{T} \, d\Omega \\ &= \, \_{P\_{L}} \int\_{\Gamma\_{L}} \delta \dot{\boldsymbol{\mu}} \cdot \boldsymbol{\mathfrak{n}} \, d\Gamma\_{L} + \, P\_{R} \int\_{\Gamma\_{L}} \delta \dot{\boldsymbol{\mu}} \cdot \boldsymbol{\mathfrak{n}} \, d\Gamma\_{R} \\ &\int\_{\Gamma\_{R}} \delta \boldsymbol{\varepsilon}\_{\boldsymbol{\sigma}} \left( \boldsymbol{\gamma}\_{\boldsymbol{\sigma}} \boldsymbol{I} \boldsymbol{\varepsilon}\_{-1} \right)^{T} \, d\Omega \, \boldsymbol{\varepsilon}\_{0} \end{split} \tag{39}$$

 δp 2 (J − 1) − κ d = 0 (40)

Here, **u** = <sup>T</sup>**u** (**X**)=T**x** (**X**)−**X** is the displacement of the material at point **X** ∈ at time T, ρ is the density of the heart muscle, **F** = ∂**x**/∂**X** is the deformation gradient tensor, **Z** = ∂**u**/∂**X** is the displacement gradient tensor, J = det **F** is the Jacobian, p is the hydrostatic pressure, κ is the bulk modulus, and P<sup>L</sup> and P<sup>R</sup> are the blood pressures in the left and right ventricles, respectively. is the muscle domain in the reference configuration, while Ŵ<sup>L</sup> and Ŵ<sup>R</sup> are the blood–muscle interfaces of the left and right ventricles, respectively, in the current configuration at time T, and **n** is the normal unit vector directed from the cavity to the muscle at these surfaces (**Figure 5**). The Dirichlet boundary condition **<sup>T</sup>u** (**X**)=0 is imposed on the boundary nodes around the valve rings. The first Piola–Kirchhoff stress tensor Π consists of the active, passive, and viscous stresses:

$$
\Pi = \Pi\_{\rm act} + \Pi\_{\rm pas} + \Pi\_{\rm vis} \tag{41}
$$

where Πact is given by Equation (38), and the others are, respectively, the passive and viscous stresses, as described in our previous work (Washio et al., 2016). The details of these two stress tensors are given in the Supplementary Material S4.

The ventricle blood pressures P<sup>L</sup> and P<sup>R</sup> were determined through their interactions with the circulatory system of the body. These were modeled as electrical analog circuits, using the same parameters described in our previous work (Washio et al., 2016). The details of the circuit model that includes the atrial

model are given in the Supplementary Material S5. In particular, the flow rates at the inlets and the outlets were associated with the rates of volume change in the cavity according to:

$$\begin{cases} \int\_{\Gamma\_L} \dot{\mathfrak{u}} \cdot \mathfrak{n} \, d\Gamma\_L = F\_{MI} - F\_{AO} \\ \int\_{\Gamma\_R} \dot{\mathfrak{u}} \cdot \mathfrak{n} \, d\Gamma\_R = F\_{TR} - F\_{PA} \end{cases} \tag{42}$$

Here, FMI, FAO, FTR, and FPA were the flow rates, respectively, through the mitral, aortic, tricuspid, and pulmonary valves (**Figure 5**). These flow rates were determined by Ohm's law while taking the rectification of the valve into account.

$$F = H\left(\overline{F}\right)\overline{F} \tag{43}$$

Here, F was the flow rate in the case of no rectification, and H was the relaxed Heaviside function:

$$H\left(\overline{F}\right) = \begin{cases} 0, & \overline{F} < 0\\ \left(\overline{F}/\overline{F}\_0\right)^2 \left(3 - 2\overline{F}/\overline{F}\_0\right), & 0 \le \overline{F} \le \overline{F}\_0\\ 1, & \overline{F} > \overline{F}\_0 \end{cases} \tag{44}$$

In our simulation, the value F<sup>0</sup> = 5 mL/s was used.

The macroscopic variables, including the acceleration **u**¨, velocity **u**˙, and displacement **u** at time T + 1T were found using Newton–Raphson iteration until the equilibrium condition was satisfied with the Newmark-beta time integration scheme (Supplementary Material S6). During the iterations, the active stress in Equation (38) was redefined with Equation (37), in which the microscopic computational results <sup>T</sup>,1TF˜ and <sup>T</sup>,1TK<sup>F</sup> were reused. Thus, switching between computations at the two scales only happened once for each macroscopic time step.

### RESULTS

#### Computer System

To perform the simulations, a distributed parallel system was used. Each node consisted of two Intel <sup>R</sup> Xeon <sup>R</sup> E5-2670 (20 MB Cache, 2.6 GHz) processors, and each processor was composed of 8 cores. In the single sarcomere simulations, only one node was used for shared memory OpenMP parallelization. In the beating-ventricle simulations, the elements of the ventricle wall were equally distributed to the nodes, while the macroscopic computations were performed only at the master node. In the microscopic computations, the time integrations of the molecular variables were parallelized using OpenMP by dividing the filaments equally among the 16 cores.

### Validation of the MTS Scheme via Single Sarcomere Oscillation

The accuracy and computational efficiency of the MTS scheme were validated by numerical experiments with a single halfsarcomere model, in which 48 thin filaments were connected to a common Z-line (**Figure 2A**). In our previous work (Washio et al., 2017), we showed that the spontaneous oscillatory behavior of the sarcomere (Ishiwata et al., 2011) can be explained by the power stroke principle after applying a simple ordinary differential equation model. In this case, the collective reversal power strokes induced quick sarcomeric lengthening. Here, we show that this could also be reproduced by the Langevin dynamics model, regardless of the choice of macroscale time step size in the MTS scheme. In this numerical experiment, the spring constant K<sup>Z</sup> was set to 1 pN/nm per thin filament, and the viscosity coefficient γ<sup>Z</sup> was set to 10<sup>4</sup> pN · ns/nm per thin filament. During the simulations, the Ca2+concentration was kept at the constant value of 1 µM.

In **Figure 6**, the shortening displacements obtained by using a conventional single-scale integration scheme (1t = 1T = 0.25 ns) and the MTS scheme (1t = 0.25 ns, 1T = 5,000 ns) are compared for both the no-trap and trap models. In the notrap model, the dependence of the first energy barrier height Eb<sup>1</sup> (θ) on the LA deflection θ in Equation (9) was eliminated, and the baseline of the energy barrier Eb<sup>01</sup> was higher when compared with the one in the trap model (**Table 1**), so that a similar maximal tensile force is obtained in both models. Next, the state-transitions were computed with 1t = 0.25 ns. In these numerical experiments, the simulations started from an initial state in which all of the MHs were in NXB, and an identical series of random forces and pseudorandom numbers for the MC state-transitions were applied to all the simulations. In case of the no-trap model (**Figure 6A**), similar amplitudes and periods were obtained for the shortening displacements, although there were deviations in the timing of the sharp declines. In case of the trap model (**Figure 6B**), the large dips in the displacements disappeared. Instead, rapid small vibrations appeared. In this case, similar initial rises, periods, and amplitudes of vibrations were obtained for the both time step sizes of 1T.

As depicted in **Figure 4**, the attached MHs in the XB state were classified according to their power stroke displacement y, as follows:

$$\begin{cases} \text{Pre } = \left\{ \text{MH}\_{i\bar{j}} \in \text{XB} : \, \mathcal{Y}\_{i\bar{j}} < s\_1/2 \right\} \\ \text{PS}\_1 = \left\{ \text{MH}\_{i\bar{j}} \in \text{XB} : \, s\_1/2 \le \mathcal{Y}\_{i\bar{j}} < s\_1 + s\_2/2 \right\} \\ \text{PS}\_2 = \left\{ \text{MH}\_{i\bar{j}} \in \text{XB} : \mathcal{Y}\_{i\bar{j}} \ge s\_1 + s\_2/2 \right\} \end{cases} \tag{45}$$

These states can be regarded as the pre-power stroke, the state after the first power stroke, and the state after the second power stroke, respectively. As suggested by our previous work (Washio et al., 2017), a large pulsed flux of the reversal power strokes from PS<sup>2</sup> to Pre over PS<sup>1</sup> generated the sharp decline in z for the no-trap model (**Figure 7A**). In the trap model, this reversal flux was trapped at PS1, so that the decline in z was stopped at small changes, leading to 1z > −10 nm (**Figure 7B**), which corresponds to the stroke size of the LA.

To test the stability of the MTS scheme, simulations using the explicit scheme given by Equation (32) were performed with a much smaller time step of 1T = 500 ns (**Figure 8**). Although the explicit scheme also yielded good results at first, the computational results became totally invalidated when the active stiffness <sup>T</sup>,1TK<sup>F</sup> exceeded the threshold indicated by

FIGURE 6 | The shortening displacements obtained by the standard scheme (blue: 1*t* = 1*T* = 0.25 ns) and the MTS scheme (red: 1*t* = 0.25 ns, 1*T* = 5,000 ns) for the spontaneous oscillations of the single half-sarcomere model with *n<sup>M</sup>* = 38 and *n<sup>F</sup>* = 48. (A) The comparison for the no-trap model. (B) The comparison for the trap model.

FIGURE 7 | The temporary change of the state ratios classified to the three power stroke stages (*Pre*, *PS*1, and *PS*2) obtained by the standard scheme (1*T* = 0.25 ns) for the spontaneous oscillations of the single half-sarcomere with the no-trap model (A) and the trap model (B). In case of the no-trap model (A), a large pulsed flux of the reversal power strokes from *PS*2 to Pre through *PS*1 generates the sharp decline of *z* around *T* = 370 ms. In case of the trap model (B), the flux of the reversal power strokes is trapped at PS1, so that the decline of *z* is stopped within small changes 1*z* > −10 nm.

Equation (33), as estimated previously. Furthermore, oscillatory behavior could not be reproduced with the explicit scheme. This result suggests the drawback of explicitly using the active

the trap model.

tensions, which occurs when solving a system of ordinary differential equations with a finer time step in coupled simulations. As shown in **Figure 8A**, the calculated force using the explicit scheme did not diverge, although the oscillatory behavior was completely lost. Thus, it is difficult to judge the accuracy of numerical results by examining only one case. As shown here, we must compare the results of different macroscale time step sizes 1T to confirm the accuracy of the coupling scheme.

The above simulations were executed on one node consisting of 16 cores using shared memory in OpenMP parallelization. Thus, in the parallelization, three filaments were assigned to each core. The averaged elapsed times for the 1-ms time integration were 125 and 97 s, with the standard integration scheme (1t = 1T = 0.25 ns) and the MTS scheme (1t = 0.25 ns, 1T = 5,000 ns), respectively. The difference in the elapsed times came from the machine synchronization overhead, and the differences in the computational loads for the various filaments. With the MTS scheme that lumps 20,000 steps, the differences in computational loads between the filaments during each small time step were tremendously diminished. For a single-sarcomere simulation, using a much smaller time step size for 1T was sufficient to attain good parallel efficiency because the overhead associated with updating z was negligible. However, a large step size was necessary when the sarcomere model was coupled with the macroscopic ventricle model because the communication overhead between the large number of nodes became greatly increased, along with the computation time for updating the macroscopic variables.

#### Validation of Basic Sarcomere Properties

The basic properties of the actomyosin trap model, which includes the SL and [Ca] dependences of the contractile force, the isometric twitch, the responses for the isotonic contraction, and the quick shortening of the half-sarcomere, along with the details of these numerical experiments, are presented in the Supplementary Material S2. The results of these numerical experiments confirm the validity of our halfsarcomere model. Here, the force-velocity curve obtained at a constant Ca2<sup>+</sup> concentration ([Ca]= 1 µM) is examined in context with the behavior of the bound myosin molecules during the isotonic contractions at the various shortening velocities (**Figure 9**). As the shortening velocity increased, the state ratio of PS<sup>2</sup> increased (**Figure 9B**), because the joint point P was pushed forward (y increased) more strongly by the deflection potential WLA (θ) in Equation (7) with the larger negative deflection θ = y − x (**Figure 9D**). Note that the negative averaged rod strain ξ at PS<sup>2</sup> for a shortening velocity larger than 1 µm/s (**Figure 9C**) does not imply a negative contractile force, because dWrod/dξ (ξ ) ≫ −dWrod/dξ (−ξ ) for any positive strain ξ>0, except for ξ∼0 as shown in **Figure 3A.**

#### Stretch-Activation by Trapped Myosins

To see the effectiveness of the trapping mechanism in the state after the first power stroke PS<sup>1</sup> created by the energy barrier in Equation (9), together with the zero detachment rates for PS<sup>1</sup> in Equations (11, 12), a stretch-activation test was performed for the single half-sarcomere model consisting of 48 filament pairs (**Figure 10**). Here, a 1% stretch was applied over the 1-ms time interval starting at T = 150 ms, at which time the contractile force had sufficiently matured. In the simulation, the time step sizes were set at 1t = 0.25 ns and 1T = 25 ns. The statetransitions were also computed using 1t = 0.25 ns. During the simulations, the Ca2<sup>+</sup> concentration ([Ca]) was kept at the constant value of 10 µM.

A roughly 15% increase in the contractile force lasted at least 2 s after the quick stretch (**Figure 10A**). This long-lasting increase in the force compared with the pre-stretch steady state was apparently due to the lasting increase in the population of PS<sup>1</sup> (**Figure 10B**: orange line). The persistent increase of the averaged LA deflection, θ = y − x, for MHs in PS<sup>1</sup> (**Figure 10C**) indicates that it was generated by the MHs trapped by the higher free energy barrier Eb<sup>1</sup> (θ) defined by Equation (9). Compared with the experimental results given in Stelzer et al. (2006), our numerical result misses "Phase 2," in which the force drops one time to the steady state level before the stretch. However, the magnitude of the force incrementation after that agrees with the experimental facts.

#### Beating-Ventricle Simulations

Beating-ventricle simulations were performed using a finite element ventricle model consisting of 7,600 tetrahedral elements. In each element, a sarcomere model consisting of 8 filament pairs was imbedded along the appropriate fiber orientation **f**. The distribution of the fiber orientations (**Figure 1**) was found by an optimization algorithm (Washio et al., 2016)

based on the impulses given by the active tension, which was computed using the MC crossbridge model instead of the Langevin model to reduce the heavy computational loads. Portions of the helical fiber structure are depicted in **Figure 1**. As confirmed in our previous work (Washio et al., 2016), this algorithm constructed a fiber distribution that was quite similar to the one obtained by diffusion tensor magnetic resonance imaging (DTMRI) measurements. The heart rate was set to 60 beats per minute, and the Ca2<sup>+</sup> transient (**Figure 5**) generated by the mid-myocardial cell model proposed by ten Tusscher and Panfilov (2006) was applied. The transmural delays of the Ca2<sup>+</sup> transient determined by the distances from the endocardial surfaces of the left and right ventricles under a transmural conduction velocity of 52 cm/s, as measured by Taggart et al. (2000), was adopted. The deformation of each element was linked to the sarcomeric shortening displacement using Equations (34, 35). In the simulations, the optimized time step algorithm represented in **Figure 4** was applied. Essentially, the values 1t = 0.25 ns and T = 5,000 ns were used, so that n = 20,000. However, the state-transitions were computed every 2.5 ns (nDA = 10), and the time integration for the detached MHs were performed every 1.25 ns (n<sup>D</sup> = 5).

In the crossbridge model, the trap and the no-trap models using the various power-stroke free energy potential functions ϕPS were used, as with the simulations of the single sarcomere oscillation (**Table 1**). By comparing it with the no-trap model in **Figure 11A**, the trap mechanism can be seen as contributing to maintaining the high pressure in the last half of the systolic phase. As a result, the blood volume ejected from the left ventricle in the trap model increased to 77 from 68 mL, while the ATP energy consumption of the left ventricular wall decreased to 5.9 from 6.4 J (**Figure 11E**). This implies that the trap mechanism serves to increase the blood ejection, while also decreasing the energy consumption. Note that the ATP consumption rates were computed by counting the detachments of MHs in PS<sup>2</sup> to those in NXB, which was controlled by the rate constant DNXB defined in Equation (12).

As shown in **Figures 11B–D**, two increases in the population of MHs in state PS<sup>1</sup> can be seen; one at the beginning of the systolic phase, and one at the final half. These increases correspond to reversals in the left ventricular pressures of the trap and the no-trap models, as shown in **Figure 11A.** In the systolic phase, the cardiac myocytes supported their contractile tension along the shared fiber bundle, in which the active stress in Equation (38) provided the great majority of the total stress

molecular variables *x* and *y* for the MHs at *PS*1.

in Equation (41). Therefore, from the mechanical equilibrium condition along a fiber bundle, the active tensions must be almost equal. If there was a delay in the provision of the active tension, or a relaxation during the intermediate systolic phase at one point of the fiber bundle, this portion quickly became lengthened, and the sarcomeres in the remaining parts shortened until reaching a mechanical equilibrium. Since this transition accompanied decreases in the active tension of the sarcomeres, stopping the process as early as possible was desirable. The trap mechanism could achieve this goal, as shown in **Figure 12**, in which the distributions of the population of MHs in states PS<sup>1</sup> and PS<sup>2</sup> at the end of the systolic phase (T = 0.25 s) were compared. As shown in **Figure 12B** for the trap model, the higher populations in the PS<sup>1</sup> state were seen in the regions where the populations in state PS<sup>2</sup> were lower than in the other regions. This indicates that the decrease in the population of MHs at PS<sup>2</sup> was sufficiently compensated for by the trapped MHs in state PS1. However, although the population in PS<sup>2</sup> for the no-trap model was similar to one of the trap model, the active tension was nearly half that of the trap model for the entire region (**Figure 12A**). In particular, the active tensions with the no-trap model were much smaller than those with the trap model, even in the regions with large PS<sup>2</sup> populations. This indicates the importance of maintaining the active tension along the fiber bundle. The distributions of the active tension values and the state populations over the entire cycle are shown in Supplementary Video 1.

The importance of the trap for synchronizing contraction and relaxation over the entire ventricle is further confirmed by **Figure 13**, in which the behaviors of the sarcomere model

filament pairs was imbedded. (A) The time courses of the left ventricular pressure (solid lines) and volume (broken lines) with the no-trap MH model (red) and the trap model (black). (B–D) The time courses of the population ratio of attached MHs in the left ventricular wall classified to the pre-power stroke state (B: *Pre*), the first post-power stroke state (C: *PS*1), and the second post-power stroke state (D: *PS*2). (E) The time courses of the cumulative ATP energy consumption in the left ventricular wall.

with the no-trap and the trap model imbedded with identical elements at the apical septal segment are compared. With the no-trap model (**Figure 13A**), there was a prominent decline in the sarcomere shortening displacement z that accompanied the large drops in the active tension around T = 0.18 s. This drop in the active tension was caused by shifts in the population of MHs from PS<sup>2</sup> to the pre-power stroke state Pre, as indicated in **Figure 13C**. As shown previously in the simulations of sarcomere oscillation, each sarcomere had the ability to undergo quick lengthening after a certain duration of contraction. However, the slow decline of LVP in the notrap model (**Figure 11A**) at the end of the systolic phase indicates that this characteristic was not necessarily exploited for the quick relaxation of the whole ventricle before the next diastolic phase because the timing of the relaxation changed depending on the Ca2<sup>+</sup> transients and the sarcomeric movements. Furthermore, a relaxation prior to a sufficient drop in the Ca2+-concentration was followed by the next

contraction, as shown in **Figure 13C**, around T = 0.2 s. This contraction of the sarcomere did not efficiently contribute to increasing the ejected blood volume, as indicated by LVV in **Figure 11A**. However, the blood ejection lasted until T = 0.3 s in the trap model. Thus, maintaining the active tension with the trapped MHs in PS1, which corresponded to a rise in the population of PS<sup>1</sup> during the time interval [0.23, 0.3] (**Figure 13D**), substantially contributed to the ejected blood volume.

**Figure 14** compares the distributions of the attached MHs, which are imbedded in 33 elements at the apical septal segment, in the y, θ coordinate at T = 0.1, 0.2, and 0.3 s of the no-trap and trap models. Although the distributions at the beginning of the systolic phase (T = 0.1 s) were nearly the same for both models, differences were found in regions of higher deflection θ in the Pre and PS<sup>1</sup> states at the peak of the systolic phase (T = 0.2 s), and in PS<sup>1</sup> and PS<sup>2</sup> at the end of the systolic phase (T = 0.3 s). Note that the large deflection (θ > 0) of the LA created high strain (ξ > 0) in the rod due to the equilibrium condition for the variable x in Equation (1). However, these MHs in the Pre state of the no-trap model disappeared quickly due to their large rate of detachment into state PXB in Equation (11) and **Table 1** (DPXB,Pre = 3,000 s−<sup>1</sup> ), so that they did not contribute to maintaining the active tension. However, the MHs in state PS<sup>1</sup> were trapped there so long as these myocytes were strongly pulled by the surrounding activated myocytes.

Finally, the computational load and the parallel efficiency were examined. For the microscale computations, the elements of the finite element model were equally distributed to the available cores. But, for the macroscale finite element computations, only one node consisting of 16 cores was used, and the remaining nodes were in the waiting state, since the finite element model was relatively small (7,700 elements). Thus, the parallel efficiency came from the proportion of the macroscale computational time, compared with the total computation time. With the original setup (n = 20,000, nDA = 10, n<sup>D</sup> = 5, 1t = 0.25 ns), the parallel computation with 1,920 cores required 105 h per heartbeat. Within this total elapsed time, 16% was occupied by the macroscale computations. Thus, good parallel efficiency was achieved. Further evaluations of the parallel efficiency are given in the Supplementary Material S7.

### DISCUSSION

### Accuracy, Stability, and Efficiency of the MTS Scheme

The MTS scheme coupled the integration of the molecular variables that use the small time step 1t with the integration of the sarcomere shortening variable z that used the coarse time step 1T, which is a large integer multiple of 1t. Since sarcomere shortening is linked to the shortening of the continuum along the fiber orientation by Equation (35), the same coupling scheme can be applied to the coupling with the finite element model. The key point of the proposed MTS scheme is that the active tension at time T + 1T is implicitly determined by combining the stretch rate of the continuum along the fiber orientation at T + 1T, as given in Equation (37), in which the stiffness of the attached myosin rods during the time interval [T : T + 1T] given by Equation (29) is used. By applying this implicit scheme, an appropriate time step interval 1T can be chosen for the macroscale computation to diminish the synchronization and communication overhead in the distributed memory parallel system. The accuracy of

FIGURE 13 | The behavior of the sarcomere in the systolic phase for the no-trap and the trap models imbedded in an identical element at the apical septal segment. (A,B) The active tension (red) and the sarcomeric shortening Z (blue). (C,D) The population ratio of attached MHs.

the MTS scheme, in which the time step ratio was set to 0.25 ns: 5 µs, was validated using a simulation of the spontaneous oscillation of a single sarcomere, and by comparing the numerical results with those computed using equal time intervals.

### Required Computational Power for the Coupled Simulation

For the beating-ventricle simulation of ventricle model consisting of 7,600 elements, 105 h were required for each beat using 1,920 cores and a 0.25-ns time step integration in the molecular computations, and a 5-µs time step integration for the macroscopic finite element computation. Within this computation, 84% of the total time was consumed by the microscopic molecular computation. In this simulation, 4 elements were assigned to each core, in which the sarcomere model consisted of 8 filament pairs imbedded in each element. Therefore, the CPU time per filament pair was ∼2.8 h. This is the fastest case, not counting the macroscale computational case in which one core was assigned to each filament. Even for the rather coarse mesh model consisting of 7,600 elements, this fastest computation required 60,800 (= 7, 600 × 8) cores. This shows that our application still required huge computational power.

### Potential of the Coupled Approach

In this paper, an effective utilization of the coupled approach to explore the macroscopic effects of a molecular mechanism was shown. Regarding the molecular mechanism, the powerstroke free energy potential was constructed so as to reproduce the stretch-activation for the single-sarcomere model. In this model, the energy barrier between the pre-power stroke state and the state after the first power stroke was made higher for large positive lever arm deflections, which meant that large loads were imposed on the myosin rods and heads. If the prepower stroke state and the state after the first power stroke correspond to, respectively, the so-called "Pi-release state" and "ADP state," the forward and reversal power stroke transitions accompany the release and the rebinding of inorganic phosphate (Pi), respectively (Llinas et al., 2015). Thus, if the larger load on the MH closes the channel in which P<sup>i</sup> travels during the transitions, the height of the free energy barrier could increase. In the proposed numerical model, this hypothesis was reflected by the landscape of the free energy ϕPS θ, y , as mentioned above. The coupled approach revealed that the proposed mechanism for the myosin molecule contributed to maintain the high systolic blood pressure for the appropriate period by synchronizing relaxations along the fiber bundles. Stelzer et al. (2006) discussed the possibility of stretch-activation reinforcing regions where stronger contractile tensions were required during the entire systolic phase, while our numerical results suggest that its function is to reinforce the regions that start relaxation earlier than other regions. Of course, this is still just a hypothesis linking the stretch-activation to the performance of the beating heart. However, this function of stretch-activation function at the end of the systolic phase has gone unnoticed until now.

### Limitations

In the coupling approach, a single half-sarcomere model was directly imbedded into each element of the macroscopic ventricular mesh. This means that the periodically repeated pattern of single sarcomere movement was imposed along the filament direction within each element. Thus, the synchronization of the sarcomeres within each element can be assumed. In reality, relaxations of sarcomeres within the same myofibril are not necessarily synchronized. Thus, even though each of the sarcomeres was stretched quickly during relaxation, as shown in the spontaneous oscillation simulation, the stretch speed of the entire cardiac cell may be slowed due to time lags. One way to account for such an effect in the simulation model is to imbed a myofibril model, in which an adequate number of sarcomeres are connected in series, into each element. Obviously, such an approach requires even greater computational resources.

### New Insights of Cardiac Muscle Relaxation in a Beating Heart

Using the numerical experiments on the single-sarcomere model, spontaneous oscillatory behavior was recovered via the Langevin dynamics model with a simple power-stroke free energy, as in Equation (8) with a constant energy barrier [Eb<sup>1</sup> (θ) ≡ Eb01]. The prominent characteristic of this oscillation is the quick lengthening induced by collective reversal strokes (**Figure 7A**). At first glance, it appears that this mechanism operated by quickly relaxing the muscle against the slow decline of the Ca2<sup>+</sup> concentration (Inset in **Figure 5**). However the timing of the lengthening events differ from those in the ventricle wall due to the various feedback signals from the local muscle movements, resulting in the slow decline of the LVP (**Figure 11A**). Using the numerical experiments on the ventricular model, we see that the trap mechanism contributes to the synchronization of muscle relaxation by halting sarcomeric lengthening if it occurs earlier than in the neighboring muscle. We also see that the same trap mechanism causes the stretch-activation phenomenon at the tissue level.

## AUTHOR CONTRIBUTIONS

TW and TH: designed the project; TW and RK: designed and conceived the numerical model; TW and J-IO: constructed the simulation code and the input data; TW and SS: wrote the paper with input from TH.

### FUNDING

This work was supported in part by the Ministry of Education, Culture, Sports, Science and Technology of Japan (MEXT) as Priority Issue on Post-K computer (Integrated Computational Life Science to Support Personalized and Preventive Medicine) (Project ID: hp170233). RK's work was supported in part also by the Research Complex Promotion Program.

### ACKNOWLEDGMENTS

The authors thank Louis R. Nemzer, Ph.D., from Edanz Group (www.edanzediting.com/ac) for editing a draft of this manuscript.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphys. 2018.00333/full#supplementary-material

### REFERENCES


macroscopic efficiency in muscle modelling. PLoS Comput. Biol. 12:e1005083. doi: 10.1371/journal.pcbi.1005083


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Washio, Sugiura, Kanada, Okada and Hisada. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Numerical Study of Scalable Cardiac Electro-Mechanical Solvers on HPC Architectures

Piero Colli Franzone<sup>1</sup> , Luca F. Pavarino<sup>1</sup> \* and Simone Scacchi <sup>2</sup>

<sup>1</sup> Department of Mathematics, University of Pavia, Pavia, Italy, <sup>2</sup> Department of Mathematics, University of Milano, Milan, Italy

We introduce and study some scalable domain decomposition preconditioners for cardiac electro-mechanical 3D simulations on parallel HPC (High Performance Computing) architectures. The electro-mechanical model of the cardiac tissue is composed of four coupled sub-models: (1) the static finite elasticity equations for the transversely isotropic deformation of the cardiac tissue; (2) the active tension model describing the dynamics of the intracellular calcium, cross-bridge binding and myofilament tension; (3) the anisotropic Bidomain model describing the evolution of the intra- and extra-cellular potentials in the deforming cardiac tissue; and (4) the ionic membrane model describing the dynamics of ionic currents, gating variables, ionic concentrations and stretch-activated channels. This strongly coupled electro-mechanical model is discretized in time with a splitting semi-implicit technique and in space with isoparametric finite elements. The resulting scalable parallel solver is based on Multilevel Additive Schwarz preconditioners for the solution of the Bidomain system and on BDDC preconditioned Newton-Krylov solvers for the non-linear finite elasticity system. The results of several 3D parallel simulations show the scalability of both linear and non-linear solvers and their application to the study of both physiological excitation-contraction cardiac dynamics and re-entrant waves in the presence of different mechano-electrical feedbacks.

Keywords: domain decomposition preconditioners, cardiac electro-mechanics, bidomain model, scalable parallel solvers, re-entrant waves, mechano-electric feedback

### 1. INTRODUCTION

In recent years, several areas of medicine, and in particular cardiology, have undergone a cultural revolution generated by new findings that have emerged from molecular biology. This new knowledge has helped to identify, for each disease and for each patient, the specific mechanisms of the disease and the resulting medical treatments, leading to the so-called personalized medicine. For example, the use of mathematical models with parameters for the individual patient-specific characteristics could allow cardiologists to predict the effectiveness of anti-arrhythmic drug treatments or the proper installation of implantable defibrillators (see e.g., Nordsletten et al., 2011; Constantino et al., 2012; Lamata et al., 2015; Trayanova and Chang, 2016).

The spatio-temporal evolution of the electrical impulse in the cardiac tissue and the subsequent process of cardiac contraction-relaxation are quantitatively described by the cardiac electromechanical coupling model, which consists of the following four sub-models:

#### Edited by:

Mariano Vázquez, Barcelona Supercomputing Center, Spain

#### Reviewed by:

Alfio Maria Quarteroni, Politecnico di Milano, Italy Daniel Hurtado, Pontificia Universidad Católica de Chile, Chile

\*Correspondence:

Luca F. Pavarino luca.pavarino@unipv.it

#### Specialty section:

This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology

Received: 15 December 2017 Accepted: 08 March 2018 Published: 05 April 2018

#### Citation:

Colli Franzone P, Pavarino LF and Scacchi S (2018) A Numerical Study of Scalable Cardiac Electro-Mechanical Solvers on HPC Architectures. Front. Physiol. 9:268. doi: 10.3389/fphys.2018.00268

**341**


The theoretical and numerical challenges posed by this complex non-linear electro-mechanical model are very interesting. Indeed, the theoretical analysis of the well-posedness of the cardiac electro-mechanical coupling model is still an open problem, as well as the convergence analysis of its finite element approximation. On the numerical level, the very different space and time scales associated with the electrical and mechanical submodels, as well as their non-linear and multiphysics interactions, make the approximation and simulation of the cardiac electromechanical coupling model a very demanding and expensive computational task.

In the last decade, several groups have performed cardiac computational studies based on three-dimensional electrical and electro-mechanical simulations (see Pathmanathan and Whiteley, 2009; Göktepe and Kuhl, 2010; Keldermann et al., 2010; Gurev et al., 2011; Trayanova et al., 2011; Land et al., 2012b; Nobile et al., 2012; Rossi et al., 2012; Dal et al., 2013; Sundnes et al., 2014; Favino et al., 2016). However, the computational costs required by the solution of the mathematical models describing the cardiac bioelectrical and mechanical activity are still too high to allow their use in a clinical setting. Therefore, there is a strong effort in the research community to develop effective computational tools and to speedup the simulation of the cardiac electro-mechanical activity (see e.g., Vázquez et al., 2011; Lafortune et al., 2012; Washio et al., 2013; Aguado-Sierra et al., 2015; Gurev et al., 2015; Land et al., 2015; Augustin et al., 2016).

Among the most efficient high-performance solvers for these complex cardiac models are parallel iterative methods, such as the Preconditioned Conjugate Gradient method (PCG) and Generalized Minimal Residual Method (GMRES), accelerated by proper scalable preconditioners. For the bioelectrical component modeled by the Bidomain system, several types of preconditioners have been proposed, such as Block Jacobi (BJ) preconditioners employing an incomplete LU factorization (ILU) for each block (Colli Franzone and Pavarino, 2004), other kinds of block preconditioners (Gerardo-Giorda et al., 2009; Chen et al., 2017). geometric multigrid (Sundnes et al., 2002; Weber dos Santos et al., 2004), algebraic multigrid (Plank et al., 2007; Pennacchio and Simoncini, 2009, 2011), and domain decomposition preconditioners such as Multilevel Schwarz (Pavarino and Scacchi, 2008; Scacchi, 2008, 2011; Munteanu et al., 2009; Pavarino and Scacchi, 2011; Charawi, 2017), Neumann-Neumann and BDDC (Zampini, 2013, 2014). For a general introduction to Domain Decomposition methods we refer the interested reader to the monograph (Toselli and Widlund, 2005). More recently, the study of efficient parallel solvers and preconditioners has been extended also to cardiac electro-mechanical models (see e.g., Colli Franzone et al., 2015; Gurev et al., 2015; Pavarino et al., 2015; Augustin et al., 2016; Colli Franzone et al., 2016a,b, 2017) and to cardiac and cardiovascular flow (see e.g., Quarteroni et al., 2017a,b).

The goal of this work is to study the performance of our parallel electro-mechanical solver in three-dimensional left-ventricular simulations on two different HPC (High Performance Computing) architectures. The finite element parallel solver we have developed is based on Multilevel Additive Schwarz preconditioners accelerated by PCG for solving the discretized Bidomain system and on Newton-Krylov methods with Balancing Domain Decomposition by Constraints (BDDC) preconditioners for solving the discretized non-linear finite elasticity system. Extensive numerical simulations have shown the scalability of both linear and non-linear solvers and their effectiveness in the study of the physiological excitationcontraction cardiac dynamics and of re-entrant waves in the presence of different mechano-electrical feedbacks.

The paper is organized as follows. The main four electromechanical cardiac sub-models are briefly introduced in section 2 and discretized in time and space in section 3, where the main computational kernels, parallel solvers and preconditioners are also described. Section 4 contains the main results of the paper obtained in large-scale 3D simulations using high-performance parallel architectures.

### 2. ELECTRO-MECHANICAL CARDIAC MODELS

We conside a cardiac electro-mechanical coupling model consisting of the following four coupled sub-models; see also **Figure 1**.

#### 2.1. Cardiac Tissue Mechanical Model

We assume a quasi-steady state regime and model the cardiac tissue as a non-linear hyperelastic material satisfying the equilibrium equation

$$\text{Div(FS)} = \mathbf{0}, \qquad \mathbf{X} \in \widehat{\Omega}, \tag{1}$$

with appropriate boundary conditions, where we denote by **x** = **x**(**X**, t) the spatial coordinates of the deformed cardiac domain (t) at time t, by **X** = (X1, X2, X3) T the material coordinates of the undeformed cardiac domain b, by **<sup>F</sup>**(**X**, <sup>t</sup>) <sup>=</sup> ∂**x** ∂**X** the deformation gradient and by **u**(**X**, t) = **x** − **X** the displacement field. Following the active stress approach, the second Piola-Kirchhoff stress tensor **S** is written as the sum of passive (pas), volumetric (vol) and active (act) components, i.e.,

$$\mathbf{S} = \mathbf{S}^{gas} + \mathbf{S}^{vol} + \mathbf{S}^{act}.\tag{2}$$

The passive and volumetric terms of **S** are defined as

$$S\_{ij}^{pos,vol} = \frac{1}{2} \left( \frac{\partial W^{pos,vol}}{\partial E\_{ij}} + \frac{\partial W^{pos,vol}}{\partial E\_{ji}} \right) \quad i, j = 1, 2, 3, 4$$

where **E** = 1 2 (**C** − **I**) is the Green-Lagrange strain tensor and Wpas is an exponential strain energy function describing the myocardium as an hyperelastic material transversely isotropic (derived form the orthotropic law proposed in Holzapfel and Ogden, 2009; Eriksson et al., 2013)

$$W^{gas} = \frac{a}{2b} \left( e^{b(I\_1 - 3)} - 1 \right) + \sum\_{i=l,n} \frac{a\_i}{2b\_i} \left( e^{b\_i(I\_{4i} - 1)^2} - 1 \right)$$

$$+ \frac{a\_{ln}}{2b\_{ln}} \left( e^{b\_{ln}I\_{8ln}^2} - 1 \right),\tag{3}$$

where a, b, a(l,n,ln) , b(l,n,ln) are positive material parameters and

$$I\_{4l} = \hat{\mathbf{a}}\_l^T \mathbf{C} \hat{\mathbf{a}}\_l,\ I\_{4n} = \hat{\mathbf{a}}\_n^T \mathbf{C} \hat{\mathbf{a}}\_n,\ I\_{8ln} = \hat{\mathbf{a}}\_l^T \mathbf{C} \hat{\mathbf{a}}\_n.$$

We did not employ an isochoric-deviatoric decomposition of the deformation gradient tensor. The volumetric term Wvol = K (J − 1) 2 is a penalization term added to enforce the nearly incompressibility of the myocardium, where K is a positive bulk modulus and J = det**F**. The model is closed by imposing boundary conditions of mixed Dirichlet and traction type.

#### 2.2. Mechanical Active Tension Model

The active tension generation model is based on calcium kinetic and myofilament dynamics. Here we consider the model proposed in Land et al. (2012a), where the active tension T<sup>a</sup> depends on the intracellular calcium concentration Ca<sup>i</sup> , the fiber stretch λ = q b**a** T l **C**b**a**l , the fiber stretch-rate <sup>d</sup><sup>λ</sup> dt and auxiliary variables included in vector **z**, i.e.,

$$\begin{cases} \frac{d\mathbf{z}}{dt} = R\_z \left( \mathbf{z}, \mathbf{C}a\_i, \lambda, \frac{d\lambda}{dt} \right), \\\ T\_a = f\_{\text{Ta}} \left( \mathbf{z}, \lambda, \frac{d\lambda}{dt} \right). \end{cases}$$

The generated active force is assumed to act only along the fiber direction, so the active Cauchy stress is

$$
\sigma^{act}(\mathbf{x}, t) = \ T\_a \ \mathbf{a}\_l(\mathbf{x}) \otimes \mathbf{a}\_l(\mathbf{x}),
$$

where **a**<sup>l</sup> is a unit vector parallel to the local fiber direction and T<sup>a</sup> is the active fiber stress associated to the deformed cardiac tissue. In the deformed configuration, the unit vector parallel to the local fiber direction can be written as

$$\mathbf{a}\_{l} = \frac{\mathbf{F}\widehat{\mathbf{a}}\_{l}}{||\mathbf{F}\widehat{\mathbf{a}}\_{l}||} = \frac{\mathbf{F}\widehat{\mathbf{a}}\_{l}}{\sqrt{\widehat{\mathbf{a}}\_{l}^{T}\mathbf{C}\widehat{\mathbf{a}}\_{l}}},\tag{4}$$

.

whereb**a**<sup>l</sup> isthe fiber direction in the reference configuration. Then the active stress component **S** act of the second Piola-Kirchhoff tensor is given by

$$\mathbf{S}^{act} = J \, \mathbf{F}^{-1} \sigma^{act} \mathbf{F}^{-T} = J \, T\_a \frac{\widehat{\mathbf{a}}\_l \otimes \widehat{\mathbf{a}}\_l}{\widehat{\mathbf{a}}\_l^T \mathbf{C} \widehat{\mathbf{a}}\_l}$$

#### 2.3. The Bioelectrical Bidomain Model

We denote by v, ue, **w**, **c** the transmembrane potential, the extracellular potential, the gating and ionic concentrations variables on the deformed configuration and by <sup>b</sup>v, <sup>b</sup>ue, <sup>b</sup>**w**, <sup>b</sup>**<sup>c</sup>** the same quantities on reference configuration. The Bidomain model, written on the deformed configuration (t) is given in its parabolic-elliptic formulation by

$$\begin{cases} c\_m \frac{\partial \boldsymbol{\nu}}{\partial t} - \operatorname{div}(D\_i \nabla(\boldsymbol{\nu} + \boldsymbol{\mu}\_\epsilon)) + i\_{ion}(\boldsymbol{\nu}, \mathbf{w}, \mathbf{c}, \lambda) = \boldsymbol{i}\_{app}^i \\ - \operatorname{div}(D\_i \nabla \boldsymbol{\nu}) - \operatorname{div}((D\_i + D\_\epsilon) \nabla \boldsymbol{\mu}\_\epsilon) = \boldsymbol{i}\_{app}^i + \boldsymbol{i}\_{app}^e, \end{cases} (5)$$

where c<sup>m</sup> and iion are the membrane capacitance and ionic current per unit volume, respectively. We apply insulating boundary conditions on ∂(t), i.e.,

$$\mathbf{n}^T D\_i \nabla (\nu + \boldsymbol{\mu}\_\varepsilon) = 0 \qquad \text{and} \qquad \mathbf{n}^T D\_\varepsilon \nabla \boldsymbol{\mu}\_\varepsilon = 0,$$

with **n** being the normal to ∂(t). In order to satisfty the compatibility condition <sup>Z</sup> (t) (i i app + i e app)d**x** = 0, we choose i i app = −i e app = iapp; see e.g., Colli Franzone et al. (2014). In the Lagrangian framework, after the pull-back on the reference configuration <sup>b</sup> <sup>×</sup> (0, <sup>T</sup>), this system becomes

$$\begin{cases} \boldsymbol{c}\_{m}\boldsymbol{J}\left(\frac{\partial\widehat{\boldsymbol{v}}}{\partial t} - \mathbf{F}^{-T}\operatorname{Grad}\widehat{\boldsymbol{v}}\cdot\mathbf{V}\right) - \operatorname{Div}(\boldsymbol{J}\mathbf{F}^{-1}\widehat{\boldsymbol{D}}\_{i}\mathbf{F}^{-T}\operatorname{Grad}(\widehat{\boldsymbol{v}} + \widehat{\boldsymbol{u}}\_{\epsilon})) \\ + \boldsymbol{f}\_{\operatorname{ion}}(\widehat{\boldsymbol{v}},\widehat{\mathbf{w}},\widehat{\mathbf{c}},\lambda) = \widehat{\boldsymbol{f}}\_{\operatorname{app}}, \\ -\operatorname{Div}(\boldsymbol{J}\,\mathbf{F}^{-1}\widehat{\boldsymbol{D}}\_{i}\mathbf{F}^{-T}\operatorname{Grad}\widehat{\boldsymbol{v}}) - \operatorname{Div}(\boldsymbol{J}\,\mathbf{F}^{-1}(\widehat{\boldsymbol{D}}\_{i} + \widehat{\boldsymbol{D}}\_{\epsilon})\mathbf{F}^{-T}\operatorname{Grad}\widehat{\boldsymbol{u}}\_{\epsilon}) = 0, \end{cases} (6)$$

where **V** = ∂**u** ∂t isthe rate of deformation; see Colli Franzone et al. (2016a) for the detailed derivation. These two partial differential equations (PDEs) are coupled through the reaction term iion with the ODE system of the membrane model, given in (t)×(0, T) by

$$\frac{\partial \mathbf{w}}{\partial t} - \mathbf{R}\_{\mathbf{w}}(\nu, \mathbf{w}) = 0, \quad \frac{\partial \mathbf{c}}{\partial t} - \mathbf{R}\_{\mathbf{c}}(\nu, \mathbf{w}, \mathbf{c}) = 0. \tag{7}$$

The bioelectrical system (Equations 6, 7) is completed by prescribing initial conditions on <sup>b</sup>v,**w**, **<sup>c</sup>**, insulating boundary conditions on <sup>b</sup>ue, <sup>b</sup>u<sup>i</sup> <sup>=</sup> <sup>b</sup><sup>v</sup> <sup>+</sup> <sup>b</sup>ue, and the intra- and extracellular applied current <sup>b</sup>iapp <sup>=</sup> <sup>b</sup><sup>i</sup> i app = −b<sup>i</sup> e app. We recall that the extracellular potential <sup>b</sup>u<sup>e</sup> is defined only up to a time dependent constant in space R(t), which can be determined by choosing a reference potential. Here we select as a reference potential the average of the extracellular potential over the cardiac volume, i.e., we require <sup>Z</sup> b <sup>b</sup>ue(**X**, <sup>t</sup>)J(**X**, <sup>t</sup>)d**<sup>X</sup>** <sup>=</sup> 0. Assuming transversely isotropic properties of the intra- and extracellular media, the conductivity tensors on the deformed configuration are given by

$$D\_{i, \varepsilon} = \sigma\_t^{i, \varepsilon} I + (\sigma\_l^{i, \varepsilon} - \sigma\_t^{i, \varepsilon}) \mathbf{a}\_l \otimes \mathbf{a}\_l,$$

where σ i,e l , σ i,e t are the the intra- and extracellular conductivity coefficients measured along the fiber direction **a**<sup>l</sup> and any cross fiber direction, respectively. From Equation (4), it follows that the tensors Di,e(x, t) written on the reference configuration are

$$\widehat{D}\_{i,\varepsilon}(\mathbf{X},t) = D\_{i,\varepsilon}(\mathbf{x}(\mathbf{X},t),t) = \sigma\_t^{i,\varepsilon} \, I + (\sigma\_l^{i,\varepsilon} - \sigma\_t^{i,\varepsilon}) \frac{\mathbf{F} \widehat{\mathbf{a}}\_l^T \mathbf{F}^T}{\widehat{\mathbf{a}}\_l^T \mathbf{C} \widehat{\mathbf{a}}\_l^T}. \tag{8}$$

Frontiers in Physiology | www.frontiersin.org

Therefore, the equivalent conductivity tensors appearing into the bidomain model written in the reference configuration are given by

$$J\mathbf{F}^{-1}\widehat{\mathbf{D}}\_{i,\varepsilon}(\mathbf{X},t)\mathbf{F}^{-T} = \sigma\_{l}^{i,\varepsilon}\mathbf{C}^{-1} + (\sigma\_{l}^{i,\varepsilon} - \sigma\_{t}^{i,\varepsilon})\frac{\widehat{\mathbf{a}}\_{l}\widehat{\mathbf{a}}\_{l}^{T}}{\widehat{\mathbf{a}}\_{l}^{T}\mathbf{C}\widehat{\mathbf{a}}\_{l}^{T}}.\tag{9}$$

For the values of the conductivity coefficients of the Bidoman model (see Colli Franzone et al., 2016a).

#### 2.4. The Ionic Membrane Model and Stretch-Activated Channel Currents

The ionic current in the Bidomain model (Equation 6) is given by iion = χIion, where χ is the membrane surface to volume ratio and the ionic current per unit area of the membrane surface Iion is given by the sum Iion(v,**w**, **c**, λ) = I m ion(v,**w**, **c**) + Isac of two terms: the ionic term I m ion(v,**w**, **c**) given by the ten Tusscher model (TP06) (ten Tusscher et al., 2004; ten Tusscher and Panfilov, 2006), available from the cellML depository (models.cellml.org/ cellml), and a stretch-activated current term Isac. The TP06 ionic model also specifies the functions Rw(v,**w**) and Rc(v,**w**, **c**) in the ODE system Equation (Equation 7), consisting of 17 ordinary differential equations modeling the main ionic currents dynamics.

The stretch-activated current (SAC) is modeled as the sum of a non-selective and a potassium selective currents

$$I\_{\rm sat} = I\_{\rm ns} + I\_{\rm Ko}$$

as in Niederer and Smith (2007). The non-selective SAC current is defined by

$$I\_{ns} = I\_{ns, \text{Na}} + I\_{ns, \text{K}} = \text{g}\_{ns} \, \text{y}\_{sl}(\lambda) \, [ \, r \left( \nu - \nu\_{\text{Na}} \right) \, + \left( \nu - \nu\_{\text{K}} \right) ],$$

with γsl(λ) = 10max(λ−1, 0), gns = 4.13 · 10−<sup>3</sup> mS/cm<sup>2</sup> and the value of r measures the relative conductance of the ions Na<sup>+</sup> and K <sup>+</sup> and determines the reversal potential vns of Ins, varying the degree of expression of the ions Na<sup>+</sup> and K <sup>+</sup>. We have chosen r = 0.2.

The K <sup>+</sup> selective SAC current is defined by

$$I\_{Ko} = g\_{Ko} \frac{\mathcal{Y}\_{\text{SL, Ko}}}{1 + \exp(-(10 + \nu)/45)} (\nu - \nu\_K),$$

where gKo = 1.2 · 10−<sup>2</sup> mS/cm<sup>2</sup> and γSL,Ko = 3 max(λ − 1, 0) + 0.7.

#### 3. NUMERICAL METHODS

#### 3.1. Space and Time Discretization

#### 3.1.1. Domain Geometry

We consider an idealized left ventricular geometry <sup>b</sup> <sup>=</sup> (0) modeled as a truncated ellipsoid described in ellipsoidal coordinates by the parametric equations

$$\begin{cases} x = a(r)\cos\theta\cos\phi & \phi\_{\text{min}} \le \phi \le \phi\_{\text{max}},\\ y = b(r)\cos\theta\sin\phi & \theta\_{\text{min}} \le \theta \le \theta\_{\text{max}},\\ z = c(r)\sin\theta & 0 \le r \le 1. \end{cases}$$

Here a(r) = a<sup>1</sup> + r(a<sup>2</sup> − a1), b(r) = b<sup>1</sup> + r(b<sup>2</sup> − b1), c(r) = c<sup>1</sup> + r(c<sup>2</sup> − c1), and a<sup>1</sup> = b<sup>1</sup> = 1.5, a<sup>2</sup> = b<sup>2</sup> = 2.7, c<sup>1</sup> = 4.4, c<sup>2</sup> = 5 (all in cm) and φmin = −π/2, φmax = 3π/2, θmin = −3π/8, θmax = π/8. We will refer to the inner surface of the truncated ellipsoid (r = 0) as endocardium and to the outer surface (r = 1) as epicardium. Proceeding counterclockwise from epicardium to endocardium, the cardiac fibers rotate intramurally linearly with the depth, for a total amount of 120◦ . Considering a local ellipsoidal reference system (**e**φ, **e**<sup>θ</sup> , **e**r), the fiber direction **a**<sup>l</sup> (**x**) at a point **x** is given by **a**<sup>l</sup> (**x**) = **b**l (**x**) cos(β) + **n**(**x**) cos(β), where

$$\mathbf{b}\_{l}(\mathbf{x}) = \mathbf{e}\_{\phi} \cos \alpha(r) + \mathbf{e}\_{\theta} \sin \alpha(r), \text{ with}$$

$$\alpha(r) = \frac{2}{3}\pi(1 - r) - \frac{\pi}{4}, \qquad 0 \le r \le 1,$$

**n**(**x**) is the unit outward normal to the ellipsoidal surface at **x** and β is the imbrication angle given by β = arctan(cos α tan γ ), with γ = θ(1 − r)60/π.

#### 3.1.2. Time Discretization

The time discretization of the electromechanical model is performed by the following semi-implicit splitting method, where different electrical and mechanical time steps could be used.

(a) given v n , **w** n , **c** n at time step tn, we compute the new variables **w** n+1 , **c** <sup>n</sup>+<sup>1</sup> by solving the ODE system of the ionic membrane model (Equation 7) with a first order implicit-explicit (IMEX) method, i.e.,

$$\begin{cases} \frac{\mathbf{w}^{n+1} - \mathbf{w}^n}{\Delta t} - \mathbf{R}\_{\mathbf{w}}(\boldsymbol{\nu}^n, \mathbf{w}^{n+1}) = 0, \\\frac{\mathbf{c}^{n+1} - \mathbf{c}^n}{\Delta t} - \mathbf{R}\_{\mathbf{c}}(\boldsymbol{\nu}^n, \mathbf{w}^{n+1}, \mathbf{c}^n) = 0; \end{cases}$$

(b) given the calcium concentration Can+<sup>1</sup> i , which is part of the vector of concentration variables **c** n+1 , we compute the new deformed coordinates **x** n+1 , providing the new deformation gradient tensor **F**n+1, by solving the variational formulation of the mechanical problem (Equation 1) and the active tension system, i.e.,

$$\begin{cases} \mathbf{z}^{n+1} = \mathbf{z}^n + \Delta t \mathbf{R}\_z \left( \mathbf{z}^{n+1}, \mathbf{C} a\_i^{n+1}, \boldsymbol{\lambda}^{n+1}, \frac{\boldsymbol{\lambda}^{n+1} - \boldsymbol{\lambda}^n}{\Delta t\_n} \right), \\\\ \boldsymbol{T}\_a^{n+1} = f \mathbf{n} \left( \mathbf{z}^{n+1}, \boldsymbol{\lambda}^{n+1}, \frac{\boldsymbol{\lambda}^{n+1} - \boldsymbol{\lambda}^n}{\Delta t\_n} \right) \\\\ \text{Div}(\mathbf{F}\_{n+1} \mathbf{S}\_{n+1}) = 0; \end{cases}$$

(c) given **w** n+1 , **c** n+1 , **F**n+<sup>1</sup> and Jn+<sup>1</sup> = det(**F**n+1), we compute the new electric potentials v n+1 , u n+1 e by solving the variational formulation of the Bidomain system (Equation 6) with a first order IMEX and operator splitting method, consisting of decoupling the parabolic from the elliptic equation, i.e.,

$$\begin{cases} -\text{Div}(\mathcal{I}^{n+1}\mathbf{F}\_{n+1}^{-1}\widehat{D}\_{i}\mathbf{F}\_{n+1}^{-T}\operatorname{Grad}\widehat{\mathcal{I}}^{n}) - \text{Div}(f\_{n+1}\mathbf{F}\_{n+1}^{-1}(\widehat{D}\_{i}+\widehat{D}\_{\varepsilon})\mathbf{F}\_{n+1}^{-T}\operatorname{Grad}\widehat{\mathcal{u}}\_{\varepsilon}^{n}) = 0, \\\ c\_{m}\boldsymbol{I}\_{n+1}\left(\frac{\widehat{\mathcal{T}}^{n+1}-\widehat{\mathcal{T}}^{n}}{\Delta t} - \mathbb{F}\_{n+1}^{T}\operatorname{Grad}\widehat{\mathcal{V}}^{n}\cdot\mathbf{V}^{n+1}\right) - \text{Div}(\mathcal{I}\_{n+1}\mathbf{F}\_{n+1}^{-1}\widehat{D}\_{i}\mathbf{F}\_{n+1}^{-T}\operatorname{Grad}(\mathcal{I}^{n+1}+\widehat{\mathcal{u}}\_{\varepsilon}^{n+1})) + \\\ I\_{n+1}i\_{\operatorname{ion}}(\widehat{\mathcal{V}}^{n},\widehat{\mathbf{w}}^{n+1},\widehat{\mathcal{C}}^{n+1},\boldsymbol{\lambda}^{n+1}) = I\_{n+1}\widehat{\mathcal{I}}\_{\operatorname{app}}^{n+1}. \end{cases}$$

In our simulations, we use the electrical time step size 1et = 0.05 ms, and a mechanical times step five times larger, 1mt = 0.25 ms. In order to approximate the convective term in the variational formulation of Equation (6), an upwind discretization strategy is employed. We refer to Colli Franzone et al. (2015) and Colli Franzone et al. (2016a) for more details about the numerical scheme.

#### 3.1.3. Space Discretization

The cardiac domain is discretized with a structured hexahedral grid Th<sup>m</sup> for the mechanical model (Equation 1) and Th<sup>e</sup> for the Bidomain model (Equation 6), where Th<sup>e</sup> is a refinement of Th<sup>m</sup> , i.e., the mechanical mesh size h<sup>m</sup> is an integer multiple of the electrical mesh size he. We consider the variational formulations of both mechanical and bioelectrical models and then approximate all scalar and vector fields by isoparametric Q<sup>1</sup> finite elements in space. In all our simulations, we employ an electrical mesh size h<sup>e</sup> = 0.01 cm in order to properly resolve the sharp excitation front, while the smoother mechanical deformation allow us to use a coarse mechanical mesh of size h<sup>m</sup> = 0.08 cm. The resulting electrical mesh consists of N<sup>φ</sup> × N<sup>θ</sup> × N<sup>k</sup> elements, whose values will be specified in each numerical test reported in the Results section.

#### 3.2. Computational Kernels and Parallel Solvers

At each time step of the space—time discretization described above, the two main computational kernels are:

(a) the solution of a non-linear system arising from the discretization of the mechanical problem (1); to this end, we use a parallel Newton-Krylov-BDDC (NK-BDDC) solver, where the Krylov method chosen is GMRES and the BDDC preconditioner will be described in the next sections;

(b) the solution of two linear systems deriving from the discretization of the elliptic and parabolic equations in the Bidomain model (Equation 6); to this eand, we use a parallel Preconditioned Conjugate Gradient (PCG) method, with Multilevel Additive Schwarz preconditioner for the very ill-conditioned elliptic system and with Block-Jacobi preconditioner for the easier parabolic system.

The parallelization of these two main computational kernels of our electro-mechanical solver is based on the parallel library PETSc (Balay et al., 2012) from the Argonne National Laboratory. All the parallel simulations have been performed on highperformance supercomputers and Linux clusters described in the Result section. For the parallel implementation of the BDDC preconditioner, see Zampini (2016).

### 3.3. Multilevel Additive Schwarz Preconditioners

We now describe the Multilevel Additive Schwarz preconditioner employed in the PCG solution of the elliptic kernel (b) associated with the Bidomain system. Let <sup>k</sup> , k = 0, ..., ℓ − 1 be a family of ℓ nested triangulations of , with finer mesh sizes from level 0 to ℓ−1, and let A<sup>k</sup> be the matrix obtained by discretizing the second equation of Equation (6) on <sup>k</sup> ; we have Aℓ−<sup>1</sup> = Abid, where Abid is the stiffness matrix related to the elliptic equation of Equation (6) discretized on the fne mesh. Denote by R<sup>k</sup> the restriction operators from ℓ−<sup>1</sup> to <sup>k</sup> . We decompose each grid <sup>k</sup> , for k = 1, ..., ℓ − 1, into N<sup>k</sup> overlapping subgrids <sup>k</sup> i for i = 1, ..., N<sup>k</sup> , such that the overlap size δ k at level k = 1, ..., ℓ − 1 equals the mesh size h <sup>k</sup> of the grid <sup>k</sup> . We denote by R k i the restriction operator from ℓ−<sup>1</sup> to <sup>k</sup> i and define A<sup>k</sup> i : = R k i A kR k T i . The Multilevel Additive Schwarz (MAS(ℓ)) preconditioner is given by

$$B\_{MAS}^{-1} := \mathbf{R}^{\mathbf{0}^T} \mathbf{A}^{\mathbf{0}^{-1}} \mathbf{R}^{\mathbf{0}} + \sum\_{k=1}^{\ell-1} \sum\_{i=1}^{N\_k} \mathbf{R}\_i^{k^T} \mathbf{A}\_i^{k^{-1}} \mathbf{R}\_i^k.$$

The resulting PCG algorithm has a convergence rate independent of the number of subdomains N<sup>k</sup> (scalability), the number of levels ℓ (multilevel optimality), while it depends linearly on the ratio Hk/h<sup>k</sup> of subdomain to element size on level k (optimality); see Pavarino and Scacchi (2008), Scacchi (2008), and Pavarino and Scacchi (2011) for the theoretical details.

### 3.4. Iterative Substructuring, Schur Complement System and BDDC Preconditioners

We then turn to the BDDC preconditioner used in the mechanical computational kernel (a) above, i.e., the Jacobian system arising at each iteration of the Newton method applied to the non-linear elasticity system (Equation 1). For sake of simplicity, in the following sections we will denote the reference domain by instead of b. We consider a decomposition of into N non-overlapping subdomains <sup>i</sup> of diameter H<sup>i</sup>

$$\Omega = \bigcup\_{i=1}^{N} \Omega\_{i\bullet}$$

and set H = max H<sup>i</sup> . We first reduce the Jacobian system

$$K\mathfrak{x} = f,\tag{10}$$

arising at each Newton step of the mechanical solver, to the interface

$$\Gamma := \left(\bigcup\_{i=1}^N \partial \Omega\_i\right) \rangle \partial \Omega\_\* $$

by eliminating the interior degrees of freedom (dofs) associated with the basis functions having support in each subdomain's interior and obtaining the Schur complement system

$$\mathbf{S}\_{\Gamma}\mathbf{x}\_{\Gamma} = \mathbf{g}\_{\Gamma}.\tag{11}$$

Here S<sup>Ŵ</sup> = KŴŴ − KŴIK −1 II KŴ<sup>I</sup> and g<sup>Ŵ</sup> = f<sup>Ŵ</sup> − KŴIK −1 II f<sup>I</sup> are obtained from the global system (Equation 10) by reordering the finite element basis functions into interior (denoted by the subscript I) and interface (denoted by the subscript Ŵ) basis functions

$$
\begin{pmatrix} K\_{II} & K\_{I\Gamma} \\ K\_{\Gamma I} & K\_{\Gamma\Gamma} \end{pmatrix} \begin{pmatrix} \chi\_I \\ \chi\_\Gamma \end{pmatrix} = \begin{pmatrix} f\_I \\ \zeta\_\Gamma \end{pmatrix}.\tag{12}
$$

The Schur complement system (Equation 11) is solved iteratively by the GMRES method, where only the action of SŴ on a given vector is required and SŴ is never explicitly formed; instead, a block diagonal problem on the interior dofs is solved while computing the matrix vector product. Once the interface solution x<sup>Ŵ</sup> has been determined, the internior dofs x<sup>I</sup> can be found by solving local problems on each subdomain <sup>i</sup> . We then solve by the GMRES method the preconditioned Schur complement system

$$M\_{\rm BDDC}^{-1} \mathbb{S}\_{\Gamma} \mathbb{x}\_{\Gamma} = M\_{\rm BDDC}^{-1} \mathbb{g}\_{\Gamma},\tag{13}$$

where M−<sup>1</sup> BDDC is the BDDC preconditioner, defined in Equation (17) below.

Balanced Domain Decomposition by Constraints (BDDC) preconditioners where introduced by Dohrmann (2003) and first analyzed by Mandel and Dohrmann (2003) and Mandel et al. (2005). In these methods all local and coarse problems are treated additively and the user selects the so-called primal continuity constraints across the subdomains' interface. Usual choices of primal constraints are e.g., point constraints at subdomain vertices and/or averages or moments over subdomains edges or faces. Closely related to BDDC methods are FETI and FETI-DP algorithms, as well as the previous balancing Neumann-Neumann methods; for more details, we refer the ineterested reader to the domain decomposition monograph (Toselli and Widlund, 2005, Ch. 6). See also Brands et al. (2008) and Klawonn and Rheinbach (2010) for FETI-DP algorithms applied in other fields of computational biomechanics.

#### 3.4.1. Subspace Decompositions

Let V be the Q<sup>1</sup> finite element space for displacements and V (i) be the local finite element space defined on subdomain <sup>i</sup> that vanish on ∂<sup>i</sup> ∩ ∂D. This local space can be split into a direct sum of its interior (I) and interface (Ŵ) subspaces V (i) = V (i) I LV (i) Ŵ and we can define the associated product spaces as

$$V\_I := \prod\_{i=1}^N V\_I^{(i)}, \quad V\_\Gamma := \prod\_{i=1}^N V\_\Gamma^{(i)}.$$

While our finite element approximations are continuous across the interface Ŵ, the functions of V<sup>Ŵ</sup> are generally discontinuous across Ŵ, We then define the subspace

<sup>V</sup>b<sup>Ŵ</sup> := {functions of <sup>V</sup><sup>Ŵ</sup> that are continuous across <sup>Ŵ</sup>},

and the intermediate subspace

$$
\widetilde{\mathcal{V}}\_{\Gamma} := \mathcal{V}\_{\Delta} \bigoplus \mathcal{V}\_{\Pi^\*}
$$

defined by further splitting the interface dofs (denoted by the subscript Ŵ) into primal (subscript 5) and dual (subscript 1) dofs. Here:

(a) the subspace <sup>V</sup>b<sup>5</sup> consists of functions which are continuous at selected primal variables. These can be e.g., the subdomain basis functions associated with subdomains' vertices and/or edge/face basis functions with constant values at the nodes of the associated edge/face. A change of basis can be performed so that each primal variable correspond to an explicit dof.

(b) the subspace V<sup>1</sup> = Q<sup>N</sup> <sup>i</sup>=<sup>1</sup> V (i) 1 is the product space of the local subspaces V (i) 1 of dual interface functions that vanish at the primal dofs.

#### 3.4.2. Restriction and Scaling Operators

The definition of our dual-primal preconditioners require also the following restriction and interpolation operators, associated with boolean matrices (with {0, 1} elements):

$$\begin{array}{c} R\_{\Gamma\Delta}: \widetilde{V}\_{\Gamma} \longrightarrow V\_{\Delta}, \ R\_{\Gamma\Pi}: \widetilde{V}\_{\Gamma} \longrightarrow \widehat{V}\_{\Pi},\\ R\_{\Delta}^{(i)}: V\_{\Delta} \longrightarrow V\_{\Delta}^{(i)}, \ R\_{\Pi}^{(i)}: \widehat{V}\_{\Pi} \longrightarrow \widehat{V}\_{\Pi}^{(i)}, \end{array} \tag{14}$$

where <sup>V</sup>b(i) 5 is the local primal subspace. Moreover, we define the pseudo-inverse counting functions δ † i (x), which are defined at each dof x on the interface of subdomain <sup>i</sup> by

$$\delta\_i^\dagger(\mathbf{x}) := \frac{1}{\mathcal{N}\_\mathbf{x}},\tag{15}$$

with N<sup>x</sup> the number of subdomains sharing x. We finally define scaled local restriction operators R (i) D,1 by scaling by by δ † i the only nonzero element of each row of R (i) 1 . We then define the scaling matrix

$$R\_{D, \Gamma} := \text{ the direct sum } R\_{\Gamma \Pi} \oplus R\_{D, \Delta}^{(i)} R\_{\Gamma \Delta}. \tag{16}$$

#### 3.4.3. Choice of Primal Constraints

The efficiency of BDDC (and more in general dual-primal) preconditioners is strongly dependent of the choice of primal contraints. The simplest choice of selecting the subdomains vertices as primal dofs is not always sufficient to obtain scalable and fast preconditioners. Therefore, richer (and computationally more expensive) primal sets have been developed in order to obtain faster preconditioners. These stronger preconditioners are based on larger coarse problems employing also edge and/or face based primal dofs, see e.g., Toselli and Widlund (2005).

#### 3.4.4. Matrix Form of the BDDC Preconditioner

Analogously to the dual-primal splitting introduced before, we partition the local dofs into interior (I), dual (1), and primal (5) dofs, so that the local stiffness matrix K (i) associated to subdomain <sup>i</sup> can be written as

$$K^{(i)} = \begin{bmatrix} K\_{II}^{(i)} & K\_{\Gamma I}^{(i)^T} \\ K\_{\Gamma I}^{(i)} & K\_{\Gamma \Gamma}^{(i)} \end{bmatrix} = \begin{bmatrix} K\_{II}^{(i)} & K\_{\Delta I}^{(i)^T} & K\_{\Pi I}^{(i)^T} \\ K\_{\Delta I}^{(i)} & K\_{\Delta \Delta}^{(i)} & K\_{\Pi \Delta}^{(i)^T} \\ K\_{\Pi I}^{(i)} & K\_{\Pi \Delta}^{(i)} & K\_{\Pi \Pi} \end{bmatrix}.$$

The BDDC preconditioner is then defined as

$$M\_{\rm BIDE}^{-1} = R\_{D,\Gamma}^T \widetilde{\mathbb{S}}\_{\Gamma}^{-1} R\_{D,\Gamma\_\*} \tag{17}$$

where the scaled restriction matrix RD,<sup>Ŵ</sup> has been defined in Equations (14, 16), and

$$\widetilde{\mathbf{S}}\_{\Gamma}^{-1} = R\_{\Gamma\Delta}^{T} \left( \sum\_{i=1}^{N} \begin{bmatrix} 0 \ \boldsymbol{R}\_{\Delta}^{(i)^{T}} \end{bmatrix} \begin{bmatrix} \boldsymbol{K}\_{\boldsymbol{II}}^{(i)} \ \boldsymbol{K}\_{\Delta\boldsymbol{I}}^{(i)^{T}} \\ \boldsymbol{K}\_{\Delta\boldsymbol{I}}^{(i)} \ \boldsymbol{K}\_{\Delta\boldsymbol{\Delta}}^{(i)} \end{bmatrix}^{-1} \begin{bmatrix} 0 \\ \boldsymbol{R}\_{\Delta}^{(i)} \end{bmatrix} \right) \boldsymbol{R}\_{\Gamma\Delta} + \boldsymbol{\Phi} \boldsymbol{S}\_{\Pi\Pi}^{-1} \boldsymbol{\Phi}^{T} \tag{18}$$

The first term in Equation (18) represent the sum of local problems on each subdomain <sup>i</sup> , with Neumann data on the local dual dofs and with zero Dirichlet data on the local primal dofs. The second term in Equation (18) represents a coarse problem for the primal variables involving the coarse matrix

$$\mathbf{S}\_{\Pi\Pi} = \sum\_{i=1}^{N} \mathbf{R}\_{\Pi}^{(l)^T} \left( \mathbf{K}\_{\Pi\Pi}^{(l)} - \left[ \begin{array}{c} \mathbf{K}\_{\Pi I}^{(l)} \end{array} \mathbf{K}\_{\Pi\Delta}^{(l)} \right] \begin{bmatrix} \mathbf{K}\_{\Pi}^{(l)} & \mathbf{K}\_{\Delta I}^{(l)^T} \\ \mathbf{K}\_{\Delta I}^{(l)} & \mathbf{K}\_{\Delta\Delta}^{(l)} \end{bmatrix}^{-1} \begin{bmatrix} \mathbf{K}\_{\Pi I}^{(l)^T} \\ \mathbf{K}\_{\Pi\Delta}^{(l)^T} \end{bmatrix} \right) \mathbf{R}\_{\Pi}^{(l)}$$

and a matrix 8 mapping primal to interface dofs

$$\Phi = R\_{\Gamma\Pi}^T - R\_{\Gamma\Delta}^T \sum\_{i=1}^N \begin{bmatrix} 0 \ R\_{\Delta}^{(i)^T} \end{bmatrix} \begin{bmatrix} K\_{\Pi}^{(i)} \ K\_{\Delta I}^{(i)^T} \\ K\_{\Delta I}^{(i)} \ K\_{\Delta\Delta}^{(i)} \end{bmatrix}^{-1} \begin{bmatrix} K\_{\Pi I}^{(i)^T} \\ K\_{\Pi\Delta}^{(i)^T} \end{bmatrix} R\_{\Pi}^{(i)}.$$

The columns of 8 are associated with coarse basis functions defined as the minimum energy extension into the subdomains with respect to the original bilinear form and subject to the chosen set of primal constraints.

For compressible linear elasticity problems it can be shown that the BDDC algorithm is scalable and quasi-optimal, satisfying a condition number bound (see e.g., Toselli and Widlund, 2005, Ch. 6.4) as

$$\operatorname{cond}(\mathcal{M}\_{\mathrm{BDDC}}^{-1}\mathcal{S}\_{\Gamma}) \le C \left(\frac{H}{h}\right) \left(1 + \log\frac{H}{h}\right)^2,$$

with C( H hm ) = α constant if the primal space is sufficiently rich, while C( H h ) = α H h if the primal space is the minimal one spanned by the dofs associated with the subdomain vertices. We recall that H is the characteristic subdomain size and h = h<sup>m</sup> is the characteristic mechanical mesh size defined in section 3.1. We could not prove a similar bound for the convergence rate of our non-symmetric NK-BDDC preconditioned operator, since our complex non-linear elasticity problem (Equation 1) involves an exponential strain energy function. Nevertheless, the numerical results presented in the next section suggests that such a bound holds also for our operator and demonstrate the effectiveness and scalability of the NK-BDDC method.

#### 4. RESULTS

In this section, we report the results of several 3D parallel simulations with our electro-mechanical Bidomain solver, using two HPC architectures:


### 4.1. Test 1: Double Reentry Simulation With the Electro-Mechanical Bidomain Model (Figures 2, 3)

We start by studying the performance of our electro-mechanical Bidomain solver on a closed ellipsoidal ventricular geometry during a double reentry dyamics initiated by an S1–S2 protocol. **Figure 2** shows the snapshots of the transmembrane potential and mechanical deformation time evolution every 50 ms, computed on 256 KNL processors of Marconi-A2. At each time instant, we report the epicardial lateral view (top panel) and selected horizontal and vertical transmural sections (bottom panel). After three S1 stimulations applied at the apex every 500 ms (not shown), an S2 cross-gradient stimulation (visible as a vertical strip in the t = 0 panel) is applied 280 ms. after the last S1 stimulus, and this instant is taken as the reference time t = 0 ms for this simulation. Two counter-rotating scroll waves are generated by the S2 stimulus, with transmural filaments located near the apex and rotation period of about 250 ms (see the panels t = 0, 250, 500 ms). The lateral epicardial view of the upper panels shows mostly one of the two scroll waves, but the second almost-symmetric one is visible in the transmural sections of the lower panels.

This reentry dynamics is visible also in **Figure 3** that reports the waveforms at epicardial sites P1, P2, P3 (shown in **Figure 3A**) of the transmembrane potential V (**Figure 3B**), extracellular potential u<sup>e</sup> (**Figure 3C**), fiber stretch λ (**Figure 3D**), active tension T<sup>a</sup> (**Figure 3E**), intracellular calcium concentration Ca<sup>i</sup> (**Figure 3F**).

#### 4.2. Test 2: Weak Scalability of the Elliptic Bidomain - TP06 Solver (Figures 4, 5)

**Figures 4**, **5** (left columns) report the results of weak scalability tests on MIRA BG/Q for the elliptic solver (PCG-MAS(4)) required by the bioelectrical Bidomain - TP06 model on a half ellipsoidal domain representing an idealized half left ventricle. The number of processors is increased from 1K to 163K cores of the Mira BG/Q supercomputer of the Argonne National Lab. **Figure 4A1** reports the condition number (blue), iteration counts (red), solution times (yellow) of the PCG - MAS(4) solver. Both a fixed half ellipsoidal domain (**Figure 4A1**, top plot) and an increasing ellipsoidal domain (**Figure 4A1**, bottom plot) are considered, where in both cases the local meshsize (hence the local problem size on each processor) is kept fixed at H/h = 16. The results clearly show the very good scalability of the PCG - MAS(4) solver, since all quantities are bounded from above as the processor count is increased from 1K to 163K cores (a factor

163) and therefore the global problem size increases from about O(10<sup>6</sup> ) to O(10<sup>8</sup> ) degrees of freedom. In particular, we remark that in spite of this problem size increase of a factor 163, the CPU times are almost constant in the case of an increasing half ellipsoid (**Figure 4A1**, bottom plot) or increase by only a factor 2–3 in the case of a fixed half ellipsoid (**Figure 4A1**, top plot), while being almost constant between 16K and 128K cores.

Analogously, **Figures 4**, **5** (right columns) report the results of weak scalability tests on Marconi - A2 for the elliptic solver (PCG-MAS(4)) and also the non-linear mechanical solver (NK-BDDC), described in section 4.3 below. As before, the results clearly show the very good scalability of the PCG-MAS(4) solver, since all quantities associated with the elliptic solver are bounded from above.

In order to study more in detail the weak scalability test on a fixed half ellipsoid (**Figure 4A1**, top plot), we report in **Figures 5A1,B1,C1** the percent summary (given by the LogView PETSc subroutine) of the main PETSc functions called by the PCG - MAS(4) elliptic solver. These PETSc functions, shown in the legend of each plot, range from inner products (VecTDoc) and vector norms (VecNorm) to the whole PCG solver (KSPSolve) and application of

the MAS(4) preconditioner (PCApply). In particular, we report the percent of: CPU time as a fraction of the KSPSolve time (**Figure 5A1**), flops (**Figure 5B1**), messages (**Figure 5C1**). When one of these PETSc functions has a negligible percentage, the corresponding legend shows it equal to 0. After an initial increase in some cases, all reported quantities are very scalable up to 64K cores, and most up to 163K cores, except the VecTDot percent of flops (in **Figure 5B1**). As expected, the percentage of time (**Figure 5A1**) and flops (**Figure 5B1**) are dominated by the PCG solver (KSPSolve), followed by matrix multiplications (MatMult) and inner products (VecTDot). The percentage

of messages (**Figure 5C1**) is dominated by vector scattering (VecScatterBegin), matrix multiplications (MatMult) and PCG (KSPSolve).

### 4.3. Test 3: Weak Scalability of the Electro-Mechanical Solver (Figures 4, 5)

We then study the weak scalability of our electro-mechanical solver from 128 to 2048 KNL processors of Marconi-A2, in particular of the two main computational kernerls: the non-linear mechanical solver (NK-BDDC) and the linear elliptic Bidomain solver (PCG - MAS(4)). **Figure 4A2** reports the CPU times and iteration counts for both solvers, while **Figures 5A2,B2,C2** reports the percent summary of the main PETSc functions called by the electro-mechanical solver.

In this weak scaling test, the local meshsize (hence the local problem size on each processor) is kept fixed at H/h = 16, while the global problem size grows proportionally to the processor count by assigning one subdomain to each processor. Hence, the computational domain consists of increasing portions or an ellipsoidal domain. The results in **Figure 4A2** clearly show the very good scalability of the PCG - MAS(4) elliptic linear solver, since both its CPU times and iteration counts are bounded from above as the processor count is increased to 2,048 cores. On the other hand, the timings of the nonlinear SNES solver are not scalable beyond 512 processors, even if the iteration counts are. This is due to the nonscalability of the coarse solver (Mumps) employed in the BDDC preconditioner.

In order to study more in detail this scalability test, we report in **Figures 5A2,B2,C2** the percent summary (given by the LogView PETSc subroutine) of the main PETSc functions called by the electro-mechanical solver. These PETSc functions, shown in the legend of each plot, range from inner products (VecTDoc) and vector norms (VecNorm) to the linear solvers (KSPSolve) and preconditioner applications (PCApply) required by both the linear (PCG-MAS(4)) and non-linear (NK-BDDC) solvers. In particular, we report the percent of: CPU time (**Figures 5A2**), flops (**Figure 5B2**) and messages (**Figure 5C2**). When one of these PETSc functions has a negligible percentage, the corresponding legend shows it equal to 0). All reported pertentages are very scalable, showing quite flat plots, except the time percentages (**Figure 5A2**), where the KSPSolve and PCApply percentages grow considerably beyond 512 cores, due mostly to the growth of MatSolve and PCSetUp, which we know already from **Figures 5A1,A2** are due to the nonscalable direct coarse solve (Mumps) of the BDDC preconditioner called by the non-linear SNES solver. As expected, the percentage of time (**Figure 5A2**) and flops (**Figure 5B2**) are dominated by the PCG solver (KSPSolve), followed by PCApply and MatSolve. The percentage of messages (**Figure 5C2**) is dominated by vector scattering (VecScatterBegin), matrix multiplications (MatMult) and linear solves (KSPSolve).

### 4.4. Test 4: Strong Scalability of the Non-linear Electro-Mechanical Bidomain Solver (Figures 6, 7)

**Figure 6** reports the results of strong scalability tests on Marconi-A2 for the non-linear electro-mechanical Bidomain model on an ellipsoidal domain during the time interval [0 100] ms. We study the time evolution of CPU times and iterations of the two main computational kernels of our electro-mechanical model: the nonlinear mechanical solver (NK-BDDC) and linear Bidomain solver (PCG - MAS(3) for the elliptic solve and PCG-BJ for the parabolic solve).

The global mesh size is fixed to 384 × 192 × 48 finite elements while the number of processors is increased from 32 = 8 × 4 × 1 (with local mesh 48 × 48 × 48) to 256 = 16 × 8 × 2 (with local mesh 24 × 24 × 24). **Figure 6A** shows the timings of the NK-BDDC solver: after an initial superlinear speedup from 32 to 64 cores, the timings still reduce when going to 128 and 256 cores but with worse speedups (see also **Figure 7A**) and start to increase at 512 cores or more (not shown). **Figure 7B** shows the number of Newton iterations for each NK-BDDC solve, which remain constant at 4 iterations independently of the number of processors. **Figure 7C** reports the cumulative GMRES iterations for each NK-BDDC mechanical solve, which increase in time since the Jacobian mechanical system becomes increasingly ill-conditioned due to the spreading of the electrical activation front and subsequent mechanical contraction. The number of iterations is reduced when going from 64 to 128 and to 256 cores, but unexpectedly in the 32 core test we got the lowest iteration counts after 20 ms. **Figure 7D** shows that the number of PCG iterations for each Bidomain elliptic solve are almost constant independently of the number of processors used. The timings of each Bidomain elliptic (**Figure 7E**) and parabolic (**Figure 7F**) solve show a reduction when the number of processors is increased, but with reduced speedup when using 256 cores or more.

As before, we now study in **Figure 7** the percent summary (given by the LogView PETSc subroutine) of the main PETSc functions in this strong scaling test for the electro-mechanical solver. We report the percent of: flops (**Figure 7C**), CPU time (**Figure 7D**), messages (**Figure 7E**), reductions (**Figure 7F**). Again we find quite flat plots, except the time percentages (**Figure 7D**), where the the KSPSolve percentage grows considerably due mostly to the growth of PCApply and MatSolve, which again we attribute mostly to the nonscalable direct coarse solve (Mumps) of the BDDC preconditioner called by the

scalability on Marconi-A2 from 128 to 2048 processors of the electro-mechanical solver (NK-BDDC). Percent summary of time (A2), flops (B2), messages (C2) of the nine main PETSc functions (from VecTDot to PCApply) called by the elliptic solver.

non-linear SNES solver. The percentage of time (**Figure 7D**), flops (**Figure 7C**) and reductions (**Figure 7F**) are dominated by the PCG solver (KSPSolve), but in **Figure 7C** the percent of flops of KSPSolve and PCSetUp decreases when the processor count increases, while the percentages of MatSolve, PCApply and MatMult increase. The percentage of messages (**Figure 7E**) are dominated by vector scattering (VecScatterBegin), linear solves (KSPSolve) and matrix multiplications (MatMult).

### 5. DISCUSSION

We have developed a high-performance parallel solver for cardiac electro-mechanical 3D simulations. After numerical discretization in space with Q<sup>1</sup> finite elements and IMEX operator splitting finite differences in time, the main computational kernels at each time step require: (a) the solution of a non-linear system deriving from the discretization of the cardiac mechanical problem (1) by a parallel Newton-Krylov-BDDC (NK-BDDC) solver, where the Krylov method chosen is GMRES; (b) the solution of the two linear systems deriving from the discretization of the elliptic and parabolic equations in the Bidomain model (Equation 6) by a parallel PCG method with Multilevel Additive Schwarz and Block-Jacobi preconditioners, respectively. The parallelization of our solver has been based on simulations have been performed on the parallel library PETSc (Balay et al., 2012) from the Argonne

National Laboratory and large-scale 3D simulations have been

run on high-performance supercomputers. We have investigated the performance of the parallel electromechanical solver in both physiological excitation-contraction cardiac dynamics and pathological situations characterized by re-entrant waves.

### 5.1. Bidomain Solver

The results have shown that the electrical Bidomain solver is scalable, in terms of both weak and strong scaling, and is robust with respect to the deformation induced by the mechanical contraction. Bidomain weak scaling tests have been performed both on the Mira BG/Q and Marconi-A2 clusters. The two architectures and the number of cores used are different, although the load per core is the same. Thus, we can not compare fairly the performances obtained on the two architectures. However, the CPU times reported in **Figure 4A**, bottom and **Figure 5A** have the same order of magnitude, showing that the solution of the Bodomain linear systems on the two architectures exhibit comparable costs.

### 5.2. Mechanical Solver

The results have shown that also the mechanical NK-BDDC solver is scalable in terms of non-linear and linear iterations

NK-BDDC solve. (D) PCG iterations for each Bidomain elliptic solve. (E) timings of each Bidomain elliptic solve. (F) timings of each Bidomain parabolic solve.

speedup over the [0 100] ms interval of the nonlinear SNES solver. (C–F) Percent summary of flops (C), time (D), messages (E), reductions (F), of the nine main PETSc functions (from VecTDot to PCApply) called by the elliptic solver.

counts, but the CPU timings, especially in the weak scaling test, do not present a scalable behavior. Our results seem to indicate that this increase of CPU timings can be attributed to the increase of computational costs required by the BDDC coarse solver. A possible remedy would be to employ a multilevel BDDC solver, where the coarse problem is solved recursively by a BDDC method with additional local and coarse problems, or to employ an adaptive selection of BDDC primal constraints. The nonscalability and ill-conditioning of the nonlinear mechanical system could also be associated with: (a) the penalty formulation employed to enforce the almost incompressibility of the cardiac tissue; (b) the presence of the stress induced by the active tension contraction model; (c) the particular mechanical boundary condition enforcing zero displacements on a fixed endocardial basal ring and fixed intracavitary endocardial pressure.

### 5.3. Comparison With Previous Studies

So far, only few studies have developed and investigated parallel numerical solvers for cardiac electro-mechanics. Lafortune et al. (2012) have proposed a fully explicit Monodomain-mechanical solver, obtaining good strong scalability results up to 500 cores. The advantage of our approach with respect to that presented in Lafortune et al. (2012) is that our solver, resulting from a semiimplicit time discretization of the electro-mechanical model, allows larger time step sizes and time adaptivity. Augustin et al. (2016) have developed a very effective electro-mechanical solver, tested on highly accurate patient-specific geometric models and based on Algebraic Multigrid (AMG) preconditioners for both the Bidomain and mechanical systems. The strong scalability results they have reported show a very good performance of AMG applied to the non-linear mechanical system, whereas the AMG preconditioner is less effective for the Bidomain linear system. The advantage of our solver compared to that introduced in Augustin et al. (2016) is that both Multilevel Additive Schwarz and BDDC preconditioners should be more robust

### REFERENCES


than AMG when high order finite elements or isogeometric analysis (see e.g., Charawi, 2017) discretizations are employed. On the other hand, while BDDC preconditioners can be easily constructed for unstructured meshes, Multilevel Additive Schwarz methods are more difficult to implement in case of such grids.

### 5.4. Future Work

In order to improve our mechanical solver, further studies could consider the following issues: (a) mixed formulations of the mechanical system based on inf-sup stable displacement-pressure discrete spaces; (b) alternative active tension contraction models; (c) alternative mechanical boundary conditions and pressurevolume relationships involving multielement Windkessel models.

### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

### ACKNOWLEDGMENTS

This work was partially supported by grants of Istituto di Matematica Applicata e Tecnologie Informatiche IMATI - C.N.R., Pavia, Italy, and of Istituto Nazionale di Alta Matematica (INdAM), Italy.


data. Biomech. Model. Mechanobiolo. 10, 295–306. doi: 10.1007/s10237-010- 0235-5


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Colli Franzone, Pavarino and Scacchi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Mechanical Characterization of the Vessel Wall by Data Assimilation of Intravascular Ultrasound Studies

Gonzalo D. Maso Talou1,2 \*, Pablo J. Blanco1,2, Gonzalo D. Ares 2,3,4 , Cristiano Guedes Bezerra2,5, Pedro A. Lemos 2,5 and Raúl A. Feijóo1,2

<sup>1</sup> National Laboratory for Scientific Computing, Department of Mathematical and Computational Methods, Petrópolis, Brazil, <sup>2</sup> National Institute of Science and Technology in Medicine Assisted by Scientific Computing, São Paulo, Brazil, <sup>3</sup> National Scientific and Technical Research Council, Buenos Aires, Argentina, <sup>4</sup> CAE Group, National University of Mar del Plata, Mar del Plata, Argentina, <sup>5</sup> Department of Interventional Cardiology, Heart Institute (Incor), São Paulo, Brazil

#### Edited by:

Mariano Vázquez, Barcelona Supercomputing Center, Spain

#### Reviewed by:

Pras Pathmanathan, United States Food and Drug Administration, United States Fernando Mut, George Mason University, United States

> \*Correspondence: Gonzalo D. Maso Talou gonzalot@lncc.br

#### Specialty section:

This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology

Received: 04 December 2017 Accepted: 12 March 2018 Published: 28 March 2018

#### Citation:

Maso Talou GD, Blanco PJ, Ares GD, Guedes Bezerra C, Lemos PA and Feijóo RA (2018) Mechanical Characterization of the Vessel Wall by Data Assimilation of Intravascular Ultrasound Studies. Front. Physiol. 9:292. doi: 10.3389/fphys.2018.00292 Atherosclerotic plaque rupture and erosion are the most important mechanisms underlying the sudden plaque growth, responsible for acute coronary syndromes and even fatal cardiac events. Advances in the understanding of the culprit plaque structure and composition are already reported in the literature, however, there is still much work to be done toward in-vivo plaque visualization and mechanical characterization to assess plaque stability, patient risk, diagnosis and treatment prognosis. In this work, a methodology for the mechanical characterization of the vessel wall plaque and tissues is proposed based on the combination of intravascular ultrasound (IVUS) imaging processing, data assimilation and continuum mechanics models within a high performance computing (HPC) environment. Initially, the IVUS study is gated to obtain volumes of image sequences corresponding to the vessel of interest at different cardiac phases. These sequences are registered against the sequence of the end-diastolic phase to remove transversal and longitudinal rigid motions prescribed by the moving environment due to the heartbeat. Then, optical flow between the image sequences is computed to obtain the displacement fields of the vessel (each associated to a certain pressure level). The obtained displacement fields are regarded as observations within a data assimilation paradigm, which aims to estimate the material parameters of the tissues within the vessel wall. Specifically, a reduced order unscented Kalman filter is employed, endowed with a forward operator which amounts to address the solution of a hyperelastic solid mechanics model in the finite strain regime taking into account the axially stretched state of the vessel, as well as the effect of internal and external forces acting on the arterial wall. Due to the computational burden, a HPC approach is mandatory. Hence, the data assimilation and computational solid mechanics computations are parallelized at three levels: (i) a Kalman filter level; (ii) a cardiac phase level; and (iii) a mesh partitioning level. To illustrate the capabilities of this novel methodology toward the in-vivo analysis of patient-specific vessel constituents, mechanical material parameters are estimated using in-silico and in-vivo data retrieved from IVUS studies. Limitations and potentials of this approach are exposed and discussed.

Keywords: parameter identification, reduced order unscented Kalman filter, IVUS, coronary arteries, arterial wall model, computational models, high performance computing

## 1. INTRODUCTION

Cardiovascular diseases are the principal cause of death and morbidity worldwide (Mathers et al., 2016). The two principal causes of death, cardiac ischemia and stroke, are intrinsically related with the onset and progress and destabilization processes of atherosclerotic plaque, which are still largely unknown (Crea and Liuzzo, 2013; Bentzon et al., 2014). At the final stage of the destabilization process, the plaque ruptures releasing thrombotic components into the blood stream which in turn generate thrombi that block the vessel lumen causing ischemia. Thus, the prediction of rupture events and the identification of the so-called culprit plaques is of the utmost importance for diagnostics and therapeutics. Through computational simulations is it possible to study the arterial wall stress state, which may compromise plaque integrity and induce rupture. Moreover, computational models also allow to recreate different physiological and pathophysiological conditions (hypertension, hyperemia, exercise, stenosis) (Taylor et al., 1999, 2013; Torii et al., 2007; Blanco et al., 2015), as well as interventions (angioplasty balloon inflation, stent deployment, stent-plaque interaction, among others) (Conway et al., 2012, 2014) that are valuable resources for diagnosis, treatment and surgical risk assessment.

In order to accurately simulate patient specific conditions, three kinds of input data are required: (i) patient-specific anatomical models of the vasculature, (ii) the loads to which the anatomical structures are subjected to, and (iii) the patientspecific distribution of the arterial-wall constituents and their corresponding material parameters. As far as anatomical data of the arteries is concerned, it can be straighforwardly extracted from different medical imaging modalities (Wahle et al., 1995; Milner et al., 1998; Bulant et al., 2017). Regarding the force exerted by the blood pressure, it can be accurately estimated from cuff-pressure measurements (O'brien et al., 2001; Miyashita, 2012). Thus, we are left to the problem of setting patient-specific material parameters for the models of the arterial wall. This has long been the Achilles tendon in numerical simulations, most of them relying in material parameters acquired from ex-vivo material experimentation in cadaveric specimens (Walsh et al., 2014; Karimi et al., 2015). In this sense, the in-vivo identification of material parameters for the arterial-wall is still an open research topic.

Toward covering the aforementioned gap, specifically in the coronary artery disease domain, intravascular ultrasound (IVUS) emerges as an suitable imaging modality to make the attempt to retrieve the material parameters and distribution of the vessel materials under in-vivo conditions due to its high temporal and spatial resolution. The acquired images, when coherently ordered, are capable of delivering the motion of the vascular structures. Some works (Kawasaki et al., 2002; Nair et al., 2002; Sathyanarayana et al., 2009) have successfully classified the materials in few discrete categories (e.g., necrotic core, fibrotic, fibro-fatty or lipid-pool, calcified) based on the acoustic impedance response of the tissues in a determined frame of the IVUS study. It has then been demonstrated that there is a notorious variability of the stress-strain response of tissues within the same category (Loree et al., 1994; Holzapfel et al., 2005; Walsh et al., 2014) of such classification. Therefore, this information is not specific enough for simulation purposes. As anticipated above, the temporal resolution of the IVUS study can be exploited to retrieve the motion (displacement field) of the vessel wall along the cardiac cycle (for example by using optical flow techniques or large deformation diffeomorphic metric mapping). Using the displacement field as input, data assimilation techniques can be supplied to estimate the material parameters.

Data assimilation techniques make use of measurable quantities to adjust a physical model whose goal is to represent the reality posed by the in-vivo scenario. In that manner, these techniques permit not only to estimate specific quantities of interest, but also to explore the underlying physical phenomena. Also, measurement errors can be filtered by the physical model being a quid pro quo benefit: the measurements instantiate the model and the model filters the measurements. Such techniques can be classified in two categories: (i) variational approaches and (ii) sequential filtering approaches.

In the variational approach, a cost functional that measures the difference between the observed measures and the model prediction is constructed. The cost functional depends on the parameters of interest (among other parameters required by the model) to render a model prediction of the measured variable. Then, the estimated parameters are those such that minimize the cost functional. The more popular approach is to solve the Karush-Kuhn-Tucker (KKT) necessary conditions which is employed in several works for mechanical parameter estimation (Lagrée, 2000; Martin et al., 2005; Sermesant et al., 2006; Perego et al., 2011; D'Elia et al., 2012; Bertagna and Veneziani, 2014; Ares, 2016). In Lagrée (2000), the viscoelastic parameters of large arteries were estimated using displacement fields of the vessel wall generated by computational models. Similarly, Martin et al. (2005) explored the estimation of the vessel compliance in a 1D model using a 3D fluid-structure interaction (FSI) model to generate the measured displacement of the vessel wall. Using medical data of blood pressure and inner radius of the arteries, Stålhand (2009) also used 1D models to estimate the material parameters according to the model proposed in Holzapfel et al. (2000). The works of Perego et al. (2011) and D'Elia et al. (2012) formulate the inverse problem from 3D FSI models and analyze the sensitivity in the identification of Young modulus to noise in the measurements of arterial wall displacements. In the latter, data assimilation is performed from flow velocity data as well. The main drawback of these variational approaches is the large number of evaluations of the cost functional (or its derivative) which are required in the minimization problem (Lassila et al., 2013). Furthermore, the use of more realistic models such as 3D FSI models or complex heterogeneous anisotropic solid models are many times mandatory to render accurate results, increasing the computational effort. In some cases, reduced order strategies combined with statistical approaches can be applied to reduce the burden behind cost functional evaluations, as shown in Lassila et al. (2013). Other approach is proposed in Bertagna and Veneziani (2014), based on the application of model reduction techniques coupled with a proper orthogonal decomposition to accomplish the solution of 3D FSI in a computationally efficient

way. Efficient implementations for solid mechanics problems have also been proposed in Avril et al. (2010) and Pérez Zerpa and Canelas (2016) using a virtual fields method and a constitutive equation gap functional, respectively.

In turn, and for problems involving a small-tomoderate number of unknown parameters, the sequential filtering approach (also known as filtering methods) is less computationally demanding and, at the same time, embarrassingly parallel. These features make the filtering approach an appealing strategy for the kind of problems addressed in the present work. Conceptually, given a set of observations, the method realizes a prediction for each observation and, then, introduces corrections in the model parameters based on the discrepancies between the model estimation and the observed data. For each prediction-correction step, several variations of the parameters are tested in the model and, through statistical analysis of the model predictions, a suitable correction is performed over the parameters. Several methods based on the Kalman filter have been developed to deal with linear and non-linear dynamic problems. As examples, a non-linear extended Kalman filter (EKF) with collocation feedback is applied to identify the Young modulus of different regions of a heart model in Moireau et al. (2008), Moireau et al. (2009), and Chapelle et al. (2009). The observations used varied between the myocardium velocity (Moireau et al., 2008), displacement (Moireau et al., 2009) and velocity of the heart boundaries (Chapelle et al., 2009). The stability of such methods was studied (Moireau et al., 2008) and in terms of accuracy it is reported that Kalman filtering is optimal for linear systems only, while extended algorithms based on linearized operators may lead to efficient, albeit non-optimal, filtering procedures. In Lipponen et al. (2010), the EKF is also applied to estimate parameters of a reduced order Navier-Stokes model (through an orthogonal decomposition of the velocity field) through observations acquired from electrical impedance tomography. In more recent works, Moireau and Chapelle (2011) presented a reduced order Kalman filter based on the unscented transform (abbreviated as ROUKF) that offers an interesting alternative to the EKF method. Such an approach does not require neither linearization nor calculation of the tangent operator of the non-linear model, which substantially eases its implementation. Noteworthy, the ROUKF features a higher order approximation of the system states statistics, delivering more accurate outcomes than EKF. In Bertoglio et al. (2012) and Bertoglio et al. (2014), ROUKF was successfully applied for estimation of Young modulus in arteries with tests in-vivo and in-vitro, showing a simpler and more efficient implementation than EKF. Recently in Caiazzo et al. (2017), terminal resistances and vessel wall properties of a 1D vascular network were estimated via ROUKF using blood flow and/or pressure measurements as observations.

In this work, we present a novel approach to construct patientspecific mechanical models of the arterial wall using in-vivo data from IVUS studies. In a nutshell, this approach integrates the realms of image processing, optical flow, continuum mechanics, and filtering data assimilation to effectively merge patientspecific data with mechanical models, toward the in-vivo estimation of material properties. From the IVUS study, a frame of interest is selected and the corresponding arterial wall is demarcated. For the mechanical model a finite strain framework is considered, and the constituent tissues are assumed to behave as isotropic Neo-Hookean materials. Importantly, it is considered that the arterial vessel corresponding to the diastolic phase is at equilibrium with a certain diastolic pressure level, and it is further subjected to a given axial stretch at that phase. By using gating, registration and optical flow methods developed in previous works (Maso Talou et al., 2015, 2017; Maso Talou, 2017), the displacement field of the vessel wall is estimated along the cardiac cycle. Then, the ROUKF is exploited as a data assimilation procedure in which the previously obtained displacement field is supplied as observational data, while the material parameter of the Neo-Hookean models are the target parameters to be estimated.

The manuscript is structured as follows. In section 2, the proposed methodology is detailed, presenting image processing techniques (section 2.1), the mechanical model for the arterial wall (section 2.2), and, at last, the data assimilation process for the estimation of the material parameters (section 2.3). In section 3, the sensitivity of the data assimilation parameters (section 3.1) and boundary conditions (section 3.2) and baseline stress state (section 3.3) for the mechanical problems are studied to assess their impact on the data assimilation outcomes. Hence, the mechanical characterization is performed for four in-vivo atherosclerotic lesions to analyze the performance of the method in real case scenarios (section 3.4). Insights, strengths and weaknesses of the methodology are then discussed in section 4 and final remarks are outlined in section 5.

## 2. METHODS

This section is divided in four parts. First, the IVUS imaging processing methods are described, where we present the procedures to obtain the displacement field of a specific vessel cross-section along the cardiac cycle (see **Figure 1**). Second, the mathematical model for the arterial mechanics is formulated, defining the mechanical equilibrium and the material constitutive behavior. Third, the data assimilation algorithm is presented as a tool to estimate unknown material properties in the mechanical models using the displacement field retrieved from the IVUS images. Finally, an efficient threelevel parallelization scheme is described for high performance computing environments.

### 2.1. Image Processing

The goal of the image processing stage is to deliver the displacement field of the vessel wall along the cardiac cycle at a particular site of interest within the artery. As this new methodology is a proof of concept, the data from the in-vivo cases will be extracted from a standard IVUS pullback as a retrospective study. As the transducer is axially displaced from frame to frame, only images corresponding to a single cardiac cycle can be extracted for each cross-section to obtain small topological variations between the images (spatial consistency). Hence, the extraction of the frames at a particular location is hindered due to the motion of the IVUS transducer exerted by

the myocardium contraction. To overcome this issue, gating and registration procedures are performed using methods previously presented in Maso Talou et al. (2015, 2017). To retrieve the displacement field, a modified optical flow method is applied to the extracted frames at the site of interest. As follows, the treatment given to the IVUS images is briefly described.

#### 2.1.1. Gating

The gating method aims to recover the cardiac phase at each cross-sectional image of the study. To achieve this, a signal that measures the total motion of each frame is generated as

$$s(n) = \alpha\_{\mathcal{g}} \left[ 1 - \frac{\sum\_{i=1}^{H} \sum\_{j=1}^{W} \left( I\_n(i,j) - \mu\_n \right) \left( I\_{n+1}(i,j) - \mu\_{n+1} \right)}{\sigma\_n \sigma\_{n+1}} \right],$$

$$+\left(1-\alpha\_{\mathcal{S}}\right)\sum\_{i=1}^{H}\sum\_{j=1}^{W}-\left|\nabla I\_{n}(i,j)\right|,\tag{1}$$

where I<sup>n</sup> is the n-th image of the study with a resolution of H × W pixels, µ<sup>n</sup> and σ<sup>n</sup> are the mean and standard deviation of the intensity at I<sup>n</sup> and α<sup>g</sup> a mixture parameter. The principal frequency mode of the signal s(n) at the physiological heartfrequency range (i.e., between 0.75 and 1.66 Hz) is extracted to obtain the mean cardiac frequency of the study, fm. Then, a low frequency signal s<sup>l</sup> (n) is generated by low-pass filtering s(n) with cut-frequency f<sup>c</sup> = 1.4fm. If there is not severe arrhythmia during the IVUS acquisition, s<sup>l</sup> presents one minimum per cardiac cycle related to the end-diastolic phase, thus, all frames for this phase are easily and directly extracted. Due to heartbeat period variability along the study, some of these minima can be displaced between s and s<sup>l</sup> , because of the lack of high frequencies contributions. To avoid such inconsistencies, we iteratively modify f k <sup>c</sup> = (k + 0.4)f<sup>m</sup> (k is the current iteration number), recompute s k l with the new cut-frequency f k c and adjust each minimum of iteration k − 1 to its nearest local minimum in s k l . Interestingly, the iterative scheme aids in cases with mild arrhythmia, i.e., where only few heartbeats of the study (not contiguous) present delay or omission of the P-wave. In those cases, the adjustment of the minima identified correctly the Pwaves or collapsed the two minima to the same time position (this is the case when a P-wave did not occur and the heartbeat elapsed twice its period). In both of the previous cases, the minima are correct. In cases with severe arrhythmia, it is recommended the use of ECG signal and manual segmentation of minima for a proper gating. In the so-obtained phase, the cardiac contraction is at its minimum, and so, it corresponds to the beginning of the cardiac cycle, more precisely the beginning of the cardiac P-wave.

Since the heart frequency changes along the study, the heartbeats are sampled with a variable amount of frames. This variability in the heartbeat frequency affects mainly the relaxation process of the heart and, consequently, the length of the T-P interval. Despite this, the P-T interval remains almost invariant. Taking this fact into consideration, the end-diastolic instant for each cardiac cycle will be regarded as a reference for the definition of S cardiac phases. Each available frame of the P-T interval is then associated to a specific cardiac phase, obtaining phasecoherent volume datasets. Further details of the gating method, setup of the mixture parameter α<sup>g</sup> and validation with in-vivo studies are described in Maso Talou et al. (2015).

#### 2.1.2. Registration

All phase-coherent volumes are registered (axially and transversally) against the volume dataset corresponding to the end-diastolic phase. This procedure is performed for each phase-coherent volume. The transversal registration is achieved by finding the in-plane rigid motion for each image in the current phase that best matches the frame image in the end-diastolic phase. To quantify the matching between two images, we use a maximum likelihood estimator presented in Cohen and Dinstein (2002) and Wachinger et al. (2008),

$$\mathcal{L}(I\_n, I\_m) = \sum\_{i=1}^{H} \sum\_{j=1}^{W} \left[ I\_n(i, j) - I\_m(i, j) - \log \left( e^{2 \left( I\_n(i, j) - I\_m(i, j) \right)} + 1 \right) \right]. \tag{2}$$

The rigid motion 4<sup>n</sup> for each cross-section is then estimated by solving the following optimization problem

$$\Xi\_n = \arg\max\_{\Xi^\*} c\left(I\_n^{\text{D}}, I\_n^{\text{s}}(\varkappa(\Xi^\*), \jmath(\Xi^\*))\right),\tag{3}$$

where I s n is the n-th cross-section of the phase-coherent volume corresponding to the s-th phase, D denotes the end-diastolic phase, and I(x(4<sup>∗</sup> ), y(4<sup>∗</sup> )) is the image I after applying the rigid transformation defined by 4<sup>∗</sup> which is composed by an in-plane translation plus a rotation with respect to the image center.

By virtue of the myocardium contraction, the same crosssections site at the different phases may be longitudinally displaced. Therefore, it is necessary to perform an axial registration to find the corresponding frames at different phases for the same transversal site. Thus, after transversal registration of all phase-coherent volumes, an axial registration against the enddiastolic phase is applied. For each frame of each phase-coherent volume (now transversally registered), the best matching frame in the end-diastolic volume is sought out. To diminish the computational burden, the search is limited to the 14 adjacent frames in the end-diastolic volume which is within the range of axial displacements of a transducer during the IVUS study (Arbab-Zadeh et al., 1999). To quantify the matching between two images, we use a neighborhood likelihood estimator defined as

$$c\_{\mathbf{w}}(I\_n^s, I\_m^\mathbf{D}) = \frac{\sum\_{d=-\,\mathbf{w}}^{\mathbf{w}} \phi\_{\sigma\_\mathbf{G}}(d) \, c(I\_{n+d}^s, I\_{m+d}^\mathbf{D})}{\sum\_{d=-\,\mathbf{w}}^{\mathbf{w}} \phi\_{\sigma\_\mathbf{G}}(d)},\tag{4}$$

where φσ<sup>G</sup> is a Gaussian weight function with σ<sup>G</sup> standard deviation and w is the amount of adjacent frames used to establish the matching between the two sites centered at I s n and I D <sup>m</sup> respectively. It is important to note that w is not the search range fixed at 14 frames, but is the size of the neighborhood used for each comparison between two frames. Then, the position for axial registration, i. e., frame of the end-diastolic phase that best matches the current frame I s n is given by

$$m = \underset{k=n-7,\ldots,n+7}{\text{arg}\max} \ c\_{\text{w}}(I\_n^s, I\_k^D). \tag{5}$$

Finally, given the site of interest at the n-th frame of the enddiastolic phase volume, the set of frames that constitutes a sequence along the cardiac cycle at this site is I = {˜I s n ,s = 1, ... , S}, where ˜I s n is the n-th frame of the phase-coherent volume corresponding to phase s after transversal and axial registration. The reader is directed to Maso Talou et al. (2017) and Maso Talou (2017) for further details of the registration methods.

#### 2.1.3. Optical Flow

For a pair (or sequence) of images, optical flow techniques aim at determining the displacement vector field that relates the points of both images (Horn and Schunck, 1981). Because optical flow strategies rely on the gray constancy assumption, a denoising procedure is performed over the sequence. The applied denoising method is a variation of the TV-L1 method (Rudin et al., 1992; Chan et al., 1999) which modifies the data term (absolute difference measurement of the images) by the negative maximum likelihood estimator assuming one image with gamma distributed noise and another noiseless image. Thus, the denoised image I corresponding to the noisy image J is obtained as

$$I = \underset{\vec{I}}{\text{arg min}} \int\_{\Omega} \left[ -\gamma\_d \nu\_d (J - \vec{I}) + \delta\_d^{-\mathcal{Y}d} e^{\mathcal{Y}d(J-\vec{I})} + \alpha\_d |\nabla \vec{I}| \right] d\Omega. \tag{6}$$

where is the image domain, γd, νd, δ<sup>d</sup> are parameters of the generalized gamma distribution that models the noise and α<sup>d</sup> the regularization parameter for denoising.

Then, the optical flow is estimated for the denoised sequence of images using the method proposed in Brox et al. (2004). Particularly, the flow (i.e., the displacement field) is computed between the end-diastolic frame of the sequence to the other frames, corresponding to the different cardiac phases. Thus, the displacement field **u** OF = (u OF , υ OF) between the end-diastolic frame I <sup>D</sup> and the s-phase frame I s is given by

$$\mathbf{u}^{\text{OF}} = \sum\_{r=1}^{R} \delta \mathbf{u}^{r},\tag{7}$$

where δ**u** r is the flow component corresponding to the image resolution r that is obtained as

$$\delta \mathbf{u}^r = \underset{\delta \mathbf{u}}{\text{arg min}} \int\_{\Omega} \left[ \psi \left( \left\| \frac{\partial I^r}{\partial t} + \nabla I^r \cdot \delta \mathbf{u} \right\|\_{G\_\rho}^2 \right) \right]$$

$$+ \alpha\_o \left\| \left( \left\| \nabla (\mathbf{u}^{r-1} + \delta \mathbf{u}) \right\|\_F^2 \right) \right\| \Omega,\tag{8}$$

where **u** <sup>r</sup>−<sup>1</sup> = Pr−<sup>1</sup> t=1 δ**u** t , k·k<sup>F</sup> is the Frobenius norm, α<sup>o</sup> is the regularization parameter for optical flow. The function ψ and the weighted norm k·kG<sup>ρ</sup> are defined by

$$\begin{split} \psi(\mathbf{x}) &= 2\kappa^2 \sqrt{1 + \frac{\mathbf{x}}{\kappa^2}}, \\ \left\| \frac{\partial I^{\Gamma}}{\partial t} + \nabla I^{\Gamma} \cdot \delta \mathbf{u} \right\|\_{\mathcal{G}\_{\rho}}^2 &= G\_{\rho} \ast \left( \frac{\partial I^{\Gamma}}{\partial \mathbf{x}} \right)^2 \delta u^2 + G\_{\rho} \ast \left( \frac{\partial I^{\Gamma}}{\partial \mathbf{y}} \right)^2 \delta \mathbf{u}^2 \\ &+ G\_{\rho} \ast \left( \frac{\partial I^{\Gamma}}{\partial t} \right)^2 + 2 \, G\_{\rho} \ast \left( \frac{\partial I^{\Gamma}}{\partial \mathbf{x}} \frac{\partial I^{\Gamma}}{\partial \mathbf{y}} \right) \delta \mathbf{u} \, \delta \mathbf{u} \\ &+ 2 \, G\_{\rho} \ast \left( \frac{\partial I^{\Gamma}}{\partial \mathbf{x}} \frac{\partial I^{\Gamma}}{\partial t} \right) \delta u + 2 \, G\_{\rho} \ast \left( \frac{\partial I^{\Gamma}}{\partial \mathbf{y}} \frac{\partial I^{\Gamma}}{\partial t} \right) \delta \nu, \end{split}$$

where G<sup>ρ</sup> is the Gaussian kernel with ρ standard deviation and ∗ is the convolution operator. Note that the flow **u** OF is the displacement field between I <sup>D</sup> and I s , then the temporal derivative is estimated as the variation of the intensity between such frames.

Such strategy defines all displacement fields along the cardiac sequence at the same reference phase (the end-diastolic phase), which eases the integrability of the data into the assimilation process introduced in section 2.3.

#### 2.1.4. Patient-Specific Geometric Model

Using an IVUS study gated at the end-diastolic phase, a geometrical model for a frame of interest is constructed (see **Figure 2**). First, the intima-media area is manually segmented by a specialist from the image using cubic splines to obtain a 2D patient-specific geometry. Then, the 2D geometry is extruded 0.05 mm in the axial direction to render a 3D slice of the arterial vessel. The mesh generation from this geometry is described later in section 2.2.5 when the numerical scheme for the mechanical problem is introduced.

#### 2.2. Mechanical Setup for the Arterial Wall

In this section, the main ingredients from continuum mechanics required to describe the mathematical models are briefly summarized. For further details the reader may refer to Ares (2016) and Blanco et al. (2016).

Let us consider the domain of a cross-sectional slice of the vessel wall. Its spatial configuration in the Euclidean space is denoted by <sup>s</sup> , with boundary ∂<sup>s</sup> = ∂<sup>W</sup> <sup>s</sup> ∪∂<sup>E</sup> <sup>s</sup> ∪∂<sup>A</sup> s , where ∂<sup>W</sup> s represents the interface between the vessel and the blood, ∂<sup>E</sup> s the external surface, and ∂<sup>A</sup> <sup>s</sup> = S<sup>2</sup> i=1 ∂A,<sup>i</sup> s stands for the set of 2 cross-sectional (non-physical) axial boundaries for the vessel slice (see **Figure 3**). The unit outward normal vector is denoted by **n**<sup>s</sup> . The coordinates at this configuration are denoted by **x**<sup>s</sup> . A material configuration, used as a reference configuration, is denoted by m, with coordinates **x**m. In the present context, <sup>s</sup> stands for the configuration at which mechanical equilibrium is achieved for a given load condition (diastolic, systolic or any other loaded state of the arterial wall). Residual stresses are neglected, therefore, the material configuration <sup>m</sup> is both load-free and stress-free.

The displacement field mapping points from the material into the spatial configuration is denoted by **u**. Then, we characterize the deformation mapping from <sup>m</sup> onto <sup>s</sup> and its inverse by the following expressions,

$$\mathbf{x}\_{\\$} = \chi\_m(\mathbf{x}\_m) = \mathbf{x}\_m + \mathbf{u}\_m,\tag{9}$$

$$\mathbf{x}\_m = \chi\_s(\mathbf{x}\_s) = \chi\_m^{-1}(\mathbf{x}\_s) = \mathbf{x}\_s - \mathbf{u}\_s,\tag{10}$$

where subscripts m and s denote the descriptions of the fields in the material and spatial configurations, respectively. Thus, the displacement vector field is given by

$$\mathbf{u}\_s(\mathbf{x}\_s) = \left(\mathbf{u}\_m(\mathbf{x}\_m)\right)\_s = \mathbf{u}\_m\left(\chi\_m^{-1}\left(\mathbf{x}\_s\right)\right),\tag{11}$$

and its gradients with respect to material and spatial coordinates are, respectively, obtained as

$$\mathbf{F}\_m = \nabla\_m \chi\_m = \mathbf{I} + \nabla\_m \mathbf{u}\_m,\tag{12}$$

$$\mathbf{f}\_s = \nabla\_s \chi\_s = \nabla\_s \chi\_m^{-1} = \mathbf{I} - \nabla\_s \mathbf{u}\_s. \tag{13}$$

Observe that [**F** −1 <sup>m</sup> ]<sup>s</sup> = **f**<sup>s</sup> and [**f** −1 s ]<sup>m</sup> = **F**m. Arterial wall tissues are assumed to behave as incompressible materials, which is

mathematically represented by the following kinematic condition

$$\det \mathbf{F}\_m = 1.\tag{14}$$

In a general case the load state of the model of an arterial cross-section is characterized as follows. Neumann boundary conditions are considered to be given by the forces exerted by the blood flow over ∂<sup>W</sup> s , i.e., through a traction field **t** W s which is considered to be characterized as **t** W <sup>s</sup> = ps**n**<sup>s</sup> (here we only consider the pressure load, and neglect the shear forces imprinted by the blood flow on the vessel wall), and by the tethering tractions **t** A,i s acting over ∂A,<sup>i</sup> s , i = 1, 2. For ease of notation, the tethering tractions are grouped into **t** A s , which is defined over the whole ∂<sup>A</sup> s . The action of the surrounding tissues is introduced as an elastic traction over the external boundary, which is characterized by the elastic parameter τ and

Frontiers in Physiology | www.frontiersin.org

the intima-media area; (Right) final 3D mesh.

depends on the displacement field at this boundary (see further details in section 2.2.1). These tractions, representing the external tissues influence, only act over the external surface ∂<sup>E</sup> s in the physiological pressure range, i.e., at end-diastolic pressure or higher. That is, during the preload problem (see section 2.2.2), the boundary ∂<sup>E</sup> is an homogeneous Neumann boundary (except for a small region of arc length 1 = 0.1 mm which is fixed to remove rigid motions).

The mechanical problem in variational form is framed as a saddle-point problem to accommodate the incompressibility constraint through the corresponding Lagrange multiplier, i.e., the pressure field in the solid domain.

Next, two variational formulations are presented which formalize the concept of mechanical equilibrium for the so-called preload and forward problems. In the preload problem, the known configuration is that one at which the body is at equilibrium (the spatial domain), and the unknown configuration is the material configuration used to define the constitutive equations. In the forward problem, the known configuration is the material one, while the unknown configuration is the one where equilibrium actually occurs.

#### 2.2.1. Forward Problem

When the material (load- and stress-free) configuration <sup>m</sup> is known, the variational Equation (16) can be cast in the material domain, yielding what we define as the forward problem. The variational formulation then reads: given the material description of the loads, p<sup>m</sup> and **t** A,i <sup>m</sup> , find (**u**m, λm) in U<sup>m</sup> × L<sup>m</sup> such that

$$\begin{split} &\int\_{\Omega\_{m}} (1-\det\mathbf{F}\_{m})\hat{\lambda}\_{m} \, d\Omega\_{m} - \int\_{\Omega\_{m}} \lambda\_{m} \left(\mathbf{F}\_{m}^{-T} \cdot \nabla\_{m}\hat{\mathbf{u}}\_{m}\right) \det\mathbf{F}\_{m} \, d\Omega\_{m} \\ &\quad + \int\_{\Omega\_{m}} (\mathbf{S}\_{m}(\mathbf{E}\_{m})) \cdot \dot{\mathbf{E}}\left(\hat{\mathbf{u}}\_{m}\right) \, d\Omega\_{m} \\ &= \int\_{\partial\Omega\_{m}^{E}} \mathbf{r}\left(\mathbf{u}\_{m} - (\mathbf{u}\_{d} + \mathbf{u}^{\mathrm{OF}})\_{m}\right) \cdot \hat{\mathbf{u}}\_{m} \, |\mathbf{F}\_{m}^{-T}\mathbf{n}\_{0}^{E}| \, \det\mathbf{F}\_{m} \, d\partial\Omega\_{m}^{E} \\ &\quad + \int\_{\partial\Omega\_{m}^{W}} \left(p\_{m}\mathbf{F}\_{m}^{-T}\mathbf{n}\_{0}^{W} \cdot \hat{\mathbf{u}}\_{m}\right) \det\mathbf{F}\_{m} \, d\partial\Omega\_{m}^{W} \end{split}$$

$$+\sum\_{i=1}^{2} \int\_{\partial\Omega\_{m}^{A,i}} \left(\mathbf{t}\_{m}^{A,i} \cdot \hat{\mathbf{u}}\_{m}\right) |\mathbf{F}\_{m}^{-T}\mathbf{n}\_{0}^{A,i}| \det \mathbf{F}\_{m} \,d\partial\Omega\_{m}^{A,i}$$

$$\forall (\hat{\mathbf{u}}\_{m}, \hat{\lambda}\_{m}) \in \mathcal{V}\_{m} \times \mathcal{L}\_{m},\tag{15}$$

where **E**˙(**u**ˆm) = 1 2 [**F** T <sup>m</sup>(∇m**u**ˆm) + (∇m**u**ˆm) <sup>T</sup>**F**m], **n**<sup>0</sup> is the unit outward normal vector in the material configuration. Recall that τ is the elastic parameter characterizing the response of the surrounding media, **u** OF is the displacement field which maps the end-diastolic to the spatial configuration where equilibrium is achieved (see Equation 7), and **u**<sup>d</sup> is the displacement field which maps points from the material to the end-diastolic configuration. Also, Um, Vm, and L<sup>m</sup> are the counterparts of U<sup>s</sup> , V<sup>s</sup> , and L<sup>s</sup> , respectively, with functions defined in m.

Acceleration terms have also been neglected, since the stresses associated to such inertial forces is much smaller than those of constitutive origin (Ares, 2016; Blanco et al., 2016).

#### 2.2.2. Preload Problem

Given the equilibrium configuration <sup>s</sup> , the variational formulation reads: given the loads **t** W,n s and **t** A s , find (**u**<sup>s</sup> , λs) ∈ U<sup>s</sup> × L<sup>s</sup> such that

$$\int\_{\Omega\_{s}} \left[ -\lambda\_{s} \operatorname{div} \hat{\mathbf{u}}\_{s} + \sigma\_{s} \cdot \mathbf{e}\_{s} \left( \hat{\mathbf{u}}\_{s} \right) \right] d\Omega\_{s} - \int\_{\Omega\_{s}} [1 - \det \mathbf{F}\_{s}^{-1}] \hat{\lambda}\_{s} \, d\Omega\_{s} = $$

$$\int\_{\partial\Omega\_{s}^{W}} t\_{s}^{W, n} \mathbf{n}\_{s} \cdot \hat{\mathbf{u}}\_{s} \, d\partial\Omega\_{s}^{W} + \sum\_{i=1}^{2} \int\_{s\Omega\_{s}^{A, i}} \mathbf{t}\_{s}^{A, i} \cdot \hat{\mathbf{u}}\_{s} \, d\partial\Omega\_{s}^{A, i} $$

$$\forall (\hat{\mathbf{u}}\_{s}, \hat{\lambda}\_{s}) \in \mathcal{V}\_{s} \times \mathcal{L}\_{s}, \tag{16}$$

where εs(**u**ˆ) = 1 2 (∇s**u**ˆ + ∇s**u**ˆ T ) is the strain rate tensor, L<sup>s</sup> = L 2 (s) and U<sup>s</sup> = **u**<sup>s</sup> ∈ **H**<sup>1</sup> (s), **u**<sup>s</sup> satisfies essential b.c. are, respectively, the linear space for pressures and the linear manifold for kinematically admissible displacements, and V<sup>s</sup> = **u**ˆ<sup>s</sup> ∈ **H**1 (s), **u**ˆ<sup>s</sup> satisfies homogeneous essential b.c. is the space of kinematically admissible variations. Also, σ<sup>s</sup> is related to the second Piola-Kirchhoff stress tensor **S**<sup>m</sup> through

$$\boldsymbol{\sigma}\_{s} = \frac{1}{\det \mathbf{F}\_{s}} \mathbf{F}\_{s} (\mathbf{S}\_{m}(\mathbf{E}\_{m}))\_{s} \mathbf{F}\_{s}^{T}. \tag{17}$$

where **S**<sup>m</sup> is a function of the Green-Lagrange deformation tensor **E**<sup>m</sup> = 1 2 **F** T <sup>m</sup>**F**<sup>m</sup> − **I** via a constitutive equation (see section 2.2.4).

In this work the preload problem is used to obtain the material configuration that enables an appropriate calculation of the stress field, which realizes the equilibrium in the enddiastolic configuration. Note that in this case the action of the surrounding media is omitted. This is due to the fact that our hypothesis considers the end-diastolic configuration as a reference configuration for the elastic response of the external tissues.

#### 2.2.3. Equilibrium Problems for a Given Set of Material Parameters

The preload problem is a mandatory step toward characterizing the mechanical state (the stress state) of the arterial wall in a geometry obtained from medical images (e.g., the end-diastolic geometry) with given baseline hemodynamics loads. In this context, these loads are given by the end-diastolic pressure, also called preload pressure, and by the axial stretch caused by tethering forces. The material configuration is required because it is used to define constitutive equations, without which the forward problem cannot properly be formulated. In our case, such baseline geometry is obtained from IVUS study, while the baseline hemodynamics loads (the blood pressure) are estimated from patient specific data. Just after solving the preload problem, the baseline mechanical state, that is the stress state due to the preload pressure (i.e., pressure at diastole), is adequately determined and the displacement field **u**s—that maps the material (load-free) configuration to the diastolic configuration– is recalled as **u**d. Then, the forward problem is solved to determine the equilibrium configuration for other hemodynamics loads occurring during the cardiac cycle. In that manner both problems are synergically coupled to solve a forward problem from an adequately preloaded configuration.

In practice, a set of physiological loads for the vessel will be given. Individualizing the diastolic pressure level as p<sup>s</sup> , a set of pressure loads between diastole and systole can be listed as {ps<sup>1</sup> , ... , ps<sup>S</sup> }. Through the forward problem, each load ps<sup>i</sup> will be in correspondence with an unknown spatial configuration s<sup>i</sup> , i = 1, ... , S. Notice then that, for a given set of material parameters, the preload problem is solved only once and so the forward problem is solved for each load ps<sup>i</sup> in the set of physiological loads.

#### 2.2.4. Constitutive Models

The main components of the atherosclerotic plaque, i.e., fibrotic, lipidic and calcified tissues, are modeled as isotropic Neo-Hookean materials. In Walsh et al. (2014), it is shown that fibrotic tissue in illiac plaque presents a quasi isotropic behavior. Different from the fibrotic tissue, the lipidic and calcified tissues do not display any contribution of smooth muscle cells or oriented fibers that may endow their structures with anisotropic behavior, what suggests that an isotropic hypothesis for these materials is reasonable.

The isotropic Neo-Hookean model is suitable for materials under large deformations where the stress-strain relationship behaves as non-linear, elastic, isotropic and independent of strain rate. Also, the model assumes an ideal elastic material at every strain level which, for physiological ranges, is satisfied by many biological tissues. The stress-strain relationship for a Neo-Hookean material derives from the strain energy function

$$
\psi = \frac{c}{2}(\overline{I}\_1 - \mathfrak{Z}),
\tag{18}
$$

where c is the material parameter that characterizes the stiffness of the material and I<sup>1</sup> is the first isochoric invariant of the Cauchy-Green tensor

$$\tilde{I}\_1 = \text{Tr}\left(\mathbf{C}\_m(\det \mathbf{F}\_m)^{-2/3}\right),\tag{19}$$

with **C**<sup>m</sup> = **F** T <sup>m</sup> **F**m. Then, the second Piola-Kirchhoff stress tensor (and the σ <sup>m</sup> through Equation 17) is obtained as

$$\mathbf{S}\_m(\mathbf{E}\_m) = \frac{\partial \Psi}{\partial \mathbf{E}\_m}.\tag{20}$$

#### 2.2.5. Numerical Methods

The preload and forward problems are linearized using the Newton-Raphson method. Linear tetrahedral finite elements for both displacement and pressure fields are used for the spatial discretization of the corresponding linearized problems. To stabilize the problem in the sense of the inf-sup condition, the linearized (forward and preload) problems are modified adding a diffusive term in the pressure equation. For the analysis of the proposed approach, four patient-specific 3D geometries were obtained using the technique described in section 2.1.4. These geometries were discretized using Netgen 3D using a characteristic element size ranging from 10µm to 40µm, resulting in meshes with 6,521, 7,516, 4,835, and 3,808 nodes for the cases 1–4, respectively. All these steps are performed using an in-house solver. The resulting systems of linear equations are solved using a direct solver based on LU factorization from the SuperLU library (Li and Demmel, 2003). Further details regarding the linearization and numerical schemes can be found in Ares (2016) and Blanco et al. (2016).

The Newton iterative scheme in both equilibrium problems finishes when k**u** m+1 <sup>s</sup> − **u** m s kL<sup>∞</sup> < 10−<sup>4</sup> mm and kλ n+1 <sup>s</sup> − λ n s kL<sup>∞</sup> < 1 Pa. Such convergence criterion was chosen to yield a higher precision than the optical flow processing applied to IVUS images (16 · 10−<sup>3</sup> mm assuming pixel precision).

#### 2.3. Data Assimilation

In the data assimilation process, the displacement field **u** OF obtained using the optical flow technique as explained in section 2.1 and the mechanical models presented in the previous section (section 2.2) are integrated by an unscented Kalman filter. Let us define a partition for the domain of analysis <sup>s</sup> = S<sup>M</sup> <sup>j</sup>=<sup>1</sup> j s composed by M disjoint regions. Each region <sup>i</sup> s is characterized by its own material parameter, say c<sup>i</sup> , see Equation (18). The axial loads **t** A,n si , the pressure level ps<sup>i</sup> and the displacement fields **u** OF si (obtained by optical flow techniques) are known at S cardiac phases (i = 1, ... , S). Since our mechanical problem is timeindependent, the time instants in the context of the Kalman filter simply correspond to filter iterations, while at each iteration all forward problems must be solved. By using the mechanical constitutive models, the material parameters grouped as θ = (c1, ... ,cM), are estimated such that

$$\theta = \underset{\hat{\theta}}{\text{arg min}} \sum\_{i=1}^{S} \|\mathbf{u}\_{s\_i}^{\text{MO}}(\hat{\theta}) - \mathbf{u}\_{s\_i}^{\text{OF}}\|\_{L\_2}^2,\tag{21}$$

where **u** MO si (θˆ) is the displacement field at the configuration s<sup>i</sup> obtained by solving the preload and forward problems (described in section 2.2) with pressure level ps<sup>i</sup> and material parameters θˆ.

The solution of the parameter identification problem eqution (21), satisfies the discrete dynamic nonlinear system presented as follows

$$\begin{aligned} X\_k^a &= f(X\_{k-1}^a, t\_{k-1}) + W\_k, \\ Z\_k &= h(X\_k^a, t\_k) + V\_k, \end{aligned} \tag{22}$$

where X a k is the augmented state vector

$$X\_k^a = \begin{bmatrix} \mathbf{u}\_{s\_1}^k(\mathbf{x}), \dots, \mathbf{u}\_{s\_S}^k(\mathbf{x}), \lambda\_{s\_1}^k(\mathbf{x}), \dots, \lambda\_{s\_S}^k(\mathbf{x}), c\_1, \dots, c\_M \end{bmatrix}^T,\tag{23}$$

which contains the displacement **u**s<sup>i</sup> and pressure λs<sup>i</sup> fields for all forward problems i = 1, ... , S, and the material parameters of all regions of the domain θ = (c1, ... ,cM); f(X a k , tk ) is the operator that sequentially solves the preload and all forward problems for parameters and initial state conditions in X a k at filter iteration tk (recall that these problems are time-independent, and so the dependence on time is ruled out in practice); W<sup>k</sup> are the model errors at the k-th step; h(X a k , tk ) = **H**X a k is a linear observation operator represented by the block matrix

$$\mathbf{H} = \begin{bmatrix} \mathbf{I}\_{\mathbf{u}\mathbf{u}} & \mathbf{0}\_{\mathbf{u}\lambda} & \mathbf{0}\_{\mathbf{u}\theta} \end{bmatrix},\tag{24}$$

where block matrix indexes indicate the corresponding dimensions; Z is the set of optical flow observations at each cardiac phase, described by the column vector

$$Z = \begin{bmatrix} \mathbf{u}\_{s\_1}^{\text{OF}}(\mathbf{x}), \dots, \mathbf{u}\_{s\_S}^{\text{OF}}(\mathbf{x}) \end{bmatrix}^T,\tag{25}$$

where **u** OF si (**x**) is the displacement field obtained by the optical flow technique for the cardiac phase i, i = 1, ... , S (observe that for the present case of static problems, the observations are fixed concerning the dynamics of the data assimilation process); V is the vector of optical flow and interpolation errors for the observation vector Z.

To obtain an estimate of the parameters θ, a reduced ordered unscented Kalman filter (ROUKF) (Julier and Uhlmann, 2002, 2004) is applied to the system described in Equation (22). The filter comprises the following steps

1. Spherical sigma-points generation σ (n) i , i = 1, ... , M + 1 with their corresponding weights w (i) (see Julier, 2003) and initialization of the variables

$$\mathbf{R}\_0 = \sigma\_{\rm OF} \mathbf{I}\_{\rm uu}; \quad \mathbf{L}\_0 = \begin{bmatrix} \mathbf{L}\_0^X \\ \mathbf{L}\_0^\theta \end{bmatrix} = \begin{bmatrix} \mathbf{L}\_0^u \\ \mathbf{L}\_0^\lambda \\ \mathbf{L}\_0^\theta \end{bmatrix} = \begin{bmatrix} \mathbf{0}\_{\rm u\theta} \\ \mathbf{0}\_{\lambda\theta} \\ \mathbf{I}\_{\theta\theta} \end{bmatrix};$$

$$\mathbf{U}\_0^{-1} = \begin{bmatrix} \sigma\_{\hat{\varepsilon}\_1} & \dots & 0 \\ \vdots & \ddots & \vdots \\ 0 & \dots & \sigma\_{\hat{\varepsilon}\_M} \end{bmatrix},\tag{26}$$

$$X\_0^a = [\hat{X}\_0^+, \hat{\theta}\_0^+]^T = [\mathbf{0}\_\mathbf{u}, \mathbf{0}\_\lambda, \hat{\theta}\_0]^T,\tag{27}$$

$$\mathbf{p}^+ = \mathbf{I}\_\mathbf{u} \mathbf{U}^{-1} \mathbf{I} \mathbf{1}^T\tag{28}$$

$$\mathbf{P}\_0^+ = \mathbf{L}\_0 \mathbf{U}\_0^{-1} \mathbf{L}\_0^T,\tag{28}$$

where σOF is the uncertainty of the computed optical flow and σcˆ<sup>i</sup> is the uncertainty of the parameter c<sup>i</sup> , i = 1, ... , M. The sensitivity analysis of the uncertainty value is studied in section 3.1.

#### 2. The prediction step

$$\begin{aligned} \hat{X}\_{k-1}^{(i)} &= \hat{X}\_{k-1}^{+} + \mathbf{L}\_{k-1}^{\hat{X}} \sqrt{\mathbf{U}\_{k-1}^{-1}} \sigma\_{i}^{(n)}, \quad i = 1, \ldots, M+1, \\ \hat{\theta}\_{k-1}^{(i)} &= \hat{\theta}\_{k-1}^{+} + \mathbf{L}\_{k-1}^{\theta} \sqrt{\mathbf{U}\_{k-1}^{-1}} \sigma\_{i}^{(n)}, \quad i = 1, \ldots, M+1, \end{aligned} \tag{26}$$

$$\begin{aligned} \left(\hat{X}\_{k}^{(i)}\right) \\ \left(\hat{\theta}\_{k}^{(i)}\right) \end{aligned} = f\left(\begin{bmatrix} \hat{X}\_{k-1}^{(i)} \\ \left(\hat{\theta}\_{k-1}^{(i)}\right) \end{bmatrix}, t\_{k-1}\right), \\ \hat{X}\_{k}^{-} &= \sum\_{i=1}^{M+1} \boldsymbol{w}^{(i)} \hat{X}\_{k}^{(i)}, \hat{\theta}\_{k}^{-} = \sum\_{i=1}^{M+1} \boldsymbol{w}^{(i)} \hat{\theta}\_{k}^{(i)}, \ \hat{Z}\_{k} = \sum\_{i=1}^{M+1} \boldsymbol{w}^{(i)} \hat{Z}\_{k}^{(i)}. \end{aligned} \tag{29}$$

3. The correction step

$$\begin{split} \mathbf{L}\_{k}^{X} &= \hat{\mathbf{X}}\_{k}^{(\*)} \mathbf{D}\_{\mathbf{w}} (\sigma^{(\*)})^{T}, \quad \mathbf{L}\_{k}^{\vartheta} = \hat{\theta}\_{k}^{(\*)} \mathbf{D}\_{\mathbf{w}} (\sigma^{(\*)})^{T}, \\ \{\mathbf{HL}\}\_{k} &= \hat{\mathbf{Z}}\_{k}^{(\*)} \mathbf{D}\_{\mathbf{w}} (\sigma^{(\*)})^{T}, \\ \mathbf{P}\_{\mathbf{w}} &= \sigma^{(\*)} \mathbf{D}\_{\mathbf{w}} (\sigma^{(\*)})^{T}, \\ \mathbf{U}\_{k} &= \mathbf{P}\_{\mathbf{w}} + \{\mathbf{HL}\}\_{k}^{T} \mathbf{R}\_{k}^{-1} \{\mathbf{HL}\}\_{k}, \\ \hat{\mathbf{X}}\_{k}^{+} &= \hat{\mathbf{X}}\_{k}^{-} + \mathbf{L}\_{k}^{X} \mathbf{U}\_{k}^{-1} \{\mathbf{HL}\}\_{k}^{T} \mathbf{R}\_{k}^{-1} \Big(\mathbf{Z} - \hat{\mathbf{Z}}\_{k}\big), \\ \hat{\theta}\_{k}^{+} &= \hat{\theta}\_{k}^{-} + \mathbf{L}\_{k}^{\vartheta} \mathbf{U}\_{k}^{-1} \{\mathbf{HL}\}\_{k}^{T} \mathbf{R}\_{k}^{-1} \Big(\mathbf{Z} - \hat{\mathbf{Z}}\_{k}\big). \end{split} \tag{30}$$

The matrices σ (∗) ,**X**ˆ (∗) k , **Z**ˆ (∗) k , ˆθ (∗) k are the M ×(M +1) matrices whose columns are the vectors σ (i) , Xˆ (i) k , Zˆ (i) k , θˆ (i) k with i = 1, ... , M+1, respectively. **D**<sup>w</sup> is the diagonal (M+1)×(M+1) matrix with values Dii = w (i) , i = 1, ... , M + 1, i.e., the sigma-point weights.

4. If stop criteria is not achieved, go to step 2 and k = k + 1.

In this iterative scheme, the model errors W<sup>k</sup> (inaccuracies in the solution of the preload and forward problems) have been neglected. The stop criteria used in this work is a fixed number of iterations that is reported for each study case in section 3.

In this work, c was reparametrized as c = 2 θˆ (this approach was introduced in Bertoglio et al., 2012) allowing θˆ to vary in the whole R (as occurs in the presented formulation 29, 30) without delivering invalid values for c. 1

#### 2.4. Parallelization Scheme

The data assimilation scheme is a computationally demanding task. However, it presents many independent or low dependent tasks. Firstly, notice that all sigma point predictions can be computed in parallel. As the forward problem is static, all forward problems (one per load, for S different loads) are computed in parallel and an extended observation vector Zˆ (i) = [Zˆ (i),1 , ... , Zˆ (i),<sup>S</sup> ] T for the i-th sigma-point is created by appending the predicted displacements Zˆ (i),<sup>j</sup> of the j-th forward problem corresponding to the pressure load ps<sup>j</sup> . In

<sup>1</sup>The reparametrization 2θ<sup>ˆ</sup> modifies the assumed distribution of the parameter from normal distribution to log-normal distribution. Mean and covariance of θ are propagated in a different fashion than mean and covariance of c. Nevertheless, there is a similar statistical interpretation for c using these descriptors in an exponential space of coordinates. For example, a covariance of σ = 1 must be understood as giving the same probability of c being half or twice its initial value.

that manner, at each Kalman iteration, the observations of all frames, are processed at once. In turn, the forward problem itself is parallelized by partitioning the mesh and communicating among subdomains the results of the local operations in both assembling and solving stages. Partitioning is accomplished using ParMETIS (Karypis and Kumar, 1998), and the solution is achieved using the SuperLU library (Li and Demmel, 2003). Following such parallelization scheme, and assuming there are enough computational resources, the cost per iteration of the data assimilation process equals the cost of the computation of one preload problem plus one forward problem, regardless the number of cardiac phases or sigma-points employed (i.e., regardless the number of parameters to be estimated). Note that the cost of the Kalman filter increases as more parameters are estimated, although when compared to the computations required for solving the mechanical equilibrium problems this increment is insignificant (only a few dozens of parameters will be required in the worst case). In **Figure 4**, the activity diagram for the proposed parallel scheme is presented. Thus, the data assimilation process is HPC ready and, even, capable to handle large scale FEM problems.

### 3. RESULTS

In what follows, sensitivity analyses are carried out to study the variation of the parameter estimation with respect to: the parameter uncertainties, boundary conditions and baseline stress state of the mechanical model (sections 3.1, 3.2, and 3.3, respectively). From these analyses, a reasonable setup of the data assimilation parameters and mechanical conditions is obtained for the present context of material identification in patient-specific models. Finally, in section 3.4, 4 patient-specific mechanical models are derived from in-vivo IVUS studies and the obtained displacement errors between the model predictions with its parameters adjusted by data assimilation—and the optical flow observations are assessed.

### 3.1. Uncertainty Parameters Sensitivity

Let us define a homogeneous ring-shaped domain <sup>s</sup> with Neo-Hookean constitutive behavior (see Equation 18). The inner and outer radius of the ring are 2 and 2.71 mm, respectively. The size and proportions are chosen to approximate an idealized coronary artery. Loads of t <sup>W</sup>,<sup>n</sup> = 80 mmHg and t <sup>W</sup>,<sup>n</sup> = 120 mmHg are applied over the inner surface for the preload and forward problems, respectively, and tethering tractions **t** A,i s are considered such that an axial stretch of 10% is prescribed. At the outer surface, homogeneous Neumann boundary conditions are assumed (τ = 0 in Equation 16). To avoid rigid movements in this idealized geometry, only radial displacement is allowed for 4 equidistant nodes at the luminar perimeter. The forward operator f , which comprises the preload and forward problems (see Equations 15, 16), is solved at each filter iteration with an iterative scheme where a Newton-Raphson linearization procedure is applied as described in section 2.2.5 (further details in Blanco et al., 2016).

Using this setting, we create an in-silico experiment to analyze: (i) the sensitivity of the parameter estimates θˆ with respect to the σ<sup>Z</sup> (the observations uncertainty, previously referred to as σOF); and (ii) the sensitivity of the parameter estimates θˆ with respect to the σ<sup>θ</sup> (the estimate uncertainty). Thus, the observations are generated by computing Z = h f(X t ) where X <sup>t</sup> = [**0u**, **0**λ, θ t ] is the true augmented state vector with the solution parameters c <sup>t</sup> = 2 θ t for the experiment. In this particular case, the domain is homogeneous and the constitutive model has only one parameter (c), then, only one parameter is estimated.

To analyze the sensitivity of θˆ with respect to the observation uncertainty σZ, the estimation of the parameter is performed assuming different values σZ, ranging from 10−<sup>1</sup> to 10−<sup>5</sup> mm. Also, three different materials are used for the ring, mimicking: cellular fibrotic tissue (c <sup>t</sup> = 5 · 105Pa), lipidic tissue (c <sup>t</sup> = 1 · 105Pa) and calcified tissue (c <sup>t</sup> = 4 · 106Pa). The estimation of the Kalman filter for all the 15 cases is presented in **Figure 5**. The results showed that in all cases the parameter uncertainty interval- 2 θˆ− q diag(**U**−**1**) ; 2 θˆ+ q diag(**U**−**1**) encloses the true parameter value c t . Even though, a closer estimate across the three materials is obtained for σ<sup>Z</sup> = 10−<sup>3</sup> mm which seems reasonable as it is the precision of the displacements delivered by the convergence process in solving the nonlinear operator f .

Regarding the filter convergence, it is observed that as the uncertainty in the observations decreases, the method converges faster. In **Figure 5**, it is shown that as the σ<sup>Z</sup> increases its value, the convergence is slower. Note that the estimator gain matrix is computed as **K**<sup>k</sup> = **L** θ k **U** −1 k {**HL**} T k **R** −1 k and the only operator that varies in the first iteration of the presented cases is **R** −1 0 . As the spectral radius of **R** −1 0 diminishes as σ<sup>Z</sup> increases then **K**<sup>0</sup> spectral radius diminishes as well, yielding a smaller correction of θˆ<sup>+</sup> k as presented in the plot. At the same time, since **P**<sup>w</sup> is constant, the update of **U**<sup>k</sup> = **P**<sup>w</sup> + {**HL**} T k **R** −1 k {**HL**}<sup>k</sup> is damped by **R**<sup>k</sup> . This damping effect is evidenced in the evolution of the parameter uncertainty intervals plotted in **Figure 5**. In statistical terms, the lack of confidence in the new observations leads to reducing its weight at the correction step.

An analogous analysis was performed to study the sensitivity of θˆ with respect to the parameter uncertainty σ<sup>θ</sup> . The uncertainty levels for σ<sup>θ</sup> ranged from 0.25 to 4 and the experiment was repeated for the three different ring materials (fibrotic, lipidic and calcified tissues). The results showed that the bigger σ<sup>θ</sup> , the wider the search space for the parameter, and the faster the method converges when the initial value is far from the true parameter value (see **Figure 6**). On the other hand, high values of σ<sup>θ</sup> may cause an overshooting in the estimation and a slower convergence. In this scenario, the reparametrization deteriorates the convergence even more. The reparametrization imposes an estimation bias to stiffer values due to the fact that displacements are less sensitive with respect to small variations in stiffer than softer materials. Then, the mean observation error (used as correction term in Equation 30) is biased to the sigma points associated with stiffer materials. This is clearly evidenced in **Figure 7**, where the initial overshooting delays the estimation of the parameter. Hence, note that the initial uncertainty interval does not necessarily has to contain c t to estimate its correct value. In fact, the uncertainty parameter values are also iteratively updated and similar values are obtained for all three σ<sup>θ</sup> illustrated

phase) that is fully parallelized without communication among the threads; and (iii) Parallelization of the FEM problem by mesh partitioning.

in **Figure 6**. The role of the initial value of σ<sup>θ</sup> is the dispersion of sigma points around the mean initial guess, and large values may accelerate convergence when the initial guess c<sup>i</sup> is far from c t .

parameters identify clearly the three different kinds of tissues in this idealized problem. Also, the observations generated insilico present an accuracy of similar order than the obtained (assuming no error carried by the optical flow) through the IVUS image processing. For this reason, σ<sup>Z</sup> = 10−<sup>3</sup> mm

Overall, a good agreement is found in term of accuracy and convergence for parameters σ<sup>Z</sup> = 10−<sup>3</sup> mm and σ<sup>θ</sup> = 4. These

is used in cases analyzed in forthcoming sections. The value of σ<sup>θ</sup> cannot be straightforwardly assigned because parameter overshooting using in-vivo complex geometries may lead to excessively soft materials which could cause contact at the inner surface in when solving the preload equilibrium, yielding nonfree material configurations. Since stress-free configurations have been assumed, a more conservative value of σ<sup>θ</sup> = 0.5 is used to avoid such problem.

### 3.2. Boundary Conditions Sensitivity

As described in section 2.2.1, the observational datum **u** OF is considered as an additional information over ∂<sup>E</sup> through a penalization factor τ (i.e., a Robin boundary condition). This strategy is an attempt to incorporate the contribution of surrounding tissues through a surrogate surface model. Moreover, since **u** OF can be exposed to errors caused by brightness variations, image artifacts or non-physical optical flow regularization issues, the use of a Robin boundary condition allows the model to naturally filter out the field **u** OF similarly as a surface spring model. Then, a characterization of the surrounding tissues provided by τ in the parameter estimation is addressed in this section.

The in-silico study case used for this sensitivity analysis was generated from the cross-section IVUS image depicted in **Figure 8** by considering the configurations corresponding to two cardiac phases: end-diastole and systole. The geometrical model was constructed for the end-diastolic configuration following the pipeline described in sections 2.1 and 2.2.5. The configurations at each one of the cardiac phases are related to an end-diastolic load (i.e., the preload) of t <sup>W</sup>,<sup>n</sup> = 80 mmHg and to a systolic load t <sup>W</sup>,<sup>n</sup> = 120 mmHg, accordingly. The loads are applied over the inner surface of the vessel in the preload and forward problems, respectively. Finally, tethering tractions **t** A,i s are considered such that an axial stretch of 10% is prescribed in the end-diastolic configuration. The remaining setup of boundary conditions is defined for each of the following analyses: (i) parameter estimation sensitivity as τ decreases from a large value (almost Dirichlet condition) to a small value (almost Neumann condition); (ii) parameter estimation robustness when observation **u** OF features errors at the boundaries.

#### 3.2.1. Test 1: Sensitivity of τ for Error-Free Observations

For this analysis, the observations for the ROUKF were generated by solving the mechanical equilibrium with our model, avoiding observational and modeling errors. Thus, a Robin boundary condition was imposed at the outer surface in the forward problem with τ = 10<sup>6</sup> (practically yielding a Dirichlet boundary condition). This setting rendered a ground truth displacement field **u** GT 1 for this test. Using observations **u** GT 1 , the data assimlation algorithm was executed for τ ∈ {10<sup>6</sup> , 10<sup>4</sup> , 10<sup>2</sup> } (higher values of τ were not analyzed since τ = 10<sup>6</sup> is already almost a Dirichlet boundary condition). The geometric model was partitioned in sextants with two concentric layer yielding 12 regions each with its own material parameter c<sup>i</sup> .

The results are presented in **Figure 9**, depicting the parameter estimation and predicted observations variations as the Robin boundary condition moves toward a Neumann boundary condition. The decrease of forces at the boundaries caused by the decreasing value of τ is compensated by the estimation of softer materials (which experiment higher strains) to match the **u** GT 1 observations. Particularly, the method recovers the correct material parameters when the penalization value is the true value used to generate the observations. i.e., τ = 10<sup>6</sup> . For the parameter estimation with τ = 10<sup>4</sup> , a qualitatively similar distribution of materials is observed with an uniform reduction in the magnitude of the material parameter. The lowest penalization value, τ = 10<sup>2</sup> , delivers a totally different arrangement of

materials. This result emphasizes the important contribution of the surrounding tissues for a correct estimation of material parameters, which is clearly retrieved when sufficient large values of the penalization parameter τ are employed.

The observation error |εZ|, which is defined as the Euclidean distance between the observations Z and mean filter observation Zˆ <sup>k</sup> at the last iteration, increases as τ is decreased. Specifically, the mean values of ε<sup>Z</sup> are 7.16 · 10−<sup>5</sup> , 1.03 · 10−<sup>3</sup> and 9.79 · 10−<sup>3</sup> for τ = 10<sup>6</sup> , 10<sup>4</sup> and 10<sup>2</sup> , respectively. Clearly, should the observations **u** GT 1 be error-free at the boundaries, a Dirichlet boundary condition (a higher value for τ ) would be the correct choice. Notwithstanding this, the observations from in-vivo scenarios are degraded by diverse sources of errors and, as it will be shown next, an excessively stringent boundary condition (of Dirichlet type) may not be the best option.

#### 3.2.2. Test 2: Sensitivity of τ for Realistic Observations

The current analysis aims at assessing the robustness of the parameter estimation process when the **u** OF at the outer boundary differs from the real in-vivo displacements. For this purpose, an allegedly ground truth **u** GT 2 is generated by altering the observation **u** OF in a certain region (untrusted region) using the mechanical model with Neumann conditions. Finally, the assimilation process is performed with the observation **u** OF and different values of τ , to assess if it is capable to approximate **u** GT 2 despite the observation errors.

Thus, an IVUS sequence with a swinging artifact (induced by the guidewire) was chosen to perform our analysis. The IVUS cross-section depicted in **Figure 8** presents an image artifact from the IVUS guidewire at the bottom-right quadrant of the frame.

τ = 10<sup>6</sup> , 10<sup>4</sup> , 10<sup>2</sup> (from left to right) in the forward operator f.

The guidewire projects a shadow that hides the arterial wall and, as consequence, the optical flow is polluted with a swinging movement not related with the true arterial-wall motion. Thus, a displacement field, denoted by **u** GT 2 , is generated from the invivo data removing the guidewire influence, with the purpose of comparing this ground truth against the Kalman predictions Zˆ <sup>k</sup> when the polluted optical flow **u** OF is used as observations. In that manner, the difference ε GT = Zˆ <sup>k</sup> − **u** GT 2 can be regarded as an estimate of the error in the Kalman prediction due to the artifact in the image processing data. At last, ε GT is computed for different values of τ to assess the discrepancies in the predictions as the external Robin boundary condition is characterized differently.

The displacement **u** GT 2 is generated by solving the equilibrium problems with a model constituted by a single material. To define a reasonable value for this constitutive property, a data assimilation process was performed using **u** OF as observation and τ = 10<sup>4</sup> , yielding to c = 33.52 kPa. Note that the cis biased by the image artifact among other errors in the displacement field and it cannot be regarded as an estimate of the real material, thus, it is analyzed the ranges among which the estimated cˆ varies. At the boundary ∂<sup>E</sup> <sup>m</sup>, a Neumann homogeneous condition (τ = 0) was applied in the area affected by the guidewire (see red line in **Figure 8**) and a Robin boundary condition with τ = 10<sup>4</sup> was applied to the remaining part of the boundary. The obtained displacement field **u** GT 2 is displayed in **Figure 8**.

The sensitivity of ε GT with respect to τ is then studied. For each value of τ ∈ {10<sup>i</sup> , i = 5, 4, ... , 0}, the data assimilation process is executed using **u** OF as observation. The relative difference between the generated ground truth **u** GT 2 and the Kalman predicted observation Z<sup>k</sup> , for each τ , is reported in **Figure 10**. For τ greater than 10<sup>3</sup> , the Robin condition guarantees that the artifact-related displacements are preserved regardless the impact on the induced internal stresses. When τ varies from 10<sup>3</sup> to 10<sup>2</sup> , the relative error difference significantly drops at the guidewire locus, from 1.33 to 0.58. As τ decreases even more, the resulting force induced by the Robin boundary condition diminishes its magnitude, yielding lower internal stresses, and spreading the error outside the region of the guidewire shadow. For values lower than 10<sup>2</sup> , the error in the displacement field is concentrated at the bottom area of the artery. Particularly, this concentration of the error is explained by the fact the continuum model is enforced to behave as incompressible, while the optical flow is not divergence-free. In terms of the parameter estimation, the value of c was of 123.96, 33.52, 27.93, 68.51, 144.88, and 125.04 kPa for τ = 10<sup>5</sup> , 10<sup>4</sup> , 10<sup>3</sup> , 10<sup>2</sup> , 10<sup>1</sup> , and 10<sup>0</sup> respectively, presenting mean and standard deviation value of 87.31 ± 50.70 kPa, all close to a cellular fibrotic tissue. Moreover, there is a large sensitivity in the estimated parameter with respect to the chosen value of τ . In comparison with the ground truth, the closest matching prediction in terms of the displacement field (i.e., the prediction for τ = 10<sup>2</sup> , see **Figure 10**) presents an estimation of c two times higher. This is a clear demonstration of the large sensitivity in the estimated parameter with respect to the setting of models for the external tissues. Even more, it indicates that the minimization of the displacement field is not directly related to the best parameter estimation.

### 3.3. Effects of Preload and Axial Stretch

An appropriate baseline stress state of the vessel is key toward an accurate characterization of the stress state in arterial tissues. In fact, as reported in Ares (2016), a preloaded and axially stretched artery features notoriously different stress patterns compared to the case when such loads are neglected. Therefore, it is important to quantify the change in the parameter estimation when the initial stress state is either considered or not in the analysis. To quantify such disagreement, the parameters of an in-vivo study were estimated assuming three different conditions, namely: (i) the diastolic configuration is neither preloaded nor axially stretched; (ii) the diastolic configuration is preloaded but not axially stretched; (iii) the diastolic configuration is preloaded and 5% axially stretched; and (iv) the diastolic configuration is preloaded and 10% axially stretched. The choice for the last two cases is based on the experimental observations of Holzapfel et al. (2005) where it is reported a physiological range for axial stretch in coronary arteries ranging between 5 and 10%.

The geometrical model and the optical flow **u** OF used for this study are the ones previously presented in **Figure 8**. The geometric model was partitioned in sextants with a unique concentric layer leading to the estimation of 6 material parameters (the same partition used in **Figure 12**. The remaining parameters for the mechanical problems and data assimilation process are described in **Table 1** along with the estimated values ci . The results showed different trends for soft (c < 200 kPa, i.e., c1, c2, c3, and c5) and stiff materials (c ≥ 200 kPa, i.e., c<sup>4</sup> and c6). The obtained parameter c increases in the soft tissues and decreases in stiff tissues as the baseline stress increases from a preload-free to a preloaded state. Further increments in the baseline stress due to the axial stretch result in material stiffening for these two categories of tissues. Interestingly, the increment of the parameter uncertainty σ<sup>θ</sup> or the decrement of the observation uncertainty σ<sup>Z</sup> in the stiff tissues increases the estimate of parameter c more in the preload-free state than in the preloaded cases. In fact, in the cases 2 and 3, the preloaded and 5% axially stretched model (cases 2.C and 3.C) featured lower c values in the stiff tissues than the preload-free model (cases 2.A and 3.A), contrarily to case 1. Some of these findings may appear counter-intuitive at first glance because as the baseline stress state increases it would be expected that all tissues soften to maintain the same deformation for the given load. Thus, the following paragraphs address the role of assimilation uncertainties, image artifacts, and the very mechanical model in the assimilation.

Firstly, as the baseline stress at the diastolic configuration rises, the parameter estimation is less sensitive with respect to variations between the predicted and the observed displacements i.e., Z −Zˆ k . For the different baseline stress states, it was assumed the same observation uncertainty which is analog to establish an uncertainty interval for the observed strains. As the Neo-Hookean model consists of a quadratic stress-strain relation, the increment of the baseline stress yields an increase in the uncertainty interval of the stresses. And because the stress is linear to the material parameter c, the estimated parameters undergo the same increase of their uncertainties diminishing the accuracy of their estimation. Moreover, the estimated value of c increased as the baseline state is subjected to a more significant preload condition, turning the data assimilation process even less sensitive. In short, this implies that dealing with the real problem –for which preload is definitely a condition of the vessels– is even more challenging than the case where initial stress conditions are neglected.

Secondly, the gap between the observations and the predicted displacements, hereafter simply discrepancy, in the data assimilation process is in part given by some observed displacement components generated by errors in the image processing stage and by physical phenomena which is not recoverable by the proposed mechanical and material models (e.g., external tissues, off-plane displacements, compressible materials or, even, misrepresentation of the constitutive law). These discrepancies could be referred to as out-of-model components, introducing a bias in the predicted displacement



In all cases, the boundary condition was fixed with τ = 10<sup>2</sup> and the initial guess for the parameters was θˆ<sup>+</sup> <sup>0</sup> = [c0, . . . , c6] with c<sup>i</sup> = 500 kPa ∀i. The estimated parameters are reported for each case, as well as the observation error |ε<sup>Z</sup> |= |Z − Zˆ <sup>k</sup> | after the data assimilation process.

field and in the parameter estimation as well. Comparing the estimations with different baseline assumptions, it is observed that the discrepancies of the identified parameter value remain below 37% and 10% for soft and stiff tissues, respectively. Particularly, we choose to use the more complex model (preloaded and axially stretched) in the following in-vivo studies because it endows the mechanical setting with more relevant physical features when compared to the other models.

### 3.4. In-Vivo Cases

The proposed methodology is now applied to 4 in-vivo cases featuring atherosclerotic lesions to derive their specific mechanical models. The goal is to analyze the accuracy of the mechanical models to predict the optical flow observations, as well as, to assess the usage of multiple (more than two) cardiac phases (and then more than one optical flow displacement field as observational data) in the parameter estimation. For each lesion, the IVUS frames that are involved in the data assimilation correspond to end-diastole, 50% systole and full-systole, as dictated by the ECG signal of the IVUS study. Optical flow was estimated between enddiastole and 50%-systole frames and end-diastole and fullsystole frames, denoted by **u** OF 1 and **u** OF 2 respectively (see **Figure 12**). Then, we compare the resulting estimated parameter for two cases: when the assimilation is performed using a single optical flow displacement field as observation (Z = [**u** OF 2 ] T ); and when two optical flow displacement fields are utilized as observations (Z = [**u** OF 1 , **u** OF 2 ] T ). Note that the observed displacement field for maximum load, i.e., **u** OF 2 , is employed in both cases because the displacement between end-diastole and systole is expected to yield higher strains.

The geometric model was partitioned in sextants with a unique concentric layer (see **Figure 11**). Each partition contains only a single type of material leading to a data assimilation process with 6 material parameters. The diastolic configuration is preloaded and 10% axially stretched for all cases and the blood pressure at each phase was assumed to be 80, 100, and 120 mmHg for the end-diastole, 50%-systole and full-systole, respectively. The parameter τ was set to 100 for lesions 1, 3, and 4 and 50 in case 2, the latter avoided contact at the luminar surfaces during in the preload problem. The ROUKF uncertainties were fixed to σ<sup>θ</sup> = 1 and σ<sup>Z</sup> = 10−<sup>2</sup> mm.

The proposed data assimilation process rendered the results depicted in **Figure 12**. The material parameters estimated in all cases remained within the physiological range (between 1 kPa to 10 MPa, see Walsh et al., 2014). Also, the addition of an extra displacement field as observation showed no considerable effect for cases 2 and 3. The reliability of the results can be assessed in terms of the model prediction error presented in **Table 2**. Due to intrinsic sources of errors in the observations (motion artifacts, spatial incoherence between cross-sections in the cardiac cycle and optical flow model artifacts), it is expected an observation error of few pixel spacing units (recall that the image discretization spacing is 16µm). Thus, model prediction errors for cases 3 and 4, and even case 1 for a single optical flow field per cardiac phase, seems highly reliable in terms of our observation precision since the error results 26 ± 14µm (1.625 ± 0.875 pixel spacing units), while case 2 seems to be the less reliable estimation with an average error of 43 ± 24µm. Overall, the average model prediction error was below 43µm and 61µm for the observation with one or two observational data, respectively.

In case 1, it is observed that the material parameters estimated with 1 and 2 optical flow displacement data are significantly different. The flow **u** OF 1 presents larger displacements than **u** OF 2 , which seems counter-intuitive since the blood pressure variation is smaller for the former condition. However, the motion exerted by the cardiac contraction is higher, in fact the larger component of displacement is rigid (a rotation of the structures). Thus, as **u** OF 1 presents the observation components with higher norm, it features a larger contribution than **u** OF 2 during the data assimilation process (see Equation 30). In that manner, the parameters estimated with 1 flow datum minimize discrepancies against **u** OF <sup>2</sup> while the ones estimated with 2 flow data minimize mainly discrepancies against **u** OF 1 .

Conversely in cases 2 and 3, the observation **u** OF 1 is the one with smaller displacements (≈4 and 2 times smaller for cases 2 and 3, respectively), yielding a small contribution to the data assimilation. This implies that the minimization of the discrepancies between the model predictions and the observations (i.e., Zˆ <sup>k</sup> − Z) related to **u** OF 2 dominates over the discrepancies associated to **u** OF 1 . In fact, **Figure 12** shows that the discrepancies represented by ε<sup>r</sup> for **u** OF 2 remained almost invariant using 1 or 2 flows in the observation.

In case 4, the discrepancies between the model predictions and the observations related **u** OF 2 also remained invariant using 1 and 2 flows data in the observation, although the parameters estimated in the lower part of the geometry varied significantly (see **Figure 12**). As previously studied in section 3.2.2, the guidewire artifact in the lower part of these images features a swinging movement not related with the arterial-wall motion. To approximate the artifact's rigid motion, the local tissue is stiffened during the assimilation process when 1 single flow was employed as observation. Conversely when **u** OF 1 is added to the observations, the spurious motion of the guidewire is negligible, and the data assimilation is not affected by this artifact.

To determine the applicability of the current approach in clinical practice, we execute the in-vivo cases in a single server with 2 Intel Xeon CPU E5-2620 at 2.00 GHz processor (each with 12 threads) and Kingston 99U5471-031.A00LF at 1333 MHz (latency of 27 ns) RAM memory. For data assimilation of these in-vivo cases, mesh (3 threads per mechanical problem) and sigma parallelism (1 thread per sigma point) were applied because it delivered the best speed up for our 24 threads (actually only 21 were employed). The wall clock time reported in **Table 2** for each execution showed that the current methodology is appropriate for offline medical applications because the processing times elapsed from 0.5 to 3 days. The use of clusters would allow further processing speed up exploiting the load parallelism as well as a more massive parallelization at the mesh level.

## 4. DISCUSSIONS

The presented methodology offers a workflow to estimate material parameters for mechanical model of coronary arteries. The strategy is composed by three key components: the image processing, the mechanical model and the data assimilation algorithm. The most appealing aspect of this proposal is that the three components are loosely coupled as black boxes which allowed us to modify, as required, each component without the need for altering the remaining ones. In fact, the image processing renders observations for the data assimilation, regardless the imaging technique employed and the nature of the displacement field. In turn, the mechanical model can also be modified without influencing in the other components, it simply must receive a set of parameters and return back the internal state variables to the data assimilation strategy. Due to this architectural design, this initial biomechanical characterization approach can be further refined by improving aspects of these individual components. Some identified hotspots for improvement are discussed in what follows.

The data assimilation showed high sensitivity with respect to variations in the model boundary conditions which aimed at mimicking the external tissues. As the displacement over the boundary was increasingly constrained (large τ ) the model was less sensitive to variations in the material parameters, hindering the parameter estimation and, even, causing divergence of the Kalman iterative process in some situations. Also, the disagreement in the spatial arrangement of model forces and the in-vivo (unknown) forces at the boundary notoriously affects the outcome of the estimation. This was exposed in section 3.2.2 when an image artifact (the IVUS guidewire) induced a spurious tangential displacement in the observation and the boundary condition. It was also showed that if a homogeneous Neumann condition is assumed at the site of such artifact, the parameter estimation varies significantly (from 4 to 15 fold reduction of parameterc). Improving the capabilities of the model in this sense

model predictions and the observations was defined as ε<sup>r</sup> = kZˆ observations and h·i denotes the mean value in s.

requires to incorporate the estimation of these forces exerted by external tissues in the data assimilation process. In short, parameter τ could be a further variable to be estimated.

It is also important to highlight that this approach can be directly extended to account for more geometrically and physically complex models. The set of here reported results constitute a solid proof of concept toward the extension of this methodology. Here, we derived a patient-specific mechanical model for an orthogonal slice of the vessel assuming plane strain state with an homogeneous axial traction force. However, there are some assumptions that imply neglecting certain physical components that may be necessary to increase the accuracy

k are the model predictions at the last Kalman iteration, Z are the optical flow


TABLE 2 | Model prediction error after data assimilation process for the 4 in-vivo cases using 1 or 2 loading conditions.

The reported errors corresponds to the disagreement between mechanical model displacements (after estimate their material parameters with the proposed method) and the optical flow observations. The execution time corresponds to the wall clock time elapsed for each case using only mesh parallelism (see Figure 4) with 24 CPUs.

of the estimated stress/strain state of the vessel. To list some of them: (i) shear forces exerted by the blood flow which are expected to be key in the study of plaque development (Stone et al., 2003; Chatzizisis et al., 2008); (ii) out-of-plane forces produced by the blood pressure due to the heterogeneous constitution of the vessel wall and the tilting of the transducer tip with respect to the cross-section; and (iii) variable axial tractions along the cross-section due to the heterogeneous composition of the vessel wall. These issues can be tackled at once by making use of 3D models. In fact, the image processing strategy allows the gating and registration of the whole arterial 3D volume of the study. Also, the extension of the optical flow techniques to 3D domains is straightforward by a proper adaptation of the differential operators and Gaussian kernel within the formulation. A further issue to address is the spatial reconstruction to obtain the proper 3D geometrical description of the vessel instead of its rectified representation in intrinsic coordinates delivered by the IVUS study. The integration of IVUS with angiographic images enabled us to perform such 3D reconstruction, as reported in Maso Talou (2013). These extensions imply in heavier computational cost and complementary implementation aspects, yet, they present no further conceptual differences regarding the methodology presented in this work.

Extension to 3D problems discussed above, as said, becomes computationally more demanding. Associated to the image processing, the cost scales with the number of cross-sections extracted from the IVUS dataset. However, the registration stage, which is the most computationally intensive task, is fully parallel (see Maso Talou et al., 2017) and gating cost is negligible. Thus, the performance of the optical flow and the spatial reconstruction process through the integration with angiographic images, turn out to be key for the efficiency of the methodology in 3D cases. Regarding the data assimilation procedure, the computational cost continues to be the approximate solution of the mechanical problem. As a significant increase in the number of degrees of freedom is expected, the computational cost would raise as well.

A first limitation in the present scheme is that the displacement field retrieved from medical images is naively used as observation from our model without further processing. This implies that the performance of the method can be improved by extracting the observation components that are spurious (such as artifacts or unreliable regions of the optical flow displacement field) or even incompatible with our model (e.g., use only the divergence-free component of the field because the mechanical model is incompressible).

Regarding the baseline stress state in our model, the residual stresses produced during the arterial tissue genesis and growth have clearly been neglected. In Wang et al. (2017), an experimental test showed that the omission of these residual stresses may produce a significant overestimation of internal stresses (from 2- to 4-fold the actual stress). Furthermore, it has been observed (Guo et al., 2017) that accounting for residual stresses is also relevant for the proper material parameter estimation. This seems to be natural, as residual stresses can be considered as a subproduct of existing residual deformations (that, in general, may not be kinematically compatible, i.e., they cannot be derived from continuous displacement fields) of the elastin matrix. Consequently, not only the stresses are not properly assessed, but the actual deformations observed at the equilibrium states are misguided. These facts highlight the need for further research to tackle simultaneously the estimation of both material parameters and residual deformations in arterial walls. In recent works (Ares, 2016; Ares et al., 2017), models and methods for the estimation of such residual stresses were proposed, with a similar spirit to the one developed in this work.

At last, it is worthwhile to remark that no validation techniques are currently available for the assessment of stressstrain state in in-vivo conditions. Even though, approaches for an indirect in-vivo or ex-vivo validation can be discussed. Techniques such as elastography and palpography (Ophir et al., 1991; Shapo et al., 1996; Céspedes et al., 1997; de Korte et al., 1998; Céspedes et al., 2000) deliver with some degree of reliability the stresses in the innermost part of the vessel. In these cases a Bland-Altman analysis can be applied to assess the similarity between the prediction of our approach and elastographic solutions. A more controlled experimental setup can be planned for ex-vivo condition using coronary specimens. For each specimens, an IVUS study can be acquired and a specimen-specific model can be constructed employing the proposed methodology. Finally, several mechanical tests can be carried out with the specimens comparing their mechanical response with predictions given by our specimen-specific models. Another in-vivo alternative is to associate ranges of the estimated material parameters to the underlying tissue composition, delivering a histological description of the vessel (usually referred to as virtual histology). As there are already methods that estimate the vessel histology from IVUS images (e.g., Kawasaki et al., 2002; Nair et al., 2002; Sathyanarayana et al., 2009), a comparative analysis can be performed to evaluate the degree of agreement between the proposed method and these virtual histologies. An appealing aspect of this last validation is that the techniques presented in those works are already validated with cadaveric specimens of coronary arteries. The experimental settings suggested above should serve to bridge the world of computational models and methods with the experimental realm, toward gaining insight into the complex mechanisms underlying the development of cardiovascular diseases.

#### 5. FINAL REMARKS

A data assimilation environment for analysis of arterial models and material characterization was described. The proposed methodology delivers the necessary tools to construct patientspecific mechanical models of an arterial site using data from standard IVUS studies. A complete sensitivity analysis of the biomechanical characterization with respect to numerical and physical parameters was reported to aid the methodology setup, as well as the interpretation of data assimilation outcomes.

#### REFERENCES


Validation in controlled scenarios was provided to demonstrate the capabilities of the present approach.

The potential and limitations of this approach were exposed and discussed in the previous section, delineating future research to enhance the image processing stage and the mechanical model of the arterial wall for this problem.

The applicability of this methodology on in-vivo scenarios was proven in the characterization of the arterial tissue for 4 invivo atherosclerotic lesions. After data assimilation, the obtained mechanical models predicted the displacement field between diastole and systole with errors below 43µm using frames of only two cardiac phases. Although no validation was performed with the in-vivo cases, the estimated material parameters remained within the expected range for this kind of tissue.

The development of this tool for the biomechanical analysis allows the indirect estimation of the internal stress state of the arterial wall. Such information combined with the vessel histology (that can be inferred from the material parameters here estimated) enables the assessment of the structural integrity of the atherosclerotic plaque to aid medical decisions and research. In summary the proposed strategy provides an imagingassimilation-mechanics integrated environment to characterize, within a truly in-vivo and patient-specific setting, the behavior of the materials that compose the arterial vessels, specifically coronary vessels, which is of the utmost importance in assessing risk of plaque progress and rupture.

### AUTHOR CONTRIBUTIONS

GM, GA, PB, and RF designed the model and the computational framework. GM and GA carried out the computational implementation. GM, PB, and RF planned the experiments. GM and PB performed the calculations and wrote the manuscript with input from all authors. CG and PL performed measurements and contributed to sample preparation. All authors contributed to the discussion and interpretation of the results, and helped shape the final version of the manuscript.

#### ACKNOWLEDGMENTS

This work was partially supported by the Brazilian agencies CNPq, FAPERJ, and CAPES. The support of these agencies is gratefully acknowledged.


3d fluid–structure interaction model. J. Biomech. 47, 1027–1034. doi: 10.1016/j.jbiomech.2013.12.029


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Maso Talou, Blanco, Ares, Guedes Bezerra, Lemos and Feijóo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Use of Biophysical Flow Models in the Surgical Management of Patients Affected by Chronic Thromboembolic Pulmonary Hypertension

Martina Spazzapan<sup>1</sup> , Priya Sastry <sup>2</sup> , John Dunning<sup>2</sup> , David Nordsletten<sup>3</sup> and Adelaide de Vecchi <sup>3</sup> \*

<sup>1</sup> King's College London, GKT School of Medical Education, London, United Kingdom, <sup>2</sup> Cardiothoracic Surgery Unit, Papworth Hospital NHS Foundation Trust, Cambridge, United Kingdom, <sup>3</sup> King's College London, School of Biomedical Engineering and Imaging Sciences, St. Thomas' Hospital, London, United Kingdom

#### Edited by:

Mariano Vázquez, Barcelona Supercomputing Center, Spain

#### Reviewed by:

Jeannette Spühler, Royal Institute of Technology, Sweden Jacopo Biasetti, Johns Hopkins University, United States

\*Correspondence:

Adelaide de Vecchi adelaide.de\_vecchi@kcl.ac.uk

#### Specialty section:

This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology

Received: 15 December 2017 Accepted: 28 February 2018 Published: 13 March 2018

#### Citation:

Spazzapan M, Sastry P, Dunning J, Nordsletten D and de Vecchi A (2018) The Use of Biophysical Flow Models in the Surgical Management of Patients Affected by Chronic Thromboembolic Pulmonary Hypertension. Front. Physiol. 9:223. doi: 10.3389/fphys.2018.00223 Introduction: Chronic Thromboembolic Pulmonary Hypertension (CTEPH) results from progressive thrombotic occlusion of the pulmonary arteries. It is treated by surgical removal of the occlusion, with success rates depending on the degree of microvascular remodeling. Surgical eligibility is influenced by the contributions of both the thrombus occlusion and microvasculature remodeling to the overall vascular resistance. Assessing this is challenging due to the high inter-individual variability in arterial morphology and physiology. We investigated the potential of patient-specific computational flow modeling to quantify pressure gradients in the pulmonary arteries of CTEPH patients to assist the decision-making process for surgical eligibility.

Methods: Detailed segmentations of the pulmonary arteries were created from postoperative chest Computed Tomography scans of three CTEPH patients. A focal stenosis was included in the original geometry to compare the pre- and post-surgical hemodynamics. Three-dimensional flow simulations were performed on each morphology to quantify velocity-dependent pressure changes using a finite element solver coupled to terminal 2-element Windkessel models. In addition to transient flow simulations, a parametric modeling approach based on constant flow simulations is also proposed as faster technique to estimate relative pressure drops through the proximal pulmonary vasculature.

Results: An asymmetrical flow split between left and right pulmonary arteries was observed in the stenosed models. Removing the proximal obstruction resulted in a reduction of the right-left pressure imbalance of up to 18%. Changes were also observed in the wall shear stresses and flow topology, where vortices developed in the stenosed model while the non-stenosed retained a helical flow. The predicted pressure gradients from constant flow simulations were consistent with the ones measured in the transient flow simulations.

**382**

Conclusion: This study provides a proof of concept that patient-specific computational modeling can be used as a noninvasive tool for assisting surgical decisions in CTEPH based on hemodynamics metrics. Our technique enables determination of the proximal relative pressure, which could subsequently be compared to the total pressure drop to determine the degree of distal and proximal vascular resistance. In the longer term this approach has the potential to form the basis for a more quantitative classification system of CTEPH types.

Keywords: CTEPH, HPC-based computational modeling, biophysical flow modeling, patient specific computational modeling, computational physiology

### INTRODUCTION

Chronic Thromboembolic Pulmonary Hypertension (CTEPH) is a form of pulmonary hypertension that arises as a complication in patients who suffered an acute embolic event (Pengo et al., 2004). For most patients this progressively fatal disease manifests several months or years following the event. Over this asymptomatic period, thromboembolic material in the pulmonary trunk is incorporated into arterial walls, gradually narrowing the vessel lumen and consequently increasing the peripheral pulmonary vascular resistance (PVR) (McNeil and Dunning, 2007). This raised PVR increases the right heart workload, leading to right ventricular failure. Although a single unresolved event such as the presence of a thrombotic occlusion in one of the pulmonary arteries is usually responsible for the development of CTPEH, in patients with the most severe forms of the disease small vessel arteriopathy with microvasculature remodeling is often observed (McNeil and Dunning, 2007). It is understood that these changes in the peripheral vasculature have an important yet still unclear functional role in further raising PVR (Ruiz-Cano et al., 2015).

While a number of risk factors for CTEPH have been identified, none of these are sufficiently significant to be used in creating scoring criteria, leaving CTEPH diagnosis mostly down to clinical experience and expertise, aided by anatomical knowledge derived from imaging data (Thistlethwaite et al., 2008). Similar limitations are found when planning surgical treatment, i.e., pulmonary thromboendarterectomy (PTE), a complex procedure requiring median sternotomy, cardiopulmonary bypass, and circulatory arrest to remove both the thrombus and the inner layers of the affected artery (Jamieson et al., 2003; Thistlethwaite et al., 2008). This procedure is usually deemed appropriate if the obstruction to the flow is proximal to the segmental branches, i.e., in the main or lobar pulmonary arteries. When the peripheral microvasculature is compromised by the disease progression, increased flow resistance results from both proximal occlusion, and adverse remodeling of inaccessible parts of the pulmonary vasculature and therefore PTE would not necessarily lead to an improvement in PVR (Kim, 2006). As a result, this type of surgery has higher rates of failure in CTEPH patients where peripheral remodeling—and not a proximal stenosis—is the major contributor to pulmonary hypertension (van de Veerdonk et al., 2011). Given the challenges and risks posed by this major procedure, PTE should only be performed if strictly necessary. It is therefore of key importance to determine the relative contribution of peripheral remodeling and proximal occlusion to the increase in PVR in each CTEPH patient, and use this information to derive robust selection criteria for PTE.

Computed Tomography (CT) pulmonary angiography is the gold standard investigation tool for determining both the presence and the extent of CTEPH in common clinical practice (McNeil and Dunning, 2007). Ventilation-perfusion scans are also performed to differentiate between various causes of pulmonary hypertension, including CTEPH. In addition to anatomical evaluation via imaging data, invasive assessment of pulmonary arterial pressure via cardiac catheterization also provides a diagnostic threshold for intervention. Specifically, CTEPH is associated with a mean pulmonary artery pressure above 25 mmHg and a pulmonary capillary wedge pressure of no more than 15 mmHg, in conjunction with the presence of chronic occlusive thrombi (Pepke-Zaba et al., 2011; Lau and Humbert, 2015). More recently the diastolic pressure gradient (DPG), a hemodynamic marker based on the difference between the mean diastolic pulmonary artery pressure and the mean pulmonary capillary wedge pressure, has been proposed as an effective diagnostic index in pulmonary hypertension, with DPG values >7 mmHg indicating adverse remodeling of the pulmonary vasculature (Gerges et al., 2013; Mazimba et al., 2016). The high inter-individual variability of patient morphologies in CTEPH, in conjunction with the progressive nature of the disease and its lack of specific symptoms, means that diagnosis and prognosis must rely on both anatomical and functional evaluations to be effective. However, due to risks associated with catheterization, invasive pressure measurements cannot be performed frequently in CTEPH patients and cardiac imaging only allows for anatomical evaluations, without providing insight into the patient hemodynamics. Further, there are currently no well-defined criteria to differentiate between proximal and distal forms of CTEPH (Galiè et al., 2009).

Such clinical context provides an ideal environment to test the potential of more sophisticated biophysical computational modeling to address the present difficulties in patient selection for PTE (McLaughlin et al., 1998). Anatomically realistic computational models of the arterial system can be tailored to the individual pathophysiology of the pulmonary arteries via patient-specific boundary conditions, providing a personalized description of the disease that is particularly useful in this type of pathologies, where the exclusive use of populationbased biomarkers for patient selection results in sub-optimal

**383**

treatment strategies (Morris et al., 2016). Image-based threedimensional Computational Fluid Dynamics (CFD) or Fluid-Structure Interaction (FSI) simulations of arterial flow can thus provide a flexible and powerful tool to elucidate the driving mechanism of disease progression (Taylor and Figueroa, 2009). Personalized CFD modeling was applied to assist in treatment planning by quantifying noninvasively hemodynamic parameters such as pressure (Kheyfets et al., 2013) and wall shear stress (WSS) (Tang et al., 2011, 2012). However, despite its clinical potential, image-based modeling in the context of CTEPH is still largely unexplored, with a limited number of studies based on simplified models. The implications of PTE in patients with different relative contribution of distal remodeling and proximal stenosis to the total PVR has been investigated using a simplified mathematical model based on the electrical analogy with two resistors in parallel, representing the proximal and peripheral resistances, respectively (Poullis, 2015). Similarly 1D models that rely on the wave equation and 0D Windkessel models have also been used to characterize pressure noninvasively in pulmonary hypertension patients using Phase Contrast MRI data as input (Lungu et al., 2014). However, these studies do not leverage on the recent progress in 3D personalized CFD modeling that was largely applied to a wide spectrum of cardiovascular diseases (de Zélicourt et al., 2010; Ladisa et al., 2010; Les et al., 2010; Coogan et al., 2011; Cebral et al., 2015; Numata et al., 2016; Arthurs et al., 2017; Youssefi et al., 2017), leaving this clinical question largely unexplored.

In this study, we investigated the potential of personalized CFD simulations of the pulmonary arteries to provide a clinical tool to better understand the role of patient-specific morphology and hemodynamics in determining the major contributor to raised PVR in CTEPH. All the simulations were carried out on realistic, high-resolution anatomical models of the pulmonary arteries by combining High-Performance Computing (HPC) and high-resolution Finite Element Method (FEM) modeling using the software package CHeart, which has been extensively validated and applied to simulate cardiovascular hemodynamics in a wide range of pathologies (de Vecchi et al., 2012, 2014a,b; McCormick et al., 2014; Lee et al., 2016; Hessenthaler et al., 2017). Three CTEPH patients who underwent PTE were modeled as a proof of concept that such an approach can contribute to improve patient selection criteria for PTE, reducing the need for more invasive pressure measurements to inform clinicians on the likely prognosis post-intervention. Special emphasis was placed on how to most accurately and efficiently model pulmonary vasculature in order to obtain the best compromise between anatomical and physiological accuracy, and computational effort. This investigation shows the potential of image-based personalized CFD modeling to support and improve the clinical decision-making process in diseases where "conventional" treatments derived from population-based guidelines are less effective or, in some cases, inadequate. Moreover, demonstrating that patient-specific models can be generated and applied to a specific clinical question without excessive computational demand, both in terms of time and resources, further supports the potential for a targeted clinical applicability of this technology.

### MATERIALS AND METHODS

### Finite-Element Model Generation From Patient Data

The clinical data for this study was obtained from Royal Papworth Hospital (Cambridge, UK) from three CTEPH patients who had undergone PTE. Patient 1 and 3 presented a stenosis on the right pulmonary artery (RPA), while in Patient 2 the partial occlusion was located on the left pulmonary artery (LPA). Further, in Patient 3 a large portion of the right lung had been surgically removed in a previous intervention. The study was approved by the local ethics review committees and all patients had given written consent.

The modeling workflow for the generation and personalization of the image-based models is illustrated in **Figure 1**. All models in this study were made patient-specific using anatomical information extracted from CT pulmonary angiography (CTPA) data (**Figure 1A**). All CTPA scans were performed using a combined chest dual energy (140–80 kV) acquisition with 1 mm slice thickness on a high-resolution Siemens Somatom Force scanner. The morphological models of the pulmonary arteries of each patient were initially based on the post-operative scans. Each morphology was segmented using the semi-automatic segmentation tool from the software package CRIMSON (Cardiovascular Integrated Modeling and SimulatiON), which was previously validated and used to simulate hemodynamics in a variety of cardiovascular problems (Lau and Figueroa, 2015; Arthurs et al., 2016, 2017; Khlebnikov and Figueroa, 2016; CRIMSON, 2017). The segmentation technique relies on the definition of paths along the centerline of each vessel, followed by the manual segmentation of the vessel cross sections at multiple locations along the centerline. The contours are then interpolated to produce smooth NURBS surfaces that approximate the vessel wall (**Figure 1B**). Interlobar arteries were segmented from the main pulmonary artery (MPA), as well as sub-segmental trunks until the third generation. **Table 1** reports the number of outlets segmented from each morphology, as well as information about the inlet and outlet surface areas and mesh size of each model.

From these segmentations, tetrahedral volume meshes with 2.8–3.4 million elements were generated with minimum and maximum edge length over the whole meshes of 5.9 and 0.0085 mm, respectively. Regional curvature refinement was applied to critical areas such as bifurcations and sudden changes in the vessel diameter to ensure numerical accuracy in the flow simulations (**Figure 1D**). A boundary layer made of three concentric layers of refined tetrahedral elements, with total thickness of 1 mm, was also added near the wall to resolve nearwall boundary layers. The effect of stenosis in the pulmonary arteries was investigated by manipulating the post-operative model to introduce a local narrowing in the proximal pulmonary vessels (**Figure 1E**). Each original segmentation was therefore modified to include a focal stenosis upstream of lobar divisions, while all other contours were left untouched and the models were subsequently meshed. The stenosis severity was then calculated as the percentage reduction in diameter, i.e., the ratio of the difference between the original and the stenotic

FIGURE 1 | Pipeline for the generation of patient-specific models and simulations in this study (Patient 2, posterior view). (A) Post-surgical CT pulmonary angiography. (B) Segmentation of the stenosed model with surface interpolation and arrow indicating stenosis on the LPA. (C,D) tetrahedral mesh (C) with curvature refinement (D). (E) Prescription of boundary conditions at the outlets (two-element Windkessel model) and at the inlet (Dirichlet condition on flow velocity); (F) CFD simulation of blood flow visualized using streamlines colored by velocity magnitude.

diameter over the original diameter. Patient 1 had a percentage diameter reduction of 43.8% on the RPA, Patient 2 of 40.3% on the LPA. The mesh from Patient 3 was first modified to add a 37.1% diameter reduction (Patient 3a, moderate stenosis), and then further manipulated to increase this value to 71.5% (Patient 3b, severe stenosis). This latter "virtual" scenario was motivated by the clinical context of this patient, where part of the right lung vasculature had been surgically removed in a previous intervention, increasing the likelihood of extensive microvasculature remodeling on the right side. In all cases the pulmonary obstruction values are within the ranges measured by previous clinical studies (Azarian et al., 1997; Miniati et al., 2006). Flow rates in the main pulmonary trunk were obtained from relevant literature and the same inlet flow profile was used in all models (Prakash et al., 2006; Forouzan et al., 2015). From these measures, a physiological velocity profile with ventricular systole from 0 to 380 ms and diastole from 380 to 925 ms was generated and prescribed as inlet boundary condition to the model MPA, as shown in **Figure 1E**. Two-element Windkessel models were imposed on each of the outlet boundaries of the left and right pulmonary branches (**Figure 1E**). In each model the vessels walls were considered rigid.

CFD simulations were subsequently carried out using CHeart, a finite-element software platform for personalized cardiovascular simulations, by solving the Navier-Stokes equations for a three-dimensional incompressible flow with a blood density (ρ) of 1,056 kg/m<sup>3</sup> and a dynamic viscosity (µ) of 3.5 cP (Lee et al., 2016; Hessenthaler et al., 2017). Transient flow simulations were performed on all pre- and post-operative morphologies for comparison of hemodynamic behaviors (**Figure 1F**). Three cardiac cycles were simulated in each case to achieve a periodic steady-state. The changes in the peak systolic, diastolic, and mean pressure gradients between inlet and outlets, and in the percentages of flow going to the right and left pulmonary arteries were then compared in the pre- and post-operative models to determine the impact of the removal of the proximal obstruction on each patient's hemodynamics. Variations in the WSS magnitude are also presented, whereby the WSS magnitude was calculated based on the tangential component of the traction vector **t** = σ**n**, as:

$$\left| \left| \mathfrak{w} (\mathfrak{u} \cdot \mathfrak{t}) - \mathfrak{t} \right| = \mathfrak{S} \mathfrak{W}$$

where σ is the Cauchy stress tensor and **n** is the normal to the wall. This study was carried out using the High Performance Computing (HPC) facility at King's College London, which comprises a 640 core SGI Altix-UV HPC with Nehalem-EX architecture.

### Windkessel Model for the Pulmonary Microvasculature

The following 2-element Windkessel model was used to model the behavior of the peripheral vasculature on the left and right side (Muthurangu et al., 2005):

$$Q(t) = \frac{p(t)}{R} + C\frac{dp(t)}{dt} \tag{1}$$


TABLE 1 | Anatomical parameters and mesh size of the segmented morphologies.

LPA, left pulmonary artery; RPA, right pulmonary artery.

where Q isthe flow rate of blood from the main pulmonary artery, p is the blood pressure, R the vascular resistance, and C the vessel compliance (Muthurangu et al., 2005).

The outlet flow on the LPA and RPA was integrated directly into the Windkessel equation by relating the flow rate Q<sup>i</sup> at the relevant boundary ϒ<sup>i</sup> to the fluid velocity, υ**,** i.e.,

$$\int\_{\Upsilon\_{\bar{i}}} \boldsymbol{\nu} \cdot \boldsymbol{\mathfrak{n}}\_{\bar{i}} d\Upsilon\_{\bar{i}} = |Q\_{\bar{i}}| \tag{2}$$

where **n<sup>i</sup>** is the normal vector to the boundary plane ϒ<sup>i</sup> . An estimate based on the ratio between stroke volume and pulse pressure was chosen for the vessel compliance, while the resistance values on the RPA and the LPA were calculated iteratively for each morphology starting with values derived from in-vivo measurements (Muthurangu et al., 2005). These initial values were then iteratively tuned using a multi-step procedure to achieve the expected value of the percentage ratio of the stenosed to the non-stenosed pulmonary arterial (i.e., flow ratio) based on the ratios of the areas of the stenosed to the nonstenosed pulmonary arteries (i.e., size ratio) in each patient. First, physiological measurements of the flow splits for varying size ratios were collected from previous studies on patients with branch pulmonary stenosis and a mathematical relationship was subsequently derived by fitting these data using an exponential curve (Sridharan et al., 2006; Ordovás et al., 2007). Second, this function was used to derive the expected flow ratio value given the size ratio in each patient, where the LPA and RPA areas were calculated in the proximal pulmonary branches from the anatomy segmentations. Finally, a sweep study was performed in each case by progressively varying the resistance of the stenosed branch to identify the value that corresponded to the target flow ratio, keeping the vessel compliance fixed. The final values of the resistance in the LPA and RPA of each anatomy are reported in **Table 2**, together with the degree of stenosis severity and its anatomical location.

#### Parametric Modeling

Transient flow simulations over multiple cardiac cycles on highresolution meshes require significant computational resources on HPC facilities, thus limiting the number of patients that can be simultaneously modeled without compromising time efficiency. To improve the models potential for clinical translation, reducing this simulation time is crucial. To achieve the necessary level of time efficiency, an alternative modeling approach based on constant flow simulations was proposed and tested in Patient 2, in both the stenosed and non-stenosed morphologies.



The vessel compliance was fixed at 1.035E-08 m<sup>3</sup> /Pa.

CFD simulations were performed on both stenosed and nonstenosed models with constant values of inlet velocity υ equal to 0.25υmax, 0.5υmax, and υmax (respectively corresponding to 0.1433, 0.2866, and 0.5732 m/s), where υmax is the maximum inlet velocity of the transient inflow profile. All constant flow simulations were launched in parallel on the HPC facility at King's College London until the solution reached an asymptotic state. The pressure gradient between inlet and outlet, ∆p, was then computed in each simulation and a curve was obtained by fitting a 2nd degree polynomial to the data in the parametric space (υn, ∆p), where the velocity υ<sup>n</sup> represents the constant inflow velocity normalized by υmax. The transient pressure gradient ∆p (t) was finally predicted by substituting the time varying inflow velocity profile υ (t) in the polynomial fitting equation, without performing more computationally intensive transient flow simulations. To assess the accuracy of this method, the relative pressure drop obtained from the transient flow simulation was compared to that derived from the polynomial fitting equation using the same inflow velocity profile of the transient flow simulation. Maximum and mean errors between the two pressure curves were calculated in both the stenosed and non-stenosed models for Patient 2.

#### RESULTS

Changes in blood flow dynamics, including flow ratio and WSS magnitude, and in the pressure gradients between the stenosed and the non-stenosed model were analyzed in each set of transient flow simulations. The same biomarkers were studied in the constant flow simulations, and in this instance data was extracted from the final time step simulated, representative of an asymptotic state.

## Peak Systolic, Diastolic, and Mean Pressure Gradients

Peak systolic, diastolic, and mean pressure gradients for each morphology are reported in **Table 3**. As expected, in all patients the stenosed model exhibited higher values in all pressure gradients than the non-stenosed ones. Even though the values reported in **Table 3** are calculated based on the pressure transient curves averaged across all outlets, rather than on the LPA and RPA separately, this increase in pressure gradient was driven by the large difference between the inlet and the outlet pressure on the stenosed branch, as shown by the pressure magnitude in **Figure 2**.

Overall, upon removal of the thrombotic occlusion, the peak systolic pressure gradient decreased from 25.16 to 21.05 mmHg in Patient 1 (16.3% reduction), from 32.08 to 18.77 mmHg in Patient 2 (41% reduction), and from 61.02 and 70.45 mmHg (in the mild and severe stenosis scenarios, respectively) to 57.36 mmHg in Patient 3, corresponding to a 6.0 and 18.6% reduction. The mean pressure gradients decreased from 4.91 to 4.09 mmHg in Patient 1 and from 5.71 to 3.55 mmHg in Patient 2, indicating a reduction of 16.7 and 37.8%, respectively. In Patient 3 the mean pressure gradient was reduced from 11.07 and 12.44 mmHg, in the moderate and severe stenosis models, to 10.31 mmHg once the stenosis was removed, corresponding to a percentage reduction of 6.9 and 17.1%, respectively. Finally, the DPG was also reduced upon removal of the stenosis. Patients 1 and 2 exhibited a reduction of 9.3% (from 4.73 to 4.29 mmHg) and 16.8% (from 4.10 to 3.41 mmHg), respectively. In Patient 3 the decrease in DPG was 17.0 and 24.5% in the moderate and severe stenosis models (from 8.63 and 9.48 to 7.16 mmHg, respectively).

### Flow Split

To appreciate what proportion of the inflow was directed to the stenosed and the non-stenosed branch, the percentage changes in the flow ratio was calculated in both morphologies for each patient and summarized in **Table 4**. The proportion of flow directed to the formerly stenosed branch increased in all patients upon removal of the stenosis, which resulted in a change in flow ratio ranging from 10 to 60% approximately (**Table 4**). In Patient 1 the percentage of the total inflow from the main pulmonary artery directed to the stenosed branch (RPA) increased from 22 to 28% upon removal of the occlusion, corresponding to an increase

TABLE 3 | Pressure gradients between inlet (main pulmonary artery) and outlets (averaged across LPA and RPA) in mmHg for each patient.


PG, pressure gradient at peak systole (time step 275); DPG, diastolic pressure gradient.

in the flow ratio of 39.3%. In Patient 2 the hemodynamics was more balanced, with ∼46% of the total inflow directed to the stenosed branch (LPA); this proportion increased to 48% when the stenosed segment was removed, which corresponded to an increase in the flow ratio of 8.2%. Patient 3 presented a moderate hemodynamic benefit from the removal of the occlusion, with the percentage of inflow to the stenosed branch (RPA) increasing from 28 to 31% (15.4% increase in flow ratio). When the effect of removing the stenosis was investigated in the model with a severe stenosis (Patient 3b), the percentage of inflow to the stenosed branch improved from 22 to 31%, corresponding to a change in flow ratio of 60.7%.

### Blood Flow Dynamics and Wall Shear Stress

The blood flow velocity field and the distribution of WSS magnitude were also analyzed and compared between the models. The stenosed models of Patients 2 and Patient 3a were chosen for comparison in **Figure 3**.

Patient 2 exhibited blood flow velocities of up to 3.025 m/s at peak systole in the stenosed segment, where the WSS peaked at 31 Pa before dropping to 2 Pa downstream of the stenosis (**Figures 3A,B**). Small areas of slow recirculating blood flow are visible downstream of the stenosis, where the WSS magnitude decreased abruptly (**Figure 3A**, insert).

In Patient 3 the peak systolic blood flow velocity in the stenosed segment reached 4.636 m/s, which corresponded to an increase in WSS magnitude to 33 Pa (**Figures 3C,D**). Unlike Patient 2, in this case dilated regions are observed downstream of the occlusion and in the secondary branch originating from the stenotic segment, where the magnitude of the WSS dropped to <1 Pa (**Figure 3D**, insert). A large region of recirculating blood flow and a low velocity helical flow developed in these regions and was associated with low WSS magnitude (**Figures 3C,D**, insert).

### Constant Flow Simulations and Parametric Modeling

Parametric modeling was then employed to investigate whether one of the biomarkers of interest, pressure gradients, could be calculated from discrete constant flow simulations. The pressure gradient for each of the flow simulations with constant inflow velocity was recorded in the stenosed and the non-stenosed models of Patient 2. The polynomial fitting equations for the stenosed and non-stenosed cases are reported in **Figures 4A,B**. The corresponding predicted curves of transient pressure gradients in the stenosed and non-stenosed model, ∆p<sup>S</sup> (t) and ∆pNS (t), respectively, were compared to the data from the transient flow simulations with the same prescribed inflow velocity profile in **Figures 4C,D**.

In the non-stenosed model, the maximum pressure gradient at peak systole predicted using the constant flow approach was 22.4 mmHg, which was 19% higher than the peak systolic pressure gradient derived from the transient flow simulations. In the stenosed mesh, the parametric modeling predicted a peak systolic gradient of 41.8 mmHg, which compared

to the transient flow simulation peak resulted in an error of ∼30%. The constant flow approach overestimated the mean pressure gradient by 24 and 47% in the stenosed and non-stenosed models, respectively. Overall the mean absolute error over the whole cycle is 1.51 mmHg for the stenosed mesh, and 1.64 mmHg in the non-stenosed mesh, derived by subtracting the two curves in time. For both models, for 75% of the cycle, the absolute value of the difference is below 5.5 mmHg. These results were obtained without modifying the specific memory requirements in one third of the computational time of the transient flow simulations on the same HPC cluster, thus making the parametric modeling approach less computationally intensive than the transient flow simulations.

### DISCUSSION

This study provides a proof of concept on how high performance computing, imaging data and numerical modeling can be successfully integrated to address the specific clinical questions posed by CTEPH. We performed patient-specific CFD simulations on realistic models of the pulmonary arteries in patients affected by CTEPH to help inform patient selection criteria for surgical intervention by pulmonary



thromboendarterectomy. The additional information provided by the models is particularly relevant for patient management in CTEPH, which is an under-diagnosed progressive disease whose stage at the point of intervention can vary significantly from patient to patient. The success of surgery depends on the degree of peripheral vascular remodeling that has occurred in the lungs since the formation of the thrombotic occlusion. Our approach allows to model the contribution of the occlusion (proximal resistance) and of the pulmonary vasculature remodeling (peripheral resistance) to the overall pulmonary vascular resistance. By quantifying changes in clinical biomarkers such as pressure gradients, WSS and flow balance between LPA and RPA, this technique can help assessing the hemodynamic effects introduced by the removal of a proximal occlusion in each patient. Such information is very challenging to obtain in-vivo, making pre-operative patient selection one of the most problematic issues of CTEPH management. By establishing if the main cause of the disease in each individual is proximal or peripheral, our personalized models can provide potentially decisive data to inform treatment.

FIGURE 3 | Simulation results at peak systole. (A) Streamlines colored by velocity magnitude in Patient 2. (B) WSS magnitude in Patient 2. (C) Streamlines colored by velocity magnitude in Patient 3a; (D) WSS magnitude in Patient 3a. The mild stenosis case was employed for Patient 3a. The inserts show the flow behavior and WSS magnitude in the stenosed segment of the pulmonary arteries, highlighting the flow recirculation regions, and the helical flow (A,C) and the drop in the WSS magnitude (B,D).

### Metrics for Patient Selection: Pressure Gradients and Flow Ratio

line represents an average of the gradients obtained in LPA and RPA from the transient flow simulation.

In CTEPH, pressure gradients and flow ratio between RPA and LPA provide extremely valuable information for characterizing the disease progression. However, pressure and flow cannot be directly quantified from standard imaging data, and even when this assessment can be performed, e.g., using advanced flow imaging techniques such as PC-MRI, its accuracy is often hampered by low spatio-temporal accuracy. While the gold standard for pressure and flow measurements is cardiac catheterization, this technique is highly invasive and only allows for measurement at discrete proximal locations in the pulmonary trunk. Numerical flow simulations present the advantage of providing noninvasive measurements for all points in the morphology of interest, including peripheral vessels.

Results showed that the removal of a proximal occlusion in the left or right pulmonary artery led to a successful reduction of the pressure gradients between the main pulmonary artery and the peripheral vasculature in Patients 1 and 2, while in Patient 3 the reduction was less significant. In this case, a pressure gradient reduction similar to that of Patent 1 (16.9%) could be achieved only when a more severe stenosis was introduced. This suggests that in Patient 3 the degree of remodeling in the peripheral vasculature represents a relatively larger contribution to the overall pressure gradients than the localized proximal resistance due to the thrombotic occlusion, thus implying that the surgical removal of the stenosis might have a less beneficial effect in this case than for Patients 1 and 2.

The removal of the stenosis also changed the flow ratio between right and left pulmonary arteries, increasing the flow rate to the repaired branch. While Patients 1 and 3a had a similar reduction in diameter, removal of the segment resulted in different levels of improvements in flow balance. The percentage change in flow ratio was higher in Patient 1 (39.3%) than in Patients 2 and 3a, where it reached 8.2 and 15.4%, respectively. The small variation observed in Patient 2 can be related to the fact that in this case the flow ratio is close to the values found in normal subjects (Cheng et al., 2005). This suggests that for Patient 1 the removal of the proximal occlusion was more beneficial to the balance of pulmonary flow than in Patients 2 and 3. For this latter case, only when a severe stenosis was introduced in the model and subsequently removed, the flow ratio increased significantly by more than 60%. This result is in agreement with the hypothesis that removal of the stenosis is less beneficial in this case than in the other two. It is worth noticing that this patient is the only one where the DPG value was above the critical threshold for peripheral remodeling of 7 mmHg.

#### Constant Flow Simulations

The estimation of transient pressure gradients from constant flow simulations provided a computationally efficient method for the assessment of the peak systolic and mean pressure gradient.

During diastole, when changes in the inlet velocity magnitude are very small, the pressure gradient curve from the constant flow simulation was in agreement with the simulation results using the time-varying inflow velocity profile. However, parametric modeling resulted in significant differences in the peak systolic pressure gradients, with a 30% overestimation compared to the transient inflow simulation result in the stenosed model. Such overestimation is present in the non-stenosed case as well, albeit of smaller magnitude (20%). When the percentage of improvement in the peak systolic pressure gradient following the removal of the stenosis is considered, however, both models provide a similar result: the peak systolic pressure gradient was reduced by ∼41% according to the transient inflow simulation and by 46% in the results from parametric modeling.

Overall, parametric modeling provides an effective strategy to reduce computation time and to estimate the expected change in peak systolic pressure gradients post-operatively, albeit the peak magnitude of the pressure gradient in each model is overestimated by this simplified approach. While each transient flow simulation took just under 5 h to complete, all three constant flow simulations were launched at the same time and required only 1 h of computations, effectively reducing the simulation time. This is particularly relevant in clinical applications, where the prompt availability of investigation is essential for an effective clinical translation of the modeling results.

#### Wall Shear Stress

WSS is a biomarker that cannot be derived from standard anatomical imaging data such as the CTPA scans used in this study. Non-routine imaging techniques, such as PC-MRI, are usually needed in order to reconstruct flow dynamics in time and space, from which WSS can be calculated: however these type of imaging data requires longer acquisition times to achieve the necessary spatio-temporal resolution for an accurate estimation of the WSS. WSS nevertheless is a significant metric in the context of vascular pathology. It is well-known that endothelial cells, when exposed to normal shear forces, produce agents with antithrombotic properties, but when the wall shear stress is outside of the normal ranges this mechanism is disrupted and pathologies such as arteriosclerosis and thrombogenesis can develop (Tang et al., 2012). Low wall shear stress is linked to vasodilation, aneurysm formation, and the development of atherosclerotic plaques (Jiang et al., 1999; Boussel et al., 2008). In patients affected by pulmonary hypertension, WSS is decreased in the pulmonary arteries, a phenomenon which is associated with vasodilation, increased cardiac output, and a subsequent decrease in pulmonary vascular resistance. The compensatory effect resulting from vessel dilation can initially counter the disease progression, however drug therapy based on epoprostenol is necessary to maintain a long-term reduction in pulmonary vascular resistance that can stabilize the disease, albeit without reversing its progression (McLaughlin et al., 1998). Such a mechanism can explain the results observed in Patient 3, where the magnitude of WSS downstream of the stenosis was lower than in Patient 2 despite a similar diameter reduction in the stenotic segment, but a much more dilated wall was observed in the same region. The lower average magnitude of WSS in Patient 3 could also be associated with hypertension, which was suggested by a DPG value greater than 7 mmHg.

Understanding the relationship between low WSS magnitude, vessel dilation and PVR reduction can reveal key information on the progression of pulmonary hypertension and the degree of remodeling in the vasculature. Therefore, WSS may prove an insightful biomarker to risk-stratify patients based on hemodynamic features. Despite its importance, WSS calculations are often not available to the clinician due to the difficulty in reliably deriving this biomarker from imaging data alone. In this context, CFD simulations can provide valuable insight into patient assessment by quantifying the WSS magnitude and its changes over time.

#### Limitations

Simulations for this study were based on a number of assumptions. We prescribed that arterial walls were rigid, and expressed the behavior of the peripheral vasculature below subsegmental level using a lumped parameter model, i.e., a twoelement Windkessel model. The anatomical segmentation, even if performed using a semiautomatic technique, could only be carried out up to a limited degree of detail. The high spatial and contrast resolution of dual energy CT acquisitions allow the visualization of small diameter arterioles, making it possible to trace pulmonary vasculature well beyond subsegmental level. A cut-off generation or caliber beyond which no vessel is segmented was defined since tracing every individual terminal branch would become excessively time consuming and therefore unfeasible, particularly if considering potential clinical applications. Besides being an obvious trade-off between anatomical accuracy and operational efficiency, defining such endpoint is a complex task in itself, with no supporting literature available to guide the decision. Overall, therefore, segmentation of the pulmonary arteries is highly affected by operator variability, thus limiting the reproducibility of the experiment. Future work to improve this limitation could include development of fully automatic segmentation techniques, like statistical shape models or atlasbased approaches (Shikata et al., 2004; Buelow et al., 2005).

The models in this pilot study were also based on postoperative scans due to unavailability of pre-operative datasets, and the occlusion level and position was idealized based on the clinical history of the patients. However, the aim of this preliminary work was to define the methodology and technical feasibility of the study and this limitation will be overcome in future studies using pre-operative imaging acquisition protocols. Similarly, due to limited availability of pre-surgical clinical records for the patients examined, indexes and comparative data for validation, e.g., comparable pressure and velocity fields, have been chosen from relevant literature. Specifically, in all patients the same idealized inflow velocity profile was employed in both the stenosed and the non-stenosed models, while evidence suggests that in CTEPH patients waveform diverges from the standard and changes markedly between patients (Kim, 2006). Rather than a methodological shortcoming, this limitation is down to incomplete clinical datasets and thus can be addressed in future studies by prospectively acquiring Color Doppler ultrasound data in addition to morphological CTPA scans.

### CONCLUSION

This study shows that patient-specific biophysical modeling of pulmonary vasculature has a potential role in optimizing CTEPH patient selection for PTE and potentially become an effective tool for a quantitative classification of CTEPH types and treatment in the longer term. Specifically, providing a quantitative prediction of the changes in pressure gradients in the pulmonary tree and flow ratio between the RPA and LPA can help identify patients in which chronic hypertension is mostly due to peripheral remodeling, and therefore is not significantly ameliorated by removal of a thrombotic occlusion. Our results show that the improvement in both pressure gradients and flow balance is different in patients with similar diameter reduction in the stenotic segment, implying that such assessment goes beyond the simple evaluation of the percentage of stenosis present in the vasculature, which provides a purely anatomical criterion for intervention. Linking the blood flow dynamics to the patient morphology is thus a key step to determine whether the hemodynamic benefits of the stenosis removal are sufficiently significant to justify surgical intervention.

In addition to this application to CTEPH, this approach is highly flexible and can be generalized to perform individualized assessment of any disease characterized by a high degree of morphological variability. Thanks to increasingly powerful HPC resources, this additional information on the pathophysiological mechanisms linking altered hemodynamics and disease progression can now be computed in a timeframe compatible with clinical needs, which represents a major step forward in the clinical translation of mathematical modeling. To further address this issue, we also presented a time-efficient approach based on constant flow simulations and parametric curve fitting that has the ability to reproduce transient pressure gradients for a given inflow velocity profile, thus reducing computational demand and optimizing the usage of HPC resources. By addressing a

### REFERENCES


specific clinical question, this study provides a proof of concept that mathematical modeling combined with high performance parallel computing holds significant potential for assisting the clinical decision-making process for CTEPH patients who are potential candidates to PTE.

### AUTHOR CONTRIBUTIONS

AdV together with DN supervised the project from beginning to end and coordinated the authors' efforts, performing the simulations, and working on the data analysis. MS worked on the project producing the meshes on which simulations were run, as well as contributing to the analysis of results, and together with AdV authored the manuscript. PS and JD were responsible for patient recruitment and acquisition of the data employed to perform this study.

#### ACKNOWLEDGMENTS

DN acknowledges funding from Engineering and Physical Sciences Research Council Research Grant (EP/N011554/1) and Engineering and Physical Sciences Research Council Healthcare Technology Challenge Award (EP/R003866/1). This work was supported by the Wellcome/EPSRC Centre for Medical Engineering at King's College London [WT 203148/Z/16/Z].

We would like to thank Prof. Alberto Figueroa (Department of Bioengineering, University of Michigan) and Dr. Rotislav Khlebnikov, Dr. Chris Arthurs, and Dr. Desmond Dillon-Murphy (King's College London) for their technical support with the software package CRIMSON.

an upright posture using MRI: the effects of exercise and age. J. Magn. Reson. Imaging 21, 752–758. doi: 10.1002/jmri.20333


cardiac ventricular assist devices. Comput. Biol. Med. 49, 83–94. doi: 10.1016/j.compbiomed.2014.03.013


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Spazzapan, Sastry, Dunning, Nordsletten and de Vecchi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Transport and Mixing Induced by Beating Cilia in Human Airways

#### Sylvain Chateau1,2, Umberto D'Ortona<sup>1</sup> , Sébastien Poncet 1,2 and Julien Favier <sup>1</sup> \*

<sup>1</sup> Aix Marseille Univ, Centre National de la Recherche Scientifique, Centrale Marseille, M2P2, Marseille, France, <sup>2</sup> Département de Génie Mécanique, Université de Sherbrooke, Sherbrooke, QC, Canada

The fluid transport and mixing induced by beating cilia, present in the bronchial airways, are studied using a coupled lattice Boltzmann—Immersed Boundary solver. This solver allows the simulation of both single and multi-component fluid flows around moving solid boundaries. The cilia are modeled by a set of Lagrangian points, and Immersed Boundary forces are computed onto these points in order to ensure the no-slip velocity conditions between the cilia and the fluids. The cilia are immersed in a two-layer environment: the periciliary layer (PCL) and the mucus above it. The motion of the cilia is prescribed, as well as the phase lag between two cilia in order to obtain a typical collective motion of cilia, known as metachronal waves. The results obtained from a parametric study show that antiplectic metachronal waves are the most efficient regarding the fluid transport. A specific value of phase lag, which generates the larger mucus transport, is identified. The mixing is studied using several populations of tracers initially seeded into the pericilary liquid, in the mucus just above the PCL-mucus interface, and in the mucus far away from the interface. We observe that each zone exhibits different chaotic mixing properties. The larger mixing is obtained in the PCL layer where only a few beating cycles of the cilia are required to obtain a full mixing, while above the interface, the mixing is weaker and takes more time. Almost no mixing is observed within the mucus, and almost all the tracers do not penetrate the PCL layer. Lyapunov exponents are also computed for specific locations to assess how the mixing is performed locally. Two time scales are introduced to allow a comparison between mixing induced by fluid advection and by molecular diffusion. These results are relevant in the context of respiratory flows to investigate the transport of drugs for patients suffering from chronic respiratory diseases.

#### Edited by:

Peter V. Coveney, University College London, United Kingdom

#### Reviewed by:

Tim David, University of Canterbury, New Zealand Oliver E. Jensen, University of Manchester, United Kingdom

> \*Correspondence: Julien Favier julien.favier@univ-amu.fr

#### Specialty section:

This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology

Received: 05 December 2017 Accepted: 19 February 2018 Published: 06 March 2018

#### Citation:

Chateau S, D'Ortona U, Poncet S and Favier J (2018) Transport and Mixing Induced by Beating Cilia in Human Airways. Front. Physiol. 9:161. doi: 10.3389/fphys.2018.00161 Keywords: mucus, cilia, transport, mixing, pulmonary flow, lattice Boltzmann method, immersed boundary

### 1. INTRODUCTION

Computational Fluid Dynamics (CFD) is becoming a powerful tool in the medical context. It provides a good insight of physical phenomena occurring inside the human body without the need of intrusive surgery methods, which often fail to observe the desired phenomenon as they introduce perturbations. Many organs, such as the human heart, have already received a lot of attention from scientists using numerical methods (Khalafvand et al., 2011). However, only few studies focused on modeling the lungs entirely, as it is probably one of the most challenging organ to simulate due to the different length scales involved, from microns for the mucociliary transport to centimeters for the airflow in the upper airways. The transport of mucus depends on its interaction with cilia, whose scale is of the order 10−<sup>6</sup> m, but is also strongly affected by the numerous bifurcations (length

**395**

and diameter of order 10−<sup>1</sup> m in the upper airways) that form the bronchial tree. Some authors have tried to study the entire lung, at the price of severe simplifications: Inagaki et al. (2009) looked at the pressure losses inside the full bronchial system, but neglected the multi-component nature of the flow and the mucociliary transport. Stylianou et al. (2016) looked at the impact of bifurcation for the particle laden flow using Direct Numerical Simulations (DNS), but considered only one bifurcation and did not take into account all the phenomena occurring at the microscale. Given the actual capacities of supercomputers, it is prohibitive to model the entire system while accounting for the multi-component and multi-scale nature of the flow, the deformation of the bronchial tree during a breathing cycle, the heat and mass transfer at the epithelium surface, etc. Hence, many authors restrict their study to a given scale/phenomenon, as it is the case in the present work. Before going any further, it is also worth noticing that, in recent years, the need for efficient methods able to perform the simulation of deformable moving solids in multi-component flows has also been felt in other areas. In this context, the aim of this paper is to present a numerical tool, which can be used to study many biofluidic configurations, such as the transport of nutrients in the brain (Siyahhan et al., 2014), the displacement of ovules in the Fallopian tubes (Anand and Guha, 1978), or even the simulation of industrial micromixers (Chen et al., 2013).

In this paper, one considers the mucociliary clearance process (MCC), which is the main defense mechanism developed by the human body to protect itself against foreign particles (like pollutants, allergens, bacteria, etc.) which are inhaled during the breathing process. Its principle is simple: a layer of fluid called Airways Surface Liquid (ASL) covers the surface of the airways. The inhaled particles are deposited onto it, and then transported to the stomach thanks to the combined motion of the cilia tufts that cover the epithelial surface. In the two-phase model adopted here, it is generally assumed that the ASL is in fact the superposition of two different fluid layers: the periciliary liquid (PCL), and the mucus phase above it (Knowles and Boucher, 2002). In this model, the PCL can be viewed as a Newtonian fluid similar to water. However, the modeling of PCL remains an open question in the literature, as its experimental characterization is not yet fully understood. Hence, other models exist such as, for example, the one of Button et al. (2012), where the mucus is depicted as a gel made of reticulated mucins. The interesting proposed idea being that, if the PCL is not thick enough and/or has a low hydration, then the mucus-gel may squeeze the cilia and prevent them to beat efficiently. The purpose of the PCL is to act as a kind of lubricant which allows the mucus to slip onto it (Puchelle et al., 1995). Its thickness is around 6 µm. The mucus is composed of 95% of water, but also contains macromolecules called mucins (Lai et al., 2009). It is a highly non-Newtonian fluid which exhibits a plethora of complex properties such as viscoelacticity and thixotropy. Its role is to act as a barrier against the external environment and to trap the particles. Its depth varies between 5 and 100 µm depending on the position in the bronchial tree (Widdicombe and Widdicombe, 1995). One of the main difficulties met for its characterization is the huge variability of its rheological properties (Lafforgue et al., 2017). It can indeed vary by several orders of magnitude during the same day within a particular person (Kirkham et al., 2002).

In order to propel these two fluid layers, the epithelium is covered by tufts of cilia (around 200–300 cilia per tuft) which are cytoplasmic extensions put into motion by biochemical motors. Their motion can be decomposed into two steps: the stroke phase, which lasts around one third of the total beating period, where cilia will be almost orthogonal to the flow in order to maximize their pushing effect; and the recovery phase where cilia will bend themselves and get closer to the epithelial surface in order to minimize their impact on the flow. This spatial asymmetry is essential in the context of creeping flows, as it is the only mechanism that generates transport (Purcell, 1977; Khaderi et al., 2010). Note that the recovery phase does not occur in the same plane as the stroke phase, but instead occurs in a plane somehow more inclined in regards to the vertical axis (Sleigh et al., 1988). The cilia length is around 7 µm, thus allowing them to enter the mucus during the stroke phase. Cilia diameter is estimated to be around 0.2–0.3 µm according to Sleigh et al. (1988), and their beating frequency is around 15 Hz.

MCC can only work if both the mucus production and ciliary beating are fully functional. Indeed, diseases such as cystic fibrosis (CF), asthma, or Chronic Obstructive Pulmonary Disease (COPD), can all be related to abnormalities in the MCC process. In the case of CF, the mucus secreted is very viscous and in large quantities, which hinders the work of the cilia. Thus mucus flow becomes almost null and mucus accumulates. It leads to severe infections, which damage or destroy the cilia tufts. On the other hand, people with asthma have less cilia, and the ones remaining may be dysfunctional. The transport of mucus is obviously less efficient than for healthy persons, which is balanced by cough for instance.

Experimentally, it has been observed that cilia synchronize their beatings accordingly to their neighbors with a small phase lag (Sleigh, 1962). It results in metachronal waves (MCW) which can be seen at the surface formed by the cilia tips. When the phase lag 18 between two cilia is negative, the MCW are called symplectic and move in the same direction as the flow. On the contrary, when 0 < 18 < π, the MCW are called antiplectic and move in the direction opposite to the flow. These waves have been shown to greatly enhance the fluid transport (Gueron and Levit-Gurevich, 1999; Gauger et al., 2009), but there are still open questions on which kind of waves is the most efficient for mucus transport and mixing. Most of them are either experimental studies performed on living animals (Machemer, 1972), or numerical ones performed in a single fluid environment (Khaderi et al., 2011; Ding et al., 2014). Only few addressed the problem using a two-layer fluid (Chatelin and Poncet, 2016; Chateau et al., 2017). The main result of these works is that antiplectic MCW are found to be the most efficient, and that particular phase lags between two cilia maximize the mucus transport. Others authors (Sedaghat et al., 2016) have investigated the role of mucus rheology using a similar methodology as the one presented here, and found that the ratio of elastic contribution of mucus viscosity to the total mucus viscosity has a quite significant effect on the mucociliary transport. In particular, the mucus velocity was observed to increase when decreasing the elastic part of the mucus viscosity. The study of the mixing induced by beating cilia is also very important as it provides information about the deposition rate of particles (such as inhaled drugs) onto the epithelial cells. However, to the best of the author's knowledge, only Ding et al. (2014)studied the mixing properties of both symplectic and antiplectic MCW but in a single fluid layer. The objective of the present paper is to fill this gap by having a deep insight into the transport and mixing properties of MCW in a more realistic two-phase environment.

The article is organized as follow: the algorithm used to model the MCC in a two-layer context is described in section 2. Results regarding the transport of passive tracers are presented in section 3, and a displacement ratio is introduced in order to quantify the efficiency of the wave organization. In section 4, the mixing capacities of the system are studied using tracers advection and by computing a global mixing index. Lyapunov exponents are also used in order to gain insight about how the mixing is locally achieved. Two time scales are also defined in order to compare the mixing induced by fluid advection to the mixing induced by molecular diffusion. Finally, conclusions summarize the main results of this work with some future views in section 5.

#### 2. NUMERICAL METHOD

The Boltzmann equation describes the behavior of a gas from a microscopic point of view. The Lattice Boltzmann Method (LBM) solves the discrete Boltzmann equation for an ensemble of distribution functions f(x, t) on a discrete lattice. These distribution functions describe the probability that ensembles of particles, with velocity **e**<sup>i</sup> , collide and then stream along the discrete velocity vectors **e**<sup>i</sup> . By doing a Chapman-Enskog analysis, one can recover the Navier-Stokes equations as presented in Kruger et al. (2016) for instance. This kind of fluid solver is now considered as an efficient alternative to traditional Navier-Stokes solvers.

#### 2.1. Mathematical Description 2.1.1. Single-Component LB Model

In LBM, the fluid status is updated in time by resolving the discrete Boltzmann equation (Chen and Doolen, 1998, and references therein):

$$f\_i(\mathbf{x} + \mathbf{e}\_i \Delta t, t + \Delta t) = f\_i(\mathbf{x}, t) - \frac{\Delta t}{\pi} \left[ f\_i(\mathbf{x}, t) - f\_i^{(eq)}(\mathbf{x}, t) \right] \tag{1}$$

where fi(**x**, t) represents the distribution function at time t and position **x** in the i th direction of the lattice (D2Q9 in 2D, and D3Q19 in 3D). Equation 1 uses the Single Relaxation Time (SRT) Bhatnagar-Gross-Krook (BGK) (Bhatnagar et al., 1954) collision operator. In this model, τ is the relaxation time, which is linked to the lattice viscosity by τ = 3ν + 0.5 using the classical normalization procedure, i.e., 1x = 1t = 1 (Kruger et al., 2016). In this work, each phase is Newtonian, but has a different viscosity. The distribution functions move along a set of discrete velocity vectors **e**<sup>i</sup> , which depend on the lattice considered, as shown in **Figure 1**. The local density and momentum at each lattice node can be obtained by summing all the functions fi(**x**, t):

$$\rho(\mathbf{x},t) = \sum\_{i=0}^{N} f\_i(\mathbf{x},t) \qquad \qquad \rho \mathbf{u}(\mathbf{x},t) = \sum\_{i=0}^{N} f\_i(\mathbf{x},t) \mathbf{e}\_i \tag{2}$$

where N is the number of discrete velocities on the lattice. The discrete equilibrium function f (eq) i (**x**, t), that appears in equation 1, can be obtained by Hermite series expansion of the Maxwell-Boltzmann equilibrium distribution (Chen and Doolen, 1998, and references therein):

$$f\_i^{(eq)} = \rho \omega\_i \left[ 1 + \frac{\mathbf{e}\_i \cdot \mathbf{u}}{c\_s^2} + \frac{(\mathbf{e}\_i \cdot \mathbf{u})^2}{c\_s^4} - \frac{\mathbf{u}^2}{c\_s^2} \right] \tag{3}$$

where c<sup>s</sup> = 1/ √ 3 is the speed of sound in lattice unit. The weight coefficients ω<sup>i</sup> are ω<sup>0</sup> = 4/9, ω1−<sup>4</sup> = 1/9 and ω5−<sup>8</sup> = 1/36 for D2Q9 lattices, and ω<sup>0</sup> = 1/3, ω1−<sup>6</sup> = 1/18 and ω7−<sup>18</sup> = 1/36 for D3Q19 lattices (Qian et al., 1992).

Body force effects are introduced by adding an extra term to Equation (1):

$$f\_i(\mathbf{x} + \mathbf{e}\_i \Delta t, t + \Delta t) = f\_i(\mathbf{x}, t) - \frac{\Delta t}{\pi} \left[ f\_i(\mathbf{x}, t) - f\_i^{(eq)}(\mathbf{x}, t) \right] + \Delta t F\_i(\mathbf{x}, t) \tag{4}$$

where F<sup>i</sup> is given by the following equation:

$$F\_i = \left(1 - \frac{\Delta t}{2\pi}\right) \alpha\_i \left[\frac{\mathbf{e}\_i - \mathbf{u}}{2c\_s^2} + \frac{\mathbf{e}\_i \cdot \mathbf{u}}{c\_s^4} \mathbf{e}\_i\right] \cdot \mathbf{F} \tag{5}$$

Here, **F** represents the body force per unit volume. The macroscopic velocity **u** must then be updated in order for the system to recover the Navier-Stokes equation:

$$
\rho \mathbf{u} = \sum\_{i} \mathbf{e}\_{i} \mathbf{f}\_{i} + \frac{\Delta t}{2} \mathbf{F} \tag{6}
$$

More details on the LBM model can be found in (Kruger et al., 2016, and references therein).

#### 2.1.2. Multi-Component LB Model

When considering two or more fluid components, the LB discrete equation is written as follows:

$$\begin{aligned} f\_i^{\sigma} \langle \mathbf{x} + \mathbf{e}\_i \Delta t, t + \Delta t \rangle &= f\_i^{\sigma} \langle \mathbf{x}, t \rangle - \frac{\Delta t}{\mathbf{r}\_{\sigma}} \left[ f\_i^{\sigma} \langle \mathbf{x}, t \rangle - f\_i^{\sigma \langle eq \rangle} \langle \mathbf{x}, t \rangle \right] \\ &+ \Delta t F\_i^{\sigma} \langle \mathbf{x}, t \rangle \end{aligned} \tag{7}$$

where f σ i (**x**, t) and τ<sup>σ</sup> are the distribution functions and the single relaxation time of the σ th component respectively. The expression of the equilibrium distribution function now reads:

$$f\_i^{\sigma(eq)} = \rho\_\sigma \omega\_i \left[ 1 + \frac{\mathbf{e}\_i \cdot \mathbf{u}\_\sigma^{(eq)}}{c\_s^2} + \frac{(\mathbf{e}\_i \cdot \mathbf{u}\_\sigma^{(eq)})^2}{2c\_s^4} - \frac{\mathbf{u}\_\sigma^{(eq)} \cdot \mathbf{u}\_\sigma^{(eq)}}{2c\_s^2} \right] \tag{8}$$

where ρ<sup>σ</sup> = P i f σ i is the density of the σ th component. **u** (eq) σ is the equilibrium velocity which is identical for the two fluid components:

$$\mathbf{u}^{(eq)}\_{\sigma} = \mathbf{u}^\* = \frac{\sum\_{\sigma} \sum\_{i} \mathbf{e}\_i f\_i^{\sigma} / \mathbf{r}\_{\sigma}}{\sum\_{\sigma} \sum\_{i} f\_i^{\sigma} / \mathbf{r}\_{\sigma}} \tag{9}$$

In Equation (7), the explicit forcing term F σ i is linked to the total body force **F**<sup>σ</sup> per unit volume exerted on the σ th component:

$$F\_i^{\sigma} = \left(1 - \frac{\Delta t}{\mathbf{r}\_{\sigma}}\right) \frac{\mathbf{F}\_{\sigma} \cdot (\mathbf{e}\_i - \mathbf{u}\_{\sigma}^{\{eq\}})}{\rho\_{\sigma} c\_s^2} f\_i^{\sigma \text{(eq)}} \tag{10}$$

Now, based on the methodology developed by Martys and Chen (2013), one adds a Shan-Chen-type fluid-fluid cohesion force **F** SC σ in the total body force vector **F**σ of Equation (10) in order to model the two-component behavior. The expression of the Shan-Chen type fluid-fluid cohesion force is (Shan and Chen, 1994):

$$\mathbf{F}\_{\sigma}^{\rm SC}(\mathbf{x},t) = -G\_{\rm coh} \rho\_{\sigma}(\mathbf{x},t) \sum\_{i} \omega\_{i} \rho\_{\sigma'}(\mathbf{x} + \mathbf{e}\_{i} \Delta t, t) \mathbf{e}\_{i} \tag{11}$$

where Gcoh is a parameter that controls the force of the cohesion force, and where σ ′ represents a fluid different from σ. Note that with a Shan-Chen-type fluid-fluid cohesion force, there is no discontinuity of the fluid velocity at the interface, which is diffuse.

#### 2.1.3. The Immersed Boundary Method

The aim of the IB method is to impose velocity boundary conditions on the Eulerian fluid nodes that surround a solid, by adding an extra body force **F** IB σ to the fluid equations, so that the macroscopic fluid velocity can equal the velocity at the Lagrangian points modeling the solid boundary. Hence, an IB force **F** IB σ is also included in the total body force vector **F**σ so that **F**<sup>σ</sup> = **F** IB <sup>σ</sup> + **F** SC σ . The macroscopic velocity **u**σ given by Porter et al. (2012) writes:

$$\rho\_{\sigma} \mathbf{u}\_{\sigma} = \sum\_{i} \mathbf{e}\_{i} f\_{i}^{\sigma} + \frac{\Delta t}{2} \mathbf{F}\_{\sigma} \tag{12}$$

The immersed boundary method to derive the forcing term uses the classical procedure which relies on two operators:


More details can be found in Li et al. (2016).

#### 2.2. Modeling the MCC

The computational domain is a fixed rectangular box of size (N<sup>x</sup> = 385, N<sup>y</sup> = 11, N<sup>z</sup> = 34), as shown in **Figure 2**. The computational domain has been chosen as it allows to study the desired values of phase lags |18| (ranging from ±π/6 up to ±π) without modifying the size of the domain and with a sufficientlyfine cilia resolution to ensure grid-independent results. The fluid part is solved on a Cartesian grid with a simple BGK collision operator, and a D3Q19 scheme. Periodic boundary conditions are used in the x and y-directions, while no-slip and free-slip boundary conditions are used at the bottom and top walls, respectively. The length L = 7 µm of the cilia is set to 11 lattice units (lu). Cilia are modeled by a set of 200 Lagrangian points, whose motion is governed by a differential 1D transport equation along a parametric curve (Chatelin, 2013; Chatelin and Poncet, 2016). In the following, P(ζ , t) denotes the position of the curve at time t and at a normalized distance ζ from the base point of a cilium. With appropriate boundary conditions, a realistic beating pattern is obtained :

$$\frac{\partial P'}{\partial t} + E(t) \frac{\partial P'}{\partial \xi} = 0 \quad \text{BC:} \begin{cases} P(0, t) = \langle 0, 0, 0 \rangle \\ P'(0, t) = \langle 2 \cos(2\pi \, t/T), 0, \cos(2\pi \, t/T) \rangle \end{cases} \tag{13}$$

with E 2 (t) = ([1 + 8 cos<sup>2</sup> (π(t + 0.25T)/T)]/T) 2 a term which mimics elastic effects, T the beating period, and P ′ = ∂<sup>ζ</sup> P. To ensure the stability of the IB method, there must be approximatively one Lagrangian point per lattice cell where the IB forces are computed. Thus only 10 Lagrangian points regularly

spaced onto the cilia are chosen for the computation of the IB forces. The spacing between two cilia is set to a = 1.44L in the x-direction, and b = 0.4L in the y-direction. Their base point is located at z = 0 which corresponds to the position of the epithelial surface. The beating period is Tosc = Nit1t, where Nit is the number of iterations for performing a full beating cycle. The PCL fills the domain from z = 0 up to an altitude z = h = 0.9L. In all simulations, N<sup>z</sup> is fixed to 34 lu, leading to a ratio h/H = 0.26. The wavelength of the imposed metachronal waves varies from λ = 32 lu for a phase lag 18 = π, to λ = 192 lu for 18 = π/6.

The motion of the cilia is imposed to be in the x-direction only. Note that, due to the inter-cilia spacing, no collision between cilia occurs during their beatings. Since the only mechanism to impose motion in creeping flow is the spatial asymmetry (Purcell, 1977; Khaderi et al., 2010), no temporal asymmetry is considered in the beating pattern. The viscosity of the PCL is chosen to be νPCL = 10−<sup>3</sup> m<sup>2</sup> /s, and the ratio of viscosity rν between the mucus and PCL is set to 10. Since the model of Porter et al. (2012) introduces a Shan-Chen fluidfluid repulsive force (Shan and Chen, 1994), surface tension effects emerge intrinsically at the mucus-PCL interface. More importantly, this also prevent the mixing of the mucus and PCL. The equations of the cilia motion are taken from Chatelin (2013) and reproduce a 2D beating pattern similar to the one observed for real cilia. In particular, the angular amplitude of this beating pattern is θ = 2π/3 as observed experimentally (Sleigh et al., 1988). Thus, the velocity Ucil at the tips of the cilia can be computed by Ucil = 2θL/Tosc, and an oscillatory Reynolds number can be defined as:

$$Re^{\alpha \mathfrak{c}} = \frac{U\_{\text{cl}}L}{\upsilon\_{\text{mucus}}} = \frac{\alpha L^2}{\upsilon\_{\text{mucus}}} \tag{14}$$

where ω is the angular beating frequency of cilia. Using physical quantities (Lphy ≈ 10−<sup>5</sup> m, νmucus ≈ 10−<sup>3</sup> m<sup>2</sup> .s−<sup>1</sup> , and Ucil ≈ 10−<sup>3</sup> m.s−<sup>1</sup> ), the obtained Reynolds number is of the order of 10−<sup>5</sup> . Thus, inertial effects do not play any role in the phenomenon of MCC. Running simulations at such a low Reynolds number would require a huge number of iterations using a lattice Boltzmann scheme due to the coupling between 1x and 1t imposed by the normalization. Hence, we chose higher Reynolds numbers: Reosc = 2.10−<sup>2</sup> , 5.10−<sup>2</sup> , and 10−<sup>1</sup> , as it has been demonstrated in Chateau et al. (2017) that inertial effects remain weak in this configuration up to Reynolds numbers around 10. For Re = 10−<sup>2</sup> , inertia effects vanish. In creeping flow, there should be no noticeable difference in the wave structure even for a Reynolds number 1,000 times weaker. The code is parallelized using MPI (Message Passing Interface) by splitting the computational domain into 9 subdomains of size (Nx/3, Ny/3, Nz).

#### 3. MUCUS TRANSPORT

A common way to treat respiratory diseases is by the inhalation of drugs, which flow into the airways until they are captured by the mucus layer. To gain an insight into how drugs are dispersed and advected into the mucus and PCL, the displacement field **d**(**x**) = R <sup>T</sup>osc 0 **u**(**x**(t), t)dt is computed, where **x** is the position vector and **u** is the fluid velocity. The component over the x-direction of the displacement field is then averaged over 20 beating periods and denoted < d<sup>x</sup> >. It is plotted on **Figure 3**. One can clearly see the importance of the phase lag, some values being associated to larger displacement of fluid. One can also observe that the particular case where all the cilia beat synchronously (i.e., 18 = 0) results in a transport which is similar to the action of fully desynchronized cilia (i.e., 18 between two neighboring cilia is random). Note that to test the repeatability of the random motion, three simulations with an initially different random pattern were performed. Each of them gave almost identical results, with less than 3% of difference in the fluid velocity. In order to understand why the presence of the PCL layer is beneficial for the mucus transport, a simulation of a single-fluid layer, representing mucus, has been run for a phase lag 18 = π/4 (see the red curve in **Figure 3**). It results in a weaker transport compared to the corresponding two-layer fluid

simulation with 18 = π/4, thus highlighting the importance of having a layer of fluid with lower viscosity under the mucus one as it allows the mucus to slip onto it (Puchelle et al., 1995). In **Figure 3**, different areas, corresponding to different mixing regimes, are also presented and will be introduced later in section 4. These regions are similar to the "transport" and "mixing" areas defined in Ding et al. (2014) and Chateau et al. (2017). The displacement over the y and z-directions has also been quantified. The displacement in the y-direction is small everywhere, and thus can be neglected. On the contrary, the displacement in the z-direction is small above the cilia tips, but not under. It has been shown in Chateau et al. (2017) that a peak in the stretching rate is present in this region. It will be shown in section 4 that it is also the area where the mixing is the strongest.

The total volume of fluid effectively displaced is computed in order to determine which phase lag is more able to transport the mucus. To do so, the global volumetric flow rate Q<sup>v</sup> over a unit volume of size (1 × 1 × Nz) is defined by:

$$Q\_V = N\_\varepsilon \frac{U^\* \Delta x^2}{L^2} \tag{15}$$

with U <sup>∗</sup> = U av/U ref , where U ref = λ/(NcilT) is the reference velocity of the system, and U av = (NxNyNz) <sup>−</sup><sup>1</sup> P <sup>i</sup>,j,<sup>k</sup> Uijk is the average fluid velocity inside the domain. The result for the total displaced volume of fluid is plotted in **Figure 4**. Metachronal motion, except for the cases where 18 = −π/6 and 18 = −π/4, induces a stronger displacement of fluid compared to the synchronized motion (18 = 0). Note that the results for 18 = −π/6 and 18 = −π/4 slightly differ from what is found in Chateau et al. (2017) where, for Reynolds numbers of the order of 10−<sup>2</sup> , symplectic MCW were found to be more efficient than synchronized motion. This is a direct consequence of the modified geometry: indeed, in Chateau et al. (2017), the cilia spacing b in the y-direction was set to values larger than 1.67L. Thus, during the stroke phase of symplectic MCW (which corresponds to a moment where the cilia are being clusterized), the fluid flow was simply expelled around the cilia. In the present case, b is much smaller (b = 0.4L) in order to have a higher density of cilia as observed in real epitheliums, and the fluid is mainly pushed above the cilia. It results in a displacement of the mucus-PCL interface above the cilia tips which never get the chance to enter the mucus layer. On the contrary, the cilia during the recovery phase are far away from each other. A suction effect occurs, leading the mucus-PCL interface to be moved downwards toward the cilia. Thus, the counter flow created by the cilia during the recovery phase is almost as strong as the flow created by the cilia during the stroke phase. As a consequence, both the PCL and mucus flows are much smaller. The opposite happens for antiplectic MCW with large wavelengths (i.e., small 18): the cilia are far from each other during the stroke phase, which maximizes their pushing effect. The suction effect also takes place, which results in the mucus-PCL interface moving downwards. Hence, the cilia tips penetrate more deeply into the mucus phase. During the recovery phase, the cilia are now clusterized, and the mucus-PCL interface is pushed far away from the cilia tips. Hence, the induced counter flow is almost null, while the cilia during the stroke phase creates a strong positive flow. This result is interesting as it might be linked to the fact that antiplectic MCW with very large wavelength (18 < π/6) are usually observed in nature for living organisms evolving in single layer fluid environments (Sleigh, 1962). This blowing and suction mechanism is similar to the one observed in Dauptain et al. (2008) on a similar configuration involving the swimming of a jellyfish by ciliary propulsion. A maximum in the total fluid displaced volume can be seen in **Figure 4**, and corresponds to an antiplectic MCW with 18 ≈ π/6, which corroborates the results found in Chateau et al. (2017) for antiplectic MCW where a peak in the total displaced fluid volume was found for 18 ≈ π/4.

In order to characterize the system from an energy perspective, the average power Pcil spent by the cilia during a beating cycle is introduced:

$$P\_{cil} = \frac{\sum\_{s,i} \mathbf{V}\_i^s \cdot (\mathbf{F}\_m^i + \mathbf{F}\_{PCL}^i)}{N\_{cil}} \tag{16}$$

where **V** s i is the velocity on the sth Lagrangian points of the ith cilium, and **F** i <sup>m</sup> and **F** i PCL the interpolated IB forces, respectively applied by the ith cilium onto the mucus and PCL. In order to have a dimensionless power P ∗ , Pcil is normalized by P<sup>∞</sup> the power spent by an isolated cilium during a beating cycle (a/L =

b/L = 5), such that P <sup>∗</sup> = Pcil/P∞. The displacement ratio η can now be defined as the mean displacement over the x-direction divided by the mean power a cilium has to spend during a beating cycle:

$$\eta = \frac{ \frac{N\_{\rm cl}}{\lambda}}{P^\*} \tag{17}$$

where < d ∗ <sup>x</sup> > is the mean displacement over the x-direction during one period, taken on an arbitrary plane (z/L = 3.2) near the top of the domain. The left axis of **Figure 5** shows the dimensionless power P ∗ spent by the system. The synchronized case requires less energy than other type of coordinated motion. Note that MCW with a phase lag such that π/3 < |18| < π result in the highest power spent, while smaller phase lags (|18| < π/4) require less energy. On the right axis of **Figure 5**, one can observe the variations of the displacement ratio η. For a given power input, the synchronized motion of the cilia is almost always more efficient than MCW for displacing fluids, except for antiplectic MCW with 18 = π/6. This result can explain why antiplectic MCW with large wavelengths are usually observed in nature.

#### 4. MIXING

#### 4.1. Global Mixing

The mixing is quantified using the method developed in Stone and Stone (2005): two populations of tracers of different colors (black and white) are initially organized in a regular pattern; each population occupying the same volume (see **Figure 6** for a view of the domain filled with tracers). They are released at t = t<sup>0</sup> when the flow is fully-established, and a second order Runge-Kutta (RK2) scheme is used to compute their advection, using the interpolated fluid velocity given by the IB method. The mixing is quantified by measuring the decay of the shortest distance between tracers that belong to the different populations. Hence,

the mixing number m is defined as follows:

$$m = \left(\prod\_{i=1}^{N} \min(|\mathbf{x}\_i - \mathbf{x}\_j|)^2\right)^{\frac{1}{N}}\tag{18}$$

where **x**<sup>i</sup> and **x**<sup>j</sup> are the positions of tracers of different colors, N is the total number of particles of the same color, and j = 1, 2, ..., N is the index for which the minimization is performed. We chose to study the mixing in three different areas: area 1 is located inside the PCL, and the tracers are set such that they occupy the region between z = 0.2L and z = 0.8L; area 2 is located above the PCL-mucus interface and the tracers occupy the region between z = 1.2L and z = 1.8L; and area 3 is located far above the PCL-mucus interface, and the tracers occupy the region between z = 2.5L and z = 3.1L (see **Figure 6** for a view of the different areas). The chosen pattern consists in rectangular boxes of size (1.44L, 0.4L, 0.6L) regularly distributed along the x-direction, each of them being centered around the base of a cilium. This geometrical distribution has been chosen in order to provide comparative results with Ding et al. (2014). The density of tracers is not a critical factor here, as pointed out by Stone and Stone (2005). Hence, in each area one tracer is placed every 2 nodes along the three directions of space. On **Figure 7**, the different mixing areas are displayed after 60 beating cycles for a Reynolds number of Re = 5.10−<sup>2</sup> : the tracers initially seeded into the PCL are significantly mixed, contrarily to the tracers initially seeded far above the mucus-PCL interface. Between these two populations, the tracers initially seeded just above the mucus-PCL interface undergo a constant shearing. It is worth noticing that tracers initially seeded into the PCL (resp. mucus) stay in the PCL (resp. mucus). This behavior is attributed to surface tension effects present at the interface which prevent a mixing of the two fluid layers. This shows that particles captured by the mucus layer will never reach the PCL. However, note that the present model does not take into account molecular diffusion effects, which may allow drugs to penetrate the PCL area. Nevertheless, the effects of diffusion will be considered in section 4.3 using two different time scales.

**Figure 8A** shows the time evolution of the mixing number m/m<sup>0</sup> in the PCL (area 1) for different metachrony; m<sup>0</sup> denoting the initial value of m when the tracers are not yet released. If the mixing is chaotic, the mixing number m/m<sup>0</sup> should be decaying exponentially. It is indeed the case during the first beating cycles (5–6 cycles). However, if only chaotic mixing was present, the measures would simply converge toward a "plateau." This is not the case here as the cilia also impose a stretching to the generated flow. Thus, the ratio m/m<sup>0</sup> keeps on decaying and converge only toward a "pseudo-plateau." Since we are mainly interested by a characterization of the chaotic mixing induced by cilia, we will focus our attention to the first beating cycles. **Figure 8B** confirms that the mixing in the mucus is very low. The mixing number m/m<sup>0</sup> is almost constant during all 60 beating cycles. The tracers are transported as a solid block and keep their initial pattern, as illustrated in **Figure 7**. On **Figure 9A**, the logarithm of the dimensionless mixing number m/m<sup>0</sup> in area 1 is plotted. The fact that m decays rapidly means that the mixing in this area is strong:

FIGURE 6 | 2D view of the domain filled with 3 populations of tracers for Re = 5.10−<sup>2</sup> . The PCL is blue, and the mucus phase is red. Population 1 occupies the PCL between z = 0.2L and z = 0.8L; Population 2 is located above the PCL-mucus interface and occupies the region between z = 1.2L and z = 1.8L; Population 3 is located far above the PCL-mucus interface, and occupies the region between z = 2.5L and z = 3.1L. The size of the computational domain is (Nx = 385, Ny = 11, Nz = 34).

FIGURE 7 | 3D view of the domain filled with 3 populations of tracers for Re = 5.10−<sup>2</sup> , 60 beating cycles after their release at t = t0 when the flow is fully-established. The tracers in the PCL are significantly mixed, while the tracers in areas 2 and 3 still present coherent patterns.

(B) in area 3 (far above the mucus–PCL interface).

indeed, only 4 beating cycles are required to obtain a converged state of mixing. During these first beating cycles, the decay of m strongly depends on the value of the phase lag 18. The results for symplectic MCW (18 < 0) are similar to those obtained for antipleptic MCW (see **Figure 9A,C**). On **Figure 9B,D**, the same quantities are plotted for area 2. One can observe the importance of 18, some phase lags being clearly more able to mix the tracers. It is interesting to note that each curve presented in **Figure 9A,B** exhibits the behavior of chaotic mixing. In other words, they can be approximated by a function of form ln(m/m0) = −βNcycles, where the fitted parameter β represents a mixing rate, which depends on the local stretching rate (Weiss and Provenzale, 2007). Hence, it is possible to compare the mixing capabilities of symplectic and antiplectic MCW, as shown in **Figure 10A,B**. The mixing rates β obtained for three different Reynolds number (Re = 10−<sup>1</sup> , Re = 5.10−<sup>2</sup> , and Re = 2.10−<sup>2</sup> ) are plotted as a function of 18 for areas 1 and 2 respectively (see **Figure 10A**). The curves follow the same trend for each value of tested Reynolds number. There are always values of 18 6= 0 such that the obtained mixing rate β is superior to the synchronized case; except for the case Re = 0.1 where the value of 18 = 0 induces a mixing rate β almost as strong as for 18 = π/2. As seen in section 3, the values of 18 = −π/6, and 18 = −π/4 induce a weak mixing. This is the direct consequence of the fact that the PCLmucus interface is pushed above the cilia tips during their stroke phase, which hinders them to penetrate the mucus layer. As a result, the fluid flow is weaker in both the PCL region and mucus

region. In **Figure 10B** two distinct peaks can be identified, one for antiplectic MCW with 18 ≈ π/4, and the second for symplectic MCW with 18 ≈ −π/4, indicating that these particular values of phase lag are more efficient to mix the mucus. While the value of 18 ≈ −π/4 induces a small transport of the PCL and mucus, and a small mixing of the PCL, it is interesting to note that it can generate a mixing as strong as the case 18 ≈ π/4 above the PCL-mucus interface. In both cases (18 = ±π/4), this can be attributed to the motion of the interface. Note that too large values of 18 induce a mixing which is similar to the one of synchronized beating cilia (18 = 0). Also, the y-scale of **Figure 10B** is much smaller (100 times smaller) than the one of **Figure 10A**. In both **Figure 10A,B**, the dashed lines represent the mixing rates β obtained for cilia beating randomly. The mixing rate obtained for such configuration may vary depending on the initial conditions of the cilia, but not significantly, as it is "averaged" over the random motion of 48 cilia. Interestingly, the motion of randomly beating cilia produces an "averaged" mixing rate: although never being in the highest values of β, it always induces a mixing reasonably high.

The conclusion here is that the mixing induced by MCW in area 1 is very similar to the mixing induced by synchronized motion, except for the particular case of symplectic MCW with very large wavelengths. In area 2, specific values of phase lags are found to be more efficient to mix the mucus-PCL area compared to synchronized or random beatings. Finally, in area 3, the mixing is weak and independent of the phase lag 18.

) as a function of the phase lag 18. (A) Mixing rate obtained in area 1 with a fit over the 4 first beating cycles. (B) Mixing rate obtained in area 2 with a fit over all 60 beating cycles. The dashed lines represent the values of the mixing rate obtained for cilia beating randomly. Note that the y-scales used in (A,B) are different: the values of β corresponding to area 1 are of order 10<sup>2</sup> times greater than those obtained in area 2. Also note that the repeatability of the random motion has been tested, and similar values for β with less than 2% of difference were found.

#### 4.2. Local Mixing

Specific drugs, such as the propranolol (PPL) or β-adrenergic, act on the cilia by modifying their beating frequency (Inoue et al., 2013). Others, such as the anticholinergics or the corticosteroids, act directly on the mucus secretion (Barnes, 2002). Each of these drugs have specific targets, and must arrive precisely where they will have the most effects. Hence, it is important to fully understand how they will be mixed. However, many questions remain open: Where are the drugs mainly mixed ? Where exactly is the location of strongest mixing in the PCL ? To answer these questions, a different method is now introduced in order to measure locally the mixing, and gain a detailed insight into how the particles are mixed depending on their location. To do so, the methodology used in Cieplak et al. (1992) is adopted. The principle is simple: one must follow the evolution of the distance r between tracers initially separated by an infinitesimal distance r0. In the particular case of chaotic mixing, a Lyapunov exponent γ can be extracted using the following equation:

$$\ln(\frac{r}{r\_0}) = \wp N\_{\text{cycles}} \tag{19}$$

This exponent gives an indication on the strength of the mixing. However, a sufficiently high number of measurements must be performed to get rid of the noise inherent to this method. To do so, a cubic set of (3 × 3 × 3) tracers, referred later as "fathers," are used. These fathers are initially set at a distance r<sup>0</sup> = 0.01 lu apart from each other. For each father, 6 tracers, referred from now as "children," are regularly initialized around the fathers along the 3 directions of space at a distance of 0.001 lu. Thus, 162 pairs of tracers are considered and their average distance rmean is regularly computed during several beating cycles.

Five typical positions are studied:


The mean distance r for positions A, B, and C are given in **Figures 11**–**13**. Note that the results for positions D and E are not displayed in the following, as they are very similar to those obtained for position A. In **Figure 11A**, one can see the evolution of the average distance rmean as a function of the number of cycles Ncycles for several phase lags 18. It takes around 10 cycles for the distance between fathers and children to significantly increase. One can see in **Figure 11B** that the evolution of ln(rmean/r0) is linear during the first cycles, indicating chaotic mixing. Similar results are obtained for positions C, D, and E (see **Figure 13** for position C). Thus, we can extract Lyapunov exponents for each curve by considering only their linear parts. It is important to note that, while the measures indicate chaotic mixing only during the first beat cycles after the tracers release, the mixing is always chaotic: indeed, the flow is well-established and its properties do not change over time. For position B (see **Figure 12A,B**), the tracers are initialized at 1L (thus 0.1L above the interface), and no Lyapunov exponent can be extracted for this position. This is due to the presence of the interface beneath them, which captures the tracers due to its undulating motion. Different positions above position B have also been tested (results not shown): we observe that when the tracers are set further above position B (i.e., further above the interface), Lyapunov exponents can be extracted again, and lead to results similar to those of position C. Our hypothesis is that the mixing is attenuated near the interface since the direction of the flow follows the motion of the interface. Thus, there is mainly a vertical shear in this area and the distance between particles at the same altitude remains similar,

FIGURE 11 | Results obtained for position A and Re = 5.10−<sup>2</sup> . (A) Average distance rmean between the fathers and the children as a function of the number of cycles Ncycles. (B) Logarithm of the dimensionless average distance rmean/r0 as a function of the number of cycles Ncycles.

only the evolution of the vertical distance measured between particles matters. **Figure 14** shows the Lyapunov exponents γ obtained for positions A, C, D, and E. The highest values of γ are obtained for the tracers located in position A, which are on the trajectory of a cilium and at an altitude of 0.45L. The values of γ corresponding to position E are smaller, which makes sense as the tracers are on the trajectory of the same cilium, but much closer to the epithelial surface. Thus, since the velocity of the cilium is smaller near its base, the mixing is weaker. Interestingly, the tracers of position D, which are in the middle of the PCL but between two cilia along the y-direction, give values of γ smaller that the ones of position E. This indicates that the mixing in areas which are not on the trajectory of a cilium is much weaker. Moreover, it takes also more time for the separation distance between fathers and children to increase: around 25 cycles for tracers in position D against only 10 cycles for tracers in position A. Finally, far above the mucus-PCL interface, the values obtained for γ are very small: the mixing is almost null. The trend of the curve for position E is the same as for positions A, C, and D. It is worth noticing that the same trend is observed for the Lyapunov exponents in **Figure 14** and the total displaced volume of fluids in **Figure 4**. Indeed, the mixing in the present configuration is due to the combined action of mixing by chaotic advection and by stretching. While the major contribution for the obtained values of the Lyapunov exponents extracted comes from their initial positions (A, B, C, D, or E), the shape of the curves in **Figure 14** is due to the combined action of these two phenomena.It is however reasonable to think that the regions of stronger stretching are also the regions where the chaotic mixing is the strongest. Hence, the extracted Lyapunov exponents are

FIGURE 13 | Results obtained for position C and Re = 5.10−<sup>2</sup> . (A) Average distance rmean between the fathers and the children as a function of the number of cycles Ncycles. (B) Logarithm of the dimensionless average distance rmean/r0 as a function of the number of cycles Ncycles.

suitable for a qualitative measure of the mixing as 18 varies. More details on flow patterns associated to peculiar phase lags can be obtained in Chateau et al. (2017).

#### 4.3. Advective and Diffusive Time Scales

The aim of this part is to compare the mixing time scales associated with chaotic advection to those associated with molecular diffusion in the PCL and in the mucus. To do so, we follow the procedure described in Ding et al. (2014) which is recalled hereafter. Note that the main difference here, compared to the work of Ding et al. (2014), is the use of two fluid layers instead of just one, which allows us to investigate different mixing behaviors between the PCL and mucus layers. First, as in section 4.1, we consider particles of different colors initially seeded at a distance s<sup>0</sup> apart at t = t0. At t > t0, the distance between these two populations of particles has decreased by a ratio α, where 0 < α < 1. Assuming there is only fluid advection, it takes N cycles for the separation distance between the particles to become s<sup>N</sup> = (1 − α)s0. The definition of s<sup>N</sup> is thus equivalent to the one of the mixing number m introduced in section 4.1. If the mixing is chaotic, i.e., if the decay in particle separation distance is exponential, one gets: s 2 <sup>N</sup> = s 2 0 exp−βN. Hence, the time scale associated with mixing by fluid advection is :

$$t\_{\rm mixing}^{\alpha} = T\_{\alpha \rm c\epsilon} \ast N = \frac{2\pi N}{\alpha} = -\frac{4\pi \log(1 - \alpha)}{\beta \alpha} \tag{20}$$

where ω isthe cilia beating frequency. From a molecular diffusion standpoint, particles moving on a distance αs<sup>0</sup> by molecular diffusion with a diffusivity coefficient D would have the following characteristic time:

$$t\_{\text{diffusion}} = \frac{(\alpha s\_0)^2}{D} \tag{21}$$

By equating the two time scales, one gets:

$$
\omega = \frac{4\pi \log(1 - \alpha)}{(\alpha s\_0)^2 \beta} D \tag{22}
$$

Thus, for given α, s<sup>0</sup> and β, Equation (22) gives a linear relationship between ω and D which allows to compare in the parameter space (D, ω) the regions where the mixing is dominated by advection or by molecular diffusion. In order to compare our results to the ones of Ding et al. (2014), the same values of α = 0.9 and s<sup>0</sup> = L = 10 µm are used. **Figure 15A,B** show the results obtained in the PCL (area 1) and in the mucus

(area 2) respectively. One can see in **Figure 15A** that there is a region compatible with typical cilia beating frequency where mixing by fluid advection is dominant. This is in accordance with the results found by Ding et al. (2014) who obtained similar mixing rates in a single layer of fluid. Note that in Ding et al. (2014), as only one phase was modeled, only two populations of tracers were considered, which filled the whole computational domain. No distinction in Ding et al. (2014) was made between regions of strong mixing (around the cilia), and regions of weaker mixing (far above the cilia). On the contrary, **Figure 15B** shows that above the PCL-mucus interface, the mixing is dominated by molecular diffusion. Hence, it shows that drugs deposited onto

.

the mucus layer can only reach the PCL via molecular diffusion. This can be confirmed by doing a simple calculus: according to Morgan et al. (2004), the mucus velocity is around Vmucus = 1.72.10−<sup>4</sup> m.s−<sup>1</sup> . Assuming that there are no bifurcations in the airways, so that the mucus is transported in the same direction on a total length of around 20 cm, and assuming that its velocity Vmucus remains constant, one gets that it takes around 20 h for the mucus to be expelled. This time has to be compared with the time taken by particles for reaching the PCL layer by molecular diffusion: assuming that the layer of mucus has a thickness Lmucus = 70 µm, the time for a particle to diffuse over this distance can be approximated using Equation (21): t ≈ (αs0) 2 /D = L 2 mucus/D. Using a diffusion coefficient D = 2.9.10−<sup>11</sup> m<sup>2</sup> .s−<sup>1</sup> , corresponding to human immunoglobulin G (IgG) in mucus (Saltzman et al., 1994), one gets a value of 169 s for the IgG to reach the PCL-mucus interface. These results show that drugs injected by nasal sprays and deposited onto the mucus layer may always reach the PCL area through molecular diffusion. There, the chaotic advection will further increase the mixing to bring drugs near the epithelium. However, drugs composed of large molecules will have smaller diffusion coefficients, and might not reach the PCL in time (for instance, for a value of diffusion coefficient of the order of 10−14, it will take around 136 h to reach the PCL). However, note that the conclusions drawn here result from several hypothesis, which may limit the generality of our simplified model of MCC. Other phenomena, such as chemical reactions, osmosis, or unusual mucus properties associated to peculiar pulmonary diseases might occur and should be taken into account for a deeper understanding of the balance between advective and diffusive mixing.

#### 5. CONCLUSION AND PERSPECTIVES

By using a coupled lattice-Boltzmann/Immersed Boundary solver, the transport and mixing induced by beating cilia were studied in the context of MCC. Thanks to this numerical approach, a stable two-phase system (mucus-PCL), allowing the introduction of a viscosity ratio, can be studied. The mucus-PCL interface is also naturally captured. Due to the local nature of the LBM, the parallelization is straightforward, allowing the simulations of large domains.

A detailed study of the transport induced by antiplectic and symplectic MCW has been performed, and the results showed that antiplectic MCW with large wavelengths (i.e., 18 < π/4) are more able to transport the mucus. A displacement ratio has also been introduced to quantify the capacity of a system to transport particles for a given power input. The configuration corresponding to an antiplectic MCW with 18 = π/6 has been found to be the most energetically efficient. On the contrary, symplectic MCW with large wavelengths result in a very poor transport, due to the displacement of the mucus-PCL interface above the cilia tips during their stroke phase.

The mixing capabilities of the system have also been studied in three distinct areas. The results showed that the mixing is chaotic in both the PCL region and above the PCL-mucus interface. The stronger mixing is obtained in the PCL region where only a few beating cycles are required to obtain a converged state of mixing. On the contrary, far above the interface, the mixing is almost null. The calculation of Lyapunov exponents in specific locations of the domain has also shown that the mixing is stronger when a cilium passes through the area of measurements, and especially around the cilia tips because of their "whip-like" motion. On the contrary, between two cilia along the y-direction, the mixing takes more time and is weaker. At the interface, particles are trapped and consequently follow the undulating motion of the mucus-PCL interface. Two time scales can be defined, one associated with advective mixing and the other one with diffusive mixing. The results showed that in the mucus, the mixing is always dominated by diffusion. Regions in the ω-D phase diagram where mixing in the PCL is dominated by advection also exist. These results show that drugs deposited onto the mucus layer can only reach the PCL layer via molecular diffusion. The two-layer character of the MCC allows a strong chaotic mixing in the PCL while trapping the particles inside thanks to the presence of a viscous layer of mucus. Above, the mixing is also chaotic, but at a much lower rate, which allows the mucus to be transported straightforwardly.

Future efforts toward more realistic simulations of the MCC include:


#### AUTHOR CONTRIBUTIONS

SC worked on the numerical framework, designed the analysis, and drafted the manuscript. UD developed the algorithm, helped in data analysis, provided significant feedback on the computation of the Lyapunov exponents, and reviewed the manuscript. SP developed the algorithm, helped in data analysis, and reviewed the manuscript. JF developed the algorithm, helped in data analysis, and reviewed the manuscript.

#### FUNDING

SC and SP acknowledge the Natural Sciences and Engineering Research Council of Canada for its financial support through a discovery grant (RGPIN-2015-06512).

#### ACKNOWLEDGMENTS

This work was granted access to the HPC resources of Aix-Marseille Université financed by the project Equip@Meso (ANR-10-EPQX-29-01) of the program Investissements d'Avenir supervised by the Agence Nationale de la Recherche, and to the HPC ressources of Compute Canada.

### REFERENCES


Int. J. Heat Fluid Flow 61, 677–710. doi: 10.1016/j.ijheatfluidflow.2016. 07.013


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Chateau, D'Ortona, Poncet and Favier. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Support Vector Machine Based Monitoring of Cardio-Cerebrovascular Reserve during Simulated Hemorrhage

Björn J. P. van der Ster 1, 2, 3, Frank C. Bennis 4, 5, Tammo Delhaas 4, 6 , Berend E. Westerhof 2, 3, 7, Wim J. Stok 2, 3 and Johannes J. van Lieshout 1, 2, 3, 8 \*

<sup>1</sup> Department of Internal Medicine, Academic Medical Center, University of Amsterdam, Amsterdam, Netherlands, <sup>2</sup> Department of Medical Biology, Academic Medical Center, University of Amsterdam, Amsterdam, Netherlands, <sup>3</sup> Laboratory for Clinical Cardiovascular Physiology, Center for Heart Failure Research, Academic Medical Center, Amsterdam, Netherlands, <sup>4</sup> Department of Biomedical Engineering, Maastricht University, Maastricht, Netherlands, <sup>5</sup> MHeNS School for Mental Health and Neuroscience, Maastricht University, Maastricht, Netherlands, <sup>6</sup> CARIM School for Cardiovascular Diseases, Maastricht University, Maastricht, Netherlands, <sup>7</sup> Department of Pulmonary Diseases, Institute for Cardiovascular Research, ICaR-VU, VU University Medical Center, Amsterdam, Netherlands, <sup>8</sup> MRC/Arthritis Research UK Centre for Musculoskeletal Ageing Research, School of Life Sciences, The Medical School, University of Nottingham Medical School, Queen's Medical Centre, Nottingham, United Kingdom

#### Edited by:

Mariano Vázquez, Barcelona Supercomputing Center, Spain

#### Reviewed by:

Radu Iliescu, University of Medicine and Pharmacy "Gr. T. Popa" Iasi, Romania Faezeh Marzbanrad, Monash University, Australia

#### \*Correspondence:

Johannes J. van Lieshout j.j.vanlieshout@amc.uva.nl orcid.org/0000-0002-3646-2122

#### Specialty section:

This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology

Received: 18 September 2017 Accepted: 04 December 2017 Published: 05 January 2018

#### Citation:

van der Ster BJP, Bennis FC, Delhaas T, Westerhof BE, Stok WJ and van Lieshout JJ (2018) Support Vector Machine Based Monitoring of Cardio-Cerebrovascular Reserve during Simulated Hemorrhage. Front. Physiol. 8:1057. doi: 10.3389/fphys.2017.01057 Introduction: In the initial phase of hypovolemic shock, mean blood pressure (BP) is maintained by sympathetically mediated vasoconstriction rendering BP monitoring insensitive to detect blood loss early. Late detection can result in reduced tissue oxygenation and eventually cellular death. We hypothesized that a machine learning algorithm that interprets currently used and new hemodynamic parameters could facilitate in the detection of impending hypovolemic shock.

Method: In 42 (27 female) young [mean (sd): 24 (4) years], healthy subjects central blood volume (CBV) was progressively reduced by application of −50 mmHg lower body negative pressure until the onset of pre-syncope. A support vector machine was trained to classify samples into normovolemia (class 0), initial phase of CBV reduction (class 1) or advanced CBV reduction (class 2). Nine models making use of different features were computed to compare sensitivity and specificity of different non-invasive hemodynamic derived signals. Model features included: volumetric hemodynamic parameters (stroke volume and cardiac output), BP curve dynamics, near-infrared spectroscopy determined cortical brain oxygenation, end-tidal carbon dioxide pressure, thoracic bio-impedance, and middle cerebral artery transcranial Doppler (TCD) blood flow velocity. Model performance was tested by quantifying the predictions with three methods: sensitivity and specificity, absolute error, and quantification of the log odds ratio of class 2 vs. class 0 probability estimates.

Results: The combination with maximal sensitivity and specificity for classes 1 and 2 was found for the model comprising volumetric features (class 1: 0.73–0.98 and class 2: 0.56–0.96). Overall lowest model error was found for the models comprising TCD curve hemodynamics. Using probability estimates the best combination of sensitivity for class 1 (0.67) and specificity (0.87) was found for the model that contained the TCD cerebral

**411**

blood flow velocity derived pulse height. The highest combination for class 2 was found for the model with the volumetric features (0.72 and 0.91).

Conclusion: The most sensitive models for the detection of advanced CBV reduction comprised data that describe features from volumetric parameters and from cerebral blood flow velocity hemodynamics. In a validated model of hemorrhage in humans these parameters provide the best indication of the progression of central hypovolemia.

Keywords: cardiovascular modeling, cerebrovascular, hypovolemia, lower body negative pressure, machine learning, support vector machine

### INTRODUCTION

Hypovolemic shock is the hemodynamic response to a critically reduced central blood volume (CBV) and its diagnosis has challenged clinicians since the Second World War (Grant and Reeve, 1941; Secher and Van Lieshout, 2016). The main treatment consists of intravenous volume administration (Secher and Van Lieshout, 2005) to raise cardiac output (CO) and improve microvascular blood flow (Vincent and De Backer, 2013; Perner and De Backer, 2014; Secher and Van Lieshout, 2016) and tissue oxygen delivery (Zollei et al., 2013; Simon et al., 2015). However, detection of a clinically relevant blood volume deficit remains difficult (Marik et al., 2011; Vincent and De Backer, 2013; Bronzwaer et al., 2015; Secher and Van Lieshout, 2016) because the blood volume is not only characterized by its magnitude but also by its function as preload to the heart (Marik et al., 2011; Bronzwaer et al., 2015; Secher and Van Lieshout, 2016). From that perspective, a functional definition of "normovolemia" is by its ability to provide the heart with an adequate CBV i.e., cardiac preload that maintains stroke volume, cardiac output, and oxygen delivery (Harms et al., 2007; Truijen et al., 2010). Direct measures of CBV are not routinely available in the clinical environments of intensive care and operating room. As a result, volume treatment during anesthesia is generally planned according to a somewhat arbitrary fixed volume regime (Bundgaard-Nielsen et al., 2009) or guided by blood pressure (BP) and heart rate (HR). However, interpretation of BP and HR changes in response to a reduction in CBV is not straightforward since loss of 1 l of blood or fluid is not reflected in changes in BP (Harms et al., 2003). Therefore, optimization of tissue oxygen delivery cannot be conducted by monitoring arterial pressure alone (Michard and Teboul, 2002; Convertino, 2012; Secher, 2015; Cannesson, 2016). It is problematic that present hemodynamic monitoring techniques do not allow detection and therefore early treatment of a volume deficit before worsening of the cardio-cerebrovascular condition compromising oxygenation of the brain (Secher and Van Lieshout, 2005).

We hypothesized that the arterial pressure and transcranial cerebral blood flow velocity waveforms contain subtle information on the actual cardio-cerebrovascular condition that is hard to interpret by human visual inspection. We set out to investigate whether a machine learning model (Deo, 2015) could be trained to detect hypovolemia using hemodynamic signals during progressive reduction of CBV. This would allow determination to what extent the cardiovascular system can compensate hypovolemia, i.e., its compensatory reserve prior to (impending) circulatory collapse (Convertino et al., 2016), by classifying patients according to their actual need of fluid therapy (Convertino and Sawka, 2017) and allow timely clinical intervention. Given that the brain is highly susceptible to hypoperfusion and hypoxia we hypothesized that the cerebral flow velocity wave shape may disclose early alterations that can be alleged to the hypovolemia induced onset of cerebral hypoperfusion resulting in pre-syncope. Earlier machine learning approaches based on BP waveforms (Moulton et al., 2013) and beat-to-beat parameters (Bennis et al., 2017) showed that it can detect a reduction in CBV. To that purpose, we parametrized both the BP and TCD waveforms to make information about curve dynamics available for statistical modeling during progressive hemorrhagic shock and compared the BP features to features from other non-invasive hemodynamic technologies. We trained a model to recognize progressive hypovolemia by means of supervised machine learning and tested it on a human model of progressive hemorrhagic shock (lower body negative pressure, LBNP). The goal was to create a model that picks up on changing physiology during the transitional phase from compensated to uncompensated circulatory shock by classifying each heartbeat based on its accompanying feature information and to check which non-invasive hemodynamic monitor contributes the most sensitive information to solve this problem.

### METHODS

#### Subjects

Forty-two young, healthy volunteers [27 female; age: mean (SD): 24 (4) years] with no history of fainting and/or cardiac arrhythmia nor taking cardiovascular medication participated in the study. They abstained from heavy exercise and caffeinated beverages at least 12 h prior to the experiment. Before inclusion subjects underwent a medical screening prior to the experiment consisting of a medical interview, a physical examination and a 12-lead ECG. The experiments were conducted in a quiet, temperature-controlled laboratory (20–22◦C). This study was carried out in accordance with the recommendations of Academic Medical Centre Amsterdam medical ethical committee (#2014\_310) with written informed consent from all subjects. All subjects gave written informed consent in accordance with the Declaration of Helsinki. The protocol was approved by the

medical ethical committee of the Academic Medical Centre, Amsterdam.

### Experimental Protocol

Measurements were performed with subjects in the supine position. Following instrumentation, the lower body was positioned inside a lower body negative pressure (LBNP) box (Dr. Kaiser Medizintechnik, Bad Hersfeld, Germany) and sealed at the level of the iliac crest (Goswami et al., 2009). To prevent a downward shift of the body into the LBNP box disrupting the airtight sealing with loss of sub-atmospheric pressure, the LBNP box was equipped with a saddle (Bronzwaer et al., 2017a). Subjects rested for 30 min of which the final 10 were designated as baseline segment, followed by application of a single step continuous negative pressure (50 mmHg below atmospheric pressure) to the lower part of the body. The pressure inside the box was manually controlled and established within less than 20 s.

During the experiment, subjects were instructed to breathe normally and to avoid movement and muscle flexing. In compliance with our laboratory safety guidelines LBNP was terminated in case of (pre-)syncopal symptoms including sweating, light-headedness, nausea, blurred vision, and/or signs meeting one or more of the following criteria: systolic arterial pressure (SAP) below 80 mmHg, or rapid drop in BP [SAP by ≥25 mmHg·min−<sup>1</sup> , diastolic arterial pressure (DAP) by ≥15 mmHg·min−<sup>1</sup> ], and drop in HR by ≥15 bpm·min−<sup>1</sup> . If none of these criteria occurred within 30 min, the protocol was ended. The subjects were continuously monitored by an investigator experienced in human studies and unoccupied by experimental obligations.

### Measurements

Continuous arterial BP was measured non-invasively by volumeclamp finger plethysmography with the cuff placed around the middle phalanx of the left hand placed at heart level (Nexfin, Edwards Lifesciences BMEYE, the Netherlands) and sampled at 200 Hz. Left ventricular stroke volume (SV) was determined by a pulse contour method (Nexfin CO-trek, Edwards Lifesciences BMEYE, Amsterdam, the Netherlands). Cardiac output (CO) was calculated as the SV times HR and total peripheral resistance (TPR) was the ratio of mean arterial pressure (MAP) to CO. Changes in CBV were monitored using thoracic impedance (TI) (Nihon Kohden, AI-601G, Japan) (Krantz et al., 2000; van Lieshout et al., 2005). Cerebral blood flow velocity was measured in the proximal segment of the middle cerebral artery (MCA) by means of TCD (DWL Multidop X4, Sipplingen, Germany). The left MCA was insonated through the temporal window just above the zygomatic arch at a depth of 40–60 mm with a pulsed 2 MHz probe. After signal optimization, the probe was fixed on a specially designed head-band (Marc 600, Spencer Technologies, Redmond, Washington, USA). Changes in oxygenated and deoxygenated hemoglobin (Hb) as well as their summation were measured using continuous wave nearinfrared spectroscopy (NIRS) (Oxymon, Artinis, Zetten, The Netherlands). NIRS tracks cortical cerebral oxygenation during manipulation of CBF in parallel with the brain capillary oxygen saturation (Rasmussen et al., 2007). A differential path length factor was computed according to Gersten et al. (Gersten, 2015) to account for the scattering of light in the brain tissue. NIRS signals were recorded at 10 Hz. Optodes were fixed just above the supraorbital ridge and below the hairline. Changes in cutaneous perfusion may interfere with the accuracy of cerebral oximetry, therefore the distance between the transmitter and the receivers was 5 cm to assure deep enough penetration of the near-infrared light into the brain to exclude substantial contamination from the extra-cerebral circulation (Claassen et al., 2006).

End-tidal CO<sup>2</sup> partial pressure (ETCO2) was measured through a nasal cannula connected to a capnograph (Hewlett Packard 7834A, Wokingham, UK). All recorded signals were analyzed offline (Matlab R2007b, Mathworks Inc. MA, USA) and visually inspected for artifacts and noise. Invalid beats were manually removed and interpolated.

### MODELING APPROACH

Models were created by means of a support vector machine algorithm [libsvm software package for Matlab (Chang and Lin, 2011)]. We used a supervised learning approach to detect worsening of the cardio-cerebrovascular condition from cardiovascular stability at rest toward instability when approaching pre-syncope. To this extent, we defined three distinct classes of the hemodynamic condition (see "class definition"). The algorithm then used one of 9 designated feature sets (listed next) to detect patterns in an attempt to classify each heartbeat in one of the three classes. For each feature set a model was computed using a non-linear radial basis function (Gaussian) kernel (Bishop, 2006). To find the optimal model configuration for each respective feature set we used 64 combinations of values for both kernel width (gamma) and C-value (8 values for each). Using a randomly selected 30 subjects train vs. 1 test subject approach, this analysis was deemed optimal once the sum of sensitivity and specificity was maximal on average for all tested subjects.

### Class Definition

Baseline rest as well as onset of LBNP and pre-syncope were marked. Time points originating from data during baseline were designated as class 0, samples from data during the first 75% of LBNP as class 1 and samples belonging to the last 25% of LBNP before onset of pre-syncope (i.e., endstage LBNP) were defined as class 2 (**Figure 2**). Multiclass in libsvm is handled by a one-vs.-one approach (Hsu, 2002). The corresponding feature values at these time points were labeled with one of these three classes. Static features were extracted on a beat-to-beat basis whereas dynamic features (variation and trends over time) were extracted by a moving windowing function of fixed size (see model specifications) where each moved window was classified as one of three classes. Due to how the class definitions were created, class distribution was not homogenous. Around 33% of the dataset was baseline data (class 0); 50% was class 1 and 18% class 2.

interest are shaded.

### Feature Extraction

To test the viability of different measured parameters from noninvasive measurement modalities we designed 7 models (named model #1 through #7). All shared the BP curve dynamics features (model #1, **Figure 1**, Table A1 in Supplementary material).

defined as end-stage LBNP before pre-syncope (class 2).

Features were then appended for models #2 through #7 for each investigated measurement modality to evaluate predictive capability when adding features from ETCO2, TI, NIRS, or TCD in modeling impending pre-syncope. All extracted features were down sampled by a factor 10 to abridge calculation time. Two models (named models #8 and #9) stand on their own and do not include the BP curve dynamics feature set.

### DEFAULT MODEL: BP CURVE DYNAMICS (MODEL #1)

From the arterial BP wave, beat-to-beat systolic, diastolic, mean, pulse pressure (SAP, DAP, MAP, and PP), interbeat interval (IBI), HR, stroke volume (SV), cardiac output (CO), left ventricular ejection time (LVET), and TPR were extracted (10 features). Four incrementally sized intervals during LBNP (30, 60, 90, and 120 s) were used for calculating trends and variances of SAP, DAP, HR, PP, and SV [4 intervals times 5 parameters for 2 techniques (trend and variation) delivers 40 features]. Additional information from the BP wave shape was extracted by wave segmentation and parametrization (**Figure 1** and Table A1, Appendix Supplementary material, 15 features) making a total of 65 parameters available for the BP curve dynamics model.

## INCREMENTAL MODELS

Either beat-to-beat interpolated ETCO<sup>2</sup> partial pressure or TI were appended in models #2 and #3 respectively (each has 1 additional feature). Features extracted from the NIRS consisted of the three concentrations of Hb: oxygenated, deoxygenated, and their summation (total Hb). Ratios of oxygenated and deoxygenated to total Hb were added as well to this model (model #4, 5 additional features).

Similar to the BP wave parametrization, the same points, durations, tangents, and surface areas were derived from the cerebral blood flow velocity wave. Further features comprised systolic, diastolic, and mean flow velocity as well as the difference


Models 2 through 7 contain the features from model #1 with device specific features. Models 8 and 9 are smaller models, that contain features that are currently clinically used and/or available.

between systolic and diastolic flow velocity (flow velocity pulse height) and their variation and trends over the same intervals as described for model #1. Also included were the cerebral autoregulatory computed gain and phase expressed as the transfer function between MAP and MFV over a 3-min moving window between BP and MFV (Zhang et al., 1998). The low frequency band (0.06–0.15 Hz) where covariation in both signals was significant (coherence of at least 0.5) was averaged to get respective gain and phase. Model #5 will further be referred to as flow velocity curve dynamics model (FV curve dynamics). Model 6 and 7 had a single FV derived feature addition. Either the MFV or flow velocity pulse height were appended to models #6 and #7, respectively.

#### FURTHER MODELS (MODELS #8 AND #9)

Two separate models were created to check model performance without newly introduced features. A model with the basic hemodynamic output from the Nexfin device (SAP, DAP, MAP, PP, IBI, HR,SV, CO, TPR, and LVET, model #8) was created to evaluate their additional value compared to BP and HR. A


Optimal results following the 64-fold optimization steps for different incremental values for regularization parameter C (misclassification penalty) and gamma (deviation of the radial basis Kernel) for each feature set.

model comprising of mere BP (SAP, DAP, and MAP) and HR was introduced as a basic model (#9).

The number of features in each model is summarized in **Table 1**.

Parameters were transposed into a feature matrix, normalized with respect to values during baseline and scaled so that all features ranged between 0 and 1. Alongside, a corresponding label vector that contained the appointed class per subject of each feature row was appended.

#### Training and Testing Process

Integral data sets of subjects were included in the modeling algorithm in order to prevent contaminating data from subjects in both training and testing set. Training data consisted of data from a subselection of 30 randomly chosen subjects which changed each iteration. The resulting model was then tested on a single subject who was not part of the training set. This process was repeated for all 42 subjects. The subset of 30 subjects was chosen to reduce total training time.

#### Model Selection

Classification success was defined as to what extent a model correctly classifies individual samples. Each successive feature


Expressed as difference between moving averaged prediction and the predefined class line (Figure 3). Lowest error per class indicated in bold.

TABLE 3 | Median [25% 75%] sensitivity and specificity for different features sets for the three designated classes.


Class 0: rest; class 1: during LBNP; class 2: final stage LBNP before pre-syncope per model structure. Highest cumulative sensitivity, specificity in that class is indicated in bold.

addition returned a unique classification outcome that inor decreased model performance. Each model estimated the probability of a new sample belonging to each of the three classes. Since the classes were defined arbitrarily it is unlikely that the trained models describe a relevant physiological paradigm. To select the best model (and thus its corresponding feature set) three methods were used to quantify model performance.

#### 1. Actual model sensitivity and specificity

Sensitivity and specificity per class were the numbers as classified by the trained models without taking into account additional detail of probability estimates of each class. Sensitivity and specificity were calculated on a 1-vs.-all manner.

#### 2. Individual model error

Model error was expressed as the difference between the predefined classes and the moving average of the prediction of each model.

3. Specificity and sensitivity by accounting for probability estimates

Next to each model classifying every individual sample, all models return a probability for the sample belonging to each respective class. In method 1 the class with the highest probability is selected as the prediction of the model for that sample. To account for probability estimates we took the ratio of the probability of a sample belonging to class 2 over its probability belonging to class 0. The logarithm of this (odds) ratio was taken and lower and upper cutoff values for this ratio were determined by using stepwise incremental thresholds to distinguish between classes 0, 1, and 2. The cutoffs were defined as optimal when the sum of both sensitivity and specificity was maximal.

#### RESULTS

The results of the search for the optimal C and gamma values per model are given in **Table 2**. These optimal models were chosen to compute both sensitivity and specificity (**Table 3**), the model errors (**Table 4**) and to detect optimal cutoffs for the probability estimate analysis (**Table 5**).

#### Actual Model Sensitivity and Specificity

Regarding classes 1 and 2, the combination with highest sensitivity and specificity was found for the model comprising volumetric features (#8) (class 1: sensitivity: 0.73; specificity: 0.98; class 2: sensitivity: 0.56; specificity: 0.96) (**Table 3**). Adding variation, trends and BP curve dynamics (model #1, **Figure 1**) did not improve the performance of the model for classes 1 (sensitivity 0.63; specificity 0.98) and 2 (sensitivity 0.56; specificity 0.95). Sequentially adding features of ETCO2, TI, or from NIRS or TCD devices also did not improve classifying actual model sensitivity. Specificity was maintained.

#### Individual Model Error

The FV curve dynamics model (#5) had the lowest error for all three classes combined (**Table 4**). The median error of the BP curve dynamics (#1) vs. FV curve dynamics model (#5) was greater for class 2. The largest fraction of subjects (12/42) benefited from the FV curve dynamics model (#5) since it had the lowest overall error. Models with either mean MCAv (model # 6) or pulse height of MCAv (model #7) accounted for another 8/42 subjects. The BP curve dynamics model (#1) had the lowest error for 10/42 subjects. Models including ETCO<sup>2</sup> (#2) or NIRS (#4) both performed best 5/42 times. The TI model (#3) came in last as the best model for 2/42 subjects.

#### Specificity and Sensitivity by Accounting for Probability Estimates

In general, all models had similar sensitivity for baseline (class 0) (range: [0.89; 0.95]) and specificity ([0.90; 0.96]) (**Table 5**). Regarding class 1, the best combination of sensitivity and specificity was found for the model that contained the FV derived pulse height (model #7). The highest combination for class 2 was found for the model with the volumetric features (model #8). This model also had the highest combination for both class 1 and

TABLE 5 | Sensitivities and specificities of all models using two cutoffs on probability estimates.


Model numbers indicate: 1, BP curve dynamics; 2, ETCO2; 3, TI; 4, NIRS; 5, TCD dynamics; 6, MCAv mean; 7, MCAv pulse height; 8, Volumetric; 9, HR and BP. Bold: highest cumulative sensitivity, specificity in that class.

class 2 together. An overview of all classification samples can be found in the confusion matrices (Stehman, 1997) in Appendix 2 Supplementary material. For both the actual models and after accounting for probability estimates. In general it can be seen that the models encounter most difficulty in the distinction between class 1 and class 2 while the distinction between class 0 and either class 1 or 2 is clearer.

### DISCUSSION

The new findings of the present study are, first, that distinguishing between normovolemia and considerable central hypovolemia in healthy young adults requires information from volumetric hemodynamic features beyond BP and HR, such as IBI, SV, CO, LVET, and TPR. Second, the cerebral blood flow velocity parameters reduced model error, possibly due to the creation of a more easily separable solution.

Features derived from the BP curve, ETCO2, TI, and from cerebral blood flow velocity and brain cortical oxygenation did not improve the classification in terms of sensitivity to detect advanced class 2 hypovolemia. In contrast, cerebral blood flow velocity models (#5–7) outperformed the other models in terms of absolute error from the predefined (artificially created) classes. Models 2–4 [ETCO2, central blood volume (TI), and cerebral cortical oxygenation (NIRS)] contributed to such an extent that they were the best discriminative model for fewer subjects and therefore in general seem less sensitive to the detection of CBV depletion.

In machine learning or datamining approaches large datasets are investigated to determine whether these features together result in a better solution to the problem at hand. A mechanistic approach may not find such a solution in a multidimensional space. The underlying physiological mechanisms can ideally be described by such a mechanistic approach so that it can explain the wide variety of pathophysiology as is seen in different patients. Unfortunately, this is not easily achieved and assumptions would have to be made for many parameters, as they cannot be measured in real time (or at all) resulting in a model that is not very useful for individual cases. Due to the large natural variation between subjects, some individuals increase peripheral resistance to maintain adequate blood pressure, whereas others increase heart rate at onset of LBNP, yet another group responds in a mixed fashion. We do not think these subtleties can be grasped by a mechanistic approach, unless the responses of a patient would be assessed beforehand which is not feasible in clinical practice.

It is possible that a unique set of features exists from different devices that gives an even better solution. To assess this possibility would require a feature selection process which is cumbersome for this amount of models. We considered that these devices are either connected as monitors to patients or not. If so, they return a fixed array of features which was included in the models here. This study therefore aimed to describe which monitors deliver the most sensitive features and should therefore be connected as a monitor for detecting changing CBV.

#### Limitations

By design the subjects were healthy individuals exposed to simulated bleeding which restrains us from extrapolating the data to elderly subjects, considering that with healthy aging brain perfusion becomes increasingly dependent on cardiac output (Bronzwaer et al., 2017b).

The current models require that its features are normalized to a reference baseline condition. This will be required as well for future use of the models. Future studies should therefore be directed at finding similar model accuracy without baseline normalization. We recognize that eliminating normalization will increase intersubject scatter, inevitably reducing classification performance.

We consider the possibility that adding a considerable number of features introduced the phenomenon known as overfitting. This would imply that the model is being too specifically trained on training data and may not function equally well on new data. Since the SVM method is a regularization model, the introduction of large amounts of features does not necessarily have to lead to worse performance due to overfitting. However, we selected optimal gamma and C on the held-out data, which could have led to a form of overfitting, but due to the newly random selection of 30 subjects in the testing step as well, this is expected to be marginal.

Classes were not distributed homogenously. Especially during training this could have had a significant effect on the outcome as the algorithm could have had relatively more examples of what is considered class 2 with respect to the other classes.

Since the training was performed on a subset of subject data, the reported numbers for sensitivity and specificity are not absolute and will be different if the analysis is repeated. In healthy subjects, variation in cardiovascular responses to sympathetic stimulation evoked by submaximal lower body negative pressure (LBNP) is considerable (Bronzwaer et al., 2016). Differences in resting HR between subjects suggest individually programmed reflex strategies of autonomic blood pressure control which may contribute to the hitherto unpredictable variance observed in cardiovascular reflex responses to central hypovolemia (Bronzwaer et al., 2016). Due to this large natural variation in subject responses we considered that by using a random subset the models are not focused on a fixed set but will vary with each iteration. Also since not everyone experiences symptoms of pre-syncope in the exact same way there may be a bias toward the point that was defined as pre-syncope here. By using a random subset of individuals the models were never trained on the full set of this bias but included different subjects each training iteration.

### Classification and Tracking

The fact that feature sets from cerebral oxygenation, central blood volume, or cerebral blood flow velocity data do not qualify beats better than the volumetric features seems to suggest that their capability to predict pre-syncope may be low or at least not better than HR and BP combined with LVET, CO, TPR, and SV. However, the probability estimation of class 2 shows a notable increase indicating that in the large majority of subjects the developed models all recognized the process of moving from baseline, to CBV depleted, to pre-syncope.

One explanation for the limited difference in performance between models #1 and #8 may be that the Nexfin built-in algorithms in itself include a BP wave shape analysis (pulse contour).

Any attempt to produce a complete clinical classification of hemorrhagic shock for the individual patient can be only provisional due to the complex interrelations in physiological adaptive responses (McMichael, 1944; Michard and Teboul, 2002; Perner and De Backer, 2014). Similarly, between healthy subjects the variation in cardiovascular responses to sympathetic stimulation evoked by bleeding is considerable. Distinct cardiovascular response patterns of preferential autonomic blood pressure control appear consistent over time within one subject but with considerable inter-individual variance in tolerance to hypovolemia (Convertino et al., 2012; Ryan et al., 2012; Bronzwaer et al., 2016). This explains the difference in time until pre-syncope and thus differences in the number of samples between subjects available to the models (Jellema et al., 1996). The models are nevertheless requested to assign one of the three classes to each individual subject through the whole trajectory from normo- to hypovolemia. Also, the large number of samples available for class 0 compared to class 1 and 2 creates an unequal distribution of samples between the three classes. This also explains the overall high specificity, since classification of a sample not belonging to the investigated class could mean either of two remaining classes.

The translation from model output to underlying physiological events is by no means straightforward. Defining the classes from normo- to hypovolemia served merely to create an artificial distinction between the ongoing circulatory adaptive responses to progressive central hypovolemia. As a consequence, the underlying physiological adaptive responses may not fit into the predefined classes and reported sensitivity does neither reflect direct classification of physiology. However, the actual sensitivity/specificity is amenable for improvement by using the certitude of the model by introducing a cut-off analysis on the probability estimates as proposed in order to quantify model performance. This better approaches a classification on a physiological response as changing probabilities of the classes could hint at progression toward cardiovascular instability respectively a return to normovolemia that can be tracked over time.

Ideally, model performance is described by the individual (moving averaged) prediction line as they tend to increase during progressive hypovolemia (**Figure 3**), as a visual manifestation of the increasing probability of impending circulatory collapse since it immediately visualizes into what direction the patient's hemodynamic condition is headed. We attempted to overcome the fact that this measure is difficult to express as a numeric error by implementing three different ways of model performance quantification. This probability estimate analysis increased model sensitivity and specificity by taking into account the complexity of the output of the model in the relative large variation of subject responses to hypovolemia.

Classification of heart beats belonging to either class 0 or class 1 and 2 is straightforward, and appeared linearly separable using

#### REFERENCES


only a few features. This may be due to the fact that this protocol was executed in a controlled setting and due to the fact that the data was normalized to a baseline value. Detecting whether a particular beat should be classified to either the class 1 or class 2 state of being hypovolemic is more challenging, hence the use of a non-linear Gaussian kernel. Due to the large inter-individual variance and artificial nature of class creation, the data show a considerable overlap for the currently presented features, which hindered us into constructing models with a higher sensitivity. Rather, the moving average during the classification process in itself has the potential to function as a real-time visualization of progress toward hypovolemia induced cardiovascular instability.

### AUTHOR CONTRIBUTIONS

Data acquisition: BvdS, FB, WS. Analysis: BvdS. Figure preparation: BvdS, JvL. Manuscript drafting: BvdS, JvL. Data interpretation: BvdS, FB, TD, BW, WS, JvL. Manuscript editing: BvdS, BW, FB, TD, WS, JvL.

### FUNDING

This work was supported by an educational grant from Edwards Lifesciences (2010B0797).

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphys. 2017.01057/full#supplementary-material


**Conflict of Interest Statement:** This work was supported by an educational grant from Edwards Lifesciences. They, however had no say in any of the content provided or the direction of submitted research.

Copyright © 2018 van der Ster, Bennis, Delhaas, Westerhof, Stok and van Lieshout. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# **Capturing the Cranio-Caudal Signature of a Turn with Inertial Measurement Systems: Methods, Parameters Robustness and Reliability**

*Karina Lebel 1,2 , Hung Nguyen3,4, Christian Duval 3,4, Réjean Plamondon<sup>5</sup> and Patrick Boissy 1,2 \**

*1 Faculty of Medicine and Health Sciences, Orthopedic Service, Department of Surgery, Université de Sherbrooke, Sherbrooke, QC, Canada, <sup>2</sup>Research Centre on Aging, Sherbrooke, QC, Canada, <sup>3</sup>Département des Sciences de l'activité Physique, Université du Québec à Montréal, Montreal, QC, Canada, <sup>4</sup>Centre de Recherche Institut Universitaire de Gériatrie de Montréal, Montreal, QC, Canada, <sup>5</sup> Laboratoire Scribens, Département de génie Électrique, École Polytechnique de Montréal, Montréal, QC, Canada*

#### *Edited by:*

*Mariano Vázquez, Barcelona Supercomputing Center, Spain*

#### *Reviewed by:*

*Peter C. Fino, Oregon Health & Science University, United States Guanghao Sun, University of Electro-Communications, Japan*

*\*Correspondence:*

*Patrick Boissy patrick.boissy@usherbrooke.ca*

#### *Specialty section:*

*This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Bioengineering and Biotechnology Received: 08 May 2017 Accepted: 04 August 2017 Published: 23 August 2017*

#### *Citation:*

*Lebel K, Nguyen H, Duval C, Plamondon R and Boissy P (2017) Capturing the Cranio-Caudal Signature of a Turn with Inertial Measurement Systems: Methods, Parameters Robustness and Reliability. Front. Bioeng. Biotechnol. 5:51. doi: 10.3389/fbioe.2017.00051* **Background:** Turning is a challenging mobility task requiring coordination and postural stability. Optimal turning involves a cranio-caudal sequence (i.e., the head initiates the motion, followed by the trunk and the pelvis), which has been shown to be altered in patients with neurodegenerative diseases, such as Parkinson's disease as well as in fallers and frails. Previous studies have suggested that the cranio-caudal sequence exhibits a specific signature corresponding to the adopted turn strategy. Currently, the assessment of cranio-caudal sequence is limited to biomechanical labs which use camera-based systems; however, there is a growing trend to assess human kinematics with wearable sensors, such as attitude and heading reference systems (AHRS), which enable recording of raw inertial signals (acceleration and angular velocity) from which the orientation of the platform is estimated. In order to enhance the comprehension of complex processes, such as turning, signal modeling can be performed.

**Aim:** The current study investigates the use of a kinematic-based model, the sigmalognormal model, to characterize the turn cranio-caudal signature as assessed with AHRS.

**Methods:** Sixteen asymptomatic adults (mean age = 69.1 *±* 7.5 years old) performed repeated 10-m Timed-Up-and-Go (TUG) with 180° turns, at varying speed. Head and trunk kinematics were assessed with AHRS positioned on each segments. Relative orientation of the head to the trunk was then computed for each trial and relative angular velocity profile was derived for the turn phase. Peak relative angle (variable) and relative velocity profiles modeled using a sigma-lognormal approach (variables: Neuromuscular command amplitudes and timing parameters) were used to extract and characterize the cranio-caudal signature of each individual during the turn phase.

**Results:** The methodology has shown good ability to reconstruct the cranio-caudal signature (signal-to-noise median of 17.7). All variables were robust to speed variations (*p >* 0.124). Peak relative angle and commanded amplitudes demonstrated moderate to strong reliability (ICC between 0.640 and 0.808).

**Conclusion:** The cranio-caudal signature assessed with the sigma-lognormal model appears to be a promising avenue to assess the efficiency of turns.

**Keywords: turn, deficit, signature, inertial motion capture, IMU, attitude and heading reference system**

### **INTRODUCTION**

Functional mobility is a key component of the quality of life in older adults. Basic daily activities involve the execution of mobility tasks, such as walking, turning, standing up and sitting down. Turning, defined as a change in walking direction, is a specifically challenging mobility task which requires inter-limb coordination and postural stability to adequately follow the central nervous system instructions (Mancini et al., 2015a; Mellone et al., 2016). Turning must also be planned in advance to efficiently and safely process and execute the information leading to the modified trajectory (Patla et al., 1999). Deficits in postural transitions, such as turning, have been identified in frails (Galán-Mercant and Cuesta-Vargas, 2014) and persons with neurological deficits (Salarian et al., 2009; Mancini et al., 2015a) and are associated with a higher risk of falling (Mancini et al., 2016). It has also been shown that objective turn metrics (e.g., number of steps while turning) are able to identify individuals with mobility impairments better than traditional gait speed and clinical measures of mobility (Carpinella et al., 2007; Salarian et al., 2009; Zampieri et al., 2010; King et al., 2012; Spain et al., 2012). Consequently, studies have suggested an increased vulnerability to impairments during the turn compared to straight-line walking due to the complexity of the task and the neural systems involved (Herman et al., 2011). Recently, Hulbert et al. (2015) have suggested categorizing turning deficits into axial and perpendicular deficits, where perpendicular deficits relates to suboptimal movement in the limbs while axial deficits refers to inadequate movement of the head to trunk rotational axis. Perpendicular deficits would, therefore, include: an increased number of steps, related to the use of a compensatory strategy; a reduced step length, to maintain postural stability; and a modified turn strategy. Alternatively, axial deficits would include segment rigidity and segment rotation which would require the adoption of compensatory strategies, and segment coordination and timing, leading to overall uncoordinated movements. On a global scheme, all of these deficits may be viewed inter-related since full body control and coordination is required to safely execute a turn. Thus, Hulbert suggests that axial deficits may lead to altered control in perpendicular segments. If so, axial deficits may appear first and early assessment of such deficits may lead to better prevention.

In healthy individuals, it has been shown that efficient turning involves a cranio-caudal sequence of movement where the head initiates the motion, followed by the trunk and then the pelvis to efficiently steer the body into the desired new direction (Fuller et al., 2007; Hong et al., 2009). This sequence was shown to be altered in people with neurodegenerative disease and those who are recurrent fallers, exhibiting increased coupling of the segments (Ferrarin et al., 2006; Crenna et al., 2007; Hong et al., 2009; Wright et al., 2012; Spildooren et al., 2013). However, all of these observations were made in motion capture laboratories using camerabased stereophotogrammetric systems. Although powerful, such systems are expensive, complex to use, require a large dedicated space and have a constrained volume of acquisition (Zhou and Hu, 2008). As such, these systems are not well-adapted to clinical settings. To efficiently be used in a clinical context, a system must preferably be portable, configurable, relatively low-cost, easy to use, and output information must be easily interpreted from a clinical perspective (Ginsburg, 2005; Anderson et al., 2012; Gaudreault et al., 2012). Advances in wearable technology offer new possibilities for researchers and clinicians to assess mobility. Inertial measurement systems are among promising wearable sensors which have gathered an increasing interest in the past decade because of their portability, autonomy, acquisition frequency, and general form factor (size, and configuration) (Zhou and Hu, 2008; Horak et al., 2015). Inertial measurement systems include attitude and heading reference systems (AHRS), also referred to in the literature as magnetic and inertial measurement unit (MIMU), magnetic angular rate and gravity sensor, or Inertial and Meagnetic Unit (MIMU). AHRS are comprised of 3-axes accelerometers, gyroscopes, and magnetometers from which information is fed into a fusion algorithm to estimate the orientation of the module in a global reference frame based on gravity and magnetic North. Therefore, using multiple AHRS affixed on contiguous segments makes it possible to assess a person's joints kinematics in different contexts. The diversity of sensors included within AHRS makes them good representative of commonly named movement monitors. This measurement system allows not only the quantity of activity performed to be monitored but also the quality of that motion through spatiotemporal gait and turn characteristics analysis as well as joint kinematics (Horak et al., 2015; Lebel et al., 2016).

Although multiple studies have used AHRS to assess mobility, the focus has always been on the raw sensors' information (i.e., acceleration and/or segment angular velocity). Consequently, turn duration and turn speed were identified as useful measures to characterize age-related changes (Sheehan et al., 2014; Vervoort et al., 2016), identify recurrent fallers from non-fallers (Greene et al., 2010; Zakaria et al., 2015; Mancini et al., 2016), differentiate between healthy controls and early Parkinson's disease patients (Salarian et al., 2009, 2010; Zampieri et al., 2010; El-Gohary et al., 2013; Mancini et al., 2015a), and frails (Galán-Mercant and Cuesta-Vargas, 2014). Although segment and joint orientation information may provide information on a person's functional capabilities that is more easily interpreted, it is far less exploited. Validity studies have proven that the accuracy of the orientation data is sufficient for coarse clinical kinematic assessment (Ferrari et al., 2010; Zhang et al., 2013; Lebel et al., 2017). However, literature also clearly highlights possible variations in accuracy with changing magnetic environment (Roetenberg et al., 2007; Palermo et al., 2014; Schiefer et al., 2014; Yadav and Bleakley, 2014) while accuracy has also been shown to vary across joints and tasks (Palermo et al., 2014; Lebel et al., 2016). Recently, Lebel et al. (2017) suggested that this variation may be partly linked to an optimal region of operation for segment angular velocity. These uncertainties regarding orientation data accuracy may explain the current underutilization of such data. However, these limitations are mainly present in extremity kinematics, where segment velocities are higher and magnetic perturbations are more common (Palermo et al., 2014; Lebel et al., 2017). During a turn, both the head and the trunk's angular velocity are within the optimal region of operation and magnetic perturbations can be assumed as minimal. Hence, the kinematic variation of the head relative to the trunk during a turn appears to be a good candidate to investigate the added value of AHRS orientation data analysis to derive meaningful clinical outcomes.

Traditionally, cranio-caudal sequence is assessed in biomechanics laboratories using camera-based stereophotogrammetric systems and analyzed in the temporal domain. Differences in temporal sequences are interpreted to be linked to different turning strategies. Such interpretations suggest that the cranio-caudal sequence exhibits a specific signature according to the adopted turn strategy. The so-called *movement signature* concept corresponds to the specific way (timing, force, amplitude, velocity) the movement is performed. Through signal modeling, the complex system involved in human movement can be reduced to a simpler form in order to better understand it. In this specific case, signal modeling is believed to provide insights into the mobility deficits. Human movement can be modeled using different paradigms which include, but are not limited to: equilibrium point models, minimization-based models, kinematic-based models and neural networks (Plamondon et al., 2014). Based on the Kinematics Theory, human movement can be seen as the cumulative response of an important number of biological systems (Plamondon, 1995a,b, 1998; Plamondon et al., 2003). Each system will produce a velocity vector from which their cumulative sum will, in the end, result in the movement of a segment. The motion can, therefore, be seen as the spatiotemporal representation of the energy induced on a specific body segment. The different systems involved in the planning and the execution of a specific task is controlled by the central nervous system. Therefore, assessment of human motion produced during a specific task can provide insights into the fundamentals of the motor control system (Wolpert et al., 1995). Analysis of the human motion through linear system modeling and an impulse response approach, therefore, seems to be a promising avenue for better characterization and early identification of motor control system deficits. Among those kinematic-based models are the delta- and sigma-lognormal models (Plamondon, 1995a,b; Djioua, 2007). These models rely on mathematical grounds to demonstrate that the lognormal function properly models the impulse response of the neuromuscular network in the case of rapid movements and can be seen as the optimal representation of the movement's kinematics (Djioua and Plamondon, 2009). Their applications ranges from human motor control phenomena explanations and the factors affecting it (Plamondon and Alimi, 1997; Plamondon et al., 2013a) to scripted signature verification (Djioua and Plamondon, 2009; Woch and Plamondon, 2010; Woch et al., 2011; Plamondon et al., 2013b; Diaz et al., 2016) and detection of fine motor control problems (O'Reilly and Plamondon, 2011; O'Reilly et al., 2014) as well as applications to monitor the evolution of fine motor control in kindertgarden children (Duval et al., 2015; Rémi et al., 2015). Indeed, directional rapid movements produce an asymmetrical bell-shaped velocity profile. This can be represented by lognormal functions with characteristic parameters and can be related to the system commands and its ability to respond (command impulse delay, command magnitude, execution delay, and response time). However, can such model be used to analyze axial control specifically? Preliminary studies within the angular domain have shown that the wrist flexion and extension in monkeys could be fit very well with a delta-lognormal model (Plamondon, 1995a), but no extensive study has further explore the interest of using the Kinematic Theory for the analysis of angular movement control.

This study investigates the possibility of characterizing the turn cranio-caudal signature *via* a sigma-lognormal model using the head relative to the trunk velocity profile derived from the orientation data assessed with AHRS. Specifically, this paper aims at (i) presenting and illustrating the methods required for head-trunk signature recognition based on AHRS recording of motion and (ii) evaluating the robustness and the reliability of the identified cranio-caudal signature parameters.

## **MATERIALS AND METHODS**

### **Protocol and Instrument**

The present study experimental protocol is based on the execution of a 10-m Timed-Up-and-Go (TUG). The TUG is a clinically recognized test to assess mobility and balance which combines basic mobility tasks (sit-to-stand, walk and turn) (Rehabilitation Institute of Chicago, 2010). Upon signal, the participant standsup, walks out to the 10-m mark, turns around, and walks back to his initial seated position (**Figure 1A**).

To enable assessment of kinematics, participants are instrumented with the IGS-180 suit (Synertial Ltd., UK) containing 17 AHRS (OS3D, Inertial Labs, USA) as shown in **Figure 1B**. Each AHRS measures raw inertial signals (segment linear acceleration, angular velocity and magnetic fields) and derives the orientation of the module, and hence the orientation of the segment it is attached to, in a global reference frame (**Figure 1C**). A validity study performed on this system revealed an acceptable accuracy and an excellent agreement for both the head and trunk sensors when compared with an optoelectronic gold standard during a turn (Lebel et al., 2017). The IGS-180 enables acquisition of data (raw inertial data and orientation data) over its 17 sensors at 60 Hz. Sensor to body alignment, required to express the sensor movement into anatomical planes of reference, is performed with the participant standing in a neutral position (standing up, looking

straight-ahead with palms facing their thighs) at the beginning of each trial.

## **Signal Processing**

**Figure 1D** gives an overview of the global workflow of the algorithm, including the signal processing. Trials are manually reviewed and segmented using the avatar in IGS-Bio, the application available with the IGS-180. Specifically, the procedure described below was followed to ensure systematic segmentation of the turns:


iv. localization of the beginning of the next gait cycle (i.e., heel strike following realignment)*→End of turn;*

All trials were segmented by the same evaluator in order to avoid bias. Further signal processing is performed in Matlab v2015a (MathWorks, USA). For each trial, the relative orientation of the head to the trunk is computed and expressed in anatomical planes of reference. The resulting relative angle signals are then filtered using a fourth order low-pass Butterworth filter with a cutoff frequency of 1.5 Hz. The cutoff frequency was determined from a residual-based analysis of the relative orientation signal, using an acceptable threshold of 2° and was performed over repeated trials (Carbonneau et al., 2013). The residual threshold was based on the reported accuracy of orientation data obtained with the present system (Lebel et al., 2013, 2017). For each trial, the cutoff frequency that yielded the acceptable residual threshold was calculated. The final cutoff frequency was calculated from the mean and SD values obtained over repeated trials analysis to cover 95% of the cases. The resulting filtered angle profile was then Lebel et al. Capturing the Cranio-Caudal Signature of a Turn with AHRS

transferred back into its quaternion form and used to compute the relative angular velocity profile.

Let us define

$$\begin{aligned} \theta & \quad \text{as the rotation angle and} \\ \vec{u} \stackrel{\triangle}{=} u &= (u\_x \mathbf{i} + u\_y \mathbf{j} + u\_z \mathbf{k}) & \text{as the unit vector, expressed with} \\ & \quad \text{the Cartesian axes } \mathbf{i}, \mathbf{j}, \mathbf{k} \end{aligned} \tag{1}$$

Then, the quaternion may be expressed as:

$$q = \cos\left(\frac{\theta}{2}\right) + \left(u\_x \mathbf{i} + u\_\flat \mathbf{j} + u\_z \mathbf{k}\right) \sin\left(\frac{\theta}{2}\right) \tag{2}$$

$$\underline{q} = \begin{bmatrix} \cos\left(\frac{\theta}{2}\right) \\ u\_{\mathbf{x}} \sin\left(\frac{\theta}{2}\right) \\ u\_{\mathbf{y}} \sin\left(\frac{\theta}{2}\right) \\ u\_{\mathbf{z}} \sin\left(\frac{\theta}{2}\right) \end{bmatrix} \tag{3}$$

The angular velocity of the head relative to the trunk (ω) can then be determined by Eq. 4 (Rico-Martinez and Gallardo-Alvarado, 2000).

$$\boldsymbol{\mathfrak{w}} = \dot{\boldsymbol{\theta}}(t)\hat{\boldsymbol{u}}(t) + \dot{\boldsymbol{\mu}}\sin(\boldsymbol{\theta}(t)) + \hat{\boldsymbol{u}}(t) \times \dot{\hat{\boldsymbol{u}}}(1 - \cos(\boldsymbol{\theta}(t))) \tag{4}$$

The axial component of the angular velocity, corresponding to the axial velocity profile of the head relative to the trunk, is then available to be used for further signature analysis.

### **Conceptual Framework and Parameters of Turn Signature**

The optimal turn cranio-caudal sequence generates a change in relative angular orientation of the head to the trunk which segments are realigned upon completion of the transition. The turn cranio-caudal signature conceptual framework, therefore, has two main components: the analysis of the relative head to trunk maximum angle reached during the turn and the investigation of the relative angular velocity profile derived from it *via* the sigma-lognormal model approach.

#### Relative Angular Velocity Profile Analysis

According to the Kinematics Theory, the impulse response of the neuromuscular system (NMS) can be identified by analyzing the characteristics of the movement itself. If it is assumed that the NMS encompasses the motor cortex down to the muscles, all neuronal activities processed prior to the NMS consequently translates into a delay in the impulse command sent to the system. The NMS itself is made of multiple motor units which can be modeled as non-linear sub-systems organized in such a way that allows them to work efficiently (Plamondon, 1995a; Djioua, 2007). The impulse response of such linearized system follows an asymmetric positive bell-shaped curve described by a lognormal function. If one considers the control strategy of a movement from an energy point of view, the velocity of the end effector becomes the basic unit of the motion and should, therefore, follow a lognormal profile. Thus, Plamondon and his team proposed and validated the use of the sigma-lognormal model on the velocity profile to analyze the human motion during scripted signature (Plamondon, 1995a; Plamondon et al., 2003; Djioua, 2007; Djioua and Plamondon, 2009; O'Reilly and Plamondon, 2009; Javier et al., 2013).

Here, we use the sigma-lognormal model to characterize the turn cranio-caudal signature. The two segments involved (head and trunk) can be seen as two NMSs, each one having its own lognormal impulse response. The output of each of these systems will, therefore, follow a lognormal profile for simple movements. In our study, we are interested in analyzing a more complex NMS, the head-trunk system, from which output can be seen as the vectorial summation of both basic systems outputs. Specifically, the cranio-caudal velocity profile can be decomposed into two phases corresponding to the moment the head initiates the turn, moving away from the trunk (phase 1) and the moment the trunk engages into the turn, closing the gap with the head (phase 2). We can, therefore, mathematically describe this complex system as the substraction of the two illustrated velocity profiles (**Figure 2A**; Eq. 5). The impulse response of the NMS is a lognormal (Plamondon et al., 2003), asymmetric bell-shaped curve (**Figure 2B**) from which the exact representation follows the equation in the insert and depends upon the magnitude of the commanded signal (D), the time occurrence of this command (t0), the execution delay (µ) and the response time (σ). The latter two were defined on a log scale. Indeed,

$$\begin{aligned} \left| \vec{v}(t) \right| &= \left| \sum\_{i=1}^{2} \vec{v}\_{i} \left( t, t\_{0i}, \mu\_{i} \sigma\_{i}^{2} \right) \right| \cong D\_{\hbar} \Lambda\_{\hbar} \left( t, t\_{0\hbar}, \mu\_{\hbar} \sigma\_{\hbar}^{2} \right) \\ &- D\_{T} \Lambda\_{T} \left( t, t\_{0T}, \mu\_{T} \sigma\_{T}^{2} \right); \text{and} \\ \Lambda\_{i} \left( t, t\_{0i}, \mu\_{i} \sigma\_{i}^{2} \right) &= \frac{1}{\sigma\_{i} \left( t - t\_{0i} \right) \sqrt{2\pi}} e^{\left( \frac{\left[ \ln \left( t - t\_{0i} \right) - \mu\_{i} \right]^{2}}{-2\mu\_{i}^{2}} \right)} \end{aligned} \tag{5}$$

where *t*0*<sup>i</sup>* is the time of occurrence of the *i*th input command; µ is the log time delay of the NMS, the time delay on a logarithmic scale; σ is the log response time of the NMS, the response time on a logarithmic scale; and D is the amplitude of the command sent to the NMS.

The lognormal equation parameters may be calculated using specific points of the velocity profile (**Figure 2C**) following equations, Eqs. 6–9 (Djioua, 2007; Djioua and Plamondon, 2009; O'Reilly and Plamondon, 2009).

$$\frac{t\_{P3} - t\_{P1}}{t\_{P5} - t\_{P1}} = \frac{e^{-\sigma^2} - e^{-3\sigma}}{e^{3\sigma} - e^{-3\sigma}} \to \sigma \tag{6}$$

$$\mu = \ln\left(\frac{t\_{P4} - t\_{P2}}{e^{-\left(1.5\sigma^2 - \sigma\sqrt{0.25\sigma^2 + 1}\right)} - e^{-\left(1.5\sigma^2 + \sigma\sqrt{0.25\sigma^2 + 1}\right)}}\right) \tag{7}$$

$$t\_0 = t\_{P3} - e^{\mu}e^{-\sigma^2} \tag{8}$$

$$D = \sqrt{2\pi} \nu\_{P3} e^{\mu} \sigma e^{\left(\sigma^{4}/2\sigma^{2} - \sigma^{2}\right)} \tag{9}$$

Indeed, from the velocity signal it is possible to identify the time at which the motion is initiated and terminated, the time at which the maximum velocity is reached as well as both inflection points. These points are first identified for phase 1 of the motion. The lognormal model parameters are then derived from these points

and phase 1 response is estimated. A similar process is followed for phase 2, allowing a full reconstruction of the velocity signal (**Figure 2D**). From the estimated lognormal equation parameters, it is also possible to deduce further characteristics of the lognormal impulse response which could help interpret the NMS. The time delay (¯*t*), defined as the rapidity at which the system responds to the command, and the time response (s), corresponding to the time it takes the system to react and execute the movement, are defined by Eqs. 10 and 11, respectively (Plamondon et al., 2003).

phases from which the velocity profile is estimated.

$$\overline{t} = \int\_{t\_0}^{+\infty} t \Lambda \left( t, t\_0, \mu, \sigma^2 \right) dt = t\_0 + e^{\mu + 0.5\sigma^2} \tag{10}$$

$$s = \sqrt{\int\_{t\_0}^{+\infty} t \Lambda \left( t, t\_0, \mu, \sigma^2 \right) dt} = \sqrt{e^{2\mu + \sigma^2} \left( e^{\sigma^2} - 1 \right)}$$

$$= \left( \overline{t} - t\_0 \right) \sqrt{\left( e^{\sigma^2} - 1 \right)} \tag{11}$$

Finally, the quality of the reconstructed signature is evaluated using a signal-to-noise ratio (SNR) approach described in equation Eq. 12, as proposed by O'Reilly and Plamondon (2009).

$$\text{SNR} = 20 \log \left( \frac{\int\_0^{t\_{\text{end}}} \nu^2(t) dt}{\int\_0^{t\_{\text{end}}} [\nu(t) - \hat{\nu}(t)]^2 dt} \right) \tag{12}$$

In Eq. 12, *v* corresponds to the measured velocity profile, while ˆ*v* is the reconstructed or estimated profile.

#### Experimental Concept Overview

The complete set of metrics proposed for characterization of the turn cranio-caudal signature is summarized in **Table 1**. In order for these parameters to be of true interest, they must be robust to task velocity natural variation and be reliable.

### **Detailed Experimental Protocol and Participants**

The robustness and reliability of the proposed approach was tested on a sample of older adults. The project was approved by the Centre de Recherche de l'Institut Universitaire de



Gériatrie de Montreal ethics board and participants provided written informed consent. Sixteen asymptomatic adults aged between 55 and 83 years old (mean age = 69.1 years, 50% female, height = 1.61 *±* 0.08 m, weight = 63.2 *±* 10.1 kg; BMI <sup>=</sup> 24.3 *<sup>±</sup>* 3.2 kg/m<sup>2</sup> ) participated in the study. Participants performed repeated 10-m TUGs equipped with the IGS-180, as explained in Section "Protocol and Instrument." TUGs were executed both at normal and fast paces, each condition being repeated twice.

### **Traditional Metrics**

For comparison purposes, data were also analyzed using traditional metrics. As such, the accelerometer signal from the trunk AHRS was analyzed to determine the number of steps the participants took during the turn (Salarian et al., 2010). Analysis of the number of steps is based on a threshold on the acceleration measured by the trunk sensors. Validity of the method was assessed by visual comparison over five trials. Mean and max angular velocity during obtained during the turn was computed using the angular velocity data provided by the trunk AHRS' gyroscope (Salarian et al., 2009, 2010; Mancini et al., 2015a).

### **Data Analysis**

For each trial, the introduced cranio-caudal signature metrics were calculated along with the traditional turn parameters.

A quality control process ensured that only the trials with a SNR greater than 10 dB were kept. The selected threshold is slightly lower than the generally accepted rule for SNR in controlled experiments (usually 15 dB), but this threshold was shown to be satisfying in this specific context. Indeed, this slightly more permissive SNR takes into account the complexity of the experiment and the possible sources for uncertainties such as the manual segmentation of the turn from the TUG task. The effects of velocity on the different metrics as well as their reliability were then analyzed. The robustness of the cranio-caudal turn signature metrics to natural task-related velocity variations and their reliability over repeated trials are important properties that need to be established before their validity can be further explored. All statistical analyses were performed using SPSS (v23.0.0 from IBM) and considered a significance level of 0.05.

#### Velocity Effect and Reliability

Each participant performed four TUGs (two at a normal pace, two at a fast pace). The effect of velocity on the metrics was, therefore, evaluated by taking the mean of each metric per participant and velocity and comparing them using a Wilcoxon signed-rank test. Reliability was assessed using a two-way random, absolute, average-measures intra-class correlation coefficient (Weir, 2005) performed on the repeated measurement of each metric [i.e., ICC(2,4) for absolute agreement]. The following guidelines were used for interpretation (Koo and Li, 2016):


### **RESULTS**

The ability of the sigma-lognormal model to estimate the craniocaudal signature is shown in **Figure 3**. The left panel of this figure illustrates the variation in relative head to trunk angle captured during the turn for a healthy individual. The right panel corresponds to the relative head to trunk angular velocity profile for the same turn (blue curve—measured; red dotted curve—reconstructed profile using the sigma-lognormal approach). Analysis of the SNR revealed a median of 17.7 [14.6, 26.6], confirming the ability of the model to fit the data.

The robustness of the proposed parameters to velocity variations as well as their reliability shall now be verified.

### **Velocity Effect**

Normal pace TUGs were significantly slower than fast TUG (Normal pace TUG duration: 20.3 *±* 2.8 s; fast pace TUG duration: 17.0 *±* 1.7 s; *p* = 0.001). **Figure 4** illustrates the turn's craniocaudal signature captured for the same healthy individual performing a normal pace and a fast pace TUG.

The dispersion of the cranio-caudal signature metrics (H2Tmax and D1,2) across participants is shown in **Figure 5**. The averaged peak head to trunk angle reached during the turn varied from 25.6° *±* 8.9° for normal pace TUG to 24.5° *±* 8.4° for fast pace trials, a difference not statistically significant (*p* = 0.683). The difference between the commanded amplitudes computed for normal pace versus fast pace were not statistically different (*D*<sup>1</sup> normal pace: 24.8 *±* 12.3, *D*<sup>1</sup> fast pace: 28.5 *±* 11.0, *p* = 0.470; *D*<sup>2</sup> normal pace: 29.2 *±* 11.0, *D*<sup>2</sup> fast pace: 24.1 *±* 10.1, *p* = 0.124). Similarly, the pace of the trials also did not have any significant effect on the timing parameters (*t*<sup>01</sup> normal pace: *−*4.40 *±* 6.30 s, *t*<sup>01</sup> fast pace: *−*4.37 *±* 5.56 s, *p* = 0.836; *t*<sup>02</sup> normal pace: *−*7.0 *±* 6.3 s, *t*<sup>02</sup> fast pace: *<sup>−</sup>*4.5 *<sup>±</sup>* 3.4 s, *<sup>p</sup>* <sup>=</sup> 0.198; ¯*t*<sup>1</sup> normal pace: 0.62 *<sup>±</sup>* 0.15 s, ¯*t*<sup>1</sup> fast pace: 0.57 *<sup>±</sup>* 0.21 s, *<sup>p</sup>* <sup>=</sup> 0.363;¯*t*<sup>2</sup> normal pace: 1.34 *<sup>±</sup>* 0.22 s,¯*t*<sup>2</sup> fast pace: 1.17 *±* 0.32 s, *p* = 0.158; *s*<sup>1</sup> normal pace: 0.28 *±* 0.09 s, *s*<sup>1</sup> fast pace: 0.26 *±* 0.06 s, *p* = 0.638; *s*<sup>2</sup> normal pace: 0.23 *±* 0.05 s, *s*<sup>2</sup> fast pace: 0.22 *±* 0.06 s, *p* = 0.198). For comparison purposes, **Figure 6** illustrates the dispersion observed across participants for the traditional turn metrics. Both the number of steps (NbSteps normal pace: 3.9 *±* 0.8, NbSteps fast pace: 3.9 *±* 0.7, *p* = 0.685) and the mean turn velocity (turnvelmean normal pace: 1.54 *±* 0.25 rad/s, turnvelmean fast pace: 1.53 *±* 0.15 rad/s, *p* = 0.925) were not significantly affected by velocity. However, the maximum velocity was significantly different (turnvelmax normal pace: 3.83 *±* 0.40 rad/s, turnvelmax fast pace: 4.08 *±* 0.42 rad/s, *p* = 0.009).

#### **Reliability**

Reliability was assessed for all repeated trials performed by the participants (i.e., normal and fast trials). **Table 2** reports the ICC

**FIGURE 3** | Cranio-Caudal Signature Determination. The proposed cranio-caudal signature approach is composed of both the analysis of the relative head to trunk angle achieved during the turn and the head to trunk relative angular velocity profile, modeled with the sigma-lognormal approach. **(A)** Change in head to trunk relative angle during a normal turn. The maximum angle reached is identified as a signature variable. **(B)** The blue curve illustrates the relative head to trunk angular velocity profile during the turn, as derived from the attitude and heading reference system measurement. The red dotted line illustrates the reconstructed profile, using the sigma-lognormal model. The parameters used to achieve the reconstruction are listed as inserts.

**FIGURE 4** | Turn cranio-caudal signature for a normal pace **(A,C)** and a fast pace Timed-Up-and-Go (TUG) **(B,D)**, executed by the same healthy participant. **(A,B)** Relative head to trunk angle variation captured during the turns. **(C,D)** Measured and estimated relative head to trunk angular velocity profile captured during the turns along with the computed signature parameters.

**TABLE 2** | Turn metrics reliability.


for each metric together with their 95% confidence intervals. Cranio-caudal signature metrics were shown to have a moderate to good reliability with ICCs, varying from 0.64 to 0.81. Furthermore, it was found that both traditional turn velocity metrics (mean and max turn velocity) had a moderate agreement while the number of steps revealed a poor reliability.

### **DISCUSSION**

The current study demonstrated for the first time that it is possible to successfully capture the cranio-caudal signature from the relative angular velocity profile deduced from the AHRS orientation data. In past studies, a cranio-caudal sequence was identified using camera-based stereophotogrammetric systems (Ferrarin et al., 2006; Crenna et al., 2007; Hong et al., 2009; Wright et al., 2012; Spildooren et al., 2013; Hulbert et al., 2015). These studies predominantly assessed the temporal sequence in which the segments (head, trunk and pelvis) are engaged in turning as well as the maximum angle reached by the head relative to the trunk and pelvis. In a study comparing recurrent fallers to non-fallers performing a 360° on-spot turning task, Wright et al. (2012) showed that all participants initiated the turn by rotating the head and that the extent of that head rotation is greater in nonfallers. Additionally, in a population with Parkinson's disease, it was also shown that both the temporal cranio-caudal sequence as well as the maximum rotation of the head to the trunk are altered compared to controls, reflecting the so-called "en-bloc" strategy (Ferrarin et al., 2006;Crenna et al., 2007; Hong et al., 2009; Spildooren et al., 2013). Hence, it has been well demonstrated that the cranio-caudal sequence exhibited during the turn contains useful information. However, it is also documented that camerabased systems have restrictions (cost, required volume of operation, occlusions) which limit their use in a clinical settings (Zhou and Hu, 2008). Alternatively, inertial measurement systems have the portability required to be used outside laboratory settings, but the type of information provided by this system is different, and thus requires data to be analyzed differently. Orientation data, expressed in a global reference frame, allow us to measure the change in orientation of the head relative to the trunk. In this study, we investigated the possibility to capture and characterize the cranio-caudal signature from the orientation data provided by AHRS using a two-step process: First, the relative head to trunk angular profile is analyzed to assess the maximum angle reached. Then, the relative angular velocity profile of the head to the trunk is derived from that relative orientation information and investigated with the sigma-lognormal model. While orientation and inertial data (acceleration and angular velocity) can be used to directly characterize the turn, the choice to use a model is based on an assumption that this model will provide insights into the NMS which will help understand mobility deficits. The model has already been proven to be linked to the NMS in different situations, but had never been used on relative angular velocity. The combined analysis of the maximum relative head to trunk angle with a sigma-lognormal approach on the velocity profile of this joint, therefore, presents a promising avenue to enable cranio-caudal signature analysis with AHRS.

In order for the approach to be truly of interest, the signature metrics have to be reliable and robust to speed variations. Comparing the metrics computed during fast TUG to the ones computed for the TUG performed at normal pace has shown that velocity does not produce significant variations in the metrics. These results are in conjunction with Akram et al. (2010) who demonstrated, using a camera-based system, that the craniocaudal timing sequence is robust to walking speed variations. Furthermore, the metrics have shown moderate to strong reliability over the four repeated trials. At this point, it is difficult to relate the results to other published work as this is, to our knowledge, the first time a similar approach has been used to characterize the cranio-caudal sequence. For comparison purposes, traditional metrics were also captured during each trial. These metrics (number of steps, mean turn velocity and max turn velocity) correspond to the current most popular metrics used in the literature to characterize the turn behavior using inertial measurement systems (Greene et al., 2010; Salarian et al., 2010; Zampieri et al., 2010; El-Gohary et al., 2013; Galán-Mercant and Cuesta-Vargas, 2014; Sheehan et al., 2014; Mancini et al., 2015b, 2016; Zakaria et al., 2015; Smith et al., 2016; Vervoort et al., 2016). Both the number of steps and the mean turn velocity were robust to a change in speed, but the maximum turn velocity was found to be significantly higher at fast pace TUG. According to Hulbert et al. (2015), the number of steps taken during a turn relates to the strategy adopted to perform that turn. The results from the current study illustrate that the turn strategy itself was not modified with TUG speed. In the literature, turn duration was identified as an objective biomarker of the ability of the neural control system to perform postural transitions (Horak and Mancini, 2013). Therefore, the observed increased maximum turn velocity with increasing TUG pace combined with the constant mean velocity can be interpreted as an adaptive strategy to maintain the same turn duration, denoting a good ability to change motor program among the participants. However, from these results, we must be cautious when interpreting a difference in maximum velocity to differentiate populations, as the extent of the difference may also be due to speed difference. If the instruction is not standardized (e.g., "perform the test as fast but safely as possible"), results of the maximum velocity may be biased. With respect to reliability, those traditional metrics performed poorer than the signature metrics as a result. The number of steps even showed poor reliability as assessed with an ICC. Previously, Salarian et al. (2010) had reported a strong agreement for that same metrics. The difference may be explained by the small variation between individuals within our sample. Indeed, the number of steps required to perform a 180° lacks variability in the current study as participants were all healthy elderly. Salarian et al. (2010) used both healthy controls and Parkinson's disease patients to test for reliability, increasing the variability between individuals. In the near future, a test-retest reliability of craniocaudal signature parameters could be re-evaluated using a similar approach to enable better comparison with the literature. The lack of variability between healthy individuals is a good thing when trying to differentiate two groups with clearly different behavior (e.g., Parkinson's disease patients versus healthy controls). However, it raises concerns regarding the sensitivity to the change of such metric. The better reliability of the cranio-caudal signature metrics observed between individuals suggests a better resolution of the metrics, offering the potential to a better sensitivity to change. If true, such metrics could be useful to monitor changes in motor control with age or disease progression within individuals. One limit to this study is the fact that the proposed cranio-caudal signature methodology was directly validated using an inertial system which is known to have a certain inaccuracy. In a recent study, it was demonstrated that the segment of interest here had a mean root-mean-squared difference between 3.1° and 4.4° during a turn with peak values around 6°(Lebel et al., 2017). However, peak error will occur around maximum velocity which, in the case of the sigma-lognormal model, is defined by Eq. 13 below. The impact of this inaccuracy on timing parameters is minor as the reported agreement is good. As a result, inaccuracy in Vmax measurement could result in inaccuracy in the estimation of parameter D. However, recalling that the effect of the pace of the trial on D was shown to be not statistically significant across individuals, it can be assumed that the model is robust to the measurement inaccuracies:

$$V\_{\text{max}} = \frac{D}{\sigma / 2\pi} e^{(-\mu + 0.5\sigma^2)}.\tag{13}$$

Now that we have established the required methodology to derive the cranio-caudal signature based on AHRS data and verified the reliability of the metrics, there is a possibility of applying it to different populations to verify the sensitivity of the metrics.

The proposed algorithm allows for the characterization of the quality of a turn using AHRS in an innovative manner. It also demonstrates the power of orientation data assessed with AHRS. The full potential of such an approach will only be reached when combined with automatic recognition and segmentation of activities (Nguyen et al., 2015; Ayachi et al., 2016a,b). Additionally, this work also shows that the sigma-lognormal model can be used to fit the cranio-caudal signature. Although this model has been proven well-suited for rapid (Plamondon et al., 2014) and slow movements (Duval et al., 2015) in different situations, the movement of the head to the trunk during the turn is somewhat different and it was previously unclear if such a model could be applied here. The present results confirm this hypothesis. However, further validation of the model in this specific context of use would be beneficial in order to provide a deeper understanding of the parameters values in this particular framework.

## **CONCLUSION**

The present study has shown that cranio-caudal signature during the turn can be captured using AHRS and a sigma-lognormal model. Metrics deduced from the signature profile were shown to be robust to speed variations and reliable. Comparison with traditional turn metrics leads us to believe that the proposed approach is a promising avenue to enhance early deficits identification.

## **ETHICS STATEMENT**

Participants gave their informed consent following the procedure approved by the Centre de Recherche de l'Institut Universitaire de Gériatrie de Montreal ethics board.

## **AUTHOR CONTRIBUTIONS**

KL developed the algorithm, designed the analysis, and drafted the manuscript. HN provided significant feedback on the

## **REFERENCES**


analysis of the study and the manuscript. RP provided substantial feedback on the use of the Sigma-Lognormal model and its interpretation and reviewed the manuscript. CD conceived the experiment, helped in data interpretation, and reviewed the paper. PB helped in the conception of the algorithm, the interpretation of the data, and reviewed the analysis and the manuscript.

## **FUNDING**

This work was conducted as part of the Ecological Mobility in Aging and Parkinson (EMAP) research group supported by a Canadian Institute of Health Research (CIHR) team in Mobility in Aging grant: Quantifying, characterizing, and modeling the whole-body mobility of individuals in their natural environment; from normal aging to Parkinson's disease. KL was financially supported for this study by the Fonds de recherche du Québec—Santé (FRQS) and the Research Centre on Aging.


with Parkinson's disease? *J. Neurol. Phys. Ther.* 36, 25–31. doi:10.1097/NPT. 0b013e31824620d1


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2017 Lebel, Nguyen, Duval, Plamondon and Boissy. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Cellular Level In-silico Modeling of Blood Rheology with An Improved Material Model for Red Blood Cells

Gábor Závodszky 1, 2 \*, Britt van Rooij <sup>1</sup> , Victor Azizi <sup>1</sup> and Alfons Hoekstra1, 3

<sup>1</sup> Computational Science Lab, Faculty of Science, Institute for Informatics, University of Amsterdam, Amsterdam, Netherlands, <sup>2</sup> Department of Hydrodynamic Systems, Budapest University of Technology and Economics, Budapest, Hungary, <sup>3</sup> ITMO University, Saint Petersburg, Russia

Many of the intriguing properties of blood originate from its cellular nature. Therefore, accurate modeling of blood flow related phenomena requires a description of the dynamics at the level of individual cells. This, however, presents several computational challenges that can only be addressed by high performance computing. We present Hemocell, a parallel computing framework which implements validated mechanical models for red blood cells and is capable of reproducing the emergent transport characteristics of such a complex cellular system. It is computationally capable of handling large domain sizes, thus it is able to bridge the cell-based micro-scale and macroscopic domains. We introduce a new material model for resolving the mechanical responses of red blood cell membranes under various flow conditions and compare it with a well established model. Our new constitutive model has similar accuracy under relaxed flow conditions, however, it performs better for shear rates over 1,500 s −1 . We also introduce a new method to generate randomized initial conditions for dense mixtures of different cell types free of initial positioning artifacts.

#### Edited by:

Timothy W. Secomb, University of Arizona, United States

#### Reviewed by:

Panagiotis Dimitrakopoulos, University of Maryland, College Park, United States Dmitry A. Fedosov, Forschungszentrum Jülich, Germany

#### \*Correspondence:

Gábor Závodszky g.zavodszky@uva.nl

#### Specialty section:

This article was submitted to Computational Physiology and Medicine, a section of the journal Frontiers in Physiology

Received: 20 April 2017 Accepted: 19 July 2017 Published: 02 August 2017

#### Citation:

Závodszky G, van Rooij B, Azizi V and Hoekstra A (2017) Cellular Level In-silico Modeling of Blood Rheology with An Improved Material Model for Red Blood Cells. Front. Physiol. 8:563. doi: 10.3389/fphys.2017.00563 Keywords: blood rheology, RBC material model, cellular flow, high-performance computing, dense cell initialization

## 1. INTRODUCTION

On the cellular level, blood is a dense suspension of various types of cells. Red blood cells (RBC) form the primary component with an approximate volume fraction of 42% (Davies and Morris, 1993) determining the bulk blood rheology. They have a biconcave shape and a typical diameter of 8 µm. Platelets (PLTs), the second most numerous component with typically 1 PLT for every 10 RBCs (Björkman, 1959) form the link between transport dynamics and vital biochemical processes related to thrombus formation. In their unactivated state PLTs have a rigid ellipsoidal form. The collective behavior of RBCs and PLTs can provide explanation to the most fundamental transport phenomena in blood, for instance the non-Newtonian viscosity (Merrill and Pelletier, 1967), the margination of platelets (Beck and Eckstein, 1980; Tilles and Eckstein, 1987), the Fåhræus effect (Barbee and Cokelet, 1971), the appearance of a cell-free layer (Maeda et al., 1996; Kim et al., 2009), or the scaling of shear-induced diffusion of RBCs (Mountrakis et al., 2016). The necessity to accurately reproduce these effects grows as the typical length-scale of the examined system reaches below ≈ 200 µm, at which point the macroscopic description no longer yields accurate local dynamics (Popel and Johnson, 2005). With the development of modern medical devices more and more elements reside in the micrometer domain, such as the strut structure of flow-diverters

**434**

(Lubicz et al., 2010) or woven endobridge (WEB) flow disruptor devices (Ding et al., 2011). This together with additional complex phenomena that require detailed cellular modeling of the flow, for instance platelet aggregation (Nesbitt et al., 2009) or white blood cell (WBC) trafficking (Fay et al., 2016), triggers an increasing need to understand how the rheology and the transport of the RBCs and PLTs are influenced while acting over such small scales.

In the solutions targeting these questions the mechanical responses of the RBCs and PLTs are often expressed with constitutive models applied through their membranes accounting for the responses of the various structural elements (Ye et al., 2016). Some examples for these material models are the spectrin-link membrane model of Dao et al. (2006) or the energy model of Skalak et al. (1973). Fedosov et al. (2010a) employed the dissipative particle dynamics (DPD) method with a constitutive description gained by coarse-graining the model of Dao et al. (2006) to study various transport features of blood (Fedosov et al., 2010b, 2011b,c; Fedosov and Gompper, 2014; Yazdani and Karniadakis, 2016). A low dimensional RBC membrane model was developed by Pan et al. (2010) and was compared to the coarse-grained spectrin-link model of Fedosov et al. (2010a). More recently, a two-component RBC membrane model that consist of a separate lipid bilayer and spectrin network (Chang et al., 2016) was introduced to examine the difference in the deformation of healthy and infected RBCs. MacMeccan et al. (2009) developed a model that coupled the lattice Boltzmann method (LBM) to finite element method (FEM) based cell mechanics and investigated the viscosity behavior of blood in shear flows at various hematocrit levels. Later, Reasor et al. (2012) used the spectrin-link membrane representation rather than solving Cauchy's equation to model the trajectory and deformation of elastic deformable particles. This model was also used to study the margination of platelets (Mehrabadi et al., 2016). Moreover, Krüger et al. (2011) developed a combination of lattice Boltzmann method (LBM) and finite element RBC membrane model based on the energy model of Skalak et al. (1973), and used the immersed boundary method (IBM) to couple the fluid and the membrane. This model was used to study the tank-treading behavior of single RBCs next to the deformation behavior and the relative viscosity of RBC suspensions (Krüger et al., 2013; Gross et al., 2014; Krüger, 2016). In addition, Shi et al. used LBM in combination with the fictitious-domain method to couple the plasma to the spectrin-link membrane model. They studied the deformation of an RBC in capillary flows, during tank-treading motion and hydrodynamic interaction between two cells (Shi et al., 2014). Hashemi and Rahnama (2016) investigated the deformation of RBCs in capillary flows with an LBM-FEM based model with IBM coupling.

In this paper, our framework called Hemocell (High pErformance MicrOscopic CELlular Libary)<sup>1</sup> is presented for modeling the flow of blood on a cellular level. Hemocell is designed to be easily extendible with additional cell-types and interactions and to provide the high computational performance that enables applications up to macroscopic scales. Blood plasma is represented as a continuous fluid simulated with LBM, while the cells are represented as discrete element method (DEM) membranes coupled to the fluid flow by the immersed boundary method. Furthermore, two different material models for the RBC membrane mechanics have been investigated. One is the aforementioned coarse-grained spectrin-link model of Fedosov et al. (2010a) and a new one that addresses several shortcomings of the former. The validation of Hemocell, in combination with our new RBC material model, is presented through single-cell mechanical experiments (i.e., stretching and shearing cases). We demonstrate that the proposed new material model reproduces both the single-cell mechanical responses and the collective transport dynamics in very good agreement with experiments, as well as it provides an accurate mechanical response and an increased structural stability under higher shear forces and strong deformations. The later is necessary, since it is known from recent high-field-strength MRI measurements of Bouvy et al. (2016) that pulsation effects are significant even on the mesoscopic level of smaller arterioles. Moreover, it can also enable simulations of transport mechanisms in micro-fluidic settings or in the vicinity of micro-medical devices, where strong deformations and high shear values and gradients can be expected. Hemocell also forms a fundamental component in building versatile multi-scale models of arterial health and diseases (Hoekstra et al., 2016).

### 2. METHODS

The solvent (blood plasma) in Hemocell is modeled as an incompressible Newtonian fluid using the lattice Boltzmann method implemented in the Palabos library (Lagrava et al., 2012) which is known to be capable of producing accurate flow results in vascular settings (Závodszky and Paál, 2013; Anzai et al., 2014). The surfaces of RBCs and PLTs are are described as boundary layers immersed into the plasma. These layers are discretized using N<sup>v</sup> vertices which are connected by N<sup>e</sup> edges yielding N<sup>t</sup> surface triangles (see **Figure 1** for an example in case of an RBC and a PLT). The connectivity and symmetries are similar to the structure of the cytoskeleton as imaged by

<sup>1</sup>https://www.hemocell.eu/ (Accessed July 25, 2017).

atomic force microscopy (Swihart et al., 2001; Liu et al., 2003). In our simulations the membrane of each RBC consisted of N<sup>v</sup> = 642 vertices, N<sup>e</sup> = 1, 920 edges, and N<sup>t</sup> = 1, 280 faces . The mechanical behavior of a cell is expressed using this discrete membrane structure. The response to deformations is formulated as a set of forces acting on the cell membrane, which is coupled to the plasma flow through a validated in-house immersedboundary implementation (Mountrakis et al., 2014; Mountrakis, 2015) that has an efficient parallel design. Mountrakis et al. (2015) demonstrated that the framework can be scaled up to 10<sup>6</sup> cells executing on 8,192 cores without significant loss of parallel efficiency.

#### 2.1. Description Of The Coarse-Grained spectrin-Link Membrane Model

In Hemocell, two distinct constitutive model have been implemented for RBCs to act on the membrane mesh. The first one is based on the systematic coarse-graining of the model of Dao et al. (2006). For the detailed derivation we refer to the work of Fedosov et al. (2010a). The model is briefly outlined below:

The free-energy of a cell is described as

$$U\_{total} = U\_{in-plane} + U\_{bend} + U\_{volume} + U\_{area} + U\_{visc}.$$

The location x<sup>i</sup> of each vertex on the membrane mesh is updated according to the force:

$$F\_i = \frac{\partial U\_{total,i}}{\partial \mathcal{X}\_i}.$$

The total free-energy is composed of the following elements:

1. The in-plane potential models the compression response of the underlying cytoskeletal network along the membrane surface. The edges of the surface triangles represent the cumulative behavior of the local spectrin links using the wormlike chain (WLC) nonlinear spring description:

$$\begin{aligned} U\_{in-plane} &= \sum\_{i=1..N\_{\mathcal{C}}} U\_{WLC} + \sum\_{j=1..N\_{\mathcal{I}}} \frac{C\_q}{A\_k^q}, \\\\ U\_{WLC} &= \frac{k\_B T l\_m}{4p} \frac{3r\_i^2 - 2r\_i^3}{1 - r\_i}, \\\\ C\_{\mathcal{q}} &= \frac{\sqrt{3} A\_{l\_0}^2 k\_B T (4r\_0^2 - 9r\_0 + 6)}{4p l\_m (1 - r\_0)^2}, \end{aligned}$$

where p, l<sup>m</sup> are the persistence length and the maximum length of the spectrin links, r<sup>i</sup> = li/l<sup>m</sup> ∈ [0, 1), l<sup>0</sup> is the average length of links, r<sup>0</sup> = l0/l<sup>m</sup> and Al<sup>0</sup> = √ 3l 2 0 /4.

2. The potential to account for bending rigidity:

$$U\_{bend} = \sum\_{i=1..N\_{\ell}} \tilde{\kappa} [1 - \cos(\theta\_i - \theta\_0)],$$

where <sup>κ</sup>˜ <sup>=</sup> <sup>2</sup>κ/<sup>√</sup> 3, θ<sup>i</sup> is the instantaneous, θ<sup>0</sup> is the equilibrium angle between neighboring faces sharing an edge, and κ is the bending constant.

3. The volume conservation energy is a fictitious potential which accounts for the forces arising from the change of volume:

$$U\_{volume} = \frac{k\_V (V - V\_0)^2}{2V\_0}\_\*$$

where V is the current, and V<sup>0</sup> is the equilibrium volume of the cell.

4. The area conservation potential is similarly a non-physical term representing the inextensible nature of the bilipid layer:

$$U\_{area} = \frac{k\_A (A - A\_0)^2}{2A\_0} + \sum\_{k=1\dots N\_l} \frac{k\_{A\_l} (A\_k - A\_{0,k})^2}{2A\_{0,k}},$$

where A, A<sup>0</sup> are the global and A<sup>k</sup> , A0,<sup>k</sup> are the local actual and equilibrium surface areas, respectively.

5. The additional term to correct membrane viscosity:

$$U\_{visc} = \sum\_{i=1\dots N\_{\ell}} -\frac{1}{2} \eta\_m \nu\_{m,n}^2,$$

where vm,<sup>n</sup> denotes the relative velocity of the vertices m and n connected by edge i and membrane viscosity η<sup>m</sup> = 22 × 10−<sup>3</sup> Pas is chosen such that RBCs yield realistic tank-treading and tumbling frequencies (Fedosov et al., 2014).

The free parameters of this model ( κ = 100 kBT, k<sup>V</sup> = 6, 000, k<sup>A</sup> = 5, 900, kA<sup>l</sup> = 100 ) were adopted from (Fedosov et al., 2010a) with the exception of the maximum link extension ratio ( r<sup>0</sup> = l0 <sup>l</sup>max <sup>=</sup> 2.6 ), which was fine-tuned for our current discretization. The usefulness of this model was demonstrated in a series of publications (Fedosov et al., 2010a, 2011a; Fedosov and Gompper, 2014; Mehrabadi et al., 2016). However, it also has a few shortcomings. The bending response is based on Helfrich's model (Helfrich, 1973) which only accounts for the properties of the lipid bilayer and not the underlying structures. Furthermore, the coarse-graining of the bending rigidity for the triangulated mesh is based on the work of Gompper and Kroll (1996) which assumes small angles and equilateral triangles, both of which are often not fulfilled for sheared RBCs. As a consequence, the bending energy in this model yields a sinusoidal force-response that has a sub-linear response for angles over <sup>π</sup> 4 , which even decays further for larger angles. This can lead to insufficient force responses and consequently to acute angles or collapse of neighboring faces. The resulting problems can often be mitigated by using a linear bending response that fits the slope of the sinusoidal at low angles. Moreover, the global surface conservation potential can lead to unphysical responses, since a local stretch of the membrane instantly causes the contraction of the rest of the membrane forcing the surface points to move toward the center of the cell.

#### 2.2. Description Of the New Constitutive Model

We propose a new material model in the form of a set of forces acting on the same triangulated cellular membrane. The initial assumption for this model is that during small deformations all these forces present a linear regime with different slopes as the response types correspond to different components of the cell and are independent of each other. However, for large enough deformations the cytoskeleton adds contribution to all of them, resulting in qualitatively similar behavior. For instance, a response for small bending is assumed to be dominated by the curvature rigidity of the bilipid membrane resulting in a term linear in angle for the DEM membrane, while for larger deformation the underlying cytoskeleton deforms as well yielding an additional quickly diverging term. In the following we describe this model in two steps by separating the phenomenological description and the implementation.

1. The link force acts along links between surface points and represents the response to stretching and compression of the underlying spectrin-network beneath the representative link. The formulation of the force is similar in spirit to the worm-like-chain potential model often used to mimic the mechanical properties of polypeptide chains. It presents a linear part which corresponds to smaller deformations and a fast-diverging non-linear part which represents the limits of the material toward this type of deformation by quickly increasing the force response as the stretch approaches the persistence-length.

$$F\_{link} = -\frac{\kappa\_l dL}{\mathcal{P}} \left[ 1 + \frac{1}{\tau\_l^2 - dL^2} \right],$$

where dL = Li−L<sup>0</sup> L0 is the normal strain defined as the relative deviation from the equilibrium length L<sup>0</sup> with τ<sup>l</sup> = 3.0 is chosen based on the assumption that the represented spectrinnetwork reaches its persistence length at the relative expansion ratio of 3. The persistence-length of a spectrin filament was taken as p = 7.5 nm (Li et al., 2005).

2. The bending force acts between two neighboring surface elements representing the bending response of the membrane arising primarily from the non-zero thickness of the spectrinnetwork. On each surface it points along the normal direction of that surface. As opposed to the previous model in which bending is expressed by modeling the bending rigidity of the bilipid membrane (Helfrich, 1973), the form of the employed force term here is similar to the form of the previous link force to account for increased resistances coming for additional sources, such as the connection of the membrane to the underlying cytoskeleton.

$$F\_{bend} = -\frac{\kappa\_b d\theta}{L\_0} \left[ 1 + \frac{1}{\mathfrak{r}\_b{}^2 - d\theta^2} \right],$$

where dθ = θ<sup>i</sup> − θ0. From simple geometric considerations it follows that the limiting angle τ<sup>b</sup> scales with the discretization length of the surface elements (L0). We fix the smallest representable curvature rmin = L0 2 sin <sup>τ</sup><sup>b</sup> 2 . From the micro pipette aspiration images of Mohandas and Evans (1994) a rough approximation for the necessary curvature radius of 0.2 µm can be inferred by examining the membrane curvature at the pipette neck. For the currently employed resolution ( L<sup>0</sup> = 0.5 µm ) the limiting angle is chosen to be τ<sup>b</sup> = π 6 . This choice prevents unrealistic sharp surface edges while allowing curvature radii as small as 0.18 µm to be represented.

3. The local surface conservation force acts locally on surface elements (i.e., triangles) and has the same form. It represents the combined surface response of the supporting spectrinnetwork and the lipid bilayer of the membrane to stretching and compression. This force is applied to all three vertices of each face and it points toward the centroid of the corresponding surface triangle.

$$F\_{area} = -\frac{\kappa\_a dA}{L\_0} \left[ 1 + \frac{1}{\tau\_a^{\prime} - dA^2} \right],$$

where dA = Ai−A<sup>0</sup> A0 . Strong-deformation experiments of erythrocytes show that at around 40% of surface area change the membrane of most cells is damaged permanently (Li et al., 2013). We set τ<sup>a</sup> = 0.3, thus prohibiting surface area changes larger than 30%.

4. The volume conservation force is the only global term. It is used to maintain quasi-incompressibility of the cell. It is applied at each node of each surface element and it points toward the normal of the surface.

$$F\_{\text{volume}} = -\frac{\kappa\_\nu dV}{L\_0} \left[ \frac{1}{\mathbf{r}\_\nu^2 - dV^2} \right],$$

where dV = V−V<sup>0</sup> V0 , τ<sup>v</sup> = 0.01 and κ<sup>v</sup> = 20 kBT is chosen to be a large but numerically still stable constant.

This constitutive model has three free parameters for RBC modeling : κ<sup>l</sup> , κ<sup>b</sup> , and κa. These are chosen to satisfy mechanical single-cell experimental results.

### 2.3. Implementation Of the Constitutive Forces For The New Model

The proposed forces can be realized in multiple ways on the given DEM structure, thus the implementation method is an inseparable part of the model. **Figure 2I** aids this description by showing a notation for two neighboring surface elements.

1. For each edge Ee<sup>i</sup> , i ∈ [1..Ne] the link force FE link is added to the total force acting on the end nodes of that edge (i.e., the IBM particles). Following the notation of **Figure 2I** for the edge between the nodes Ev<sup>1</sup> and Ev2, the resulting link forces are:

$$
\vec{F}\_{link\_{\nu\_1}} = F\_{link} \* \frac{\vec{\nu}\_2 - \vec{\nu}\_1}{\|\vec{\nu}\_2 - \vec{\nu}\_1\|} = -\vec{F}\_{link\_{\nu\_2}}.
$$

2. The bending force is applied for each edge Ee<sup>i</sup> , i ∈ [1..Ne] on the four nodes of the two connecting surface elements. For the edge between the nodes Ev<sup>1</sup> and Ev3:

$$
\vec{F}\_{bend\_{\nu\_k}} = -F\_{bend} \ast \vec{n}\_k, k \in [1, 2]
$$

$$
\vec{F}\_{bend\_{\nu\_l}} = F\_{bend} \ast \frac{\vec{n}\_1 + \vec{n}\_2}{2}, l \in [3, 4].
$$

3. The local surface conservation force acts on each Ef<sup>j</sup> , j ∈ [1..Nt] surface elements. For the face with the normal vector nE<sup>1</sup> and centroid CE:

$$
\vec{F}\_{area\_{\nu m}} = F\_{area} \ast (\vec{C} - \vec{\nu}\_m), m \in [1, 2, 3].
$$

4. The volume conservation force is applied on the three nodes of each surface element:

$$
\vec{F}\_{\text{volumej}} = F\_{\text{volumie}} \ast \frac{\mathbf{S}\_j}{\mathbf{S}\_{\text{avg}}} \ast \vec{n}\_j.
$$

,

where S<sup>j</sup> is the surface area of the j-th element and Savg is the average surface area.

### 2.4. Validation Of the Mechanical RBC Responses

The free parameters of our mechanical RBC membrane model are fit to match the results of the optical-tweezer stretching experiments (Mills et al., 2004) and the Wheeler shear experiment (Yao et al., 2001). The single-cell deformation types during the measurements are shown in **Figure 3**.

In the optical-tweezers experiment small silica beads are attached on the opposing sides of the RBC. One is then fixed to the wall of the experimental container while the other is moved away by a focused laser beam. The arising forces result in a stretching of the RBC along the longitudinal axis and contraction along the transversal axes. In our simulation the same force magnitudes are used as in the experiment. They are applied on five percent of the membrane area on the opposing ends of the RBC. These areas represent the attachment surfaces of the silica beads.

The stretching curves of the two material models implemented in Hemocell are compared to the experiment of Mills et al. (2004) and to the results of Fedosov et al. (2010a) and are shown in **Figure 4**. Both constitutive membrane models can reproduce the stretching behavior of a single RBC in the given forcing regime with good accuracy. However, since the responses of the different force types are more balanced in our new model (i.e., it is less likely, that one force will dominate over the others during deformation), while the spectrin-link model is over-dominated by the in-plane link-force, our model captures the transversal contraction at higher stretches with more accuracy.

In the wheeler experiment performed by Yao et al. (2001) an RBC is positioned in shear flow such that the axis of symmetry of the cell lies in the plane of the shear and is perpendicular to the flow velocity. The deformation of the RBCs is then inferred from measuring its laser diffraction pattern in the flow. We numerically compute the behavior of a single RBC placed in pure shear flow with shear rates between 17 and 200 s −1 , in accordance with the experiment. The deformation index of the RBC is defined as given in Yao et al. (2001):

$$DI = \frac{(D\_{\max}/D\_0)^2 - 1}{(D\_{\max}/D\_0)^2 + 1},$$

where D<sup>0</sup> is the original diameter of the RBC (7.82 µm) and Dmax is the maximal diameter during the deformation at a constant shear rate value. The results are compared to the experimental results and to simulated results of MacMeccan et al. (2009) in **Figure 5**. Both material models give a deformation

index that are in agreement with Yao's experiment and with the simulations of MacMeccan et al. (2009). It is important to point out that the numerical accuracy of this simulation is more sensitive to the fluid-structure coupling compared to the previous stretching scenario, therefore, a close match with the measurements implies an accurate coupling between the plasma flow and the cell membranes. Additionally, the numerical limit of shear rate is tested for both material models with this setting. In our implementation the new material model could resist higher sustained shear rates (γ˙max = 2, 500 s −1 ) than the spectrinlink model (γ˙max = 500 s −1 ) before the RBC collapsed due to insufficient force response arising from numerical errors (for the onset of such an error see the inset image in **Figure 5**).

The fit to the experimental results yield κ<sup>l</sup> = 15 kBT, κ<sup>a</sup> = 5 kBT, and κ<sup>b</sup> = 80 kBT. These values are used throughout this work for the new constitutive model. Evans (1983) measured the bending modulus to be in the order of 50 kBT, not far from our κ<sup>b</sup> . Additionally, with the selected κ<sup>a</sup> value the local surface extensions under physiological flow conditions are smaller than the set limit of 30%, typically below 7%, which agrees with the literature (Fung, 1993). Please note that the selection of these parameters is not unique, other sets might exist that also fit the single-cell experimental results well with the proposed mechanical model. To infer further material characteristics of the model, we employed a simulation to deform a single hexagonal patch of the membrane (for an overview of the applied deformations see **Figure 2II**). The uniaxial stretching yields a surface Young modulus of E<sup>s</sup> = 27.82 µN/m. Assuming that the major deformation response arise from the membrane (the bilipid layer, and the spectrin, actin filaments) and its width in the range of 25 − 50 nm (Gov and Safran, 2005; Yoon et al., 2009), the typical Young modulus for small deformations E = 1 kPa of healthy RBCs (Maciaszek and Lykotrafitis, 2011) gives the surface tensile modulus of E<sup>s</sup> = 25−50 µN/m, in agreement with our results. The shear deformation of the patch yields µ = 10.87 µN/m, close to the upper region of the measured ranges of 6 − 10 µN/m (Mohandas and Evans, 1994; Park et al., 2011). Finally, area expansion gives a compression modulus of K = 21.88 µN/m near the reported range of 18 − 20 µN/m (Park et al., 2011). Assuming homogeneous isotropic linear behavior (that only holds for small deformations), the relation between the elastic constants yields a Poisson's ratio of 0.29, in the vicinity of the expected value of 1/3.

From the unique material properties an emergent ability of RBCs traveling in small, confined flows is their deformation to parachute-like shapes (Noguchi and Gompper, 2005). This behavior is necessary to pass small micro-capillaries of diameters below the diameter of the undeformed RBC (Tsukada et al., 2001). **Figure 6** shows an example of a simulated RBC that deforms toward this shape in a tight channel computed with the new material model.

### 2.5. Generating Cell Initial Conditions

An important component of simulating blood flows on a cellular level is the selection of initial conditions for the cells, such as position and orientation. These are far from trivial since, due to the biconcave shape of RBCs, their volume (≈ 71 µm<sup>3</sup> ) compared to the volume of their enclosing box ( ≈ 224 µm<sup>3</sup> ) is low. Using the densest possible packing along a regular grid thus yields a hematocit of 32% which is often inadequate as it

does not reach the level of physiologic hematocrit of human blood. A further issue is the need for a randomized distribution to avoid initial artifacts originating from the regular positioning and orientations. To circumvent these difficulties an additional kinetic simulation was developed to compute realistic initial distributions even at high hematocrit values. Instead of the real biconcave shapes, the enclosing ellipsoid of the RBCs were used to execute a simple kinetic process for hard ellipsoid packing. The so-called the force-bias model (Mo´scinski ´ et al., 1989; Bargieł and Mo´scinski, 1991; Bezrukov et al., 2002 ´ ) was applied to these enclosing ellipsoids. The algorithm proceeds as follows. The positions of the center of the cells are randomly distributed in the simulation domain. Next, two scaling variables are defined for every cell type (e.g., RBC, platelet): d in represents the possible largest scaling in the system without any overlap between the cells. While d out is initially set so that the merged volume (counting overlapping volumes only once) of all the ellipsoids scaled with it equals the total volume of the enclosing ellipsoids corresponding to the desired hematocrit level. Then, a repulsive force is applied between overlapping ellipsoids, proportional to the volume of the overlapping regions:

$$
\vec{F}\_{\vec{\eta}} = \mathfrak{d}\_{\vec{\eta}\vec{\jmath}} p\_{\vec{\eta}} \frac{\vec{r}\_{\vec{\jmath}} - \vec{r}\_{\vec{\imath}}}{|\vec{r}\_{\vec{\jmath}} - \vec{r}\_{\vec{\imath}}|},
$$

where δ**ij** equals 1 if there is an overlap between particle i and j and 0 otherwise, while pij is a chosen potential function. In our case, the potential function was selected to be proportional to the overlapping volume of the d out scaled particles. The positions are updated following Newtonian mechanics where mass is proportional to the particle scaling radius. This ensures that larger particles will move slower than smaller ones (i.e., an RBC will push away a platelet rather than the other way around). The rearrangement of the cells have a tendency of increasing d in. As a final step the size of d out is reduced every iteration according to a chosen contraction rate τ . The computation stops when d out ≤ d in at which point the system is force-free, since there are no overlaps. Using this method, we were able to push the initial hematocrit value up to 46% covering the physiological range. Additionally, we can fix the orientation of the cells by only allowing translation of their center of mass during this computation, thus predefining the alignment of the particles. This is beneficial for initializing higher velocity flows where the cells are expected to be lined up with the bulk flow direction. **Figure 7** presents two sample initial conditions generated with this method.

It is possible to initialize simulations of up to 10<sup>6</sup> cells efficiently this way. These simulations are free from regulargrid positioning artifacts from the beginning, which in turn reduces the computational time significantly. Though the actual computational cost it saves varies by geometry, hematocrit, flow velocity, etc., in our simulations the warm-up phase needed to allow the initially regularly placed cells to arrange more realistically amounted to 10-30% of the total computational time, while with the randomized initialization this whole phase could be omitted.

#### 3. RESULTS

Our ultimate goal of accurate mechanical modeling of cellular membranes in blood flows is to allow for the resolution of the collective transport dynamics and coupling this to relevant biochemical processes. In the following, these transport properties are explored using the new constitutive model in the cases of a straight vessel sections of varying diameters. A snapshot from the simulation of the D = 128 µm case is visualized in **Figure 8**. The RBCs close to the wall experience much larger deformations than those in the center of the channel. In every simulation PLTs are present as well in a physiologic concentration (around 1/10th of the RBC cell count). Since the elastic response of the unactivated platelets are at least an order of magnitude stronger for small deformations than the response of RBCs (Haga et al., 1998), the platelets are simulated with the same constitutive model as RBCs, however, the constants κ<sup>l</sup> , κa, κ<sup>b</sup> are multiplied by 10. These simulations also benefit from the above mentioned randomized initialization of the cells.

The first fundamental transport property examined is the apparent viscosity. The results are compared to the experimental results collected by Pries et al. (1992). These experimental results are aggregated for hematocrit levels of 20, 45, and 60% after a correction for temperature and medium viscosity. Based on these data, an empirical formula is also derived in Pries et al. (1992) which was used in the current work to produce the expected results for the hematocrit level of 30%. It can be insightful to briefly overview the general measurement method of blood viscosity in experimental settings. The hematocrit level refers to the discharge hematocrit present in the blood tank, from where the flow is directed through a tube of various diameters driven by hydrostatic pressure. The relation of the pressure and the appearing average flow velocity in the tube defines the viscosity. This is taken into account in the current simulations by translating the discharge hematocrit values to hematocrit values actually present in the tube during the measurements by applying Equation (8) from (Pries et al., 1992). The simulations are initialized with zero velocity in the whole domain after which the flow is started up and driven by external body force. The results together with the experimental results are shown in **Figure 9**.

The results show good agreement with the measurements. For the simulations of H = 45% after the initialization the undeformed cells create large clusters. In the current work, the notation of an RBC cluster refers to a group of RBCs having at least a single membrane point in touch with another RBC of the same cluster. In the initial phase of the simulations the elastic effects of these RBC clusters are perceivable as the viscosity during the first few milliseconds increases quickly, well above the expected values. This is caused by the deformation of the cells residing inside these large and dense clusters and this behavior is one of the major components that leads to yield-stress. After a critical threshold in shape deformation they loose these stable structures and the viscosity quickly settles back to the expected level. For more details see Sections 3.1 and 3.2.

Another distinctive feature of cellular suspension flows is the formation of a cell-free layer (CFL) close to the walls as a result of lift force acting on the cells. The width of the appearing cellfree zones are defined using the density distribution of cells. It is the distance from the wall at which point the density distribution averaged along the vessel section reaches 5%. The results are compared to the in vitro experiments at different hematocrit levels of Tateishi et al. (1994) in **Figure 10**.

While our simulated diameter range surpasses the bounds of the experimental range, the overlapping region shows good agreement for the hematocrit level of 30 and 45%. The level of 20% does not have a directly corresponding measurement, however, it is situated between between the experimental results of 16% and 30%, as expected. For a given diameter the CFL decreases with the increase of hematocrit as more RBCs are packed into the same domain volume.

Finally, to validate the flow profile in stationary flow a straight, rectangular channel was set up to recreate the flow environment of the experimental work of Carboni et al. (2016). The hematocrit level was set to 35%, and the driving body-force was calibrated to have a volumetric flow rate matching the experiment. In **Figure 11**, the velocity profile was compared to the profile obtained from the PIV measurements.

The simulated profile fits the measurement well and has the same plug-shape along with similar widths of high-shear regions at the sides of the channel.

### 3.1. Break-up of the RBC Structures At Increasing Shear-Rates

It is a well-known phenomenon that toward low shear-rate values the viscosity of blood increases steeply (Chien, 1970). This is caused by the formation of dense clusters of RBCs including rouleaux structures. In our simulations, aggregation interactions

between cells were not included, thus, these structures arise from the various alignments and high density of the cells. This effect was investigated in the case of the D = 128 µm vessel section at H = 45%. The whole system is initialized to be still. Then, it is driven by a constant body force, and once the average velocity equilibrates (typically after a few hundred ms) the relative apparent viscosity is recorded. The shear is not constant along the radius of the vessel, however, for slow flows its local value scales approximately linearly with the average velocity. **Figure 12** shows the relative apparent viscosity of the whole vessel section at low average velocities.

The relation appears to be logarithmic (see the fitted exponential decay), which is in agreement with the literature (Baskurt and Meiselman, 1997). Around the average velocity of 1 cm/s the apparent viscosity of the vessel section already settles suggesting that the majority of the RBC structures are gone. The further increase in velocity from this point on only results in a minor change of bulk viscosity.

### 3.2. Effects Of the Initial RBC Deformation

Due to the elastic deformable nature of RBCs, blood can exhibit yield-stress behavior if the hematocrit level is high enough (Picart et al., 1998). In such a dense suspension of cells under low shearstress the clusters can behave similarly to deformable solids. The relative positions of the RBCs within these clusters remain the same while they deform. At a critical stress value the force required to further deform the cells becomes larger than the force needed to separate them, thus breaking the structure. From that point blood transitions to fluid-like behavior. The stability of these clusters is dependent on several variables, for instance the level of hematocrit and the concentration of fibrinogen in blood plasma (Baskurt and Meiselman, 1997). However, a weaker

yield-stress behavior still arises in the absence of fibrinogen (and other endogen proteins) at high hematocrit values (> 30%) (Blackshear et al., 1983; Morris et al., 1989). This effect is perceivable during some phases of the simulations, such as the initial start up of the flows in our straight vessel sections. To investigate it, the plasma was brought up to the stationary velocity driven by external body force without any cells. This is necessary to separate the effects of initial cell deformations from the effects of initially driving the fluid up to the desired velocity. The velocity is set to a high enough value (e.g., 6 mm s for the D = 64 µ m , H = 45% case and 1.5 cm s for the D = 128 µ m , H = 45% case) for the large RBC structures to break. The undeformed cells with randomized positions and alignments are then placed into the flow while the driving force is kept constant. This moment

is denoted as t = 0 s . **Figure 13** shows the progression of the relative viscosity from this point in the case of D = 64 µ m , H = 45%.

During the first 3 ms the relative viscosity rises from the value of 1 steeply while the plasma flow slows down. At this stage, the RBCs do not flow but deform. The local velocity in the fluid corresponds to the deformation velocity of the cells. Around 4 ms, the relative viscosity reaches its peak value and the clusters start to break up, i.e., the relative positions of the RBCs start to change and the suspension no longer displays solid-like features. After this point blood quickly settles back to its stable final relative viscosity. The same viscosity pattern is observable for all simulations with H = 45% during the initial phase, however, for smaller diameters the phenomenon is less significant. It must be noted, however, that in our case both the surface and the cytoplasmic viscosity was the same as the plasma viscosity, while experimental results suggest higher values of 2−6 mPas for cytoplasma (Park et al., 2011) and 10−<sup>10</sup> − 10−<sup>9</sup> Ns/m for the bilipid membrane membrane (Waugh, 1982; Evans and Yeung, 1994). This difference is likely to have a strong effect on the characteristic times of cell deformation that is not investigated here.

#### 4. CONCLUSIONS

The novel material model produces results in good agreement with several experiments targeting both single-cell mechanics and collective transport behavior. It also performs well for higher shear rate values where the other investigated model might fall short. It is capable of capturing the emerging solidlike behavior of dense RBC suspensions under low shear-rates. Furthermore, since our RBC material model is able to handle strong deformations coupling it with the LBM method for the plasma flow which operates at very small time-steps (in the order of 10−<sup>8</sup> s for the demonstrated flows) allows for small scale transient effects such as flow instabilities behind obstacles (e.g., stenosis or micro medical devices) to be simulated as well.

The framework itself is structured to be easy to extend with additional material models and cell types, e.g., white blood cells, and with other fields, such as concentrations of different chemical components as well as with new biophysical processes, for instance bond formations. The efficient highly parallel implementation is capable of handling large domain sizes, thus it is able to cover the range between cell-based micro-scale and macroscopic domains.

The demonstrated capabilities make this framework in combination with our constitutive model an ideal environment for exploring the transport effects of blood flowsin-silico. It forms a solid ground for resolving accurate transport mechanics in vascular flows as a necessary component for modeling complex phenomena such as cell aggregation around micro-medical devices, thrombus formation and rheological response of diseases effecting RBC mechanical properties.

### AUTHOR CONTRIBUTIONS

GZ conceived the research, designed the model and wrote 50% the paper; BvR collected and analyzed the experimental data, validated the Dao/Suresh model in our implementation, revised the new material model, and wrote 50% of the paper; VA

### REFERENCES


contributed to the technical realization of Hemocell and revised the final version of the paper; AH conceived and supervised the research, and revised the manuscript. All authors read and approved the final version of the manuscript.

### FUNDING

This work was supported by the European Union Horizon 2020 research and innovation programme under grant agreement no. 675451, the CompBioMed project and grant agreement no. 671564, the ComPat project. This work was sponsored by NWO Exacte Wetenschappen (Physical Sciences) for the use of supercomputer facilities, with financial support from the Nederlandse Organisatie voor Wetenschappelijk Onderzoek (Netherlands Organization for Science Research, NWO).


and confinement effects. Soft Matter 10, 4360–4372. doi: 10.1039/c4sm 00081a


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Závodszky, van Rooij, Azizi and Hoekstra. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

digital media

of impactful research

article's readership