Topological data analysis of vascular disease: A theoretical framework

Vascular disease is a leading cause of death world wide and therefore the treatment thereof is critical. Understanding and classifying the types and levels of stenosis can lead to more accurate and better treatment of vascular disease. Some clinical techniques to measure stenosis from real patient data are invasive or of low accuracy. In this paper, we propose a new methodology, which can serve as a supplementary way of diagnosis to existing methods, to measure the degree of vascular disease using topological data analysis. We first proposed the critical failure value, which is an application of the 1-dimensional homology group to stenotic vessels as a generalization of the percent stenosis. We demonstrated that one can take important geometric data including size information from the persistent homology of a topological space. We conjecture that we may use persistent homology as a general tool to measure stenosis levels for many different types of stenotic vessels. We also proposed the spherical projection method, which is meant to allow for future classification of different types and levels of stenosis. We showed empirically using the spectral approximation of different vasculatures that this projection could provide a new medical index that measures the degree of vascular disease. Such a new index is obtained by calculating the persistence of the 2-dimensional homology of flows. We showed that the spherical projection method can differentiate between different cases of flows and reveal hidden patterns about the underlying blood flow characteristics, that is not apparent in the raw data. We showed that persistent homology can be used in conjunction with this technique to classify levels of stenosis. The main interest of this paper is to focus on the theoretical development of the framework for the proposed method using a simple set of vascular data.

Topological data analysis (TDA) has been proven to provide a new perspective and a 2 new analytic tool in data analysis, inspiring researchers in various applications [4,27,28]. 3 The analysis with TDA is based on persistent homology driven by the given topological 4 space. Various forms of data from various applications are actively being used by 5 researchers via TDA for possibly finding new knowledges out of the given data set. 6 The work described in this paper is motivated by the clinical problem of the 7 diagnosis of vascular disease. In this paper we explored how TDA could be used to 8 understand the complexity of complex flows, particularly vascular flows and proposed 9 April 27, 2019 1/41 and developed a theoretical framework of the new method that could characterize and 10 classify the vascular flow conditions. 11 As the degree of vessel deformation increases, the complexity of vascular flows also 12 increases. As the complexity increases it becomes difficult to fully characterize the flow 13 behaviors. In this paper we are mainly interested in stenotic vascular flows. Stenotic 14 vessels can predispose to angina, strokes, transient ischemic episodes and flow limitation 15 causing serious vascular disease and it is crucial to understand the complexity of 16 stenotic flows for proper treatments. 17 Vascular disease is the primary cause of human mortality in the United States and 18 worldwide. Coronary heart disease is the single leading cause of death in America 19 today [2]. Each year about 1 million people die of heart disease (one in three deaths and 20 another 17 million are at risk for heart attacks. That is, 1.3 million undergo coronary 21 interventions, either a bypass (∼ 500,000), or angioplasty and/or stenting (∼ 1.3 22 million) [1]. About 23.6 million deaths of cardiovascular disease are expected by 2030. 23 Nearly 787, 000 people in the US died from heart disease, stroke and other cardiovascular 24 diseases in 2011 (one of every three deaths in America) -2, 150 Americans die each day 25 from these disease, one every 40 seconds. Heart disease -once every 90 seconds. Direct 26 and indirect costs of cardiovascular diseases and strokes are total more than $320.1 27 billon [2]. As these statistics imply, accurate diagnosis for the prediction and treatment 28 of vascular disease is crucial. Increasing the diagnosis success rate even by a few percent 29 would result in saving a significant number of human lives. For this reason, a great deal 30 of manpower and funding are used up for vascular research each year in the US.

31
In the following, we first briefly explain those two main methodologies used clinically 32 today. We proposed, in this paper, a new additional diagnosis methodology using TDA. 33 The proposed new method can be used with the existing methodology to increase the 34 diagnosis accuracy.  The most intuitive diagnosis method is the anatomic or geometric approach. This The FFR is defined as the ratio of the maximum attainable flow in the presence of a 72 stenosis to the normal maximum flow, which is uniquely given by the measure of the 73 pressure gradient around the single lesion. The FFR as a functional index is then 74 defined by the ratio of the proximal pressure to the distal pressure at maximum 75 coronary vasodilation as shown in the left figure in Figure 1: where P p , P d , P v are the proximal, distal and vasodilation pressures, respectively. In 77 most cases, P v is not elevated and considered P v ∼ 0. Thus the FFR becomes The FFR analysis of today is based on the following assumptions: 1) the pressures 79 P p and P d used for the evaluation of the FFR are evaluated simultaneously by the flow 80 wire, 2) the obtained FFR value represents the functional index for a single stenosis and 81 3) the interaction of the flow wire device with the local flow movements is negligible in 82 the measurements of P p and P d . The clinical norm of the FFR as an immediate functional index is roughly as follows: 84 F F R <∼ 0.75(0.8) implies that the lesion is functionally significant requiring 85 intervention and F F R ≥∼ 0.85(0.9) implies that the lesion is functionally not 86 significant. Measurement of the FFR is not required for a stenosis of emergent severity 87 (> 70% stenosis). However, for the lesion of intermediate severity the FFR plays a 88 critical role because the geometry of the stenosis alone does not deliver enough 89 information of the functional significance. Figure 1 shows the catheterization with a 90 flow wire for the determination of the FFR before (middle) and after (right) the 91 April 27, 2019 3/41 stenosis. The main advantage of using the FFR is its ability to measure the pressure 92 gradient inside the stenotic region directly. 93 Despite such an advantage, it requires additional costs and risk because it is more 94 invasive. Furthermore, the functional analysis using FFR does not fully utilize the 95 functional information of vascular flows because it measures the pressure drop only 96 throughout the stenotic vessel. For this reason not every value of FFR provides a direct 97 interpretation of the vasculature. Alternatively there have been investigations that use 98 computational fluid dynamics (CFD) solutions to measure the FFR using the 99 patient-specific CFD solutions [19]. Even with this approach, the FFR is only derived As shown in the previous section, the anatomical analysis and the functional analysis or their combinatorial approach are widely used and useful, but they yet may carry debatable diagnosis results for some vascular situations, particularly for the intermediate situations. Thus more refined analysis is still demanded that could deliver more functional measures than the percent stenosis and/or FFR of the pressure drop and that could predict the future development of stenosis. The ideal method is to use the complete knowledge of how all the hemodynamic variables are related, which can be given by a master function, f where v x , v y , v z are the velocity fields, P the pressure, µ viscosity and ω vorticity.

106
However, it is not even possible to relate every variable into a single numeric measure. 107 The proposed research is to investigate how the information of f can be extracted 108 through TDA. This is a new approach and can be used to reveal the clinical difference 109 between two vasculatures which have similar FFR and/or percent stenosis.

110
Our primary approach to this problem is to attempt to use the relatively recent 111 concept of TDA based on persistent homology. In this paper we explore the applications 112 of persistent homology to the problem of stenotic blood vessels based on the preliminary 113 work of [21]. In this paper we will first explain the concepts of simplicial complexes and 114 simplicial homology, followed by persistent homology. We then apply 1-dimensional 115 persistent homology to a geometric model of a stenosed vessel's boundary, the vascular 116 wall, to estimate the stenosed radius of the vessel using what we call the critical failure 117 value of the vessel. We show that this critical failure has a close relationship with the 118 disease level of the vessel. While the homology of a topological space is unaffected by 119 how the space is stretched and deformed, we see that the persistent homology captures 120 size information about an underlying space, as well as homology data. We also 121 conjecture at additional applications of this approach to other problems, such as 122 measuring aneurysms.

123
A second application of persistent homology uses velocity data generated using a 124 three dimensional spectral method projected onto the unit 2-dimensional sphere, S 2 , to 125 quantify the stenosis level and type of the given stenotic vascular flows -defined as the 126 fundamental projection in this paper. This approach is based on 2-dimensional 127 persistent homology. The justification for this approach is that considering both spatial 128 and velocity data requires understanding of high dimensional data. Instead we see that 129 restricting ourselves to only the direction of velocity while ignoring both spatial data 130 and speed allows us to find otherwise hidden trends. We show empirically that this 131 spherical projection yields different topological properties for differing levels of stenosis. 132 We also observe that this spherical projection has apparently differing geometric 133 properties for symmetric stenosis compared to asymmetric stenosis. We go on to 134 conjecture that applying these techniques to larger and more varied data sets may allow 135 for partial or complete classification of stenosis, which will be investigated in our 136 following paper. We further conjecture that these techniques may allow to understand 137 the advantages of different designs of stents.

138
In our preliminary research, we found that it is possible to reveal the topological 139 difference between the two vasculatures, the difference that can not be seen with the 140 current anatomical and functional approaches. We further found that such a difference 141 can be measured in a single numeric index through TDA if the data is presented in a 142 proper way. This new functional analysis for the stenotic vascular flows will significantly 143 improve the existing analysis.

144
The paper is composed of the following sections. In Section 2, we briefly explain 145 hemodynamic models and numerical approximation methods that we use for this 146 research. In Section 3, we will explain some basic concepts of simplicial homology.

147
Using simplicial homology, we will explain persistent homology, which is the key to our 148 research, in Section 4. In Section 5, we will explain the first proposed research, the 149 critical failure value analysis, which is the generalization of the percent stenosis. In 150 Section 6, we propose the n-spherical projection. If the first three velocity variables are 151 used for the projection, we will define such a projection as a fundamental projection. In 152 Section 7, we will provide a brief summary and explain briefly about our future research.  To model the stenotic vascular flows, we use the incompressible Navier-Stokes equations. 160 We also use the no-slip boundary conditions at the blood vessel walls. A more precise 161 description needs to consider the compressibility and more general types of boundary 162 conditions such as Navier boundary conditions and boundary conditions based on the 163 molecular model. However, as the main focus of this paper is more in the global 164 behavior of stenotic vascular flows, the incompressible equations with no-slip boundary 165 conditions suffice to consider.

166
To introduce the governing equations we consider in this paper, let ρ = ρ(x, t) be the 167 density, P = P (x, t) the pressure, u = (u, v, w) T the velocity vector for the position 168 vector x = (x, y, z) T ∈ Ω and time t ∈ R + . Here Ω is the closed domain in R 3 . We 169 assume that the blood flow we consider is Newtonian. From the mass conservation we 170 have the following equations where µ ∈ R + , the kinematic viscosity, and λ ∈ R is the bulk viscosity constant, and f 172 be the external force. Further we assume that the pressure is homogeneous in x and t and is incompressible. Then the above equations are reduced to where we also assume that there is no external force term f . For the actual numerical simulation we use the normalized equations. For example, the length scale is x o = 0.26, the baseline velocity is u o = 30, the time scale t o = 6.7 × 10 −3 , the unit pressure P o = 900, the unit density ρ o = 1 and the unit kinematic viscosity µ = 0.0377 (all in cgs units) [18]. For the incompressible Navier-Stokes equations, we need to find the unknown pressure P . In this paper, we used the Chorin's method, i.e. the artificial compressibility method [5,6]. For the Chorin's approach, we seek a steady-state solution at each time such that u t → 0, P t → 0, t > t s for ∇ · u → 0. Then for the artificial compressibility, we introduce an auxiliary equation for p such that where τ is the pseudo-time. The pseudo-time is the time for which we solve the above 175 equation for the given value of t until ∇ · u → 0 at each t. To solve the governing equations numerically, we adopt the spectral method based on 178 the Chebyshev spectral method. We use a total of N t elements. Each element is a linear 179 deformation of the unit cube, Ω c = [−1, 1] 3 . We expand the solution in each domain as 180 a Chebyshev polynomial. Let ξ be ξ ∈ [−1, 1] and T l (ξ) be the Chebyshev polynomial of 181 degree . Then in each element, the solution u is given by the tensor product of T l (ξ). 182 To explain this further, we consider the 1D Chebyshev expansion. The 3D is simply a tensor product of the 1D expansion. The 1D Chebyshev expansion is given by where P N is the projection operator which maps the solution u(ξ, t) to the polynomial space of degree N andû are the expansion coefficients. Once the expansion coefficients are found, the solution is obtained as a linear combination of the Chebyshev polynomials with the expansion coefficients. For the spectral methods, we adopt the spectral collocation method so that the expansion coefficients are given by the individual solutions at collocation points. For the collocation points, we use the Gauss-Lobatto collocation points. That is, for the collocation points, ξ i for the degree N , we have We solve the incompressible Navier-Stokes equations on x(ξ i ) and the expansion coefficients are given by the quadrature rule based on the Gauss-Lobatto quadraturê where c n = 2 if n = 0 and c n = 1 otherwise [16].

183
The 3D Chebyshev approximation is given by a tensor product of the 1D Chebyshev 184 expansion. Figure 2 shows some vessels we use for the numerical simulation. The left   In this section, we briefly explain simplicial homology that is applied to the CFD data 195 obtained by the spectral method described in the previous section. We refer [4,7,9] for 196 details. To understand persistent homology we must understand homology. In this 197 paper we will be mainly concerned with the simplicial homology of a simplicial complex. 198 This concept of a simplicial complex is fundamentally tied to the concept of persistent 199 homology and thus we must first understand what a simplicial complex is.  2. If s 1 and s 2 are in S, then s 1 s 2 is either the empty set or a face of both.

204
Speaking informally, a simplicial complex is topological space made of vertices, edges, 205 triangles, tetrahedrons and higher dimensional equivalents attached to one another by 206 their edges, vertices, faces and so on. Generally speaking, simplicial complexes are a 207 tool for building simple topologies. As we shall see later, a simplicial complex is simple 208 enough that certain important topological features can be calculated numerically,  In the left figure of Figure 4 we see three edges and three vertices. In the middle, we 211 have filled in the hole from the left figure with a triangle. Therefore we have one two 212 simplex (triangle), three one simplexes (edges) and three one simplexes (vertices). In 213 the right figure, we have a tetrahedron ABCD. This is the three dimensional equivalent 214 to a triangle. It has four triangles as faces, six edges and four vertices.

Simplicial homology 216
The basic idea of homology is that homology describes the holes in a topological space. 217 We shall see this clearly after we give a precise definition. To define homology we will 218 need to define some intermediate objects.

219
Definition 2.2. Let S be a simplicial complex, k, N ≥ 0 be integers and R be a ring with unit. A simplicial k-chain is a formal sum where the c i are elements of R and the s i are k-simplices of S.

220
When referring to a simplex, one specifies the simplex and an orientation of that 221 simplex. This is done by specifying the vertices and an ordering of those vertices.

222
Permuting the vertices represents the same simplex multiplied by the sign of that 223 permutation.

224
Definition 2.3. The free R-module C k (S, R) is the set of all k-chains.

225
For technical reasons, we take C −1 (S, R) to be the trivial module. We will write 226 C k (S, R) as simply C k for this paper out of convenience. There is a natural map 227 between these R-modules called the boundary map. Speaking imprecisely, the boundary 228 map takes a simplex to its boundary. The precise definition follows:

229
Definition 2.4. The boundary map We take δ 0 to be the trivial map. It is not difficult to verify that δ k−1 • δ k = 0 for all 230 k. Therefore the kernel of δ k−1 contains the image of δ k . This leads to the definition of 231 homology.

232
Definition 2.5. The kth homology module H k (S, R) with coefficients in R is 233 ker(δ k−1 )/Im(δ k ) [15]. While it is true that H k is a module, we will simply refer to them as groups for convenience. As an example, let us take the simplicial complex X in the left figure of Figure 4 with the vertices labelled from left to right v 1 , v 2 and v 3 . We shall take our ring R to be the rational numbers, Q. The module C 0 is the set: For C 1 we must choose an orientation for our edges and will therefore orient them according to the index of their vertices. Thus we have that C 1 is given by We have no higher dimensional simplices and thus C k = 0 for k ≥ 2. The boundary map δ k is necessarily the zero map for k = 1. The boundary map δ 1 is defined by Thus we have that

Simple algebra reveals that
It is easy to see that the kernel of δ 1 is the submodule generated by Finally it is easy to see that all other homology modules are trivial. 235 We need to understand what homology represents. As mentioned earlier, the 236 homology groups represent holes. If our ring R is not a field, then the homology groups 237 may have torsion. In this work, we will always calculate homology relative to a field, 238 namely the rational numbers, Q. Therefore the homology groups will be isomorphic to 239 the product of some number of copies of R.

240
Let us determine the homology of the complex in Figure 5 with coefficients in Q. hole will give the homology group a copy of our ring R. Thus H 1 (X, Q) = Q 3 .There are 247 no higher dimensional simplices and thus the higher homology groups are trivial.

248
If we fill in one of the holes with a triangle, Figure 6, then we now have only two The space X with a hole filled in is the space Y . homology group a copy of our ring R. This torus has one hollow, thus its second 256 homology group is Q (see Figure 7).

257
In general, the nth homology group measures n dimensional holes, i.e. holes similar 258 to the interior of the n dimensional sphere. In this work we will not go higher than 259 second dimensional homology because our analysis in this paper does not require higher 260 dimensions.

261
Definition 2.6. The kth Betti number for a topological space X, β k , is the rank of the 262 kth homology group.

263
As we have seen above, for each n dimensional topological space there is a copy of 264 the ring R in our homology group. The Betti numbers therefore represent the number 265 of holes in each dimension. The primary topological feature we explore is that of persistent homology [4]. We will 268 first give a brief overview of this concept. Given some point data, called a point cloud, 269 that are points on some surface or other manifold, we generate a complex that is 270 hopefully a reasonable approximation of the original manifold. Given the points, we will 271 assign edges and triangles (and higher order simplices) to pairs, triples, etc. of the point 272 cloud. Roughly speaking, if a pair of points are close to each other, we add an edge 273 between them and similarly for faces. To assign edges and higher order simplices, we 274 introduce a parameter t, called the filtration value. This t is the length of the largest 275 edge that may be included in our simplex. We will let t vary and at each value we will 276 create the complex, calculating the homology of the complex at each t value.

277
There are, generally, at least three strategies to make use of this parameter to assign 278 simplices. The Vietoris-Rips strategy [11] places an edge between two points if their 279 distance is less than t and a face between three points if their pairwise distance is less 280 than t and so on. This strategy is fine, but computationally expensive. The next 281 strategy is the witness strategy [22] which takes two subsets of the points, called 282 landmark points and witness points. The landmark points serve as vertices of our 283 complex. We will place an edge between two landmark points if there is a witness point 284 within distance t of both points, a face if there is a witness point within t of all three 285 points and so on. Usually all of the points in the point cloud are used as witnesses. The 286 last strategy is the lazy-witness strategy [22], where edges are assigned identically to the 287 witness strategy, however faces and higher simplices are assigned anywhere there are n 288 points that are all pairwise connected with edges. We will be using the lazy-witness 289 streams for our computation due to the reduced complexity of the calculation.

290
It is also worth noting that in the witness and lazy witness methods, there is 291 sometimes an extra mechanism used to help decrease noise. For this mechanism, rather 292 than compare distances to t, we compare distances to t + η n (p) where η n (p) is the 293 distance from p to its nth nearest neighbor and p is the witness point being considered. 294 This tends to remove some noise for low t values. We see the same point cloud at t = .25 ( Figure 9). It is important to understand 299 that although the points all lie on a plane, the edges and triangles should be considered 300 to pass through higher dimensions so as to not intersect, except at their common faces. 301 In this figure, we have added arrows to indicate the five obvious holes that are present. 302 It is conceivable that there may be more holes hidden, but we will see that this is not 303 the case.

304
In Figures 10 thorugh 12, we see that the point cloud now is topologically the same 305 as an annulus, with only the middle hole present. It is also worth observing that as t 306 increases, the central hole is gradually becoming smaller and will eventually be closed 307 with a large enough t value.    have multiple stacked intervals graphed that correspond to individual generators of the 318 homology groups. In the zeroth dimension we see many generators that correspond to 319 many disconnected components when t is small, which eventually are connected into a 320 single component when t is larger. In the first dimension we see a number of small 321 circles that are quickly closed up and one circle that lasts a long time corresponding to 322 the one large hole in the center of the annulus. We have placed rectangles about the five 323 intervals that are present at t = .25 that correspond to the five holes seen in Figure 9. 324 We have only generated barcodes for the zeroth and first dimensions.   In Figures 14 and 15, we have witness and lazy witness barcodes. Because there is a 343 choice inherent in the witness and lazy witness methods, choosing the landmark points, 344 these are not unique. Performing these calculations a second time will generate a 345 different barcode. It is also worth pointing out that if an interval is shorter than the 346 minimum t step, then those intervals are not shown. In Figure 14, we have several 347 connected components at t = 0, which quickly become a single connected component at 348 approximately t = 0.1. We also see a number of one dimensional holes when t is small, 349 but by t = 0.2 there is exactly one hole left, the annulus center. In Figure 15, we see a 350 number of intervals in the zeroth dimension, but again we see that eventually we only 351 have one connected component. In the first dimension we only see one hole, which lasts 352 for a wide interval. This again corresponds to the central hole of the annulus.  The complete algorithm for calculating persistent homology can be found in [28]. We will briefly summarize the algorithm here. To compute the homology of a simplicial complex, one must understand the boundary operator δ. Since we are looking at homology relative to a field, the chain groups and homology groups, C k and H k , are vector spaces. Therefore, one can consider δ k to a linear map between vector spaces. Because we are interested in homology, we are interested in the kernel of this map, as well as the image. We may use the standard basis of the chain groups, specifically the k-chains, as our basis. Let us use as an example the complex in Figure 16. The basis for C 1 is {a, b, c, d} and the basis for C 2 is {bc, cd}. Here we are referring to the edges by their endpoints. If we write the standard matrix for δ 0 , it would simply be the zero matrix. If we write the matrix for δ 1 , relative to the bases in the order given above, then we would get To compute the kernel of our matrix, we will transform the matrix into Smith normal form. To do this, we use elementary row and column operations. Specifically, we may swap two columns, add a multiple of one column to another and multiply one column by a non zero constant. The row operations are similar. Each of these operations correspond to a change of basis in either C 1 or C 0 . If we add row two to row three and then row three to row four, our new matrix will be Our new basis for C 0 will be {a, b-c, c-d, d}. The basis for C 1 is unchanged because we 359 used no column operations. form, we may simply read off the dimensions of kernels and images of the δ k and simply 370 take their difference to find the dimensions of our homology group.

371
To calculate persistent homology is a harder task. For this we must have the definition of a persistence module. In our case, we will receive as input a number of complexes, all representing the same point cloud for different values of our filtration variable t. Let us call the chain complex at the kth timestep C k n . We will similarly call the kernels and boundary groups Z k n and B k n , respectively. For each C k n , there is a natural inclusion map Here, R is our field.

372
It is shown in the same paper that calculating the homology of this persistence module is equivalent to calculating the intervals that appear in the barcode. Due to the structure theorem [27], that every graded module M over a graded PID, R[t], decomposes uniquely into the form where Σ α represents an upward shift in grading. Thus, our persistence modules C k n , will 373 give rise to persistent homology groups H k n , which will decompose in the above manner. 374 It was shown in [27], that factors of the form Σ i R[t]/t j−i correspond to persistent 375 intervals of the form (i, j) in our barcode. Similarly, each free factor, Σ i R[t] corresponds 376 to an interval of the form (i, ∞). Thus the crux of the problem comes down to finding 377 and decomposing the homology of our persistence modules. To accomplish this, we may 378 simply use row and column reduction as in the example above. The finer details, along 379 with some simplifications, are laid out in the original paper.  Figure 13, we had only one interval in the zeroth dimension that that was rather long 383 and many shorter intervals. Similarly, in the first dimension we again had one long 384 interval and many shorter ones. By making use of our prior knowledge that the point 385 cloud was coming from an annulus, we see that the "real" features, namely one 386 connected component and one circle, correspond to the long intervals and the shorter 387 intervals are noise. It is reasonable to assume that this holds frequently (though not 388 certainly) in general data sets. Usually longer intervals will correspond to "real" While we have not given any proof that longer intervals tend to be more important 394 than shorter intervals, in [4], there is an argument given that solidifies this mindset.

395
Speaking imprecisely, the various methods of building simplicial complexes out of point 396 clouds fall inside a hierarchy under inclusion. The lower methods (lazy witness and 397 witness) are easier to compute but less accurate. The higher methods (Vietoris-Rips 398 and other methods not discussed in this paper) are more complex to calculate, but more 399 accurate. If a barcode has an interval with sufficient length, then this guarantees a 400 corresponding interval in the higher complexity methods. We refer the reader 401 to [4,7,11] for more details. Our stated goal is to explore the applications of topological features and data to 404 understanding stenosed blood vessels. First we will use a simple model of a stenosed 405 vessel to explore the topological structure of a vessel. We will initially only consider the 406 exterior of a vessel, which topologically speaking is a cylinder. A stenosed blood vessel 407 is characterized by a narrowing of that vessel. 408 We shall consider the typical radius of our vessel to be r 0 = r healthy and will assume that the stenosis takes the shape of a Gaussian distribution. We will also assume a small amount of noise, in the form of a uniform random variable ∈ [0, 0.1]. We will take r st to be the difference between the normal radius and the stenosed radius. Therefore our model will be, in cylindrical coordinates where the stenosis is at z = 0 and r 0 − r st is the radius of the cylinder at the stenosis. Thus r st is a measure of how stenosed the vessel is, specifically the vessel has a stenosis percentage of 100 r 0 − r st r 0 .
When working with real data we will not have any equation that represents the surface 409 of the vessel, rather we will have point data approximating the surface. Therefore for  For our first application example we will use 1500 points for our blood vessel model 417 above, with 100 landmark points. We use the lazy witness method described above.

418
The points are selected randomly on our surface and the landmark points are selected 419 using an algorithm that selects points uniformly according to pointwise distance. An 420 image of the point cloud with the landmark points circled in red is included in the right 421 figure of Figure 17, and points are colored to give depth.  When we use the lazy witness strategy to graph the barcode for Figure 17, we get 423 Figure 18. As we saw above, each interval corresponds to a generator of the homology 424 in the corresponding dimension. We can see that the homology is for the most part the 425 homology of a cylinder. Initially there is some noise when t is small, but until about   This critical failure value is of particular importance to us because we shall see that 438 it approximates the stenosis of the vessel. The critical failure value is a generalization of 439 percent stenosis. The exterior of the blood vessel is a cylinder. The ends of the cylinder 440 are open and thus we have a one dimensional hole, the hollow center of the cylinder. If 441 the ends were capped then we would have a two dimensional hole instead. We are using 442 persistent homology and thus we are approximating the point cloud with simplicial 443 complexes as described above. As t increases we add more and more edges and triangles 444 April 27, 2019 18/41 to our complex. Eventually, as we saw in the annulus example above, we will have 445 triangles that span the hollow of our cylinder. When this occurs, our simplicial complex 446 no longer is a hollow cylinder and thus has different homology.

447
Definition 4.2. Suppose P is a point cloud with points chosen from the topological 448 space S. The principal critical failure value of P is the critical failure value of S.

449
Speaking generally, suppose we have a point cloud of data that is contained in a 2D 450 shape (or 3D solid, etc.) S with n points. We call this point cloud S n . If n is very large, 451 and the points are more or less evenly spread out, then it is reasonable to expect that 452 CF V (S n ) ≈ CF V (S), assuming some reasonable conditions regarding S. In fact, it is 453 reasonable to write that lim n→∞ CF V (S n ) = CF V (S), again assuming some 454 reasonable conditions about S and the method under which the points are chosen.

455
The critical failure value and principal critical failure values will depend heavily on 456 which method one uses to calculate the persistent homology. We will use subscripts to 457 indicate which method is being referred to. We mentioned that when calculating 458 persistent homology using the lazy witness method, sometimes one may choose to 459 include the complexity of considering the nth nearest neighbor for points when 460 constructing our simplexes. When one is constructing the persistent homology of a 461 topological space, rather than a point cloud, there will not usually be an nth nearest 462 neighbor, as any point will usually have infinitely many points arbitrarily near it. Thus, 463 when constructing the persistent homology of such a set, we will not include the nearest 464 neighbor complexity.

465
Theorem 1. For S a circle of radius R, CF V w (S) = CF V lw (S) = R.

466
Here the subscripts w and lw denote the witness and lazy witness methods, 467 respectively.

468
Proof: First, observe that in the zeroth and first dimensions, the complexes created 469 using the witness and lazy witness methods are identical, and therefore the principal 470 critical values for these two methods will be identical.

471
Let us consider a circle of radius R centered at the origin and an inscribed regular 472 hexagon, as pictured in Figure 19. The reader can verify that the distance between 473 neighboring vertices of the hexagon is R. Let us suppose that our parameter τ = t < R. 474 We will consider a two simplex on the circle that may be generated using the witness 475 and lazy witness methods under this setup. We will show that such a simplex does not 476 contain the center of the circle and therefore the first dimensional homology of our 477 complex is nontrivial. This will imply that the critical failure value of our circle is 478 greater than t. Let p and q be the vertices of the inscribed hexagon adjacent to (R, 0). We suppose 480 that we have a one simplex with vertices p and q . We assume without loss of 481 generality that the witness point for p and q is (R, 0). If not, we may rotate the circle 482 until these two vertices straddle (R, 0) and then necessarily can take (R, 0) to be our 483 witness. Because τ < R, it must be the case that p and q lie between p and (R, 0) and 484 q and (R, 0), respectively. For the moment, we assume that the distance between p and 485 (R, 0) and q and (R, 0) is exactly t, as pictured in Figure 20. Similarly, any point that may be connected to q to form a simplex would be 500 contained in a similar arc around q . Any point that may be connected to both p and 501 q must be contained within both arcs. The intersection of these two arcs is precisely 502 the arc from p to q , as pictured in Figure 22. However, any simplex with vertices p , q 503 and a third vertex within this arc does not contain the origin. Now suppose the points p and q have distance to (R, 0) less than t. Repeating the 505 above construction, the arc that surrounds p of points that may be connected to p is 506 simply rotated clockwise by the same amount that p has been rotated. Similarly the 507 arc about q is rotated counterclockwise the same amount that q has been rotated.

508
This rotation of these two arcs can widen their intersection, but still cannot generate a 509 simplex containing the origin. To see that this is true, notice that for a simplex to  In any case, we have shown that if τ < R, then there is no one simplex that contains 516 the origin and therefore the homology in the first dimension of our complex contains at 517 least one generator.

518
To see that the critical failure value is at most R, observe that if τ = R, then we 519 may construct a simplex with landmark points and witness points at alternating 520 vertices of the inscribed hexagon. This simplex includes the origin, and all other points 521 can easily be covered. Thus the critical failure value of a circle of radius R is R.
To prove this, we will need two brief lemmas. 525 Lemma 3. Let C 1 and C 2 be two circles centered at the origin, with the radius of C 1 = 526 r 1 < r 2 = radius of C 2 . Let p be the point (r 1 , θ 1 ) and q be the point (r 2 , θ 2 ) in polar 527 coordinates. Also let q be the point (r 1 , θ 2 ). Then the distance from p to q is greater 528 than the distance from p to q . Proof: Let us calculate the distance between p and q using the law of cosines and the triangle with vertices at p, q and the origin. By the law of cosines, if d 1 is the distance between p and q, then d 2 1 = r 2 1 + r 2 2 − 2r 1 r 2 cos(γ) where γ is the angle ∠poq. If we rearrange slightly, we get d 2 1 = (r 2 − r 1 ) 2 + 2r 1 r 2 (1 − cos(γ)).
If we replace r 2 with r 1 , we get d 2 , the distance between p and q .
Observing that we have made one positive term zero and the second non-negative term 530 becomes either strictly smaller or remains zero, thus we have the lemma proven.

532
Also let p = (r 1 , θ 1 ) and q = (r 1 , θ 2 ). Then the distance between p and q is larger than 533 the distance between p and q . Proof of Corollary: To prove this, first observe that, due to Theorem 1, the circle at 537 z = 0 is filled in exactly when τ = r 0 − r st . Thus the critical failure value of our set is 538 at most r 0 − r st . Next we show that our critical failure value is at least r 0 − r st .

539
First, suppose that τ = t and that our first homology group is trivial. It must be the 540 case that some simplex of our complex intersects the z axis. Let p i = (r i , θ i , z i ), 541 i = 1 . . . 6, be the landmark points and witness points of such a simplex. It is a 542 relatively simple exercise to show that replacing these six points with p i = (r i , θ i , 0) 543 does not increase any pairwise distance (though now these projections may not be 544 contained within S). Further, observe that the simplex with vertices given by these 545 projected vertices still intersects the z axis at the origin. 546 Next, notice that the closest our set S to the z axis is on the circle made up of the 547 intersection of S with the plane z = 0. We will call this circle C. Observe that, we 548 simply moved our original points vertically and therefore did not change their distance 549 from the z axis. Thus, our projected points are at least as far from the z axis as the 550 radius of C. Therefore we see that r i ≥ r 0 − r st . Now, using our two lemmas repeatedly, 551 we see that replacing our points p i with p i = (r 0 − r st , θ i , 0) does not increase any 552 pairwise distance. Further, the new simplex still intersects the z axis. Thus, we see that 553 these new points give us a simplex that fills in the center of C, with landmark and 554 witness points given by the p i . By the theorem, it is not possible for such a simplex to 555 exist if τ < r 0 − r st , and therefore τ = t ≥ r 0 − r st . The above corollary demonstrates that, without noise, if we have enough points on 557 the surface of our model, we expect that the critical failure value will give us our stenosis 558 radius almost exactly. To see empirically how well our estimate works with noise, we 559 will generate a point cloud on our model of a stenotic vessel, with noise, and calculate 560 the persistent homology as outlined above, many times. We will take r 0 = 1, and r st a 561 random variable taking values from [0, 1] and will plot the critical values against r st .

562
For this experiment, in each iteration, we use a total of 1500 randomly chosen points 563 and 400 landmark points, approximately evenly spaced. We performed this experiment 564 a total of 80 times. The results of the experiment are pictured in Figure 26.  Figure 26, we can see, as expected, that the relationship between the critical value and r 0 − r st is approximately linear. There is some nonlinearity for large values of r st . This is due to the minimal radius being approximately the same size as the noise, which is caused by gaps in the model due to too few points. Our expectation is that as the number of points and landmark points are increased the relationship between the critical value and r st will be approximately CFV = 1 − r st . This is due to our blood vessel having a normalized healthy radius of 1. For a general vessel, we expect the critical value to be approximately CFV = r healthy − r st .
For different shapes of stenosis, we conjecture that this critical failure value would still 566 be a measure of stenosis, though the exact relationship between stenosis and CFV 567 would depend on the shape of the vessel. 568 We see from our above calculations that the critical failure value is related to the 569 minimal radius of the vessel, thus giving us an idea of the stenosis of the vessel that is 570 determined entirely by the point data of the vessel. This stenosis level can potentially 571 be calculated directly from the vessel without the use of this critical failure value in 572 some cases, and therefore the question of the usefulness of this critical failure value must 573 be considered in our future work. The critical failure value will be defined for 574 practically any shape of vessel, whether or not the stenosis is shaped like a Gaussian or 575 any number of other symmetric or asymmetric shapes. Further, this critical value does 576 not require exact knowledge of the location of the stenosis. In our model above, the 577 stenosis was at z = 0, but if we instead had it at z = 1, or any other height, these 578 calculations would generate the same results. This suggests that the critical value may 579 be useful in automating the diagnosis of stenosis. We also expect that the critical value 580 method may act as a sort of universal measurement for all different types of stenosis.
We have demonstrated that size data is encapsulated within the barcodes. Not only 582 do the barcodes give the homology of a topological space, but also measurement data as 583 well. It is therefore reasonable to suggest that in this problem as well as many others, 584 reading off this size information may be critical. For example, we speculate that a 585 similar method may be used to measure the widening of vessels, aneurysms.

586
If we replaced our model above with a model that had Gaussian widening instead of 587 narrowing, thus modeling an aneurysm, then we would be interested in how wide the 588 widest portion of the vessel is. This could potentially be estimated by looking for the 589 largest t value where there is a second dimensional generator. To understand this 590 geometrically, we realize that at a relatively small time step the two ends of the vessel 591 will be capped off by triangles spanning their diameter, due to having a smaller radius 592 than the aneurysm. When these ends are capped, we will have a large hollow, namely 593 the interior of the vessel. This hollow will eventually be filled with tetrahedrons, and 594 therefore become trivial. The t value where this happens will depend on how wide the 595 vessel has become, and therefore that t value would be a measure of the wideness of the 596 vessel.

597
Because the definition of persistent homology depends on the distances between 598 points, the fact that persistent homology encapsulates not only homology information 599 but size and diameter information is reasonable. Taking radius and size data from 600 persistent homology can potentially have significant applications in many different real 601 world problems, not just in the context of stenotic vessels. In this section we propose the spherical projection and TDA with the spherical 604 projection based on the two dimensional homology. We will use vascular data calculated 605 using the incompressible Navier-Stokes equations as outlined in Section 1. The spectral 606 method used gives far more detailed results about the vessel, including pressure and 607 velocity data for the interior of the vessel.

608
Of particular interest to us is the velocity fields of blood flows moving in the vessel. 609 When the blood vessel is healthy one has essentially laminar flow. All of the blood is 610 moving in parallel in the same direction. When the vessel is diseased, we may see 611 turbulence. In this work, we propose to analyze the given data in the phase space. First 612 we investigate the data in the phase space with the first three velocity components. In 613 the left figure of Figure 27 the data is visualized in the phase space. Each axis is 614 corresponding to each velocity component.  There are no hollow portions or apparent significant circles that would give interesting 619 homology. Thus both would be topologically trivial.

620
To construct a meaningful topological space, we found that the projection of the raw 621 data onto the n-unit sphere -so-called an n-spherical projection is the key element of 622 TDA of vascular disease [21]. To understand why the projection approach works, we 623 consider the case of random fields where the 3D velocity and pressure are all randomly 624 generated. The left figure of Figure 28 shows the spherical projection of the random 625 velocity fields on S 2 . The right figure of Figure 28 shows the spherical projection of the 626 random velocity and pressure fields on S 3 . The right figure shows the pressure contour 627 on the velocity fields. The color represents the pressure distribution. Notice that the 628 sphere S 2 in the left figure is hollow but the sphere in the right is not.  Figure 29 shows the corresponding barcode to the spherical projection of the random velocity fields on S 2 (left) and the spherical projection of the random velocity and pressure fields on S 3 (right). As shown in the figures, a hole appears at t ≈ 0.4 and disappears at t ≈ 0.9 in the second dimension (left figure for S 2 ) and in the third dimension (right figure for S 3 ). The interval where hole is existent in S 2 (left) and in S 3 (right) in the barcodes is significant and it represents the underlying topology well. We define the persistence of an n-dimension interval, Π n as where a is the value of t when the hole starts to appear and b is t when the hole 630 disappears. In Figure 29, Π 2 ≈ 0.5 for the left figure and Π 3 ≈ 0.4 for the right.

631
The parameter, Π n serves as a measure of the complexity and we hypothesize that 632 Π n is directly related to the level of disease.

633
Definition 5.1. Fundamental projection: Let v = (v x , v y , v z ) be a non-zero velocity vector. We first normalize v. The fundamental projection is the projection of the normalized velocity fields onto the unit sphere, where ||v|| is the norm of velocity fields, e.g. ||v|| = v 2 x + v 2 y + v 2 z .

634
Definition 5.2. n-spherical projection: The n-spherical projection is the general projection that involves more variables, including the velocity fields, such as the pressure, P . If the pressure data is included, the spherical projection is done by The physical implication of the topological structure for the fundamental projection 635 seems obvious but the one for the general projection n ≥ 3 is not obvious and we need 636 to conduct a parameter study using the CFD solutions in our future work.

643
It is worth observing that our vascular data is of the form < s, U , P >, with s being 644 spatial data, U being velocity data and P being scalar pressure data. The fundamental 645 projection therefore reduces our topology from a seven dimensional space to a two 646 dimensional space, namely the surface of the sphere. An advantage of this projection is 647 that three dimensional data can be readily visualized -however, a lot of information 648 may be lost in this reduction. The difference between the left and right figures in Figure 30 is clear, one is a sphere, 650 the other is not. More precisely, the seventy percent stenosed projection has points all 651 over the sphere, whereas the ten percent stenosed projection only has points on the pole 652 of the sphere. To see the topological difference between these two, we can use our tool 653 of persistent homology. We expect the first to have no generators in the second 654 dimension and the second to have one. Our next section will go into this in detail. The process of the proposed method should be clear already; take the velocity data 657 from a stenosed vessel and calculate its projection on the unit sphere, which we call its 658 spherical projection (fundamental projection as we use the first three velocity 659 components). Next we calculate its persistent homology. Because this calculation has 660 high complexity, we use a fraction of our points as in the earlier calculations. We use 661 the lazy witness strategy with 200 witness points and 150 landmark points. 662 We make a choice of points when we perform our computation, and therefore there is 663 a measure of imprecision inherent in our results. If we simply choose our points 664 randomly, there is a chance that we will choose badly. If we wrongly chose a set of 665 landmark points that were clustered together near a pole, we would make a poor 666 deduction as to the coverage of the sphere. Because the sphere is a fixed scale, it does 667 not take many points that are evenly spaced to properly cover the sphere. Therefore if 668 we make sure to choose our landmark points to be as evenly spaced as possible out of all 669 possible choices, we can be confident that we avoid this case.

670
In the following, we will show the unprojected velocity, the spherical projections and 671 the barcode associated with the persistent homology for various cases. Again we have 672 used the Javaplex package from [23] to calculate the barcodes. It is worth pointing out 673 that when calculating the barcodes using Javaplex, if there are no intervals calculated 674 above a certain dimension, then that dimension may not be graphed. As we should suspect from the earlier Figure 30, the spherical projection for the 676 vessel with 10% stenosis has no meaningful homology in the higher dimensions. In Figure 34, the points have begun to spread across the sphere slightly, but there is 682 still no meaningful homology. In Figure 35, we obviously see that there is no homology in the higher dimensions for 684 the spherical projection of the vessel with 30% stenosis. In Figure 36, we now see that points have begun to cover the sphere. There are not 686 enough points across the sphere to result in a really good two dimensional hole, but we 687 expect we will eventually see one. We also expect that there will be a number of circles, 688 due to the gaps between points. In Figure 37, our barcode matches our intuition. We see some small circles and some 690 two dimensional holes. It is worth observing that a vessel stenosed 40% is considered 691 diseased, whereas lesser stenosis levels are not. Therefore we have shown that the 692 spherical projection reveals a topological difference between diseased and undiseased 693 vessels. In Figure 38, we start to see the sphere become more uniformly covered due to the 695 increasingly chaotic blood flow. We expect to get a two dimensional hole and some 696 number of one dimension holes.   In Figure 40, we now have the sphere is almost uniformly covered, except at the 700 poles. We therefore expect several one dimensional circles and one long lasting two 701 dimensional hole. In Figure 41, we see many one dimensional circles and one long lasting two 703 dimensional hole.  asymmetric vessel may have more circulation that a symmetric vessel and therefore even 718 the less stenosed vessels will have circulation. In Figure 43, we see that already we have the sphere completely covered and there is 720 an additional feature in the form of a ring about the equator. This ring was not present 721 on the symmetric vessel cases and therefore immediately suggests that the spherical 722 projection may be useful in differentiating between different types of stenosis.  In Figure 45, we again see significant circulation and an equatorial ring. We expect 726 that there will be a number of circles and a two dimensional hole. In Figure 47, we see many circles and one two dimensional hole. In Figure 48, we notice that the two dimensional hole has a shorter interval than in 731 the previous barcode. This is apparently caused by a thinning of points near the north 732 and south poles of the sphere. Thus a larger value for t is required before the sphere is 733 completely covered. In Figure 49, we again observe an equatorial ring, along with a more uniformly 735 covered sphere. In Figure 50, we see many circles and a long two dimensional hole.  In Figures 51 and 52, we again see many one dimensional circles, a two dimensional 738 hole and an equatorial ring. The two dimensional hole is very short due to a less 739 uniform covering of the sphere.

740
One feature that we easily observed is that the spherical projections of these 741 asymmetric spherical projections seem to have a denser ring of points about the equator. 742 This feature is not present on the projections of the symmetric vascular data and 743 therefore we presume is caused by the asymmetry. This feature may be important and 744 would likely not be difficult to detect. If the ring is present, a simple least squares or 745 least absolute values approximation of the points using a plane may be used to find the 746 ring. We suspect similar features may be present for differing types of asymmetry which 747 may allow for future classification of the types of asymmetry.

748
The above pictures also show a trend for the spherical projections, which can be seen 749 by looking at the uniformity of the points on the sphere. For the 10% by 40% vessel, 750 the sphere is relatively uniformly covered. For the 40% by 60% vessel, many of the 751 points on the sphere have migrated, with fewer points near the north and south poles. 752 This can be seen by looking at the spherical projections, or perhaps more clearly by 753 looking at the barcodes. In the barcode for the 40% by 60% vessel, there is a two 754 dimensional hole, but the length of that hole is short, because t must be large before we 755 cover the sphere due to the migration of points. 756 Another important feature that is in all of these asymmetrical vessels, as well as in 757 the high stenosis symmetric vessels, is a clusters of points near the positive and negative 758 Fig 53. A blood vessel with a ring stent inserted. The velocity of the blood is also shown.
y-direction poles. These represent the majority of blood flowing forward and a smaller 759 but significant amount of blood flowing backward.

760
As shown in [14], it is natural to develop asymmetry when the stenosis is being  In Figures 53 through 55, we see that there is much circulation of the blood in the 768 vessel and therefore there is a two dimensional hole. In the previous sections, we have focussed solely on velocity data. Velocity of blood flow 771 somewhat naturally maps to the unit sphere precisely because circulation naturally 772 corresponds to a spherical projection covering the unit sphere. However, there may be 773 useful data to be extracted from projecting some or all of the higher dimensional data 774 onto higher dimensional spheres, or even other topological spaces. This leads to the 775 n-spherical projection of an n-dimensional non-zero vector (Definition 5.2).

776
How useful this idea of higher dimensional spherical projections remains to be seen 777 and will be considered in our figure work. For example, one might consider the four 778 dimensional projection of velocity and pressure data onto the three-sphere. There may 779 be important information encoded here, but what that information is and how it is 780 encoded is less clear, in part because this higher dimensional projection is much less 781 natural than the fundamental projection.

782
Perhaps a more natural alternative would be to project velocity and pressure data 783 onto S 2 × I, where I is an interval. Pressure is a scalar in our data and certainly 784 non-negative and therefore this projection may be more natural. Exploring the 785 information present within these higher dimensional projections is a topic for a later 786 paper; we merely include it here for completeness.

793
In this paper, first we introduced the concepts of homology and persistent homology 794 and gave an example of their use. We applied this concept of persistent homology to the 795 geometry data of the exterior of the vessel thereby generating the so called critical 796 failure value. We demonstrated empirically that this critical failure value has a close 797 relationship with the stenosis level of a vessel and therefore may be used to measure 798 stenosis. This method may be used for various vessel shapes and therefore may help 799 serve as a general method of measuring stenosis. Further, this method demonstrates the 800 April 27, 2019 38/41 potential general application of persistent homology to determine size information about 801 a topological object, which may have many applications. 802 We next developed the concept of the spherical projection to understand and 803 quantify vascular flow. We demonstrated that the spherical projection reveals important 804 information and patterns about the vascular flow that are not apparent to the naked eye. 805 We applied this method to a varied data set and demonstrated the differences thereof. 806 This concept of spherical projection may be critical to understanding and classifying 807 the different types and levels of stenosis. We applied this spherical projection to many 808 different sets of vascular data and showed clear differences between the barcodes for the 809 different stenosis levels and types. The barcodes for the vessels with high stenosis were 810 different compared to the less stenosed vessels. Additionally the asymmetric vessels 811 were different from the symmetric vessels in their spherical projections, due to the 812 presence of the equatorial ring.

813
A potential future application of both these concepts, spherical projection and 814 critical failure value, are to those cases of unusual vascular geometry, such as bifurcation 815 or curved vessels. The critical failure value should be able to determine stenosis level in 816 both these cases and the spherical projection would make sense as well. The spherical 817 projection would allow for better understanding of the underlying vascular flow in these 818 cases as well.

819
An important question to ask is how important is the persistent homology in the 820 spherical projection. In our above calculations, every important piece of information 821 given by the barcodes was readily observed from the spherical projections. In fact, the 822 persistent homology lost some information, because the persistent homology did not 823 reflect the equatorial rings. Therefore, while the persistent homology in three 824 dimensions may be less useful, due to being able to see the spherical projection, 825 persistent homology would be vital in these higher dimensional data sets.

826
The spherical projection may be generalized to data from higher dimensions. We 827 have only used the velocity data for our calculations, but we have pressure data as well. 828 Curvature may also be calculated from the vessel geometry, which could be another 829 useful piece of data. Projecting some or all of this data on a higher dimensional sphere 830 or other high dimension object and calculating the persistent homology of these 831 projections may give good results. The spherical projection is naturally physical since 832 much of the important information of velocity is in the direction, which this projection 833 preserves. However, simply projecting higher dimensional data onto a sphere may not 834 be the best projection to consider. Rather, we conjecture that the particular projection 835 used should be targeted, based on intelligent analysis and understanding of the data in 836 question.

837
In this paper, we focused on developing the theoretical framework of the proposed 838 method. And the data set of vascular flows is from the simple CFD calculations with 839 rather simplified vessel configurations. We applied the proposed method to real patient 840 data and obtained desired data, which will be presented in our upcoming paper.