^{1}

^{2}

^{1}

^{1}

^{1}

^{3}

^{4}

^{3}

^{3}

^{2}

^{1}

^{1}

This is an open-access article distributed under the terms of the

With the global rise of cardiovascular disease including atherosclerosis, there is a high demand for accurate diagnostic tools that can be used during a short consultation. In view of pathology, abnormal blood flow patterns have been demonstrated to be strong predictors of atherosclerotic lesion incidence, location, progression, and rupture. Prediction of patient-specific blood flow patterns can hence enable fast clinical diagnosis. However, the current state of art for the technique is by employing 3D-imaging-based Computational Fluid Dynamics (CFD). The high computational cost renders these methods impractical. In this work, we present a novel method to expedite the reconstruction of 3D pressure and shear stress fields using a combination of a reduced-order CFD modelling technique together with non-linear regression tools from the Machine Learning (ML) paradigm. Specifically, we develop a proof-of-concept automated pipeline that uses randomised perturbations of an atherosclerotic pig coronary artery to produce a large dataset of unique mesh geometries with variable blood flow. A total of 1,407 geometries were generated from seven reference arteries and were used to simulate blood flow using the CFD solver Abaqus. This CFD dataset was then post-processed using the mesh-domain common-base Proper Orthogonal Decomposition (cPOD) method to obtain Eigen functions and principal coefficients, the latter of which is a product of the individual mesh flow solutions with the POD Eigenvectors. Being a data-reduction method, the POD enables the data to be represented using only the ten most significant modes, which captures cumulatively greater than 95% of variance of flow features due to mesh variations. Next, the node coordinate data of the meshes were embedded in a two-dimensional coordinate system using the t-distributed Stochastic Neighbor Embedding (

Atherosclerosis is the leading cause of death in the developed world, accounting for more than 40% of total mortalities per year. While it has been accepted that risk factors like hypertension, high cholesterol and diabetes play a pivotal role in the progression of the disease, they do not explain the prediliction of atherosclerotic plaque formation near sites of arterial bifurcation, side branching and curvature (

To overcome the above, physics-based machine learning technologies have raised interest recently. These methods are predicated on capturing the underlying physics either via incorporation of the actual conservation laws (

In light of these advances in closely related fields of research, this paper establishes the foundation of our novel method amalgamating these techniques and applies it to a well-characterised experimental dataset of atherosclerotic pig coronary arteries (

We have developed an automatic pipeline which generates synthetic data from existing 3D reconstructed blood vessels (

The data processing pipeline is summarized in this flowchart. OCT images are obtained in the cath. lab. and used to extrapolate a 3D contour. Mesh generation and Computational Fluid Dynamics are done through an automatic pipeline. The velocity profiles obtained from CFD will act as the ground truth. Synthetic data generation (

Synthetic data has been proposed to meet the huge data requirement of artificial intelligence (AI) (

10 randomly selected phantom geometries from the dataset are visualised. All phantoms shown were generated from the same OCT image. Variation in shape is due to random synthetic perturbations applied to the artery diameter, the function of which is a composite of two sinusoids with randomised amplitude, frequency, phase and vertical displacement. This ensures smooth, continuous variation along the length of the artery regardless of input parameters.

Parameters of the Carreau-Yasuda model.

3.45 | 56 | 3.313 | 2 | 0.3568 |

POD is a tool in CFD post processing and is derived from the Singular Value Decomposition (SVD) method for matrix factorisation commonly used in statistical analysis. The method finds correlations in the vector flow solution field, which contains small linear perturbations, to obtain an Eigenbasis onto which the mesh flow data can be projected. In classical POD, the correlations are obtained in the time domain to identify flow structures that are most dynamically important in time during the evolution of turbulence. The same methodology is also extended to varying flow cases based on different experimental setups (e.g. considering a number of unsteady flow experiments performed on the same CFD mesh), this is known as common base POD (cPOD) (

Several shape optimizers have been proposed in the literature, of which

As a next step, a reduced order mapping is obtained by minimizing the Kullback-Leibler divergence between the Gaussian distribution of the original points and a Student’s

SVD re-organizes the modes based on their energy level content and the number of modes are truncated when >95% of the variance of the field is preserved. This resulted in the first 10 modes for the pressure field and the shear stress field for the dataset we use for this study, which when used for reconstructing the solution leads to a root mean squared error less than 5%. In order to interpolate the POD principal coefficient field that enables predictions of future objects, simple feed-forward neural networks and classical machine learning methods were compared. It was found that the RFR algorithm combined with the Regressor Chain algorithm were best suited for this task.

The RFR algorithm is a supervised machine learning technique that integrates multiple independent decision trees on a training data set: the obtained results are ensembled to obtain a more robust single model compared to the results of each tree separately (

An automatic pipeline was implemented to perform highly accurate 3D reconstruction from biplane angiograms and an OCT pullback (

A collection of meshes generated using various OCT images and perturbation parameters, coloured by the pressure (left) and wall shear (right) solutions from CFD simulations. The mesh dimensions are normalised for the sake of visualisation.

(left) Root-mean-squared error for the reconstruction of the original mesh-wise pressure solution from a truncated set of 10 principal coefficients per mesh. The error is normalised against the range of pressure values across all meshes. (right) Singular values for the decomposition of the pressure solution, normalised against the largest value. These singular values are ordered by magnitude and represent the relative contribution of each POD mode to the energy of the overall pressure solution. Subsequent values quickly decay to <1% of the highest value, as the first several modes represent the overwhelming majority of the information in the pressure field. This indicates that many of these trailing modes can be safely discarded from the dataset without losing a significant amount of information.

The mesh-wise reconstruction error for wall shear (left) is much lower than pressure reconstruction using the same number of coefficients. Additionally, the singular values (right) decay to 0 in a fewer number of modes compared to the pressure decomposition. These factors are indicative of the wall shear solution being easier for the POD method to decompose than static pressure, possibly due to the fewer number of CFD nodes for which it is computed.

Next was a reduction in the dimensions of the mesh topology using

The distribution of all meshes in the database embedded in 2D

The distribution of all meshes in the database embedded in 2D

The 1407

Predictions of POD principal coefficients of shear stress for first two modes using the proposed framework, compared to the ground truth for the test data set. The first part of the same data set was used for training via the RFR. The regression was performed on the 2D

Predictions of POD principal coefficients of pressure for first two modes using the proposed framework, compared to the ground truth for the test data set. Training and testing of the RFR model for pressure utiised the same algorithm, configuration, and optimization as shear stress.

With the regression for cPOD principal coefficients completed, the mesh-wise modes previously generated by the cPOD method together with the newly predicted coefficients are used to reconstruct the flow field. Results of the 3D reconstruction of the shear stress and pressure fields for the CFD method (“ground truth”) the cPOD reconstruction alone, and the RFR prediction are shown in

A visualisation of the flow field solution for pressure (left) and wall shear (right) of two test meshes. Shown is the ground truth CFD simulation data (top), the reconstructed POD solution using the 10 most dominant coefficients calculated from the CFD solution (middle) and the reconstruction using the RFR predicted coefficients (bottom).

Mean errors and standard deviations of reconstructed pressure solution (Left), and of the reconstructed shear stress solution (Right). All values are normalized against corresponding range of values in the full dataset.

NMAE, % | NRMSE, % | NMAE, % | NRMSE, % |
---|---|---|---|

Rheological theories of Atherosclerosis have been shown to successfully predict plaque location, plaque progression, and plaque rupture (

Synthetic manipulations have recently been introduced to Machine Learning to overcome the excessive requirement of well annotated data for AI algorithms (

Dimensionality reduction helps retain defining features whilst drastically reducing the volume of data required to represent them. This makes machine learning algorithms more likely to identify such features, along with being more computationally efficient. Additionally, it aids in removing noise and extraneous features which can confound important signals (

In unsteady fluid mechanics problems on a fixed mesh, a 1D time coordinate is typically used as an evolutionary variable to characterise the snapshots of the POD method. Here, this approach is generalised to a set of 2D

The standard RFR algorithm was found to be a suitable option for non-linear regression to reconstruct the POD signals from the

To translate the current method to clinical applications, several limitations must be addressed. First, the current implementation assumes that shape variations are the most important factor affecting velocity fields and their derived parameters. This is corroborated by theoretical arguments, as well as observations that velocity, shear stress and pressure drop strongly scale with diameter. However, the artery flow field also scales with the inflow velocity, which changes throughout the cardiac cycle. To systematically account for the unsteady velocity variation, future developments include extending the scope of the AI model by re-adding the time evolution input. In the meantime, the current simplified steady model may already be sufficient if the flow features of interest are slow compared to the viscous effects, i.e. the flow in the coronary vessel is quasi-steady. In this case, the time history of inflow velocity variation can be decoupled into a series of time frames, where each frame may be represented by a steady process at a different inlet velocity scale. In turn, the shear stress and pressure fields at each frame can be rapidly reconstructed from the inflow velocity and the shear stress and pressure fields of a baseline dataset using the scaling law introduced by Taylor et al. (

A more serious limitation of the current study is the neglect of the natural flexibility and heterogeneity of vessel walls in the flow modelling process. Whilst the rigid wall assumption significantly accelerates the solution of the governing Navier-Stokes equations, modelling of the Fluid Structure Interaction (FSI) is essential to correctly capture the coronary artery flow behaviour (

Despite the overall salutary results of the RFR method, to further refine accuracy of the machine learning model predictions in future, the RFR algorithm may be replaced by more advanced methods such as those based on Gaussian processes; one advantage of which being uncertainty quantification to provide an overall error estimate for the user. Such estimations would be an invaluable addition to a model that is intended for use as a diagnostic tool for clinicians.

Finally, in line with many recent works devoted to the proof-of-concept data-driven modelling of cardio-vascular flows (

Despite the above-mentioned limitations of the current work, it can be concluded, using

To conclude, we developed a method to produce a very fast solution to the Navier-Stokes equations, as we aimed to focus on applying this method in a clinical environment with high demand for rapid solutions. We are currently working towards newer methods enabling time dependent flows that incorporate solid state interactions, as well as higher accuracy AI modelling functions with corresponding error estimates.

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

All authors listed have made a substantial, direct, and intellectual contribution to the work and approved it for publication.

The BHF (FS/PhD/22/29316) is appreciated for their funding of this project. This study was conducted with the assistance of the Research Software Engineering team in ITS Research at Queen Mary University of London.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.