In-stent restenosis in acute coronary syndrome—a classic and a machine learning approach

Background In acute coronary syndrome (ACS), a number of previous studies tried to identify the risk factors that are most likely to influence the rate of in-stent restenosis (ISR), but the contribution of these factors to ISR is not clearly defined. Thus, the need for a better way of identifying the independent predictors of ISR, which comes in the form of Machine Learning (ML). Objectives The aim of this study is to evaluate the relationship between ISR and risk factors associated with ACS and to develop and validate a nomogram to predict the probability of ISR through the use of ML in patients undergoing percutaneous coronary intervention (PCI). Methods Consecutive patients presenting with ACS who were successfully treated with PCI and who had an angiographic follow-up after at least 3 months were included in the study. ISR risk factors considered into the study were demographic, clinical and peri-procedural angiographic lesion risk factors. We explored four ML techniques (Random Forest (RF), support vector machines (SVM), simple linear logistic regression (LLR) and deep neural network (DNN)) to predict the risk of ISR. Overall, 21 features were selected as input variables for the ML algorithms, including continuous, categorical and binary variables. Results The total cohort of subjects included 340 subjects, in which the incidence of ISR observed was 17.68% (n = 87). The most performant model in terms of ISR prediction out of the four explored was RF, with an area under the receiver operating characteristic (ROC) curve of 0.726. Across the predictors herein considered, only three predictors were statistically significant, precisely, the number of affected arteries (≥2), stent generation and diameter. Conclusion ML models applied in patients after PCI can contribute to a better differentiation of the future risk of ISR.


Introduction
Coronary artery disease (CAD), especially acute myocardial infarction (AMI), is one of the most common causes of death worldwide despite the developments in diagnosis and revascularization treatment (1,2).In-stent restenosis (ISR), defined by lumen reduction following a percutaneous coronary intervention (PCI) by more than 50% or at least 5 mm in a stent edge (3), is determined by one of the two processes: neo-atherosclerosis, defined by the accumulation of lipids and macrophages within the neointima, with or without necrotic core and calcifications, and neointimal proliferation, caused by the proliferation and migration of vascular smooth muscle cells into the tunica intima layer (4).In-stent restenosis is a progressive process that might begin early (within days) from the PCI or late (weeks to months) and it correlates with three major underlying mechanisms: elastic recoil, vascular remodeling and neointimal hyperplasia (5).In-stent restenosis continues to pose a significant problem even in the age of drug-eluted stents (DES), although compared to bare-metal stents (BMS), DES proved to reduce restenosis and postpone later revascularization by a total percent of 50%-70% (3).The incidence rate of ISR in the case of DES, although lower than BMS remains as high as 10% (6).For this reason, ISR has been thoroughly studied throughout the years, in order to find out the best approach to prevent and treat restenosis.Unfortunately, the risk factors for restenosis remain controversial, only a few of them (diabetes mellitus, stent length and stent diameter) being consistently found to influence the rate of coronary ISR (7)(8)(9).
In some studies, classic cardiovascular risk factors are deemed to increase the likelihood of ISR (9,10), but in some others this relationship is not very clearly validated (11).Many studies investigated the role of various cardiovascular risk factors in the development of restenosis; diabetes mellitus was incriminated in many researches as an independent risk factor for ISR, along with arterial hypertension, active smoking and low-density lipoprotein cholesterol (LDL-C) (12)(13)(14).Erkan Yıldırım et al. (15) showed in multivariate logistic regression that ISR was independently associated with Gensini score, stent diameter and length, left ventricle ejection fraction, and LDL-C.In another study concentrating on the risk of restenosis in patients with sirolimus-eluting stent implantation, age, arterial hypertension (HTN), diabetes mellitus (DM), LDL-C, high sensitivity C-reactive protein (hsCRP), and target lesion on left circumflex artery (LCX) were independent predictive factors for restenosis (12).Other reports concentrated on the angiographic risk factors of ISR; several studies (9,12) as well as a meta-analysis (7) showed that increased stent length is correlated with ISR, while the stent diameter implication in restenosis is debatable with two studies pinpointing increase of ISR with reduced stent diameter (13, 15), while another study found the opposite, that ISR correlates with increased stent diameter (9).Multi-vessel disease or multiple stents implanted also increases the risk for ISR (14).The location of the implanted stent is also correlated with the risk of restenosis, left anterior descending artery (LAD) being the most often cited as the location with the highest risk of ISR (9), although there are many studies citing either LCX or right coronary artery (RCA) (7,12).Inflammatory markers such as CRP or hsCRP also correlated with ISR (12,14).Moreover, single nucleotide polymorphisms such as: fibrinogen factor I (rs1800790), monocyte differentiation antigen CD14 (rs2569190), and nitric oxide synthase 3 (rs1799983) have been correlated with ISR (16).Even if the precise value of these cardiovascular risk factors is not clear due to some contradictory findings, we have integrated some of these classic risk factors and completed with other parameters, related to the angiographic findings using a new approach based on machine learning (ML) that may produce a better differentiation of the future risk of ISR.
Recently, ML technology started to be used in medical studies due to its ability to quickly scan through large databases, find patterns that would have been otherwise hard to detect through other means, and use these patterns in order to accurately predict outcomes when given new sets of similar data (17).Thus, it becomes clear that ML has an increasing potential not only in the world of medical studies, but also in clinical practice.To the best of our knowledge, there is only one study that used ML in order to predict the probability of ISR occurrence by Jesús Sampedro-Gómez et al. (18).
In the present study, we used ML to build a prediction model which uses demographic, clinical and angiographic data in order to predict the appearance of ISR and the necessity of target lesion revascularization.

Materials and methods
Patients with ACS who underwent PCI at Clinical Emergency Hospital, Bucharest, followed by a second coronary angiography, that occurred in the context of a new ACS, both between 2011 and 2019, were enrolled in this retrospective case-control study.We excluded patients whose data about the main outcome, ISR, could not be found.For the prediction of ISR we collected the following type of data: demographic: age and gender; cardiovascular risk factors: active smoking, smoke quitting compliance, dyslipidemia, HTN, DM; angiographic: extent of coronary artery disease (number of major coronary arteries with at least 70% stenosis, respectively 50% stenosis for lesions in the left main), location, length, diameter and type of stent, stent inflation pressure, the need for pre-dilation and/or post-dilation, presence of artery calcification and culprit lesion-whether the stented lesion was the one that cause the acute coronary syndrome(ACS) or not; other data: time passed between the PCI and the second angiography, whether the ACS was AMI or unstable angina, previous PCI or coronary artery by-pass graft (CABG).All the angiographic measurements were done by Quantitative Coronary Angiography in the core laboratory formed by interventional cardiologists with at least ten years of experience.The primary outcome evaluated was coronary in-stent restenosis, defined as a 50% or higher stenosis diameter in a coronary segment in which a stent had been previously implanted.The secondary outcome was the necessity of revascularization of the target lesion, defined as indication of PCI in a coronary segment with ISR.Addressing ISR severity, Mehran classification dividing ISR into four patterns was considered (19).
All analyses were completed using the statistical software program SPSS version 23.Data were presented as mean ± SD for continuous variables and as number and percentage for categorical counterparts.Differences between groups if normally distributed were compared with Student's t-test or chi-square, while for non-parametric variables Mann-Whitney U-test was assessed.P-values were two-tailed with a cut-off of less than 0.05 considered statistically significant.
We herein explored various ML techniques to predict the risk of ISR development within a certain timeframe, based on demographic, clinical parameters and risk factors.Overall, 21 features were selected as input variables for the ML algorithms, including continuous, categorical and binary variables.Therefore, in the preprocessing phase continuous features were rescaled to [0,1] and the categorical features have been one hot encoded.
Experiment setup-given the relatively small number of available samples, the trained models may lack generalizability, and a regular performance evaluation on a held-out test set might not be representative.Therefore, as depicted in Figure 1 we propose a strategy to assess the performance of various ML algorithms as a function of the amount of data available for training.
First, for each experiment a random train-test split was performed designating 20% of the available samples as an evaluation set.Next, from the remaining 80% of the data allocated to the training set, we randomly select 25%, 50%, 75% and 100% of the samples to train a Random Forest (RF) model ( 20), a support vector machines (SVM) model ( 21), a simple linear logistic regression (LLR) and a deep neural network (DNN).To reduce potential biases stemming from the data, we repeated each experiment 100 times with different random splits and store the area under the receiver operating characteristic curve (AUC) for the binary classification problem.This allowed us to assess the performance of the ML algorithms more accurately, by computing the mean of the AUC scores obtained across the experiments, as well as to assess the performance trend with respect to the number of training examples.
Four different types of ML algorithms-as displayed in Figure 1 -were trained to predict ISR.Deep neural networks exhibit the greatest potential in modelling non-linear patterns in the data that could lead to a performant classifier.However, when provided with insufficient number of training samples they become difficult to train due to overfitting effects.Therefore, to assess the true modelling potential we concurrently employed 3 established machine learning algorithms that can be optimized on relatively fewer data-points as compared to DNNs.[PA (DERIAD11] The RF, SVM and LLR algorithms have been trained using the scikit-learn library (22) available in the Python programming language, using the default parameters.
The deep learning model has a fully connected architecture with three hidden layers of 64, 64 and 32 neurons respectively, where non-linearities are provided by hyperbolic tangent activation functions.The output layer is composed of only one neuron with a sigmoid activation, hence the model will output the probability for the input pertaining to the positive class.Given the reduced number of training examples, we regularized the model by a drop-out layer placed between the last hidden and the output, randomly removing 15% of the connections, thus forcing the model to perform predictions more objectively.
For the DNN model the fraction of data used for training was further randomly split into a train set (80% of the samples) and a validation set.The training was enabled by the Adam optimizer with an initial learning rate of 10 −4 , minimizing a binary crossentropy cost function for 200 epochs, with a batch size of 64.Given the reduced number of training examples, for each experiment we chose the best performing model with respect to the validation loss to avoid the effect of overfitting.To count for the severe class imbalance (only 18% positive cases) we penalize the model more when misclassifying a positive sample, as compared to a negative one, by setting class weights computed using the scikit-learn library (22).

Results
A number of 340 patients was admitted into the hospital for two diagnoses, namely unstable angina (37.20%) and acute myocardial infarction (62.80%).From this cohort, 72.76% were males with a mean age of 59.95 ± 11.07 years old, and 27.24% were females with a mean age of 64.93 ± 9.26 years.A total of 492 stents were implanted in these patients, out of which 158 (32.11%) were BMS, 18 (3.66%)were first-generation DES, 125 (25.41%) secondgeneration, and 132 (26.83%) third-generation DES.In-stent restenosis was found in 87 (17.68%) of the coronary lesions, out of which 42 (48.28%)required a second PCI with stent implantation.The demographic, clinical, angiographic and stent predictors for ISR considered into our study are displayed in Table 1.
Out of the stents, 17 (3.46%)were implanted on the left main (LM) coronary artery, 187 (38.01%) on the RCA, 184 (37.40%) on LAD artery, 74 (15.04%) on LCX and 30 (6.10%) on other smaller branches or by-pass grafts.The average time spent between the two angiographies was 1.25 years.The characteristics of the implanted stents were also evaluated in terms of length, diameter, generation, and maximum pressure (Table 1).Concerning the angiographic Mehran classification of ISR cases, pattern I was found in 35% of patients, pattern II in 25%, pattern III in 31% and IV in 9%, p = 0.72.
Taking into account all the variables introduced in the model, the following regression equation was obtained: In-stent restenosis probability ¼ 3:50 þ 0:000043(TtF) Where TtF- Simplifying Equation 1 by removing the terms with null coefficient, we obtain the following relation: From this set of predictors only 3 predictors were statistically significant (p < 0.05) at a confidence interval of 95%, precisely, the number of affected arteries, stent generation and diameter.
Corroborating the information from Table 1 and Equation 2, it can be concluded that stents from older generations increase the risk of in-stent restenosis, while generations 2 and 3 diminish the probability of this condition.In our study, 62.07% of in-stent restenosis occurred in patients with BMS, whereas generations 2 Frontiers in Cardiovascular Medicine and 3 generated 13.79%, and respectively, 12.64% of the restenosis cases.Furthermore, reduced diameter of the stent, precisely less than 3.5 mm related with the risk of ISR.Concerning the ML results, as described in the subsection Experiment setup we aimed to assess the effect of increasing the number of training samples on the model performance.Figure 2 shows the mean AUCs for each experiment.
All algorithms have shown a statistically significant (p-value < 0.05) improvement in AUC when increasing the data fraction from 25% to 100%, but none of them has reached a statistically significant improvement when increasing the fraction of data from 75% to 100%.
As depicted in Table 2, when using the entire training set to train the models, in terms of AUC scores, the best performing  algorithm is the random forest, achieving a mean AUC of 0.721 [95% CI (0.709-0.733)].In terms of the algorithm performance trend, although not statistically significant (p-value = 0.47), the DNN is the only one that seems to maintain an ascending trendline when increasing the fraction of data used for optimization from 75% to 100%, while all other algorithms seem to saturate.When assessing the impact of increasing the fraction used for training from 50% to 100%, random forest algorithm still shows no statistically significant improvement (p-value = 0.32) while the DNN does (p-value = 3.87 × 10 −5 ).

Discussions
The implementation of specific recommendations in a personalized approach is a desirable goal for any disease.Machine learning algorithms aiming to predict high vs. low risk subjects for ISR could avoid unnecessary follow-ups and consequently costs.
The primary aim of the study was to investigate the risk factors for in-stent restenosis considering multiple clinical and angiographic variables from both a classic statistical perspective, but also from a ML approach.
The statistical analysis pinpointed three elements increasing the risk for ISR: stent generation and diameter, as well as the number of affected arteries, subjects with multiple coronary lesions being at a higher risk.Similarly to findings from literature (23), the majority of ISR cases were registered in subjects with BMS, compared with second and third generation of DES.Moreover, ISR was more frequently in small diameter stents and in multi-vessel coronary disease patients; 94.25% of the total individuals with ISR had stents with diameters between 2 and 3.5 mm in agreement with the other reports (9).Compared with previous studies that revealed a relation between ISR and other risk factors, such as smoking, DM, HTN, dyslipidemia, impaired renal function, etc., this was not identified in our cohort (24).Referring to the angiographic morphologic classification of ISR, 60% of our cases were Mehran pattern I or II.Besides the Mehran classification applied in the present study that has some inconvenient as it was designed in the BMS area, there are other methods of ISR classification, such as the use of intracoronary imaging, such as intravascular ultrasound (IVUS) or optical coherence tomography (OCT), Waksman classification (25), that allows the determination of the underlying mechanisms involved in the development of ISR, for guidance and optimization of the results.Unfortunately, in the current study only a minority of patients (around 20%) beneficiated of intracoronary imaging, reason for which this predictor could not be included in our model.
Starting from the same cohort we have built four ML algorithms to predict ISR, from which RF showed the best areas under the curve with increasing fraction of data used for training, but even if DNN seem to perform worse on small

Limitations
Firstly, our study is limited by its retrospective, single-center design.The indication for repeating the coronary angiography was based on new ischemic events, therefore the patients included could represent a selected high-risk group questioning the validation of our cohort.Secondly, there are other predictors, such as intravascular imaging, biomarkers, genetic elements that were not included into our model.Thirdly, there are patients with asymptomatic ISR that have not been included into our study, bias inducing an underestimation of the accuracy of the displayed model.

Conclusions
Using routine clinical and angiographic data in a retrospective analysis we have built 4 machine learning models to predict the risk factors of ISR with good prediction scores for RF model that could help clinicians in deciding coronary stenting.

FIGURE 1
FIGURE 1Schematic overview of the proposed analysis.The outer loop sets the fraction of data used and the inner loop randomly splits the dataset into a training and a testing set.

FIGURE 2
FIGURE 2Machine learning performance as a function of the dataset size.y-axis shows the ROC-AUC while the x-axis shows the fraction of data used for training a DNN (blue), a LLR (orange), a SVM (green) and a RF (red) prediction model.

TABLE 1
The clinical, angiographic, and stent predictors considered for in-stent restenosis probability.

TABLE 2
(18)ine learning experiment area under the curve (AUC) score results.ascompared to simpler ML algorithms such as LLR or RF, it has greater potential in achieving a better performance when the number of samples available for training increases.Compared with the only study that utilized ML to predict restenosis(18), our models have external validation at least for the subjects that presented with another ACS and the RF algorithm from the present research displayed better AUCs compared with GRACIA-3, EVENT or PRESTO studies.It is to emphasize that ML models work properly when extensive balanced data are provided.Nevertheless our models included small, unbalanced data sets that reflect better the daily clinical scenarios, but impacting the generality of the model.Furthermore it is also important to compare ML algorithms with conventional methods of ISR risk assessment, such as coronary angiography or intracoronary imaging (optical coherence tomography-OCT and intravascular ultrasound-IVUS).Proper stent expansion is a well-known factor for preventing ISR and several studies showed that smaller minimal stent area at IVUS are relative good predictor for ISR (26) (AUC varying around 0.8 depending on the coronary artery location) mainly in left main stenting, but less studied for other locations, reason for which it is a 2a recommendation only for left main and complex coronary artery stenting in the 2021 American Heart Association/American College of Cardiology, and Society of Cardiovascular Angiography and Interventions (AHA/ACC/SCAI) Guideline for Coronary Artery Revascularization.Even if intracoronary imaging has many advantages in predicting ISR its use is recommended for optimizing stenting mainly in the left main and its wide application restricted due to increased costs.The proposed ML model combining of clinical and angiographic characteristics offers clinicians a tool for early identification of high-risk patients for ISR allowing clinicians to early detect risk factors. datasets