ORIGINAL RESEARCH article
A High-Precision Machine Learning Algorithm to Classify Left and Right Outflow Tract Ventricular Tachycardia
- 1Computational and Data Science, Chapman University, Orange, CA, United States
- 2Department of Cardiology, Ningbo First Hospital of Zhejiang University, Hangzhou, China
- 3Department of Cardiology, Loma Linda University, Loma Linda, CA, United States
- 4Harefield Heart Science Center, Imperial College London, London, United Kingdom
- 5CHOC Children’s Hospital, Orange, CA, United States
- 6Department of Environmental Sciences, Faculty of Science, Alexandria University, Alexandria, Egypt
- 7Zhejiang Cachet Jetboom Medical Devices Co., Ltd., Hangzhou, China
Introduction: Multiple algorithms based on 12-lead ECG measurements have been proposed to identify the right ventricular outflow tract (RVOT) and left ventricular outflow tract (LVOT) locations from which ventricular tachycardia (VT) and frequent premature ventricular complex (PVC) originate. However, a clinical-grade machine learning algorithm that automatically analyzes characteristics of 12-lead ECGs and predicts RVOT or LVOT origins of VT and PVC is not currently available. The effective ablation sites of RVOT and LVOT, confirmed by a successful ablation procedure, provide evidence to create RVOT and LVOT labels for the machine learning model.
Methods: We randomly sampled training, validation, and testing data sets from 420 patients who underwent successful catheter ablation (CA) to treat VT or PVC, containing 340 (81%), 38 (9%), and 42 (10%) patients, respectively. We iteratively trained a machine learning algorithm supplied with 1,600,800 features extracted via our proprietary algorithm from 12-lead ECGs of the patients in the training cohort. The area under the curve (AUC) of the receiver operating characteristic curve was calculated from the internal validation data set to choose an optimal discretization cutoff threshold.
Results: The proposed approach attained the following performance: accuracy (ACC) of 97.62 (87.44–99.99), weighted F1-score of 98.46 (90–100), AUC of 98.99 (96.89–100), sensitivity (SE) of 96.97 (82.54–99.89), and specificity (SP) of 100 (62.97–100).
Conclusions: The proposed multistage diagnostic scheme attained clinical-grade precision of prediction for LVOT and RVOT locations of VT origin with fewer applicability restrictions than prior studies.
One population-based study (Dukes et al., 2015) of 1,139 older adults without any heart-failure signs or systolic dysfunction shows that premature ventricular complexes (PVC) and ventricular tachycardia (VT) burden are significantly associated with an increased risk of adjusted decreased left ventricular ejection fraction (odds ratio, 1.13) and increased adjusted risk of incident heart failure (hazard ratio, 1.06) and death (hazard ratio, 1.04). Catheter ablation (CA) is a commonly considered treatment of VT patients with and without structural heart disease when drugs are ineffective or have unacceptable side effects (Cronin et al., 2019). It has a class I indication for treatment of idiopathic outflow tract ventricular tachycardia (OTVT) (Joshi and Wilber, 2005; Latchamsetty et al., 2015). The OTVT stems from the right ventricular outflow tract (RVOT) in 60–80% of the cases and from the left ventricular outflow tract (LVOT) (Bunch and Day, 2006) in the rest of the cases. An accurate prediction of RVOT and LVOT origins of OTVT can optimize the CA strategy, reduce ablation duration, and avoid operative complications. Previous studies (Kamakura et al., 1998; Hachiya et al., 2000; Ito et al., 2003; Joshi and Wilber, 2005; Tanner et al., 2005; Haqqani et al., 2009; Zhang et al., 2009; Betensky et al., 2011; Yoshida et al., 2011, 2014; Cheng et al., 2013, 2018; Nakano et al., 2014; Efimova et al., 2015; He et al., 2018; Xie et al., 2018; Di et al., 2019; Enriquez et al., 2019; Yamada, 2019) propose several criteria or models to estimate RVOT and LVOT origins. However, these results have been limited by sample size, scope of studies, ECG measurement efficiency, and generalizability of the models. In contrast, we develop an optimal multistage scheme that automatically extracts features from standard 12-lead ECGs and incorporates these features into a machine learning model to predict RVOT and LVOT origins of VT or PVC with clinical-grade precision and provides multiprospective analysis for the most important ECG features.
Materials and Methods
The institutional review board of Ningbo First Hospital of Zhejiang University has approved this retrospective study and granted a waiver of the requirement to obtain informed consent. The study was conducted in accordance with the Declaration of Helsinki.
From each patient’s entire ECG recorder, three cardiac electrophysiologists (EPs) unanimously selected one QRS complex during the sinus rhythm (SR) and one QRS complex during the PVC or VT as the initial input. The features extracted from the two QRS complexes are supplied to an optimal machine learning classification model that provides two possible prediction outputs: RVOT or LVOT. For the purposes of the classification scheme, RVOT is considered a positive outcome and LVOT a negative one. This study employed a training–validation–testing design to correctly assess the performance of the algorithm. This study consists of four phases (shown in Figure 1): (Dukes et al., 2015) a feature extraction phase in which two feature extraction methods are studied and compared—our proprietary automated ECG feature extraction method and a method based on conventional QRS morphological ECG measurements (Cronin et al., 2019) a training phase in which the extreme gradient boosting tree classification model is supplied by the features generated in the feature extraction phase (Joshi and Wilber, 2005) a validation phase aimed at finding important features as optimal model input and deciding the optimal discretization cutoff threshold that was applied in the testing phase; and (Latchamsetty et al., 2015) a testing phase aimed at evaluating, interpreting, and reporting the model performance.
We reviewed patients who underwent mapping and ablation for frequent PVC or VT that originated from either LVOT or RVOT at the Ningbo First Hospital of Zhejiang University from March 2007 to September 2019. A PVC or VT burden above 10% of total test duration was required for a study entry. A total of 420 patients with OTVT were included in this study. Origin sites of OTVT were confirmed by a successful CA, which means the frequent PVC and VT did not occur above 5% of the total test duration in the first 6-month follow-up after CA.
Classification of Anatomic Sites
The anatomical structure of RVOT and LVOT is depicted in Figure 2, and the demographic data of the anatomic sites are shown in Supplementary Section A and Table 1. This study only focuses on the prediction of RVOT and LVOT rather than the subsites (shown in Figure 2) under RVOT and LVOT. The effective ablation sites of RVOT and LVOT confirmed by ablation provide evidence to create RVOT and LVOT labels for the subsequent machine learning model development.
Figure 2. Anatomic structure of LVOT and RVOT. LVOT includes left coronary cusp (LCC), right coronary cusp (RCC), non-coronary cusp (NCC), aortomitral continuity (AMC), and LVOT summit. RVOT includes anterior cusp (AC), left cusp (LC), right cusp (RC), RVOT freewall, and RVOT septal.
Mapping and Ablation Procedure
Anti-arrhythmic drugs were stopped for at least five half-lives before the inception of the ablation procedure. A 4.0-mm 7F irrigated ablation catheter (Navistar; Biosense Webster, Diamond Bar, CA, United States) was initially placed in the RVOT for mapping. Both fluoroscopy and electroanatomic mapping systems (CARTO, Biosense Webster, Diamond Bar, CA, United States or NavX Velocity, St. Jude Medical, St. Paul, MN, United States) were used to localize the anatomic position of the ablation catheter within the outflow tract. The intracardiac echo was used to identify specific anatomical structures, such as cusps and papillary muscles. For example, Figure 3 presents the fluoroscopy, 3-D mapping, intracardiac echocardiography, and activation mapping for a case with the origin site in commissure of aortic sinus of valsalva LVOT. Using point-by-point mapping, anatomic aggregated maps were created. Activation mapping was performed in all patients during VT and PVC. Pace mapping was also performed with the lowest pacing output (2–20 mA) and pulse width (0.5–10 ms) to capture the ventricular myocardium at the site of the earliest activation. If suitable ablation sites for the RVOT were not located or ablation failed to abolish the arrhythmia, extended mapping to the LVOT site was deployed via a retrograde aortic approach. After target sites were located, radiofrequency energy was delivered up to a maximum power of 35 W and a maximum electrode-tissue interface temperature of 43°C. If the VT or PVC disappeared or the frequency of arrhythmias diminished after the first 30 s of ablation, the energy was delivered continuously from 60 to 180 s. Ablation success was defined as the absence of spontaneous or induced VT or PVC at 30 min after the last energy delivery and confirmed by continuous cardiac telemetry in the subsequent 24 h of inpatient care.
Figure 3. Activation map and fluoroscopic map for VA originating from commissure of aortic sinus of valsalva in LVOT. (A) Right and left anterior oblique fluoroscopic views show an ablation catheter in the LVOT. Ablation in the LVOT (LCC–RCC commissure) eliminated the PVC within 3 s. (B) The 3-D anatomic representation of the RV endocardium, LV endocardium, and venous system with the ablation catheter positioned at the anterior interventricular vein. (C) The green circle indicates the tip of the ablation catheter in LCC–RCC commisure. (D) The earliest bipolar and unipolar activation time (–30 ms) are shown.
The Procedure to Assess the Catheter Ablation Outcomes
In the subsequent 24 h of inpatient care after the ablation procedure, every patient received continuous ECG monitoring. After discharge, the patients underwent a follow-up 2 weeks after the ablation and then every month at the cardiology clinic. A 12-lead surface ECG test was obtained on each clinic visit, and 24-h Holter monitoring was also prescribed at 3 and 6 months after the ablation.
ECG Measurement Protocol
Noise Reduction and QRS Sample Selection
With chest and limb leads placed carefully in a standard position, the 12-lead surface ECGs were collected by the EP workmate system (EP-WorkMateTM System, Abbott, Saint Paul, MI, United States) at a sampling rate of 2,000 Hz before the ablation procedure. The noise sources impacting the ECG database were power line interference, baseline wandering, and random noise. Wavelet transform yields better time–frequency localization results than windowed Fourier transform and naturally has an advantage in noise reduction applications (Abi-Abdallah et al., 2006). Thus, the wavelets technique was used to remove the noise components mentioned above. The coif5 Wavelets (Lahmiri, 2014) and Stein’s Unbiased Risk Estimator (SURE)-based (Stein, 1981; David and Johnstone, 1995) threshold were implemented by MATLAB to carry out the noise reduction steps. To get a full understanding of the techniques and schemes that were adopted in this work, please refer to the code availability section. After noise components were removed, three cardiac EPs unanimously selected one QRS complex during the SR and one QRS complex during the PVC or VT to classify RVOT and LVOT.
Automated ECG Feature Extraction Method
We applied the following measurements and transformation protocol to automatically extract ECG morphological features and supply them to the machine learning model. We used the R-wave peak points of PVC and SR heartbeat in lead V6 as reference lines because they are easy to identify in most conditions. At the first step, for one SR heartbeat, 215 data points (0.11 s) before and after the reference line were truncated, and 335 data points (0.17 s) before and after the reference line were cut for one PVC. The above lengths of 430 and 670 were the means of QRS complex duration plus four times the standard deviation of that for SR beat and PVC. They should cover 99.99% of the QRS complexes in any data due to the normality of the QRS duration distribution and the empirical rule. The mean and standard deviation of QRS duration were computed from the samples in this study; the maximum length of QRS complex for SR beat is 405 data points, and the maximum for PVC is 607 data points. Second, for every lead, we selected the first peak/valley (local maximum or minimum) closest to the reference line (shown in Figure 4A) defined in the first step. Third, the three peaks or valleys before the first peak/valley identified in the second step and the four peaks or valleys after the first peak/valley were selected from all peaks and valleys of SR heartbeat and PVC separately. Thus, in every lead, eight peaks and valleys were extracted to represent the SR heartbeat and PVC basic features. The zero-padding method was applied for the cases that did not have eight peaks and valleys around the reference line. The total number of peaks and valleys, eight, is equal to the means of the number of peaks and valleys in all leads plus four times the standard deviation of that for SR beat and PVC, respectively. This automated feature extraction method was verified manually to make sure it captured essential QRS morphological characteristics.
Figure 4. Description of automated ECG feature extraction method. The proposed feature extraction method automatically finds peaks presented by P# and valleys presented by V# in panel (A) through 430 data points of one SR beat in 12 leads. Panel (B) presents the numerical measurements that capture essential information of a peak, including location = sample points at P3, prominence = distance from P2 to P3, distance from peak or valley location to left prominence boundary = distance from P1 to P3, distance from peak or valley location to right prominence boundary = distance from P3 to P4, width at half of the prominence = the length of green line, distance from left prominence boundary to right prominence boundary = distance from P1 to P4, amplitude = distance from P2 to zero baseline, contour height = prominence – amplitude. X-axis presents sampling data points, and Y-axis presents voltage.
The numerical measurements (shown in Figure 4B) of each peak and valley include location, prominence, the distance from peak or valley location to left prominence boundary, the distance from peak or valley location to right prominence boundary, width at half of the prominence, the distance from left prominence boundary to right prominence boundary, amplitude, contour height, and a logic variable to present peak or trough. The prominence of a peak or a valley measures how much the peak or valley stood out due to its intrinsic height and location relative to neighbor peaks or valleys. Thus, the prominence of a peak was defined as the vertical distance between the peak point and its lowest contour line. The measurement of valleys adopted the same method with peaks.
After the above eight numerical measurements of eight peaks or valleys for both SR beat and PVC at every lead were collected, we generated a feature matrix with the size of 192 (2 beats × 12 leads × 8 peaks or valleys) by 8 (the number of numerical measurements). We transformed the feature matrix using ratios of features in the rows and columns of the matrix to create a new level of features that can reveal vital details of the ECG morphology. Finally, 1,600,800 features were automatically obtained, and their definitions can be found in Supplementary Section B.2. The estimated 95% CI of each numerical measurement in the feature matrix is documented in Supplementary Section B.2 and Supplementary Table 5.
Conventional QRS Morphological Feature Extraction
Even though we intended to develop an automated ECG measurement system that is favored by the machine learning algorithm, the conventional QRS morphological ECG measurement method, such as metrics of Q-, R-, and S-waves; segments among them; and the ratios among segments, is studied and compared in this work. The conventional QRS morphological ECG measurement protocol is defined below. SR and VT ECG morphology were measured on the same 12-lead ECG by a customized MATLAB program. During the clinical arrhythmia, the following measurements (presented in Supplementary Section B.3 and Figure 1) were obtained from both one SR beat and one PVC: (Dukes et al., 2015) amplitude of Q-, R-, and S-waves (Cronin et al., 2019) duration of Q-, R-, and S-waves as well as QRS complex; and (Joshi and Wilber, 2005) R/S amplitude ratio (Kamakura et al., 1998; Ito et al., 2003), transitional zone (Hachiya et al., 2000; Tanner et al., 2005), V2 transition ratio (Betensky et al., 2011), transitional zone index (Yoshida et al., 2011; Di et al., 2019), R-wave deflection interval (Cheng et al., 2013), V2S/V3R index (Yoshida et al., 2014), R-wave duration index (Ouyang et al., 2002), and R/S amplitude index (Ouyang et al., 2002). The T-P segment was considered one of the isoelectric baselines to measure R- and S-wave amplitudes. The QRS duration was measured from the site of the earliest initial deflection from the isoelectric line to the time of the latest activation. The R-wave length was calculated from the site of the earliest initial deflection from the isoelectric line to the time at which the R-wave intersected the isoelectric line. For all cases, QRS measurements were performed on an isolated PVC representative of the clinical VT before the induction of sustained VT and compared with the SR QRS complex. All measurements above were used to compare our approach against methods from 12 prior studies (Kamakura et al., 1998; Zhang et al., 2009; Betensky et al., 2011; Yoshida et al., 2011, 2014; Cheng et al., 2013, 2018; Nakano et al., 2014; Efimova et al., 2015; He et al., 2018; Xie et al., 2018; Di et al., 2019).
In addition to the above conventional ECG measurements, we developed the following protocol to generate features to supply to the machine learning model. Amplitudes of Q-, R-, and S-waves based on the voltage at the onset of Q-wave, the offset of S-wave, the Q-wave, and the S-wave were also input variables in the machine learning model. To give the same length input to the machine learning model, we set the measures of Q-, R-, and S-waves for these waves’ missing cases to zeros, such as QS morphology in the V1 lead and RS morphology in the V5 or V6 lead. As we implemented the automated feature extraction method, we also transformed the measurements mentioned above into new variables and put them into the machine learning model. The total number of features generated by this method is 155,784, and the entire definition of features can be found in Supplementary Section B.3. The 95% CI of each numerical measurement are listed in Supplementary Section B.3 and Supplementary Table 7.
For the continuous variables of age and ECG measurements, we calculated the mean and standard deviation. For all count variables, total sample size, number of males, number of subjects with frequent PVC, sustained VT, and sublocations under RVOT or LVOT, we calculated frequency counts and percentages. One-sample test for proportions, two-sample t test, two-sample test for proportions, and Fisher’s exact test were adopted to test the difference of the sample numbers, average ages, genders, and the number of frequent PVC or sustained VT between RVOT and LVOT groups. The Cramer Von Mises, Anderson–Darling, and Shapiro–Wilks tests did not reject the data normality hypothesis, and a two-sample t test was used to test for equal means of continuous variables between RVOT and LVOT. Statistical optimization of the gradient boosting tree model was done through iterative training using the extreme gradient booster (XGBoost) package. The following performance measures were formally analyzed, including the area under the curve (AUC) of the receiver operating characteristic (ROC) curve, accuracy (ACC), sensitivity (SE), specificity (SP), and F1-score. A two-sided 95% CI summarizes the sample variability in the estimates. The CI for the AUC was estimated using the Sun and Su optimization of the Delong method implemented in the pROC package. In contrast, CIs for F1-score, SE, and SP were obtained by the bootstrap method with 20,000 replications. All analyses were done by R version 3.5.3.
We analyzed data from 420 patients who underwent CA of OTVT at the Ningbo First Hospital of Zhejiang University from March 2007 to September 2019. After the CA procedure, two (0.5%) patients developed slight ecchymosis. A total of five (1.2%) patients were excluded from this study because of frequent PVC or VT recurrence in the first 6-month follow-up.
Patient demographic and clinical characteristics data for the RVOT and LVOT groups are shown in Table 1. We compare the distributions of these background characteristics in the RVOT and LVOT groups and list the associated p-values in the table. The RVOT cohort consists of 20.95% left cusp, 17.62% posterior septal, 14.29% anterior septal, 10% anterior cusp, 7.86% free wall, and 7.14% right cusp. The LVOT cohort consists of 10.71% left coronary cusp, 5.71% aortomitral continuity, 2.62% left coronary cusp and right coronary cusp ommisure, 1.67% right coronary cusp, and 1.43% summit (shown in Supplementary Section A and Table 1).
The patients were assigned to training, validation, and testing cohorts, consisting of 340 (81%), 38 (9%), and 42 (10%) patients, respectively, using random proportional allocation (demographic summary shown in Table 1). For a fair comparison, the machine learning model was supplied with different features from two feature extraction methods. The performance was assessed using the same training, validation, and testing cohorts.
We used 1,600,800 automatically generated ECG features as machine learning model input. The proposed approach achieved an ACC of 97.62 (87.44–99.99); F1-score of 98.46 (90–100); prediction of RVOT origins with SE of 96.97 (82.54–99.89); and SP of 100 (62.97–100) (shown in Table 2), respectively; and AUC of 98.99 (96.89–100) (presented in Figure 5). Among the 1,600,800 initial automatically generated ECG features, we found a total of 1,352 critically important features with non-zero Shapley additive explanations (SHAP) values (Lundberg and Lee, 2017), showing the importance of their contributions to RVOT and LVOT prediction. The detailed interpretation of SHAP value is introduced in Supplementary Section C.1. We chose and analyzed the top three important features (shown in Figure 6) that have significant classification capability: (Dukes et al., 2015) the ratio between the location of the 5th peak or valley at the SR beat V1 lead and the right boundary of the 5th peak or valley at the V1 lead of PVC, Cronin et al. (2019) the ratio between the prominence of the 5th peak or valley at the V1 lead of PVC and the prominence of the 5th peak or valley at the V3 lead of PVC, and (Joshi and Wilber, 2005) the difference between the distance of the 5th peak or valley to the left boundary at the V1 lead of PVC and the distance of the 5th peak or valley to the left boundary at the V1 lead of the SR beat.
Figure 5. Receiver-operating characteristic curve generated by the optimal machine learning model supplied with two feature extraction methods. The CI for the AUC was estimated using the Sun and Su optimization of the Delong method. Sensitivity and specificity of RVOT prediction are indicated for different thresholds.
Training the machine learning model using 155,784 features extracted from conventional QRS morphological ECG measurements, the proposed method attained an ACC of 92.86 (80.35–98.85), F1-score of 95.38 (86.62–98.86), prediction of RVOT origins with SE of 93.94 (78.64–98.99) and SP of 88.89 (50.86–99.45) (shown in Table 2), and AUC of 95.62 (89.78–100) (presented in Figure 5). Among the initial 155,784 features, we found a total of 1,003 critically important features with non-zero SHAP values (Lundberg and Lee, 2017), showing the importance of their contributions to RVOT and LVOT prediction. The top three important features (shown in Supplementary Section C.1 and Figure 2) that show significant classification capability are (Dukes et al., 2015) the ratio between R-wave amplitude based on the zero isoelectric baselines at lead III PVC and the R-wave amplitude based on the offset of S-wave at V1 lead PVC, Cronin et al. (2019) the ratio between the R-wave amplitude based on R-wave onset at V2 lead SR beat and the R-wave amplitude based on zero isoelectric baseline at V3 lead PVC, and (Joshi and Wilber, 2005) the ratio between the R-wave amplitude based on the zero isoelectric baseline at aVL lead SR beat and the R-wave amplitude based on S-wave offset at V1 lead PVC. The statistical summary of conventional QRS morphological measurements for leads V1 to V6 is listed in Supplementary Section A and Table 2.
Finally, the average performance of eight cardiologists who determined RVOT and LVOT using the same ECG samples in this study is presented in Table 2. The classification confusion matrix for these three methods shows correct and incorrect frequency counts in Supplementary Section A and Table 3. Furthermore, we compared our approach against related methods from 12 prior studies (Kamakura et al., 1998; Zhang et al., 2009; Betensky et al., 2011; Yoshida et al., 2011, 2014; Cheng et al., 2013, 2018; Nakano et al., 2014; Efimova et al., 2015; He et al., 2018; Xie et al., 2018; Di et al., 2019). ACC, F1-score, SE, SP, positive predictive value, negative predictive value, and AUC were used to compare performances and are shown in Table 3.
We designed and implemented a high-accuracy algorithm for LVOT and RVOT origins of OTVT classification, using 1,600,800 ECG measurements automatically extracted from 12-lead ECGs using our proprietary method. The prediction accuracy comparison among our method combined with the XGBoost classifier, a conventional QRS feature extraction method combined with XGBoost, and the performance of human experts (shown in Table 2) shows that the machine learning model with the automated ECG feature extraction method was uniformly superior. We used DeLong’s test (DeLong et al., 1988) to demonstrate that the automated ECG feature extraction method had a significantly higher AUC compared with that attained by the conventional QRS morphological feature extraction approach with a P-value = 0.035. The comparison of our approach against methods from 12 prior studies (Kamakura et al., 1998; Zhang et al., 2009; Betensky et al., 2011; Yoshida et al., 2011, 2014; Cheng et al., 2013, 2018; Nakano et al., 2014; Efimova et al., 2015; He et al., 2018; Xie et al., 2018; Di et al., 2019) shows that our algorithm achieved the highest performance scores (shown in Table 3). Additionally, we evaluated the general classification capability of each criterion proposed by previous studies using the database in this study. Not surprisingly, we observed significant differences between previously reported performances and the reproduced results of these methods because most of the prior studies used the univariate analysis to make predictions (shown in Table 3).
The excellent performance of our machine learning algorithm demands an enormous volume of data and features. It is an extremely time- and cost-consuming task to generate such amount of features by the conventional ECG QRS morphological measurements introduced in prior studies because these measurements are manually obtained. Thus, we did not make any assumptions about ECG criteria before training the machine learning algorithm and intended to exhaust all possible relationships among morphological measures of Q-, R-, and S-waves as well as the entire QRS complex. We designed and implemented an automated ECG feature extraction method that can generate 1,600,800 ECG signal characteristics. Not only did these features contain a considerable amount of the classical statistics from 12 prior studies (Kamakura et al., 1998; Zhang et al., 2009; Betensky et al., 2011; Yoshida et al., 2011, 2014; Cheng et al., 2013, 2018; Nakano et al., 2014; Efimova et al., 2015; He et al., 2018; Xie et al., 2018; Di et al., 2019), but they also captured morphological measures not considered by previous studies, such as rsR’ waves and rsr’s’ waves. However, one may be concerned that such a feature extraction method will include the P- and T-wave within SR beats and retrograde P-waves within PVC. The machine learning model captures and analyzes a large amount of information from every beat but filters out all unimportant features based on their classification accuracy contribution. As we can see from the top three important features (shown in Figure 6) selected by the machine learning model, none of the features that presented waves mentioned above played a role in the prediction. The important morphological features of the Rsr’ and rsr’s waves may be caused by noise and lead placement of the 12-lead ECG electrodes because the 12-lead ECG electrodes are frequently misplaced due to the mapping patches used during the ablation procedure. In this study, we avoid such a problem because chest and limb leads were placed carefully in a standard position when the 12-lead surface ECGs were collected before the procedure.
Figure 6. Analysis of top three significant ECG measurements found by machine learning model with automated feature extraction method. The univariate analysis (A) shows that features 1 (A.1) and 2 (A.2) have significant capability to separate RVOT and LVOT. The bivariate analysis (B) indicates the classification ability of one–one interaction of the top 3 significant features. In the multivariate analysis (C), the smaller feature 1 (C.1), feature 1 (C.2), and feature 3 (C.3) generate a higher probability of LVOT, but the magnitude of influence varies across features. The color in panel (C) represents the feature value (red high, blue low).
Moreover, before the machine learning model is safely applied in practice, an unambiguous interoperation is necessary for cardiologists to gear this advanced tool, such as explaining what crucial criteria are and why they play vital roles. For instance, the machine learning model shows that the smaller the magnitude of the first important feature (shown in Figure 6C.1), the higher the possibility of LVOT origin of OTVT. The first important feature is the ratio of the location of the 5th peak or valley at the V1 lead SR beat and the right boundary of the 5th peak or valley at the V1 lead of PVC. In our feature extraction system, the 5th peak or valley at the V1 lead of PVC is an S-wave in most cases. The key ECG lead in the initial site prediction of VT origin is the V1 lead because it is located nearly orthogonal to the septal plane and, thus, is the best lead to resolve initial right- vs. left-sided activation. When the V1 lead has a positive QRS (R > s), the VT is considered to have the right bundle branch block (RBBB) configuration. Conversely, net negative QRS (r < S) defines a left bundle branch block (LBBB) configuration (Haqqani and Marchlinski, 2019). The top three important features (shown in Figure 6) were exactly measured activation time, RBBB, and LBBB configuration. Therefore, such interpretation makes the machine learning decision process not a black box anymore.
Last but not least, the machine learning model proposed in this study can be immediately and effortlessly deployed to EP labs. The pretrained model, source code, and data are available online and found in the “Data Availability Statement” section. The model inputs are only two QRS complexes, one for PVC and one for SR beat, and they can be easily acquired from 12-lead standard ECG. The analysis of one patient’s data takes less than a second provided every step of measurement and computation is automatically done by the model and the preprocessing approach. The precise prediction of origins can significantly reduce CA duration and reduce the risk of complications.
Because the data set did not produce enough well-labeled data to feed a machine learning model, the algorithm currently only predicts LVOT and RVOT rather than subsites of them. For instance, the origin of PVC is sometimes in the middle of septal RVOT/LVOT. The presence of expertly labeled data for three categories, RVOT, LVOT, and septal, will allow the machine learning model to predict the origins with higher accuracy. Although this study includes patients with comprehensive anatomy sites under RVOT and LVOT, the performance of the method could improve in the presence of more cases of RCC and summit under LVOT. Moreover, some conditions, such as cardiomyopathies, reentrant VT coronary heart disease, and prior structural and congenital abnormalities, are underrepresented or absent from the study. Thus, the algorithm potentially has a limitation if applied in such scenarios.
Considering the performance of prediction, the capacity of extracting vital information from 12-lead ECG and the robustness of application, our results provide the promising and reliable decision support to guide a successful CA treatment of ventricular arrhythmia by machine learning technology.
Data Availability Statement
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://doi.org/10.6084/m9.figshare.c.4668086.v2.
JZ, GF, XD, BH, HC, and CR processed the data for analysis. JZ, HC, GF, XD, IA, and CR performed the statistical analysis. All authors contributed to the study design, data interpretation, and writing of the report.
This work was supported by the 2020 Natural Science Foundation of Zhengjiang Province (ID H0205-3202410).
Conflict of Interest
From the Department of Cardiology, Ningbo First Hospital of Zhejiang University. HC has served as a consultant for Biosense Webster, Boston Scientific, and Abbott. HY was employed by the company Zhejiang Cachet Jetboom Medical Devices Co., Ltd.
The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
We are grateful for the support from the arrhythmia center of Ningbo First Hospital of Zhejiang University.
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphys.2021.641066/full#supplementary-material
Abi-Abdallah, D., Chauvet, E., Bouchet-Fakri, L., Bataillard, A., Briguet, A., and Fokapu, O. (2006). Reference signal extraction from corrupted ECG using wavelet decomposition for MRI sequence triggering: application to small animals. Biomed. Eng. Online 5:11.
Betensky, B. P., Park, R. E., Marchlinski, F. E., Hutchinson, M. D., Garcia, F. C., Dixit, S., et al. (2011). The V(2) transition ratio: a new electrocardiographic criterion for distinguishing left from right ventricular outflow tract tachycardia origin. J. Am. Coll. Cardiol. 57, 2255–2262.
Bunch, T. J., and Day, J. D. (2006). Right meets left: a common mechanism underlying right and left ventricular outflow tract tachycardias. J. Cardiovasc. Electrophysiol. 17, 1059–1061. doi: 10.1111/j.1540-8167.2006.00577.x
Cheng, D., Ju, W., Zhu, L., Chen, K., Zhang, F., Chen, H., et al. (2018). V3R/V7 index: a novel electrocardiographic criterion for differentiating left from right ventricular outflow tract arrhythmias origins. Circ. Arrhythm. Electrophysiol. 11:e006243.
Cheng, Z., Cheng, K., Deng, H., Chen, T., Gao, P., Zhu, K., et al. (2013). The R-wave deflection interval in lead V3 combining with R-wave amplitude index in lead V1: a new surface ECG algorithm for distinguishing left from right ventricular outflow tract tachycardia origin in patients with transitional lead at V3. Int. J. Cardiol. 168, 1342–1348. doi: 10.1016/j.ijcard.2012.12.013
Cronin, E. M., Bogun, F. M., Maury, P., Peichl, P., Chen, M., Namboodiri, N., et al. (2019). HRS/EHRA/APHRS/LAHRS expert consensus statement on catheter ablation of ventricular arrhythmias. Europace 21, 1143–1144.
DeLong, E. R., DeLong, D. M., and Clarke-Pearson, D. L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44, 837–845. doi: 10.2307/2531595
Di, C., Wan, Z., Tse, G., Letsas, K. P., Liu, T., Efremidis, M., et al. (2019). The V1-V3 transition index as a novel electrocardiographic criterion for differentiating left from right ventricular outflow tract ventricular arrhythmias. J. Interv. Card. Electrophysiol. 56, 37–43. doi: 10.1007/s10840-019-00612-0
Dukes, J. W., Dewland, T. A., Vittinghoff, E., Mandyam, M. C., Heckbert, S. R., Siscovick, D. S., et al. (2015). Ventricular ectopy as a predictor of heart failure and death. J. Am. Coll. Cardiol. 66, 101–109. doi: 10.1016/j.jacc.2015.04.062
Efimova, E., Dinov, B., Acou, W. J., Schirripa, V., Kornej, J., Kosiuk, J., et al. (2015). Differentiating the origin of outflow tract ventricular arrhythmia using a simple, novel approach. Heart Rhythm 12, 1534–1540. doi: 10.1016/j.hrthm.2015.04.004
Enriquez, A., Baranchuk, A., Briceno, D., Saenz, L., and Garcia, F. (2019). How to use the 12-lead ECG to predict the site of origin of idiopathic ventricular arrhythmias. Heart Rhythm 16, 1538–1544. doi: 10.1016/j.hrthm.2019.04.002
Hachiya, H., Aonuma, K., Yamauchi, Y., Harada, T., Igawa, M., Nogami, A., et al. (2000). Electrocardiographic characteristics of left ventricular outflow tract tachycardia. Pacing Clin. Electrophysiol. 23(11 Pt 2), 1930–1934. doi: 10.1111/j.1540-8159.2000.tb07055.x
Haqqani, H. M., Morton, J. B., and Kalman, J. M. (2009). Using the 12-lead ECG to localize the origin of atrial and ventricular tachycardias: part 2–ventricular tachycardia. J. Cardiovasc. Electrophysiol. 20, 825–832. doi: 10.1111/j.1540-8167.2009.01462.x
He, Z., Liu, M., Yu, M., Lu, N., Li, J., Xu, T., et al. (2018). An electrocardiographic diagnostic model for differentiating left from right ventricular outflow tract tachycardia origin. J. Cardiovasc. Electrophysiol. 29, 908–915. doi: 10.1111/jce.13493
Ito, S., Tada, H., Naito, S., Kurosaki, K., Ueda, M., Hoshizaki, H., et al. (2003). Development and validation of an ECG algorithm for identifying the optimal ablation site for idiopathic ventricular outflow tract tachycardia. J. Cardiovasc. Electrophysiol. 14, 1280–1286. doi: 10.1046/j.1540-8167.2003.03211.x
Kamakura, S., Shimizu, W., Matsuo, K., Taguchi, A., Suyama, K., Kurita, T., et al. (1998). Localization of optimal ablation site of idiopathic ventricular tachycardia from right and left ventricular outflow tract by body surface ECG. Circulation 98, 1525–1533. doi: 10.1161/01.cir.98.15.1525
Lahmiri, S. (2014). Comparative study of ECG signal denoising by wavelet thresholding in empirical and variational mode decomposition domains. Healthc Technol. Lett. 1, 104–109. doi: 10.1049/htl.2014.0073
Latchamsetty, R., Yokokawa, M., Morady, F., Kim, H. M., Mathew, S., Tilz, R., et al. (2015). Multicenter outcomes for catheter ablation of idiopathic premature ventricular complexes. JACC Clin. Electrophysiol. 1, 116–123.
Lundberg, S., and Lee, S.-I. (2017). A unified approach to interpreting model predictions. in Proceedings of the 31st International Conference on Neural Information Processing Systems. New York, NY: ACM
Nakano, M., Ueda, M., Ishimura, M., Kajiyama, T., Hashiguchi, N., Kanaeda, T., et al. (2014). Estimation of the origin of ventricular outflow tract arrhythmia using synthesized right-sided chest leads. Europace 16, 1373–1378. doi: 10.1093/europace/eut355
Ouyang, F., Fotuhi, P., Ho, S. Y., Hebe, J., Volkmer, M., Goya, M., et al. (2002). Repetitive monomorphic ventricular tachycardia originating from the aortic sinus cusp: electrocardiographic characterization for guiding catheter ablation. J. Am. Coll. Cardiol. 39, 500–508.
Tanner, H., Hindricks, G., Schirdewahn, P., Kobza, R., Dorszewski, A., Piorkowski, C., et al. (2005). Outflow tract tachycardia with R/S transition in lead V3: six different anatomic approaches for successful ablation. J. Am. Coll. Cardiol. 45, 418–423. doi: 10.1016/j.jacc.2004.10.037
Xie, S., Kubala, M., Liang, J. J., Hayashi, T., Park, J., Padros, I. L., et al. (2018). Lead I R-wave amplitude to differentiate idiopathic ventricular arrhythmias with left bundle branch block right inferior axis originating from the left versus right ventricular outflow tract. J. Cardiovasc. Electrophysiol. 29, 1515–1522. doi: 10.1111/jce.13747
Yoshida, N., Inden, Y., Uchikawa, T., Kamiya, H., Kitamura, K., Shimano, M., et al. (2011). Novel transitional zone index allows more accurate differentiation between idiopathic right ventricular outflow tract and aortic sinus cusp ventricular arrhythmias. Heart Rhythm 8, 349–356. doi: 10.1016/j.hrthm.2010.11.023
Yoshida, N., Yamada, T., McElderry, H. T., Inden, Y., Shimano, M., Murohara, T., et al. (2014). A novel electrocardiographic criterion for differentiating a left from right ventricular outflow tract tachycardia origin: the V2S/V3R index. J. Cardiovasc. Electrophysiol. 25, 747–753. doi: 10.1111/jce.12392
Zhang, F., Chen, M., Yang, B., Ju, W., Chen, H., Yu, J., et al. (2009). Electrocardiographic algorithm to identify the optimal target ablation site for idiopathic right ventricular outflow tract ventricular premature contraction. Europace 11, 1214–1220. doi: 10.1093/europace/eup231
Keywords: outflow tract ventricular tachycardia, catheter ablation, electrocardiography, classification, artificial intelligence algorithm
Citation: Zheng J, Fu G, Abudayyeh I, Yacoub M, Chang A, Feaster WW, Ehwerhemuepha L, El-Askary H, Du X, He B, Feng M, Yu Y, Wang B, Liu J, Yao H, Chu H and Rakovski C (2021) A High-Precision Machine Learning Algorithm to Classify Left and Right Outflow Tract Ventricular Tachycardia. Front. Physiol. 12:641066. doi: 10.3389/fphys.2021.641066
Received: 13 December 2020; Accepted: 18 January 2021;
Published: 25 February 2021.
Edited by:Xiaopeng Zhao, The University of Tennessee, Knoxville, United States
Reviewed by:Peter Van Dam, Radboud University Nijmegen, Netherlands
Marianna Meo, Institut de Rythmologie et Modélisation Cardiaque (IHU-Liryc), France
Copyright © 2021 Zheng, Fu, Abudayyeh, Yacoub, Chang, Feaster, Ehwerhemuepha, El-Askary, Du, He, Feng, Yu, Wang, Liu, Yao, Chu and Rakovski. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Huimin Chu, email@example.com
†These authors have contributed equally to this work