A High-Precision Machine Learning Algorithm to Classify Left and Right Outflow Tract Ventricular Tachycardia

Zheng, Jianwei; Fu, Guohua; Abudayyeh, Islam; Yacoub, Magdi; Chang, Anthony; Feaster, William W.; Ehwerhemuepha, Louis; El-Askary, Hesham; Du, Xianfeng; He, Bin; Feng, Mingjun; Yu, Yibo; Wang, Binhao; Liu, Jing; Yao, Hai; Chu, Huimin; Rakovski, Cyril

doi:10.3389/fphys.2021.641066

ORIGINAL RESEARCH article

Front. Physiol., 25 February 2021

Sec. Cardiac Electrophysiology

Volume 12 - 2021 | https://doi.org/10.3389/fphys.2021.641066

This article is part of the Research TopicNonlinear Analysis and Machine Learning in CardiologyView all 13 articles

A High-Precision Machine Learning Algorithm to Classify Left and Right Outflow Tract Ventricular Tachycardia

Jianwei Zheng^1†

Guohua Fu^2†

Islam Abudayyeh³

Magdi Yacoub⁴

Anthony Chang⁵

William W. Feaster⁵

Louis Ehwerhemuepha⁵

Hesham El-Askary^1,6

Xianfeng Du²

Bin He²

Mingjun Feng²

Yibo Yu²

Binhao Wang²

Jing Liu²

Hai Yao⁷

Huimin Chu^2*

Cyril Rakovski¹

¹Computational and Data Science, Chapman University, Orange, CA, United States
²Department of Cardiology, Ningbo First Hospital of Zhejiang University, Hangzhou, China
³Department of Cardiology, Loma Linda University, Loma Linda, CA, United States
⁴Harefield Heart Science Center, Imperial College London, London, United Kingdom
⁵CHOC Children’s Hospital, Orange, CA, United States
⁶Department of Environmental Sciences, Faculty of Science, Alexandria University, Alexandria, Egypt
⁷Zhejiang Cachet Jetboom Medical Devices Co., Ltd., Hangzhou, China

Introduction: Multiple algorithms based on 12-lead ECG measurements have been proposed to identify the right ventricular outflow tract (RVOT) and left ventricular outflow tract (LVOT) locations from which ventricular tachycardia (VT) and frequent premature ventricular complex (PVC) originate. However, a clinical-grade machine learning algorithm that automatically analyzes characteristics of 12-lead ECGs and predicts RVOT or LVOT origins of VT and PVC is not currently available. The effective ablation sites of RVOT and LVOT, confirmed by a successful ablation procedure, provide evidence to create RVOT and LVOT labels for the machine learning model.

Methods: We randomly sampled training, validation, and testing data sets from 420 patients who underwent successful catheter ablation (CA) to treat VT or PVC, containing 340 (81%), 38 (9%), and 42 (10%) patients, respectively. We iteratively trained a machine learning algorithm supplied with 1,600,800 features extracted via our proprietary algorithm from 12-lead ECGs of the patients in the training cohort. The area under the curve (AUC) of the receiver operating characteristic curve was calculated from the internal validation data set to choose an optimal discretization cutoff threshold.

Results: The proposed approach attained the following performance: accuracy (ACC) of 97.62 (87.44–99.99), weighted F1-score of 98.46 (90–100), AUC of 98.99 (96.89–100), sensitivity (SE) of 96.97 (82.54–99.89), and specificity (SP) of 100 (62.97–100).

Conclusions: The proposed multistage diagnostic scheme attained clinical-grade precision of prediction for LVOT and RVOT locations of VT origin with fewer applicability restrictions than prior studies.

Introduction

One population-based study (Dukes et al., 2015) of 1,139 older adults without any heart-failure signs or systolic dysfunction shows that premature ventricular complexes (PVC) and ventricular tachycardia (VT) burden are significantly associated with an increased risk of adjusted decreased left ventricular ejection fraction (odds ratio, 1.13) and increased adjusted risk of incident heart failure (hazard ratio, 1.06) and death (hazard ratio, 1.04). Catheter ablation (CA) is a commonly considered treatment of VT patients with and without structural heart disease when drugs are ineffective or have unacceptable side effects (Cronin et al., 2019). It has a class I indication for treatment of idiopathic outflow tract ventricular tachycardia (OTVT) (Joshi and Wilber, 2005; Latchamsetty et al., 2015). The OTVT stems from the right ventricular outflow tract (RVOT) in 60–80% of the cases and from the left ventricular outflow tract (LVOT) (Bunch and Day, 2006) in the rest of the cases. An accurate prediction of RVOT and LVOT origins of OTVT can optimize the CA strategy, reduce ablation duration, and avoid operative complications. Previous studies (Kamakura et al., 1998; Hachiya et al., 2000; Ito et al., 2003; Joshi and Wilber, 2005; Tanner et al., 2005; Haqqani et al., 2009; Zhang et al., 2009; Betensky et al., 2011; Yoshida et al., 2011, 2014; Cheng et al., 2013, 2018; Nakano et al., 2014; Efimova et al., 2015; He et al., 2018; Xie et al., 2018; Di et al., 2019; Enriquez et al., 2019; Yamada, 2019) propose several criteria or models to estimate RVOT and LVOT origins. However, these results have been limited by sample size, scope of studies, ECG measurement efficiency, and generalizability of the models. In contrast, we develop an optimal multistage scheme that automatically extracts features from standard 12-lead ECGs and incorporates these features into a machine learning model to predict RVOT and LVOT origins of VT or PVC with clinical-grade precision and provides multiprospective analysis for the most important ECG features.

Materials and Methods

Study Design

The institutional review board of Ningbo First Hospital of Zhejiang University has approved this retrospective study and granted a waiver of the requirement to obtain informed consent. The study was conducted in accordance with the Declaration of Helsinki.

From each patient’s entire ECG recorder, three cardiac electrophysiologists (EPs) unanimously selected one QRS complex during the sinus rhythm (SR) and one QRS complex during the PVC or VT as the initial input. The features extracted from the two QRS complexes are supplied to an optimal machine learning classification model that provides two possible prediction outputs: RVOT or LVOT. For the purposes of the classification scheme, RVOT is considered a positive outcome and LVOT a negative one. This study employed a training–validation–testing design to correctly assess the performance of the algorithm. This study consists of four phases (shown in Figure 1): (Dukes et al., 2015) a feature extraction phase in which two feature extraction methods are studied and compared—our proprietary automated ECG feature extraction method and a method based on conventional QRS morphological ECG measurements (Cronin et al., 2019) a training phase in which the extreme gradient boosting tree classification model is supplied by the features generated in the feature extraction phase (Joshi and Wilber, 2005) a validation phase aimed at finding important features as optimal model input and deciding the optimal discretization cutoff threshold that was applied in the testing phase; and (Latchamsetty et al., 2015) a testing phase aimed at evaluating, interpreting, and reporting the model performance.

FIGURE 1

Figure 1. Central illustration.

Patient Selection

We reviewed patients who underwent mapping and ablation for frequent PVC or VT that originated from either LVOT or RVOT at the Ningbo First Hospital of Zhejiang University from March 2007 to September 2019. A PVC or VT burden above 10% of total test duration was required for a study entry. A total of 420 patients with OTVT were included in this study. Origin sites of OTVT were confirmed by a successful CA, which means the frequent PVC and VT did not occur above 5% of the total test duration in the first 6-month follow-up after CA.

Classification of Anatomic Sites

The anatomical structure of RVOT and LVOT is depicted in Figure 2, and the demographic data of the anatomic sites are shown in Supplementary Section A and Table 1. This study only focuses on the prediction of RVOT and LVOT rather than the subsites (shown in Figure 2) under RVOT and LVOT. The effective ablation sites of RVOT and LVOT confirmed by ablation provide evidence to create RVOT and LVOT labels for the subsequent machine learning model development.

FIGURE 2

Figure 2. Anatomic structure of LVOT and RVOT. LVOT includes left coronary cusp (LCC), right coronary cusp (RCC), non-coronary cusp (NCC), aortomitral continuity (AMC), and LVOT summit. RVOT includes anterior cusp (AC), left cusp (LC), right cusp (RC), RVOT freewall, and RVOT septal.

TABLE 1

Table 1. Summary statistics of demographic data and clinical characteristics of all patients.

Mapping and Ablation Procedure

Anti-arrhythmic drugs were stopped for at least five half-lives before the inception of the ablation procedure. A 4.0-mm 7F irrigated ablation catheter (Navistar; Biosense Webster, Diamond Bar, CA, United States) was initially placed in the RVOT for mapping. Both fluoroscopy and electroanatomic mapping systems (CARTO, Biosense Webster, Diamond Bar, CA, United States or NavX Velocity, St. Jude Medical, St. Paul, MN, United States) were used to localize the anatomic position of the ablation catheter within the outflow tract. The intracardiac echo was used to identify specific anatomical structures, such as cusps and papillary muscles. For example, Figure 3 presents the fluoroscopy, 3-D mapping, intracardiac echocardiography, and activation mapping for a case with the origin site in commissure of aortic sinus of valsalva LVOT. Using point-by-point mapping, anatomic aggregated maps were created. Activation mapping was performed in all patients during VT and PVC. Pace mapping was also performed with the lowest pacing output (2–20 mA) and pulse width (0.5–10 ms) to capture the ventricular myocardium at the site of the earliest activation. If suitable ablation sites for the RVOT were not located or ablation failed to abolish the arrhythmia, extended mapping to the LVOT site was deployed via a retrograde aortic approach. After target sites were located, radiofrequency energy was delivered up to a maximum power of 35 W and a maximum electrode-tissue interface temperature of 43°C. If the VT or PVC disappeared or the frequency of arrhythmias diminished after the first 30 s of ablation, the energy was delivered continuously from 60 to 180 s. Ablation success was defined as the absence of spontaneous or induced VT or PVC at 30 min after the last energy delivery and confirmed by continuous cardiac telemetry in the subsequent 24 h of inpatient care.

FIGURE 3

Figure 3. Activation map and fluoroscopic map for VA originating from commissure of aortic sinus of valsalva in LVOT. (A) Right and left anterior oblique fluoroscopic views show an ablation catheter in the LVOT. Ablation in the LVOT (LCC–RCC commissure) eliminated the PVC within 3 s. (B) The 3-D anatomic representation of the RV endocardium, LV endocardium, and venous system with the ablation catheter positioned at the anterior interventricular vein. (C) The green circle indicates the tip of the ablation catheter in LCC–RCC commisure. (D) The earliest bipolar and unipolar activation time (–30 ms) are shown.

The Procedure to Assess the Catheter Ablation Outcomes

In the subsequent 24 h of inpatient care after the ablation procedure, every patient received continuous ECG monitoring. After discharge, the patients underwent a follow-up 2 weeks after the ablation and then every month at the cardiology clinic. A 12-lead surface ECG test was obtained on each clinic visit, and 24-h Holter monitoring was also prescribed at 3 and 6 months after the ablation.

ECG Measurement Protocol

Noise Reduction and QRS Sample Selection

With chest and limb leads placed carefully in a standard position, the 12-lead surface ECGs were collected by the EP workmate system (EP-WorkMate^TM System, Abbott, Saint Paul, MI, United States) at a sampling rate of 2,000 Hz before the ablation procedure. The noise sources impacting the ECG database were power line interference, baseline wandering, and random noise. Wavelet transform yields better time–frequency localization results than windowed Fourier transform and naturally has an advantage in noise reduction applications (Abi-Abdallah et al., 2006). Thus, the wavelets technique was used to remove the noise components mentioned above. The coif5 Wavelets (Lahmiri, 2014) and Stein’s Unbiased Risk Estimator (SURE)-based (Stein, 1981; David and Johnstone, 1995) threshold were implemented by MATLAB to carry out the noise reduction steps. To get a full understanding of the techniques and schemes that were adopted in this work, please refer to the code availability section. After noise components were removed, three cardiac EPs unanimously selected one QRS complex during the SR and one QRS complex during the PVC or VT to classify RVOT and LVOT.

Automated ECG Feature Extraction Method

We applied the following measurements and transformation protocol to automatically extract ECG morphological features and supply them to the machine learning model. We used the R-wave peak points of PVC and SR heartbeat in lead V₆ as reference lines because they are easy to identify in most conditions. At the first step, for one SR heartbeat, 215 data points (0.11 s) before and after the reference line were truncated, and 335 data points (0.17 s) before and after the reference line were cut for one PVC. The above lengths of 430 and 670 were the means of QRS complex duration plus four times the standard deviation of that for SR beat and PVC. They should cover 99.99% of the QRS complexes in any data due to the normality of the QRS duration distribution and the empirical rule. The mean and standard deviation of QRS duration were computed from the samples in this study; the maximum length of QRS complex for SR beat is 405 data points, and the maximum for PVC is 607 data points. Second, for every lead, we selected the first peak/valley (local maximum or minimum) closest to the reference line (shown in Figure 4A) defined in the first step. Third, the three peaks or valleys before the first peak/valley identified in the second step and the four peaks or valleys after the first peak/valley were selected from all peaks and valleys of SR heartbeat and PVC separately. Thus, in every lead, eight peaks and valleys were extracted to represent the SR heartbeat and PVC basic features. The zero-padding method was applied for the cases that did not have eight peaks and valleys around the reference line. The total number of peaks and valleys, eight, is equal to the means of the number of peaks and valleys in all leads plus four times the standard deviation of that for SR beat and PVC, respectively. This automated feature extraction method was verified manually to make sure it captured essential QRS morphological characteristics.

FIGURE 4

Figure 4. Description of automated ECG feature extraction method. The proposed feature extraction method automatically finds peaks presented by P# and valleys presented by V# in panel (A) through 430 data points of one SR beat in 12 leads. Panel (B) presents the numerical measurements that capture essential information of a peak, including location = sample points at P3, prominence = distance from P2 to P3, distance from peak or valley location to left prominence boundary = distance from P1 to P3, distance from peak or valley location to right prominence boundary = distance from P3 to P4, width at half of the prominence = the length of green line, distance from left prominence boundary to right prominence boundary = distance from P1 to P4, amplitude = distance from P2 to zero baseline, contour height = prominence – amplitude. X-axis presents sampling data points, and Y-axis presents voltage.

The numerical measurements (shown in Figure 4B) of each peak and valley include location, prominence, the distance from peak or valley location to left prominence boundary, the distance from peak or valley location to right prominence boundary, width at half of the prominence, the distance from left prominence boundary to right prominence boundary, amplitude, contour height, and a logic variable to present peak or trough. The prominence of a peak or a valley measures how much the peak or valley stood out due to its intrinsic height and location relative to neighbor peaks or valleys. Thus, the prominence of a peak was defined as the vertical distance between the peak point and its lowest contour line. The measurement of valleys adopted the same method with peaks.

After the above eight numerical measurements of eight peaks or valleys for both SR beat and PVC at every lead were collected, we generated a feature matrix with the size of 192 (2 beats × 12 leads × 8 peaks or valleys) by 8 (the number of numerical measurements). We transformed the feature matrix using ratios of features in the rows and columns of the matrix to create a new level of features that can reveal vital details of the ECG morphology. Finally, 1,600,800 features were automatically obtained, and their definitions can be found in Supplementary Section B.2. The estimated 95% CI of each numerical measurement in the feature matrix is documented in Supplementary Section B.2 and Supplementary Table 5.

Conventional QRS Morphological Feature Extraction

Even though we intended to develop an automated ECG measurement system that is favored by the machine learning algorithm, the conventional QRS morphological ECG measurement method, such as metrics of Q-, R-, and S-waves; segments among them; and the ratios among segments, is studied and compared in this work. The conventional QRS morphological ECG measurement protocol is defined below. SR and VT ECG morphology were measured on the same 12-lead ECG by a customized MATLAB program. During the clinical arrhythmia, the following measurements (presented in Supplementary Section B.3 and Figure 1) were obtained from both one SR beat and one PVC: (Dukes et al., 2015) amplitude of Q-, R-, and S-waves (Cronin et al., 2019) duration of Q-, R-, and S-waves as well as QRS complex; and (Joshi and Wilber, 2005) R/S amplitude ratio (Kamakura et al., 1998; Ito et al., 2003), transitional zone (Hachiya et al., 2000; Tanner et al., 2005), V₂ transition ratio (Betensky et al., 2011), transitional zone index (Yoshida et al., 2011; Di et al., 2019), R-wave deflection interval (Cheng et al., 2013), V₂S/V₃R index (Yoshida et al., 2014), R-wave duration index (Ouyang et al., 2002), and R/S amplitude index (Ouyang et al., 2002). The T-P segment was considered one of the isoelectric baselines to measure R- and S-wave amplitudes. The QRS duration was measured from the site of the earliest initial deflection from the isoelectric line to the time of the latest activation. The R-wave length was calculated from the site of the earliest initial deflection from the isoelectric line to the time at which the R-wave intersected the isoelectric line. For all cases, QRS measurements were performed on an isolated PVC representative of the clinical VT before the induction of sustained VT and compared with the SR QRS complex. All measurements above were used to compare our approach against methods from 12 prior studies (Kamakura et al., 1998; Zhang et al., 2009; Betensky et al., 2011; Yoshida et al., 2011, 2014; Cheng et al., 2013, 2018; Nakano et al., 2014; Efimova et al., 2015; He et al., 2018; Xie et al., 2018; Di et al., 2019).

In addition to the above conventional ECG measurements, we developed the following protocol to generate features to supply to the machine learning model. Amplitudes of Q-, R-, and S-waves based on the voltage at the onset of Q-wave, the offset of S-wave, the Q-wave, and the S-wave were also input variables in the machine learning model. To give the same length input to the machine learning model, we set the measures of Q-, R-, and S-waves for these waves’ missing cases to zeros, such as QS morphology in the V₁ lead and RS morphology in the V₅ or V₆ lead. As we implemented the automated feature extraction method, we also transformed the measurements mentioned above into new variables and put them into the machine learning model. The total number of features generated by this method is 155,784, and the entire definition of features can be found in Supplementary Section B.3. The 95% CI of each numerical measurement are listed in Supplementary Section B.3 and Supplementary Table 7.

Statistical Analysis

For the continuous variables of age and ECG measurements, we calculated the mean and standard deviation. For all count variables, total sample size, number of males, number of subjects with frequent PVC, sustained VT, and sublocations under RVOT or LVOT, we calculated frequency counts and percentages. One-sample test for proportions, two-sample t test, two-sample test for proportions, and Fisher’s exact test were adopted to test the difference of the sample numbers, average ages, genders, and the number of frequent PVC or sustained VT between RVOT and LVOT groups. The Cramer Von Mises, Anderson–Darling, and Shapiro–Wilks tests did not reject the data normality hypothesis, and a two-sample t test was used to test for equal means of continuous variables between RVOT and LVOT. Statistical optimization of the gradient boosting tree model was done through iterative training using the extreme gradient booster (XGBoost) package. The following performance measures were formally analyzed, including the area under the curve (AUC) of the receiver operating characteristic (ROC) curve, accuracy (ACC), sensitivity (SE), specificity (SP), and F1-score. A two-sided 95% CI summarizes the sample variability in the estimates. The CI for the AUC was estimated using the Sun and Su optimization of the Delong method implemented in the pROC package. In contrast, CIs for F₁-score, SE, and SP were obtained by the bootstrap method with 20,000 replications. All analyses were done by R version 3.5.3.

Results

We analyzed data from 420 patients who underwent CA of OTVT at the Ningbo First Hospital of Zhejiang University from March 2007 to September 2019. After the CA procedure, two (0.5%) patients developed slight ecchymosis. A total of five (1.2%) patients were excluded from this study because of frequent PVC or VT recurrence in the first 6-month follow-up.

Patient demographic and clinical characteristics data for the RVOT and LVOT groups are shown in Table 1. We compare the distributions of these background characteristics in the RVOT and LVOT groups and list the associated p-values in the table. The RVOT cohort consists of 20.95% left cusp, 17.62% posterior septal, 14.29% anterior septal, 10% anterior cusp, 7.86% free wall, and 7.14% right cusp. The LVOT cohort consists of 10.71% left coronary cusp, 5.71% aortomitral continuity, 2.62% left coronary cusp and right coronary cusp ommisure, 1.67% right coronary cusp, and 1.43% summit (shown in Supplementary Section A and Table 1).

The patients were assigned to training, validation, and testing cohorts, consisting of 340 (81%), 38 (9%), and 42 (10%) patients, respectively, using random proportional allocation (demographic summary shown in Table 1). For a fair comparison, the machine learning model was supplied with different features from two feature extraction methods. The performance was assessed using the same training, validation, and testing cohorts.

We used 1,600,800 automatically generated ECG features as machine learning model input. The proposed approach achieved an ACC of 97.62 (87.44–99.99); F₁-score of 98.46 (90–100); prediction of RVOT origins with SE of 96.97 (82.54–99.89); and SP of 100 (62.97–100) (shown in Table 2), respectively; and AUC of 98.99 (96.89–100) (presented in Figure 5). Among the 1,600,800 initial automatically generated ECG features, we found a total of 1,352 critically important features with non-zero Shapley additive explanations (SHAP) values (Lundberg and Lee, 2017), showing the importance of their contributions to RVOT and LVOT prediction. The detailed interpretation of SHAP value is introduced in Supplementary Section C.1. We chose and analyzed the top three important features (shown in Figure 6) that have significant classification capability: (Dukes et al., 2015) the ratio between the location of the 5th peak or valley at the SR beat V₁ lead and the right boundary of the 5th peak or valley at the V₁ lead of PVC, Cronin et al. (2019) the ratio between the prominence of the 5th peak or valley at the V1 lead of PVC and the prominence of the 5th peak or valley at the V3 lead of PVC, and (Joshi and Wilber, 2005) the difference between the distance of the 5th peak or valley to the left boundary at the V1 lead of PVC and the distance of the 5th peak or valley to the left boundary at the V₁ lead of the SR beat.

TABLE 2

Table 2. Classification performance comparison with 95% CI.

FIGURE 5

Figure 5. Receiver-operating characteristic curve generated by the optimal machine learning model supplied with two feature extraction methods. The CI for the AUC was estimated using the Sun and Su optimization of the Delong method. Sensitivity and specificity of RVOT prediction are indicated for different thresholds.

Training the machine learning model using 155,784 features extracted from conventional QRS morphological ECG measurements, the proposed method attained an ACC of 92.86 (80.35–98.85), F₁-score of 95.38 (86.62–98.86), prediction of RVOT origins with SE of 93.94 (78.64–98.99) and SP of 88.89 (50.86–99.45) (shown in Table 2), and AUC of 95.62 (89.78–100) (presented in Figure 5). Among the initial 155,784 features, we found a total of 1,003 critically important features with non-zero SHAP values (Lundberg and Lee, 2017), showing the importance of their contributions to RVOT and LVOT prediction. The top three important features (shown in Supplementary Section C.1 and Figure 2) that show significant classification capability are (Dukes et al., 2015) the ratio between R-wave amplitude based on the zero isoelectric baselines at lead III PVC and the R-wave amplitude based on the offset of S-wave at V₁ lead PVC, Cronin et al. (2019) the ratio between the R-wave amplitude based on R-wave onset at V₂ lead SR beat and the R-wave amplitude based on zero isoelectric baseline at V₃ lead PVC, and (Joshi and Wilber, 2005) the ratio between the R-wave amplitude based on the zero isoelectric baseline at aVL lead SR beat and the R-wave amplitude based on S-wave offset at V₁ lead PVC. The statistical summary of conventional QRS morphological measurements for leads V₁ to V₆ is listed in Supplementary Section A and Table 2.

Finally, the average performance of eight cardiologists who determined RVOT and LVOT using the same ECG samples in this study is presented in Table 2. The classification confusion matrix for these three methods shows correct and incorrect frequency counts in Supplementary Section A and Table 3. Furthermore, we compared our approach against related methods from 12 prior studies (Kamakura et al., 1998; Zhang et al., 2009; Betensky et al., 2011; Yoshida et al., 2011, 2014; Cheng et al., 2013, 2018; Nakano et al., 2014; Efimova et al., 2015; He et al., 2018; Xie et al., 2018; Di et al., 2019). ACC, F₁-score, SE, SP, positive predictive value, negative predictive value, and AUC were used to compare performances and are shown in Table 3.

TABLE 3

Table 3. Comparison with prior studies to localize the origins of outflow tract arrhythmia.

Discussion

We designed and implemented a high-accuracy algorithm for LVOT and RVOT origins of OTVT classification, using 1,600,800 ECG measurements automatically extracted from 12-lead ECGs using our proprietary method. The prediction accuracy comparison among our method combined with the XGBoost classifier, a conventional QRS feature extraction method combined with XGBoost, and the performance of human experts (shown in Table 2) shows that the machine learning model with the automated ECG feature extraction method was uniformly superior. We used DeLong’s test (DeLong et al., 1988) to demonstrate that the automated ECG feature extraction method had a significantly higher AUC compared with that attained by the conventional QRS morphological feature extraction approach with a P-value = 0.035. The comparison of our approach against methods from 12 prior studies (Kamakura et al., 1998; Zhang et al., 2009; Betensky et al., 2011; Yoshida et al., 2011, 2014; Cheng et al., 2013, 2018; Nakano et al., 2014; Efimova et al., 2015; He et al., 2018; Xie et al., 2018; Di et al., 2019) shows that our algorithm achieved the highest performance scores (shown in Table 3). Additionally, we evaluated the general classification capability of each criterion proposed by previous studies using the database in this study. Not surprisingly, we observed significant differences between previously reported performances and the reproduced results of these methods because most of the prior studies used the univariate analysis to make predictions (shown in Table 3).

The excellent performance of our machine learning algorithm demands an enormous volume of data and features. It is an extremely time- and cost-consuming task to generate such amount of features by the conventional ECG QRS morphological measurements introduced in prior studies because these measurements are manually obtained. Thus, we did not make any assumptions about ECG criteria before training the machine learning algorithm and intended to exhaust all possible relationships among morphological measures of Q-, R-, and S-waves as well as the entire QRS complex. We designed and implemented an automated ECG feature extraction method that can generate 1,600,800 ECG signal characteristics. Not only did these features contain a considerable amount of the classical statistics from 12 prior studies (Kamakura et al., 1998; Zhang et al., 2009; Betensky et al., 2011; Yoshida et al., 2011, 2014; Cheng et al., 2013, 2018; Nakano et al., 2014; Efimova et al., 2015; He et al., 2018; Xie et al., 2018; Di et al., 2019), but they also captured morphological measures not considered by previous studies, such as rsR’ waves and rsr’s’ waves. However, one may be concerned that such a feature extraction method will include the P- and T-wave within SR beats and retrograde P-waves within PVC. The machine learning model captures and analyzes a large amount of information from every beat but filters out all unimportant features based on their classification accuracy contribution. As we can see from the top three important features (shown in Figure 6) selected by the machine learning model, none of the features that presented waves mentioned above played a role in the prediction. The important morphological features of the Rsr’ and rsr’s waves may be caused by noise and lead placement of the 12-lead ECG electrodes because the 12-lead ECG electrodes are frequently misplaced due to the mapping patches used during the ablation procedure. In this study, we avoid such a problem because chest and limb leads were placed carefully in a standard position when the 12-lead surface ECGs were collected before the procedure.

FIGURE 6

Figure 6. Analysis of top three significant ECG measurements found by machine learning model with automated feature extraction method. The univariate analysis (A) shows that features 1 (A.1) and 2 (A.2) have significant capability to separate RVOT and LVOT. The bivariate analysis (B) indicates the classification ability of one–one interaction of the top 3 significant features. In the multivariate analysis (C), the smaller feature 1 (C.1), feature 1 (C.2), and feature 3 (C.3) generate a higher probability of LVOT, but the magnitude of influence varies across features. The color in panel (C) represents the feature value (red high, blue low).

Moreover, before the machine learning model is safely applied in practice, an unambiguous interoperation is necessary for cardiologists to gear this advanced tool, such as explaining what crucial criteria are and why they play vital roles. For instance, the machine learning model shows that the smaller the magnitude of the first important feature (shown in Figure 6C.1), the higher the possibility of LVOT origin of OTVT. The first important feature is the ratio of the location of the 5th peak or valley at the V1 lead SR beat and the right boundary of the 5th peak or valley at the V₁ lead of PVC. In our feature extraction system, the 5th peak or valley at the V₁ lead of PVC is an S-wave in most cases. The key ECG lead in the initial site prediction of VT origin is the V₁ lead because it is located nearly orthogonal to the septal plane and, thus, is the best lead to resolve initial right- vs. left-sided activation. When the V₁ lead has a positive QRS (R > s), the VT is considered to have the right bundle branch block (RBBB) configuration. Conversely, net negative QRS (r < S) defines a left bundle branch block (LBBB) configuration (Haqqani and Marchlinski, 2019). The top three important features (shown in Figure 6) were exactly measured activation time, RBBB, and LBBB configuration. Therefore, such interpretation makes the machine learning decision process not a black box anymore.

Last but not least, the machine learning model proposed in this study can be immediately and effortlessly deployed to EP labs. The pretrained model, source code, and data are available online and found in the “Data Availability Statement” section. The model inputs are only two QRS complexes, one for PVC and one for SR beat, and they can be easily acquired from 12-lead standard ECG. The analysis of one patient’s data takes less than a second provided every step of measurement and computation is automatically done by the model and the preprocessing approach. The precise prediction of origins can significantly reduce CA duration and reduce the risk of complications.

Study Limitations

Because the data set did not produce enough well-labeled data to feed a machine learning model, the algorithm currently only predicts LVOT and RVOT rather than subsites of them. For instance, the origin of PVC is sometimes in the middle of septal RVOT/LVOT. The presence of expertly labeled data for three categories, RVOT, LVOT, and septal, will allow the machine learning model to predict the origins with higher accuracy. Although this study includes patients with comprehensive anatomy sites under RVOT and LVOT, the performance of the method could improve in the presence of more cases of RCC and summit under LVOT. Moreover, some conditions, such as cardiomyopathies, reentrant VT coronary heart disease, and prior structural and congenital abnormalities, are underrepresented or absent from the study. Thus, the algorithm potentially has a limitation if applied in such scenarios.

Conclusion

Considering the performance of prediction, the capacity of extracting vital information from 12-lead ECG and the robustness of application, our results provide the promising and reliable decision support to guide a successful CA treatment of ventricular arrhythmia by machine learning technology.

Data Availability Statement

The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: https://doi.org/10.6084/m9.figshare.c.4668086.v2.

Author Contributions

JZ, GF, XD, BH, HC, and CR processed the data for analysis. JZ, HC, GF, XD, IA, and CR performed the statistical analysis. All authors contributed to the study design, data interpretation, and writing of the report.

Funding

This work was supported by the 2020 Natural Science Foundation of Zhengjiang Province (ID H0205-3202410).

Conflict of Interest

From the Department of Cardiology, Ningbo First Hospital of Zhejiang University. HC has served as a consultant for Biosense Webster, Boston Scientific, and Abbott. HY was employed by the company Zhejiang Cachet Jetboom Medical Devices Co., Ltd.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

We are grateful for the support from the arrhythmia center of Ningbo First Hospital of Zhejiang University.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphys.2021.641066/full#supplementary-material

References

Abi-Abdallah, D., Chauvet, E., Bouchet-Fakri, L., Bataillard, A., Briguet, A., and Fokapu, O. (2006). Reference signal extraction from corrupted ECG using wavelet decomposition for MRI sequence triggering: application to small animals. Biomed. Eng. Online 5:11.

Google Scholar

Betensky, B. P., Park, R. E., Marchlinski, F. E., Hutchinson, M. D., Garcia, F. C., Dixit, S., et al. (2011). The V(2) transition ratio: a new electrocardiographic criterion for distinguishing left from right ventricular outflow tract tachycardia origin. J. Am. Coll. Cardiol. 57, 2255–2262.

Google Scholar

Bunch, T. J., and Day, J. D. (2006). Right meets left: a common mechanism underlying right and left ventricular outflow tract tachycardias. J. Cardiovasc. Electrophysiol. 17, 1059–1061. doi: 10.1111/j.1540-8167.2006.00577.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Cheng, D., Ju, W., Zhu, L., Chen, K., Zhang, F., Chen, H., et al. (2018). V3R/V7 index: a novel electrocardiographic criterion for differentiating left from right ventricular outflow tract arrhythmias origins. Circ. Arrhythm. Electrophysiol. 11:e006243.

Google Scholar

Cheng, Z., Cheng, K., Deng, H., Chen, T., Gao, P., Zhu, K., et al. (2013). The R-wave deflection interval in lead V3 combining with R-wave amplitude index in lead V1: a new surface ECG algorithm for distinguishing left from right ventricular outflow tract tachycardia origin in patients with transitional lead at V3. Int. J. Cardiol. 168, 1342–1348. doi: 10.1016/j.ijcard.2012.12.013

PubMed Abstract | CrossRef Full Text | Google Scholar

Cronin, E. M., Bogun, F. M., Maury, P., Peichl, P., Chen, M., Namboodiri, N., et al. (2019). HRS/EHRA/APHRS/LAHRS expert consensus statement on catheter ablation of ventricular arrhythmias. Europace 21, 1143–1144.

Google Scholar

David, L. D., and Johnstone, I. M. (1995). Adapting to unknown smoothness via wavelet shrinkage. J. Am. Stat. Assoc. 90, 1200–1224. doi: 10.1080/01621459.1995.10476626

CrossRef Full Text | Google Scholar

DeLong, E. R., DeLong, D. M., and Clarke-Pearson, D. L. (1988). Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44, 837–845. doi: 10.2307/2531595

CrossRef Full Text | Google Scholar

Di, C., Wan, Z., Tse, G., Letsas, K. P., Liu, T., Efremidis, M., et al. (2019). The V1-V3 transition index as a novel electrocardiographic criterion for differentiating left from right ventricular outflow tract ventricular arrhythmias. J. Interv. Card. Electrophysiol. 56, 37–43. doi: 10.1007/s10840-019-00612-0

PubMed Abstract | CrossRef Full Text | Google Scholar

Dukes, J. W., Dewland, T. A., Vittinghoff, E., Mandyam, M. C., Heckbert, S. R., Siscovick, D. S., et al. (2015). Ventricular ectopy as a predictor of heart failure and death. J. Am. Coll. Cardiol. 66, 101–109. doi: 10.1016/j.jacc.2015.04.062

PubMed Abstract | CrossRef Full Text | Google Scholar

Efimova, E., Dinov, B., Acou, W. J., Schirripa, V., Kornej, J., Kosiuk, J., et al. (2015). Differentiating the origin of outflow tract ventricular arrhythmia using a simple, novel approach. Heart Rhythm 12, 1534–1540. doi: 10.1016/j.hrthm.2015.04.004

PubMed Abstract | CrossRef Full Text | Google Scholar

Enriquez, A., Baranchuk, A., Briceno, D., Saenz, L., and Garcia, F. (2019). How to use the 12-lead ECG to predict the site of origin of idiopathic ventricular arrhythmias. Heart Rhythm 16, 1538–1544. doi: 10.1016/j.hrthm.2019.04.002

PubMed Abstract | CrossRef Full Text | Google Scholar

Hachiya, H., Aonuma, K., Yamauchi, Y., Harada, T., Igawa, M., Nogami, A., et al. (2000). Electrocardiographic characteristics of left ventricular outflow tract tachycardia. Pacing Clin. Electrophysiol. 23(11 Pt 2), 1930–1934. doi: 10.1111/j.1540-8159.2000.tb07055.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Haqqani, H. M., and Marchlinski, F. E. (2019). The surface electrocardiograph in ventricular arrhythmias: lessons in localisation. Heart Lung Circ. 28, 39–48. doi: 10.1016/j.hlc.2018.08.025

PubMed Abstract | CrossRef Full Text | Google Scholar

Haqqani, H. M., Morton, J. B., and Kalman, J. M. (2009). Using the 12-lead ECG to localize the origin of atrial and ventricular tachycardias: part 2–ventricular tachycardia. J. Cardiovasc. Electrophysiol. 20, 825–832. doi: 10.1111/j.1540-8167.2009.01462.x

PubMed Abstract | CrossRef Full Text | Google Scholar

He, Z., Liu, M., Yu, M., Lu, N., Li, J., Xu, T., et al. (2018). An electrocardiographic diagnostic model for differentiating left from right ventricular outflow tract tachycardia origin. J. Cardiovasc. Electrophysiol. 29, 908–915. doi: 10.1111/jce.13493

PubMed Abstract | CrossRef Full Text | Google Scholar

Ito, S., Tada, H., Naito, S., Kurosaki, K., Ueda, M., Hoshizaki, H., et al. (2003). Development and validation of an ECG algorithm for identifying the optimal ablation site for idiopathic ventricular outflow tract tachycardia. J. Cardiovasc. Electrophysiol. 14, 1280–1286. doi: 10.1046/j.1540-8167.2003.03211.x

PubMed Abstract | CrossRef Full Text | Google Scholar

Joshi, S., and Wilber, D. J. (2005). Ablation of idiopathic right ventricular outflow tract tachycardia: current perspectives. J. Cardiovasc. Electrophysiol. 16(Suppl. 1), S52–S58.

Google Scholar

Kamakura, S., Shimizu, W., Matsuo, K., Taguchi, A., Suyama, K., Kurita, T., et al. (1998). Localization of optimal ablation site of idiopathic ventricular tachycardia from right and left ventricular outflow tract by body surface ECG. Circulation 98, 1525–1533. doi: 10.1161/01.cir.98.15.1525

CrossRef Full Text | Google Scholar

Lahmiri, S. (2014). Comparative study of ECG signal denoising by wavelet thresholding in empirical and variational mode decomposition domains. Healthc Technol. Lett. 1, 104–109. doi: 10.1049/htl.2014.0073

PubMed Abstract | CrossRef Full Text | Google Scholar

Latchamsetty, R., Yokokawa, M., Morady, F., Kim, H. M., Mathew, S., Tilz, R., et al. (2015). Multicenter outcomes for catheter ablation of idiopathic premature ventricular complexes. JACC Clin. Electrophysiol. 1, 116–123.

Google Scholar

Lundberg, S., and Lee, S.-I. (2017). A unified approach to interpreting model predictions. in Proceedings of the 31st International Conference on Neural Information Processing Systems. New York, NY: ACM

Google Scholar

Nakano, M., Ueda, M., Ishimura, M., Kajiyama, T., Hashiguchi, N., Kanaeda, T., et al. (2014). Estimation of the origin of ventricular outflow tract arrhythmia using synthesized right-sided chest leads. Europace 16, 1373–1378. doi: 10.1093/europace/eut355

PubMed Abstract | CrossRef Full Text | Google Scholar

Ouyang, F., Fotuhi, P., Ho, S. Y., Hebe, J., Volkmer, M., Goya, M., et al. (2002). Repetitive monomorphic ventricular tachycardia originating from the aortic sinus cusp: electrocardiographic characterization for guiding catheter ablation. J. Am. Coll. Cardiol. 39, 500–508.

Google Scholar

Stein, C. M. (1981). Estimation of the mean of a multivariate normal distribution. Ann. Stat. 9, 1135–1151. doi: 10.1214/aos/1176345632

CrossRef Full Text | Google Scholar

Tanner, H., Hindricks, G., Schirdewahn, P., Kobza, R., Dorszewski, A., Piorkowski, C., et al. (2005). Outflow tract tachycardia with R/S transition in lead V3: six different anatomic approaches for successful ablation. J. Am. Coll. Cardiol. 45, 418–423. doi: 10.1016/j.jacc.2004.10.037

PubMed Abstract | CrossRef Full Text | Google Scholar

Xie, S., Kubala, M., Liang, J. J., Hayashi, T., Park, J., Padros, I. L., et al. (2018). Lead I R-wave amplitude to differentiate idiopathic ventricular arrhythmias with left bundle branch block right inferior axis originating from the left versus right ventricular outflow tract. J. Cardiovasc. Electrophysiol. 29, 1515–1522. doi: 10.1111/jce.13747

PubMed Abstract | CrossRef Full Text | Google Scholar

Yamada, T. (2019). Twelve-lead electrocardiographic localization of idiopathic premature ventricular contraction origins. J. Cardiovasc. Electrophysiol. 30, 2603–2617. doi: 10.1111/jce.14152

PubMed Abstract | CrossRef Full Text | Google Scholar

Yoshida, N., Inden, Y., Uchikawa, T., Kamiya, H., Kitamura, K., Shimano, M., et al. (2011). Novel transitional zone index allows more accurate differentiation between idiopathic right ventricular outflow tract and aortic sinus cusp ventricular arrhythmias. Heart Rhythm 8, 349–356. doi: 10.1016/j.hrthm.2010.11.023

PubMed Abstract | CrossRef Full Text | Google Scholar

Yoshida, N., Yamada, T., McElderry, H. T., Inden, Y., Shimano, M., Murohara, T., et al. (2014). A novel electrocardiographic criterion for differentiating a left from right ventricular outflow tract tachycardia origin: the V2S/V3R index. J. Cardiovasc. Electrophysiol. 25, 747–753. doi: 10.1111/jce.12392

PubMed Abstract | CrossRef Full Text | Google Scholar

Zhang, F., Chen, M., Yang, B., Ju, W., Chen, H., Yu, J., et al. (2009). Electrocardiographic algorithm to identify the optimal target ablation site for idiopathic right ventricular outflow tract ventricular premature contraction. Europace 11, 1214–1220. doi: 10.1093/europace/eup231

PubMed Abstract | CrossRef Full Text | Google Scholar

Zheng, J., Fu, G., Anderson, K., Chu, H., and Rakovski, C. (2020). A 12-Lead ECG database to identify origins of idiopathic ventricular arrhythmia containing 334 patients. Sci. Data 7:98.

Google Scholar

Keywords: outflow tract ventricular tachycardia, catheter ablation, electrocardiography, classification, artificial intelligence algorithm

Citation: Zheng J, Fu G, Abudayyeh I, Yacoub M, Chang A, Feaster WW, Ehwerhemuepha L, El-Askary H, Du X, He B, Feng M, Yu Y, Wang B, Liu J, Yao H, Chu H and Rakovski C (2021) A High-Precision Machine Learning Algorithm to Classify Left and Right Outflow Tract Ventricular Tachycardia. Front. Physiol. 12:641066. doi: 10.3389/fphys.2021.641066

Received: 13 December 2020; Accepted: 18 January 2021;
Published: 25 February 2021.

Edited by:

Xiaopeng Zhao, The University of Tennessee, Knoxville, United States

Reviewed by:

Peter Van Dam, Radboud University Nijmegen, Netherlands
Marianna Meo, Institut de Rythmologie et Modélisation Cardiaque (IHU-Liryc), France

Copyright © 2021 Zheng, Fu, Abudayyeh, Yacoub, Chang, Feaster, Ehwerhemuepha, El-Askary, Du, He, Feng, Yu, Wang, Liu, Yao, Chu and Rakovski. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Huimin Chu, bWFyay5jaHVodWltaW5AZ21haWwuY29t

^†These authors have contributed equally to this work

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.