P-Wave Area Predicts New Onset Atrial Fibrillation in Mitral Stenosis: A Machine Learning Approach

Introduction: Mitral stenosis is associated with an atrial cardiomyopathic process, leading to abnormal atrial electrophysiology, manifesting as prolonged P-wave duration (PWD), larger P-wave area, increased P-wave dispersion (PWDmax—PWDmin), and/or higher P-wave terminal force on lead V1 (PTFV1) on the electrocardiogram. Methods: This was a single-center retrospective study of Chinese patients, diagnosed with mitral stenosis in sinus rhythm at baseline, between November 2009 and October 2016. Automated ECG measurements from raw data were determined. The primary outcome was incident atrial fibrillation (AF). Results: A total 59 mitral stenosis patients were included (age 59 [54–65] years, 13 (22%) males). New onset AF was observed in 27 patients. Age (odds ratio [OR]: 1.08 [1.01–1.16], P = 0.017), systolic blood pressure (OR: 1.03 [1.00–1.07]; P = 0.046), mean P-wave area in V3 (odds ratio: 3.97 [1.32–11.96], P = 0.014) were significant predictors of incident AF. On multivariate analysis, age (OR: 1.08 [1.00–1.16], P = 0.037) and P-wave area in V3 (OR: 3.64 [1.10–12.00], P = 0.034) remained significant predictors of AF. Receiver-operating characteristic (ROC) analysis showed that the optimum cut-off for P-wave area in V3 was 1.45 Ashman units (area under the curve: 0.65) for classification of new onset AF. A decision tree learning model with individual and non-linear interaction variables with age achieved the best performance for outcome prediction (accuracy = 0.84, precision = 0.84, recall = 0.83, F-measure = 0.84). Conclusion: Atrial electrophysiological alterations in mitral stenosis can detected on the electrocardiogram. Age, systolic blood pressure, and P-wave area in V3 predicted new onset AF. A decision tree learning model significantly improved outcome prediction.


INTRODUCTION
Inter-atrial block (IAB) results from impaired conduction of action potentials along Bachmann's bundle that connects the right and left atria (Tse et al., 2016). It is characterized electrocardiographically by a prolonged P-wave duration of >120 ms. This condition results in delayed and asynchronous activation of the left atrium (Agarwal et al., 2003;Budeus et al., 2005;Caldwell et al., 2014). IAB has been associated with higher incidence of stroke as well as cardiovascular and allcause mortality (Ariyarajah et al., 2007;Magnani et al., 2011). However, it is unclear any benefit derived from early initiation of anti-coagulation in IAB before the development of atrial fibrillation (AF), and the risk may differ depending on the severity of IAB and the presence of other cardio-metabolic co-morbidities. Two other measures have been used to assess atrial electrophysiological remodeling. Firstly, P-wave dispersion, defined as the difference between maximum and minimum P-wave duration (PWD), is a measure of heterogeneous and discontinuous atrial activation. Secondly, P-wave terminal force in V1 (PTFV1) is a marker of left atrial disease independently of structural or pressure changes in the left atrium (Morris and Thompson, 1964) and has been shown to be a predictor of future incident AF (Martin Garcia et al., 2012). Prolonged PWDs, measured from amplified and digitized ECG signals obtained in sinus rhythm, predicted AF recurrence after pulmonary isolation procedures (Jadidi et al., 2018). Moreover, the area of the P-wave initial portion was independently associated with the development of AF in patients with left atrial overload (Ishida et al., 2010). AF complexity parameters derived from the ECG also predicted long-term outcomes following catheter ablation (Lankveld et al., 2016).
Mitral stenosis is a valvular disease frequently seen in parts of Asia, causing significant morbidity and mortality. In this condition, the most common arrhythmia encountered is AF, but there are limited data on electrocardiographic changes that reflect ongoing atrial cardiomyopathic process that precedes the development of fibrillation. Mitral stenosis patients have longer PWDs and higher P-wave dispersion than control subjects without mitral stenosis (Guntekin et al., 2008). Another study confirmed this observation and further demonstrated a significant correlation between maximum PWDs and left atrial size, transmitral valve gradient, and a negative correlation with mitral valve area (Rezaian et al., 2007). PTFV1 is higher in mitral stenosis and is a predictor of disease severity (Yuce et al., 2011). However, there are limited published data regarding the incidence of IAB, the relative contributions of partial and advanced IAB, and whether these indices predict incident AF in mitral stenosis.

METHODS
This study received approval from The Joint Chinese University of Hong Kong-New Territories East Cluster Clinical Research Ethics Committee. Clinical and electrocardiographic details of a cohort of Chinese patients referred to our center, which is a tertiary referral center and teaching hospital, between November 2009 and October 2016, for echocardiography, were analyzed retrospectively. Inclusion criteria were mitral stenosis patients with raw ECG data files available for analysis.

Definitions, Data Extraction, Electrocardiographic Measurements, and Primary Outcome
The following clinical details were obtained from the patients: age, gender, blood pressure, smoking status, diabetes mellitus, hypertension, hypercholesterolemia, and ischemic heart disease. For electrocardiographic parameters, data were extracted from patients who had ECGs that did not show atrial fibrillation (AF). The following parameters were manually measured by two investigators from the ECGs showing sinus rhythm. The following P-wave variables were determined from ECGs of patients in sinus rhythm. Automated measurements from raw ECG data were extracted from the Philips ECGVue program (Standard Edition). The ECG waveform data is captured at a sample rate of 4 MHz and reduced to 500 samples per second with 5 µV resolution. The mean, minimum, maximum, and standard deviation of different P-wave variables were calculated from values from all 12 leads (Figure 1). P-wave dispersion was defined as the maximum difference in PWD. P-wave terminal force in V1 (PTFV1) was defined as the area subtended by the terminal negative component of a biphasic P-wave in lead V1, with the area calculated by multiplication of the duration and depth of the waveform (He et al., 2017). The primary endpoint of this study was new onset persistent or permanent atrial fibrillation (AF). Paroxysmal AF at baseline or detected followup was excluded. The endpoint was met if AF was detected in at least two ECGs on follow-up 1 year apart in an absence of sinus rhythm detection in the intervening period.

Statistical Analysis, Non-linear Variables, and Decision Tree Learning
Data were expressed as median [lower quartile to upper quartile]. Categorical data were analyzed by Fisher's exact test. Differences between study groups were tested using Kruskal-Wallis ANOVA. P < 0.05 was considered statistically significant. Non-linear interactions (e.g., interactions formed by some important individual variable) play an important role in predicting the outcome. The consideration of non-linear interactions overcomes linear model's assumption that the dependent and independent variables are linearly related. In this study, the logarithmic form of the multiplication nonlinear items formed by age (important individual variable) and other continuous variables were considered, i.e., log(age * x i ), where x i denotes the ith continuous variable. The adoption of logarithmic transformation is to obtain equivalent inference on variable-outcome associations while avoid the bias due to exponentiation on some squared and cubed variables. The non-linear variables considered were log(age * systolic blood pressure), log(age * diastolic blood pressure), log(age * diabetes mellitus), log(age * hypercholesterolaemia, log(age * ischaemic heart disease), log(age * left atrial diameter), log(age * mitral valve area), log(age * mitral valve gradient), log(age * mitral FIGURE 1 | Normal inter-atrial conduction, partial and advanced inter-atrial block and left atrial enlargement. P-wave terminal force in V1 (PTFV1), P-wave duration (PWD), and P-wave area. Adapted from (He et al., 2017) with permission. stenosis severity), log(age * P-wave area in v3). For instance, for a patient whose age is 58 years old and has systolic blood pressure 110 mmHg, we generate the value of nonlinear variable log(age * systolic blood pressure) by calculating log(58 * 110) = 3.8048. The values for the other non-linear variables are obtained in a similar way. Then these non-linear variables and the individual variables are together used as input in the risk prediction model. The reproduction of the non-linear variables can be obtained since there exists one-one mapping between the non-linear variables and the individual variable pairs that form them.
Decision tree learning uses a decision tree module (as a predictive model) to determine the outcome (or target value, represented by leaves) of a sample based on the associated observations (represented by branches) for model classification and prediction. The principles of decision tree are illustrated in Figure 2. In this study, a decision tree learning approach (classification and regression tree, CART Rutkowski et al., 2014 was used to predict new onset AF. Specifically, the nonlinear variables (including log(age * systolic blood pressure), log(age * diastolic blood pressure), log(age * diabetes mellitus), log(age * hypercholesterolaemia), log(age * ischaemic heart disease), log(age * left atrial diameter), log(age * mitral valve area), log(age * mitral valve gradient), log(age * mitral stenosis severity), log(age * P-wave area in v3) together with individual variables (including sex, age, systolic blood pressure, diastolic blood pressure, diabetes mellitus, hypercholesterolaemia, ischaemic heart disease, left atrial diameter, mitral valve area, mitral valve gradient, mitral stenosis severity, P-wave area in v3) were used as input to the DTL model, in order to predict the new onset of AF outcome in mitral stenosis. In the DTL model, leaves represent class label of new onset AF and branches represent feature conjunctions (both of non-linear variables and individual variables) that lead to new onset AF. DTL uses Gini index to construct a decision tree, which is calculated by the formula representing the probability of a particular patient being wrongly classified when it is randomly chosen and c denotes the number of class (c = 2 for new onset AF classification in this study). The Gini index is used to create split points by considering a binary split for each variable in DTL. It varies between 0 and 1, where 0 denotes that all elements belong to a certain class or if there exists only one class, and 1 denotes that the elements are randomly distributed across various classes. A Gini Index of 0.5 denotes equally distributed elements into some classes. Here we present an example to show the construction process of DTL with Gini index. Firstly, the frequency table of new onset AF outcome (27 with "yes" and 32 with "no") was considered: there are 19 of 46 females with new onset AF, while eight of 13 males with new onset AF. The Gini index for sex as a decision node to split the tree was calculated: Gini index (new onset AF, male) = 2 * (8/13) * (1-8/13) =0.4734, Gini index (new onset AF, female) = 2 * (19/46) * (1-19/46) = 0.4849. In addition, the frequency table of ischaemic heart disease was considered: eight of 12 with ischaemic heart disease had new onset AF, and 19 of 47 without ischaemic heart disease had new onset AF. The corresponding Gini index was calculated by Gini(new onset AF, ischaemic heart disease as "yes") = 2 * (8/12) * (1-8/12)=0.4444, Gini index(new onset AF, ischaemic heart disease as "no")=2 * (19/47) * (1-19/47) = 0.4817. In the same manner, the Gini indices of all the variables (the Information Gain of a continuous variable, such as age, was discretized first) was calculated. DTL splits the dataset on different variables by referring to the Gini indices, and the variable with the lowest Gini index value was selected as the decision node. DTL divided the dataset by its branches and repeat the same process on every branch. A branch with zero Gini index becomes a leaf node, while a branch with Gini index larger than zero needs further splitting. DTL is run recursively in a similar way on the non-leaf branches, until all data were classified.

RESULTS
Electrocardiograms of patients with mitral stenosis (n = 155) were screened. In this cohort, 96 had atrial fibrillation on admission to our hospital without prior ECGs in sinus rhythm available for analysis. The remaining 59 patients were in sinus rhythm and were analyzed further. A graphical representation of the different P-wave indices is shown in Figure 1. The baseline characteristics of this cohort are shown in Supplementary Table 3). On multivariate analysis, age (OR: 1.08 [1.00-1.16], P = 0.037) and P-wave area in V3 (OR: 3.64 [1.10-12.00], P = 0.034) remained significant predictors of AF. Receiver-operating characteristic (ROC) analysis showed that the optimum cut-off for P-wave area in V3 was 1.45 Ashman units with an area under the curve of 0.65 for classifying new onset AF.
A decision tree learning (DTL) model was then employed to generate the decision rules based on only individual variables as shown in Figure 3A, while the decision rules generated by DTL model based on both individual and non-linear interaction items are shown in Figure 3B. In each model, 80% of the sample (n = 47) were randomly selected and the remaining 20% (n = 12) were used for validation. For the decision tree without interacting variables, P-wave area in V3 is the first predictor generated with a Gini index of 0.496 ( Figure 3A). In addition, the interaction between age and left atrial diameter is the first variable in the generated decision rule with a Gini index of 0.495, while Pwave area in V3 as the second most important variable with Gini index 0.278 ( Figure 3B). Both decision trees generated by machine learning can be used as an efficient tool for accurate risk stratification in mitral stenosis. The decision rule incorporating interactions between variables is more accurate.
To determine the out-of-sample prediction ability of the model, five-fold cross-validation was adopted. The evaluation metrics evaluated are accuracy, precision, recall, and F-measure. Comparisons of the prediction performance of logistic regression and DTL are shown in Supplementary Table 4. DTL with individual and non-linear variables outperforms the logistic regression for predicting incident AF in mitral stenosis. The non-linear variable formed by P-wave area in V3 and age was significantly predictive for classifying new onset AF outcome in mitral stenosis. We can also observe that several non-linear variables are more predictive than individual variables, indicating the importance of considering the non-linear patterns in clinical characteristics to improve the performance of predicting new onset AF.

DISCUSSION
The major findings of this study are that (i) a high proportion of patients with mitral stenosis had IAB, (ii) age and P-wave area in V3 predicted new onset AF, and (iii) a stepwise improvement in the predictive performance after incorporation of interaction variables and machine learning using a decision tree approach.
Prior studies have investigated alterations in P-wave morphologies and indices in non-valvular atrial fibrillation. Whilst some reports have dealt with the relationship between mitral stenosis and P-wave indices, few studies have examined whether these indices can predict new onset AF. Atrial electrophysiology in mitral stenosis is abnormal due to a complex process of electrophysiological remodeling. IAB is the conduction delay along the Bachmann's bundle between left and right atria, diagnosed by its characteristic ECG pattern (Agarwal et al., 2003;Bayes de Luna et al., 2012). Partial IAB and advanced IAB are defined as PWD ≥ 120 ms in the presence and absence of biphasic P-waves in the inferior leads. The association between IAB and supraventricular tachyarrhythmias, especially AF, is known as Bayés syndrome. Mitral stenosis is a major risk factor for AF through atrial dilatation with progressive structural remodeling and interstitial fibrosis, predisposing to re-entrant activity within the atrium (Markides and Schilling, 2003;O'neal et al., 2016). In addition to IAB, abnormal atrial electrophysiology can be detected by alterations in P-wave morphology on the electrocardiogram (ECG), including Pwave dispersion and abnormal P-wave terminal force in V1 (PTFV1) (Yamada et al., 1999;Dogan et al., 2004;Wong et al., 2004;Koide et al., 2008;Tsioufis et al., 2010;Yoshizawa et al., 2014). A previous study involving 30 mitral stenosis patients found that maximum P-wave duration and P-wave dispersion were significantly higher than patients without mitral stenosis (Guntekin et al., 2008). In the same previous study, baseline maximum and minimum PWDs and P-wave dispersion all correlated with mitral valve area and mean mitral gradient (Guntekin et al., 2008). Another study involving a prospective follow-up of 116 mitral stenosis patients similarly reported these associations, and additionally correlated these ECG parameters with increased pulmonary artery pressure, and a poor NYHA class (Yuce et al., 2011). Moreover, the extent to which atrial electrical abnormalities can predict incident AF remains less explored in these previous studies. Whilst previous studies have demonstrated the predictive value of various P-wave indices for incident AF or progression from paroxysmal to persistent AF (Koide et al., 2008), to date there are no studies specifically on their values in mitral stenosis. In our cohort, we utilized automated ECG measurements and found that P-wave area significantly predicted new onset AF.
In clinical practice, it is often useful to use cut-off values to identify categorize whether a patient is at high or low risk of adverse events. For example, a previous study identified that PWDs longer than 150 ms predicted AF recurrence after pulmonary isolation procedures (Jadidi et al., 2018). It should be noted that in the prior study, the ECG signals were amplified to 0.2-0.25 mV/cm before manual measurements were made, and the methodology therefore differs from that used here. In a cohort of patients with left atrial overload, area of the initial portion of the P-wave ≥ 65 µV.ms was associated with a four-fold increased risk of developing future AF (Ishida et al., 2010). In our study, the optimum cut-off for P-wave area in V3 was 1.45 Ashman units, with an area under the ROC curve of 0.65. Our novelty lies with the demonstrations that P-wave area can be used to predict incident AF in mitral stenosis and the use of machine learning approaches significantly improve outcome classification.

LIMITATIONS
Some limitations should be recognized. Firstly, this was a single center study with a small sample size and retrospective analyses. Some inherent bias could affect the results. However, electronic health records in Hong Kong are comprehensive with accessible information across different hospitals within the public system and multiple follow-ups per year in the outpatient and inpatient settings. Secondly, the effects of drugs were not explored in the current study. Thirdly, left atrial diameter was the only available metric on atrial dimensions, as this was the only variable described in the echocardiography reports. Future work could explore (i) whether left atrial area, volume or volume index can predict incident AF and (ii) the potential effects on atrial reverse remodeling by drugs. Our main conclusion that Pwave area predicts incident AF needs to be validated by larger prospective studies.

CONCLUSION
Atrial electrophysiological alterations in mitral stenosis can detected on the electrocardiogram. Age, systolic blood pressure, and mean PWD predicted new onset AF. A decision tree learning model significantly improve outcome prediction.

DATA AVAILABILITY STATEMENT
All datasets generated for this study are included in the article/Supplementary Material.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by The Joint Chinese University of Hong Kong-New Territories East Cluster Clinical Research Ethics Committee. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

AUTHOR CONTRIBUTIONS
GT: conception of study and literature search, figures, study design, data collection, data analysis, data contribution, manuscript drafting, and critical revision of manuscript. IL: literature search, data analysis, data contribution, manuscript drafting, and critical revision of manuscript. JZ and KLi: data analysis, manuscript drafting, and critical revision of manuscript. SL, YL, KLe, and TL: data interpretation, and critical revision of manuscript. AB: literature search, data analysis, critical revision of manuscript, and study supervision. QZ: literature search, figures, study design, data analysis, manuscript drafting, critical revision of manuscript, and study supervision.