Machine Learning for Electrocardiographic Features to Identify Left Atrial Enlargement in Young Adults: CHIEF Heart Study

Background Left atrial enlargement (LAE) is associated with cardiovascular events. Machine learning for ECG parameters to predict LAE has been performed in middle- and old-aged individuals but has not been performed in young adults. Methods In a sample of 2,206 male adults aged 17–43 years, three machine learning classifiers, multilayer perceptron (MLP), logistic regression (LR), and support vector machine (SVM) for 26 ECG features with or without 6 biological features (age, body height, body weight, waist circumference, and systolic and diastolic blood pressure) were compared with the P wave duration of lead II, the traditional ECG criterion for LAE. The definition of LAE is based on an echocardiographic left atrial dimension > 4 cm in the parasternal long axis window. Results The greatest area under the receiver operating characteristic curve is present in machine learning of the SVM for ECG only (77.87%) and of the MLP for all biological and ECG features (81.01%), both of which are superior to the P wave duration (62.19%). If the sensitivity is fixed to 70–75%, the specificity of the SVM for ECG only is up to 72.4%, and that of the MLP for all biological and ECG features is increased to 81.1%, both of which are higher than 48.8% by the P wave duration. Conclusions This study suggests that machine learning is a reliable method for ECG and biological features to predict LAE in young adults. The proposed MLP, LR, and SVM methods provide early detection of LAE in young adults and are helpful to take preventive action on cardiovascular diseases.


INTRODUCTION
Machine learning, an artificial intelligence (AI)-based computational statistic, has been broadly applied to clinical practice in medicine to assess disease risk and diagnosis (1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12). For instance, Lin et al. (12) used the support vector machine (SVM) classifier for some ECG features training successfully to identify echocardiographic left ventricular hypertrophy, and the performance of SVM was superior to the conventional ECG voltage criteria. In the modern age, the impact of machine learning is tremendously growing in medicine and has become a cost-effective and practical tool for physicians.
Left atrial enlargement (LAE) is related to high blood volume status (i.e., mitral regurgitation and elite endurance athletes) (13,14) and elevated left ventricular (LV) diastolic pressure (i.e., obesity, hypertension, and great LV mass) (15)(16)(17). LAE is a precursor of left atrial dysfunction and has been associated with incident atrial fibrillation, ischemic stroke, and cardiovascular events in middle-and old-aged individuals (18)(19)(20)(21). The prevalence of LAE is increased with aging, (18) and, in young adults, LAE is usually observed in those undergoing rigorous physical training, particularly accumulated lifetime training >3,600 h (22)(23)(24). A prior coronary artery disease risk development in young adults (CARDIA) study also revealed that the presence of LAE at a young age is a risk factor in incident cardiovascular events occurring in midlife (25). Therefore, early detection of LAE is vital to prevent the development of cardiovascular diseases and related sequelae.
The P wave duration in lead II ≥ 120 milliseconds is currently the most commonly used ECG criterion for the general population to screen for the presence of echocardiographic LAE, which is mainly defined as a diastolic left atrium dimension >4 cm in the parasternal long axis window (26). The P wave duration was also a predictor of atrial fibrillation, cardiovascular death, and early vascular aging (27,28). Over the past 5 years, there were only some hospital-based studies utilizing machine learning for ECG features to detect the presence of LAE, in which the area under the curve (AUC) of the receiver operating characteristic curve (ROC) varied much from 0.62 to 0.98 (29)(30)(31). However, there were no previous reports performed in the general population. The aim of the study was to investigate the performance of machine learning for ECG features to identify LAE in a military cohort of young male adults.

Study Population
A population of 2,268 military males aged 17-43 years were obtained from the cardiorespiratory health in eastern armed forces study (CHIEF Heart Study) for the machine learning experiment (32)(33)(34)(35). All the participants received the annual health examination for their demographic, anthropometric, and hemodynamic measurements in the Hualien Armed Forces General Hospital of Taiwan from 2016 to 2021. Anthropometric parameters, including body height, weight, and waist circumference, of each participant were measured in the standing position. The hemodynamic parameter for blood pressure of each participant was measured one time over the right upper arm in a sitting position after at least 15 min of rest by an automatic oscillometric monitor (PARAMA TECH FT-201, Fukuoka, Japan). In addition, all the participants received 12lead ECG and echocardiography to assess their cardiac structure and function during the same period. Sixty-two participants were excluded for a lack of relevant data (n = 36) or were unwilling to sign informed consent (n = 26), leaving a sample of 2,206 males for analysis.

ECG and Echocardiographic Measurements
A 12-lead ECG was performed for each participant (Schiller AG CARDIOVIT MS-2015, Baar, Switzerland). If the quality of the ECG report was not interpretable (i.e., baseline wandering), a new ECG would be repeated by the technician. The analysis for the ECG parameters, such as the heart rate and P-QRS-T wave duration or interval, was performed by the software in the ECG machine and interpreted by a board-certified cardiologist.
Transthoracic echocardiography using a 1-5 MHz transducer (iE33; Philips Medical Systems, Andover, MA, USA) was performed following the ECG procedure at the Hualien-Armed Forces General Hospital. Measurements of left atrial dimensions were based on the recommendations of the American Society of Echocardiography (36). LAE was defined as the left atrial diameter in the image of 2-D or M-mode > 4 cm, which was calculated from the posterior aortic wall to the posterior left atrial wall for men in the parasternal long-axis view at the end-ventricular systole. The prevalence of LAE in the young males was 4.85% (107/2206). The profiles of those with and without LAE are shown in Table 1 and compared by ANOVA, where a p < 0.05 was considered significant. The study design and protocol were approved by the Institutional Review Board of Mennonite Christian Hospital (No. 16-05-008) in Hualien City, Taiwan.

Machine Learning Procedures
Three machine learning classifiers, including the multilayer perceptron (MLP) (37), logistic regression (LR) (38), and support vector machine (SVM) with a linear kernel (39), were used for 26 ECG features (heart rate; P wave duration in lead II; intervals of PR, QRS, and QT in lead II; axes of P, QRS, and T waves in lead II; voltages of the R wave in limb leads I, II, III, aVR, aVL, and aVF; voltages of both the R and S waves in precordial leads V1-V6) and with or without six biological features (age, body height, body weight, waist circumference, and systolic and diastolic blood pressure) training to identify the presence of LAE from military young males in Taiwan. The normalization of Min-Max scaling was used for the input data to execute a linear transformation (40). The original data of all 32 ECG and biological features were adjusted to a normalized value between 0 and 1. The MLP model includes an input layer, hidden layers, and an output layer (37). In hidden layers, the rectified linear unit (ReLU) activation function is utilized for each node, and the logistic regression function is used to determine the output layer. LR is a linear model that transforms its output using the logistic sigmoid function to return a probability value (38). The loss function includes the loss term and the regularization term. The loss term for learning the weight vector is negative log-likelihood, and the regularization term is used to avoid overfitting. In SVM (39), the maximum margin is constructed to maximize the distance from the hyperplane to the nearest subset of the training data points (support vectors) of the LAE or non-LAE class. The soft-margin SVM with regularization technique weighted by hyperparameter is adopted to allow the wide decision margin (39). The optimized hyperparameters for the three machine learning classifiers are chosen by grid search based on the average AUC of the ROC curves of the cross validation.

Data Augmentation and Cross Validation
The data of the 2,206 participants were randomly grouped by a 3:1 ratio into a training/validation set (n = 1,654) and a test set (n = 552). Three subgroups of equal size were divided from the training/validation set. Two subgroups of the training/validation set were used for training, and the remaining subgroup was used for validation. The data numbers illustrated by the three folds are shown in Table 2. Because there was an imbalance in sample size between LAE and non-LAE cases, the synthetic minority oversampling technique (SMOTE) (41) was applied to artificially augment the LAE cases. Using the SMOTE to create sufficient new minority class cases, a near neighbor of the minority class of the index cases was randomly chosen for interpolation. The decision space for the LAE cases was magnified, and the SMOTE method could balance the number of each category. After data augmentation, the three subgroups were replaced to repeat the process: two for training and one for validation. An average of the three AUCs of the ROC curves from the 3-fold cross validations was treated as a single performance. The raw data that were not preprocessed by SMOTE for machine learning were used to confirm the validity of SMOTE. This study utilized scikit learn v0.20.2 software and Python programming language for the proposed methods. The flow chart for data preprocessing and machine learning is shown in Figure 1.    of the P wave duration in lead II was ≥ 106 ms for LAE, the sensitivity, specificity, and accuracy were 73.68, 48.78, and 49.64%, respectively. The AUCs of the ROC curves shown in Figure 2 were 72.93, 77.09, and 77.87% using the MLP, LR, and SVM, respectively, for 26 ECG features and 81.01, 78.99, and 76.74% utilizing the MLP, LR, and SVM, respectively, for 32 ECG and biological features, which were much >62.19% for the P wave duration in lead II.

DISCUSSION
The study was the first report to show a better performance of machine learning to predict echocardiographic LAE compared to the traditional ECG criterion of P wave duration in young male adults who had a healthy status and without multiple comorbidities. Prior studies (29)(30)(31) have revealed that machine learning for ECG features could detect most of the LAE cases from hospitalized patients, probably due to those patients with LAE who were likely to have other cardiac comorbidities, such as heart failure, that were easily reflected by ECG features; thus, the results might not be appropriate for healthy individuals. Some studies have shown that, in young adults, particularly physically fit people, an enlarged cardiac chamber is likely, and the typical ECG features for LAE might not be the same as those in middle-aged individuals and elderly individuals who had several cardiovascular comorbidities, i.e., hypertension. This study revealed that the P wave axis rather than the P wave duration was a strong indicator for LAE. In addition, a greater R wave in leads aVL and I and an S wave in lead V1 representing an enhanced left lateral electrical force in the heart (42) and a greater QT interval representing a longer diastolic phase of electrical repolarization and left ventricular relaxation were vital predictors of LAE. These findings emphasize the necessity of performing machine learning, specifically for physically young adults to identify LAE. The SVM was the best machine learning classifier for ECG features only to detect LAE in young males, achieving an AUC of 78% of the ROC. In contrast, the MLP was the best machine learning classifier, which could improve the performance from 73 to 81% after biological features were added to the MLP model. It was obvious that the addition of biological features did not improve the predictive performance of the SVM and LR classifiers.

Study Strengths and Limitations
The main strengths of this study included the following: First, military males were physically active, and the training program was conducted in Eastern Taiwan. In addition, since the living environment is a closed system, the participants have a similar daily schedule, and the unmeasured bias could be minimized. Third, this was the first study using machine learning for ECG and biological features to predict LAE early in young adults. In contrast, the data were only obtained from the males, and the results might not be the same for the females. Second, other feature learning methods, such as convolutional neural networks for ECG training to predict LAE, were not performed, which may be a focus of future works. Third, since LAE is highly associated with atrial fibrillation, follow-up studies allow a conclusion related to atrial fibrillation. Finally, oxidative stress was also related to the occurrence of atrial fibrillation (43), and this was not considered in this study.

CONCLUSION
This study suggests that it is reliable to use machine learning for ECG features and biological features to predict LAE in young adults. The proposed MLP, LR, and SVM methods could provide early detection of LAE in young adults in clinical settings and may be useful in screening for high-risk groups of young adults for cardiovascular diseases, i.e., atrial fibrillation, which has an important relationship with LAE.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Institutional Review Broad of Mennonite Christian Hospital (No. 16-05-008). The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
C-YH wrote the paper. P-YL collected the data. S-HL, YK, and CL raised critical comments for the paper. G-ML analyzed data and edited the manuscript and was the principal investigator for the CHIEF study. All authors contributed to the article and approved the submitted version.