- Faculty of Electrical Engineering and Computer Science, VSB-Technical University of Ostrava, Ostrava, Czechia
Background: Early detection and diagnosis of myocardial infarction (MI) help physicians save lives through timely treatment. Vectorcardiography (VCG) is an alternative to the 12-lead electrocardiography, providing not only characteristic changes in cardiac electrical activity in MI patients but also unique spatial information often overlooked by traditional methods. Despite its potential, comprehensive comparative studies applying machine learning (ML) techniques specifically to VCG data remain limited.
Methods: This study proposes a novel VCG processing methodology using a comparative analysis of machine learning-based algorithms for the automated detection of MI patients from VCG recordings, utilizing extracted domain knowledge VCG features that monitor morphological changes in cardiac activity. For this purpose, records from the PTB Diagnostic dataset were used. The extracted domain knowledge dataset of morphological features was then fed into a diverse set of 210 machine learning configurations, including K-nearest neighbor, Support Vector Machine, Discriminant Analysis, Artificial Neural Network, Decision Tree, Random Forest, Naive Bayes, Logistic Regression, and Ensemble Methods. To further improve classification performance, we combined analyzed high-performing models using a stacking ensemble strategy, which integrates multiple base classifiers into a meta-classifier.
Results: The stacking-based decision-level fusion achieved high accuracy of 95.55%, sensitivity of 97.70%, specificity of 86.25%, positive predictive value of 96.86%, negative predictive value of 89.61% and f1-score of 97.27%.
Conclusion: The results demonstrate that decision-level fusion via stacking improves classification performance and enhances the reliability of MI detection from VCG recordings, supporting cardiologists in decision-making.
1 Introduction
Myocardial infarction (MI) is a heart disease caused by a blockage of a coronary artery, causing insufficient flow of oxygenated blood to the heart muscle. A frequent clinical manifestation is long-lasting chest pain, shortness of breath or dizziness of varying intensity depending on where exactly in the coronary artery basin the blockage occurred (Thygesen et al., 2007; Zhao et al., 2021). There are, however, known cases where MI can occur in some patients as a clinically silent event without obvious symptoms. This fact can cause more serious problems for patients in the future. The American Health Association estimates that approximately 750,000 residents of the United States have a heart attack each year, of which 210,000 have a recurrent heart attack (Mozaffarian et al., 2016; Acharya et al., 2017). This means that there is a serious risk that patients can have a silent heart attack, which has damage to the myocardium without any clinical symptoms. The result of this fact is a very high risk of mortality. Mortality due to MI is increasing every year, while the age of onset of MI is decreasing (Acharya et al., 2017; Amini et al., 2021; He et al., 2022).
The result of MI is irreversible damage to the myocardium and early diagnosis is necessary. Among non-invasive diagnostic methods, the 12-lead ECG is the most commonly used in clinical practice for diagnosing MI and other heart diseases. It measures the electrical activity of the heart using electrodes placed on the patient’s body. The diagnosis is made based on changes in the shape of the curve, rhythm, and other characteristics that correspond to a specific pathology. As part of automated detection, the ECG is analyzed using Machine Learning (M-L) methods to support diagnostics. Various machine learning techniques are used Hammad et al. (2022), such as K-nearest neighbor (KNN) (Sharma and Sunkaria, 2018), Ensemble bagged trees (Fatimah et al., 2021), Support vector machine (SVM) (Sharma and Sunkaria, 2018; Dhawan et al., 2012; Dohare et al., 2018), a combination of Fourier decomposition method (FDM) and SVM Fatimah et al. (2021) or Convolutional neural networks (CNN) (Śmigiel et al., 2021). The use of 12-lead ECG for MI detection has been studied by several authors using different methods (Acharya et al., 2017; Zhang et al., 2021; Sharma et al., 2018; Zeng et al., 2020; Barmpoutis et al., 2019; Han and Shi, 2019; Tadesse et al., 2021). Murat et al. (2021) extracted ECG features from lead II, which were subsequently input to nine different classification methods for arrhythmia detection. Using ECG features, the random forest classifier achieved the highest accuracy of 90.30%. Zhang et al. (2021) achieved a detection accuracy of 99.40% using the Tree Bagger classification to locate 11 kinds of MI. Acharya et al. (2017) achieved a detection accuracy of 93.53% from unfiltered ECG recordings using CNN. In a previous study (Sharma et al., 2018), they also dealt with the KNN classifier for extracted features. Another use of SVM for MI classification was used by Dohare et al. (2018), where their proposed model does not include pre-processing of signals in the form of filtering. Other processing options of ECG were presented by Weng et al. (2014) in the form of a combination of Principal Component Analysis (PCA) and polynomial approximation in the feature extraction phase of MI classification in order to increase the accuracy of the SVM classifier where they achieved accuracy of 98.07%. Machine learning methods were also used for ECG classification by Zhou and Tan (2020), where they proposed a combination of CNN and extreme learning machine (ELM). While Ripoll et al. (2016) derived a new CNN-based screening method to assess whether a patient should be referred to a cardiology service from ambulatory care or the emergency department.
In addition to the commonly used 12-lead ECG, there is also lesser-known method of 3-lead Vectorcardiography (VCG), which provides spatial information about the electrical activity of the heart. VCG measurement is performed in three mutually perpendicular anatomical planes: sagittal, horizontal and frontal. Figure 1 shows individual VCG leads X, Y, Z together with spatial visualization. Compared to a standard 12-lead ECG, VCG achieves higher sensitivity and is more suitable for automated processing to detect heart diseases due to the smaller number of leads, no redundant information and higher measurement accuracy with corrected orthogonal leads (Malmivuo and Plonsey, 1995; Kijonka et al., 2022). In clinical practice, vectorcardiography is most often measured using the Frank lead system (Frank, 1956). In addition to the Frank lead system, there are also other lead systems such as (McFee and Parungao, 1961), SVEC III (Schmitt and SIMONSON, 1955), and hybrid lead systems (Dellborg et al., 1995), which were only used in specialized workplaces.
Figure 1. VCG recording for HC (A) and for MI (B) in individual leads X, Y, Z and three-dimensional recurrent loop.
Early detection of heart diseases is a frequently discussed topic these days (Hammad et al., 2018; Ketu and Mishra, 2022; Jambukia et al., 2015; Zhang X. et al., 2022, Zhang et al., 2020). There are known cases where automated detection from ECG recordings does not achieve sufficient accuracy (Filos et al., 2017). For example, studies dealing with the diagnosis of atrial enlargement (Bartall et al., 1978), right ventricular hypertrophy (Zhou et al., 2001; Cowdery et al., 1980) or even posterior MI (PMI) (Khan and Pachori., 2021a; Prabhakararao and Dandapat., 2020; Khan and Pachori., 2021b) found that VCG achieves a higher detection success compared to classical ECG. A reevaluation of the frequency of ECG use in favor of VCG has also been suggested (Van Bemmel et al., 1992; Khan and Pachori, 2021a). To detect MI from VCG recordings, the authors used similar principles of machine learning as in the case of ECG as mentioned above (Dehnavi et al., 2011; Dima et al., 2013; Panagiotou et al., 2013; Khan and Pachori, 2021a; Correa et al., 2016; Tripathy and Dandapat, 2017). For example, Dehnavi et al. (2011) extracted 22 VCG features, which were subsequently applied to a neural network (ANN). To detect myocardial scar as a consequence of MI, Dima et al. (2013) used ECG and VCG recordings from which they extracted features analyzing the pathophysiological consequences of a scarred myocardium causing electrical conduction failure. They used the SVM method for the extracted features. Another use of SVM was used by the authors in Panagiotou et al. (2013); Khan and Pachori (2021a) for the extracted features. The authors in Sharma and Sunkaria (2018) compared two classification methods for the detection of inferior MI, where they used the KNN and SVM methods and looked for which of these used methods achieves more accurate results. The information extracted from VCG led the authors in Chuang et al. (2023) to develop a patient-specific transformation for VCG synthesis. They used temporal convolutional networks in the variational mode decomposition domain to potentially differentiate MI patients and localize the infarct area. Perkins et al. (2024) used LASSO logistic regression to predict heart failure in patients after STEMI, using data from VCG and clinical information. Their model performed best at predicting heart failure within 90 days. They focused on important features that track changes in the heart’s shape, like the 3D sum or the shape of the heart in specific areas. In addition, the VCG recordings were also used in the analysis of the same MI patient before and during percutaneous transluminal coronary angiography (PTCA) (Hernandez et al., 2018). The results of this study indicate the potential for the diagnosis of acute myocardial infarction (AMI) using fewer leads.
1.1 Limitations of existing work
Machine learning methods have become an integral part of the automated processing of both ECG and VCG recordings for the early detection of various heart diseases. The limitation of the aforementioned publications is mainly the inconsistency of the analyzed machine learning methods for MI detection, where different authors achieve different detection accuracy results with different methods, even if many authors use similar features. Most authors do not dwell further on the choice or reason for using a given machine learning method. Furthermore, the spatial information of the VCG is not fully utilized in the processing of the measured signals. Another problem is the insufficient analysis of VCG features, especially within the framework of the analysis of the morphological properties of the given curve. A more detailed analysis of these features can contribute to a better understanding of the morphological changes that arise in individual patient records.
Comparison of different classification methods can increase the accuracy, reliability and efficiency of myocardial infarction detection and contribute to better care of patients suffering from this serious disease. The benefit of this study is the consolidation of knowledge in the processing of VCG records using machine learning methods for MI detection. These insights can further be beneficial in improving automated detection to support diagnostics. Furthermore, also for the subsequent selection of a suitable classifier when solving similar problems, which are often solved in the literature recently. The main novelty lies in the development of a unique methodology combining structured feature extraction, comparative analysis of multiple machine learning classifiers, and the final implementation of a stacking model that integrates their strengths. The key contributions of this work can be summarized as follows:
1. Extraction of domain-specific VCG features capturing morphological changes in VCG loops relevant to myocardial infarction.
2. Analysis of the contribution and relevance of VCG features using statistical and permutation importance of features analysis for MI detection.
3. Comparative analysis and tuning of various machine learning classifiers to assess their suitability for MI detection.
4. Enhancing MI detection performance through a novel methodology that integrates feature-based analysis and the fusion of machine learning models using a stacking approach to achieve higher accuracy in automated diagnosis.
1.2 Paper organization
The rest of the article is organized as follows: Section 2 describes the methods and data used with the analysis of extracted VCG features, Section 3 deals with the comparative analysis of individual machine learning methods within the MI detection accuracy, Section 4 focuses on the discussion of the achieved results and Section 5 concludes the paper.
2 Materials and methods
The main goal of this section is to design a methodology for improving the accuracy of MI detection from VCG recordings. The proposed methodology includes signal preprocessing, feature extraction for analyzing the morphological properties of VCG loops, feature relevance analysis for MI detection, and a comparative analysis of machine learning methods for automatic MI classification.
The individual sub-steps of processing VCG records for the purpose of MI detection are represented in Figure 2. Furthermore, each VCG record obtained from the Physikalisch Technische Bundesanstalt (PTB) database was filtered in order to remove interfering components that may affect the subsequent analysis. For this purpose, the procedures applied in previous work (Vondrak and Penhaker, 2022b) were used. After data preprocessing, VCG features were subsequently extracted, which analyze the morphological characteristics of cardiac activity. To verify the relevance of the extracted features, each feature was analyzed by the Mann-Whitney (M-W) statistical test to verify their predictive value. After statistical verification of significance, the features were fed into a total of 210 different types classification method configurations with variously set hyperparameters. The individual sub-steps mentioned above are discussed in more detail in the following subsections.
2.1 Study population
For the purposes of this study, a public PTB diagnostic database was used, which contains records from healthy patients and patients with various heart diseases (MI, cardiomyopathy, bundle branch block, dysrhythmia). Recordings were taken at the Department of Cardiology of University Clinic Benjamin Franklin in Berlin, Germany. Each recording is sampled at a frequency of 1 kHz and contains simultaneously measured 12-lead ECG and 3-lead VCG. Each patient record contain 15 simultaneously recorded signals: conventional 12-lead ECG and 3-lead Frank orthogonal VCG. The signals were acquired for 2 min with a 16-bit resolution in the range of
2.2 Data preprocessing
ECG and VCG recordings are often affected by disturbing components such as baseline wandering or high-frequency transients during measurement. This interference needs to be removed in order to obtain valid data from the desired biological signal. Since the data used mainly contain baseline wandering interference in all measured leads, a second-order Savitz-Golay (SG) filter with a window length of 1,201 was used. Unlike Butterworth or Chebyshev filters, the cut-off frequency of the SG filter is not strictly defined. Its value depends on the window length (window) and the filter order, which, according to Equation 1, is approximately 0.416 Hz.
This type of filtering is often used to remove the interfering component for subsequent analysis (Khan and Pachori, 2021b) and was also used in our previous works (Vondrak and Penhaker, 2022b). For illustrative purposes, Figure 3 shows the curve before filtering (blue), after filtering (red) and the fluctuating component (black). In addition, the effect of filtering on VCG spatial loops is shown in Figure 4. It can be noticed that this filter reliably removes interfering components in the signal that can affect the success of automatic detection.
Figure 3. An illustrative example of signal filtering, where red curve represents the filtered signal, blue represents the original, and black represents the detected fluctuating component.
Figure 4. An example of spatial filtering of VCG loops, where the red curve is the filtered signal and the blue curve is the original signal.
2.3 Feature extraction
Based on our previous work, features extracted from VCG signals have proven effective for the detection of MI (Vondrak and Penhaker, 2022a; Vondrak et al., 2024). A total of 12 VCG features were used for the following analysis, analyzing the morphological properties of the QRS and T loop. These VCG features can be considered as mathematical operations applied to measured signals that do not require high computational requirements and can be applied to any recording. Furthermore, these features encapsulate domain knowledge, as they are derived from predefined characteristics based on specific insights into heart physiology (Rossello et al., 2013). Each of these extracted features was analyzed by a statistical analysis test using the Mann-Whitney test to verify their predictive value. The following features were extracted:
The length of the QRS curve (arcQRS): The length of the QRS loop was calculated from the projection to the optimal plane (OP) using the PCA method. This projection measures the total length of the loop with the ability to detect changes in the contour of the loop. The PCA method here represents the dimension reduction from a spatial loop to an optimal two-dimensional curve.
The principle of projection into the OP is based on the decomposition of the centered matrix
where
The columns of the matrix
where A is an arbitrary point of the optimal plane, T is a known point of the plane–the centroid of the QRS, w1 and w2 are the first two vectors of the matrix W, and t and s are the parameters of the optimal plane equation.
The first two principal components of the matrix P correspond to the projection of the loop points onto the optimal plane. The length of the curve can be expressed as shown in Equation 5.
where
Maximum QRS and T loop vector (maxVecQRS, maxVecT): The maximum size of the vector in the QRS complex occurs during depolarization of the ventricles marked as the R wave. The QRS complex is followed by the T wave, which represents the repolarization of the ventricles. The maximum vectors for both loops were calculated from the maximum values in the given parts of the cardiac revolution (R peak, T peak) in individual leads X, Y, Z according to Equation 6.
where
Maximum distance from the center of gravity (MaxGravQRS): The maximum distance from the centroid was calculated for individual points in the QRS loop according to Equation 7:
where
Velocity of the QRS and T loop (VelQRS, VelT): The propagation velocity of the electrical vector was calculated for the QRS and T loop and can be obtained as the derivative of the curve K in time t according to the Equation 8.
where VelQRS/T(t) is the velocity of the loop, K is the analyzed vectorcardiographic QRS or T loop and t is time. Additional features were then calculated from the velocity obtained this way, namely the maximum speed (maxVelQRS/T), mean velocity (meanVelQRS/T) and standard deviation of the velocity (stdVelQRS/T). Similarly, all of these features were also obtained for the T loop.
The area under QRS and T loop (areaQRS/T): The area under the QRS and T loops was calculated from the projection in the OP according to Equation 9. Area under the curve was calculated for both QRS and T loop.
where
2.4 Classifications
The classification methods most commonly used in the literature for distinguishing between MI patients and healthy individuals were selected. These include K-nearest neighbor (KNN) (Mucherino et al., 2009), Support Vector Machine (SVM) (Hearst et al., 1998), Discriminant Analysis (DA) (Balakrishnama and Ganapathiraju, 1998), Artificial Neural Network (ANN) (Dreiseitl and Ohno-Machado, 2002), Decision Tree (DT) (Mantas and Abellan, 2014), Random Forest (RF) (Belgiu and Dr Drăfuţ, 2016), Naive Bayes (NB) (Murphy, 2006), Logistic Regression (LR) (Ng and Jordan, 2001) or Ensemble Methods (EM) (Dietterich, 2000). However, several factors must be taken into account when analyzing individual classifiers. These factors include the nature of the data distribution, the variability of the data, and the diversity or existence of noise. It is also necessary to keep in mind the advantages and possible limitations of individual classifiers. For example, KNN must have training and test data available at all times, so the classifier relies on memory-based learning. For noisy data, the choice using DT is more preferred (Mantas and Abellán, 2014). Or for linear and non-linear features, SVM classifiers are effective (Noble, 2006).
In this study, we experimented with the above-mentioned classifiers using 10-fold cross-validation, while each of the used classifiers was tuned with respect to the relevant parameters. The DT has been tuned with respect to the split criterion (Cross entropy and Gini diversity index (GDI)), minimum number of leaves as
In the case of SVM, the classifier has been tuned with regard to kernels, namely Linear kernel, Polynomial kernel [with degree (2, 3, 4)], Radial Basis Function (RBF) kernel, and Gaussian kernel. A kernel scale from 1 to 5 was used for each of the kernels.
The KNN was tuned with respect to the size of K, which was chosen between 5 and 20 with a step of 5. The distance type was chosen as Euclidian, City block, and Chebyshev. A distance parameter was chosen Equal, Inverse, and Squared inverse.
DA has been tuned with respect to the type of discriminant analysis, namely linear, diagonal linear, pseudo-linear, quadratic, diagonal quadratic and pseudo-quadratic.
The ANN has been tuned by adjusting both the number of hidden layers (ranging from 1 to 2 layers) and the size of each hidden layer (with units set at 5, 10, 20, and 50 neurons). This tuning is focused on finding the right balance between model complexity and generalization, helping the network better capture complex patterns in the data.
The NB has been tuned with respect to the Prior (uniform, empirical), distribution (normal, kernel) and in the case of distribution kernel the kernel type as normal, box, epanechnikov and triangle.
The Random forests are based on a similar principle as Decision Trees. For this method, the parameters that were tuned included the Number of Trees, ranging from 100 to 500 in steps of 100, and the Minimum Number of Leaves, ranging from 1 to 5 in steps of 1. Higher values were not considered due to the risk of potential overfitting.
The Logistic Regression was tuned with the Prior parameter set to uniform and empirical, which influences how class weights are assigned in the training data. For Regularization, Least Absolute Shrinkage and Selection Operator (lasso) and ridge methods were selected to prevent overfitting by penalizing overly complex models. Additionally, various optimization algorithms (Solver) such as Stochastic gradient descent (sgd) (Shalev-Shwartz et al., 2007), Average stochastic gradient descent (asgd) (Xu, 2011), Dual sgd for SVM (dual) (Hsieh et al., 2008), Broyden-Fletcher-Goldfarb-Shanno quasi-Newton algorithm (bfgs) (Nocedal and Wright, 1999), Limited-memory bfgs (lbfgs) (Nocedal and Wright, 1999), and Sparse Reconstruction by Separable Approximation (sparsa) (Wright et al., 2009) were tested to find the optimal model parameters.
Next, the Ensemble Methods were tuned with regard to different techniques: Adaptive Boosting, which enhances weak learners by focusing on misclassified examples; Robust Boosting, designed to handle noisy data by adjusting the impact of outliers; and Random Undersampling Boosting, which balances class distribution by reducing the size of the majority class. The Learner type was set to either Tree or Discriminant, depending on the model’s requirements. Learning cycles were tested with values of 50, 100, 150, and 200 to optimize the model’s performance.
According to the aforementioned hyperparameter settings, a total of 210 different configurations of machine learning methods are analyzed. These hyperparameters were chosen with consideration for preventing overfitting of individual classifiers, computational complexity, and appropriate optimizations for each model.
Finally, the Stacking ensemble learning method was used for classification, combining predictions from multiple base models to enhance classification performance. The input for logistic regression as a meta-classifier will consist of the best-performing trained models. Stacking ensemble learning is a machine learning method that combines the outputs of multiple base models and utilizes a meta-classifier to achieve better performance (Alexandropoulos et al., 2019; Yoon and Kang, 2023; Zhou, 2025). Each base model learns from the data from different perspectives, reducing the risk of overfitting and improving generalization. The meta-classifier, often logistic regression, is trained on the outputs of these models and learns to weigh their predictions based on their accuracy. This approach represents a form of decision-level fusion, as it integrates multiple model predictions to leverage the strengths of different algorithms and increase the overall robustness of the classification. The result is a model with higher accuracy and reliability compared to individual classifiers. To prevent data leakage and ensure that the meta-classifier learned only from unseen samples, the stacking layer was trained using out-of-fold (OOF) predictions obtained during cross-validation. Each base model produced predictions exclusively for data that were not part of its training fold. These predictions were subsequently used as input features for training the meta-classifier.
2.5 Performance parameters
In this study, the performance of the classifiers was evaluated by calculating accuracy (Acc) (Equation 10), sensitivity (Sens) (Equation 11), specificity (Spec) (Equation 12), positive predictive value (PPV) (Equation 13), negative predictive value (NPV) (Equation 14), and f1-score (f1) (Equation 15). Whereby TN (True Negative) represents the number of correctly detected healthy records. TP (True Positive) represents the number of correctly detected MI records. FN (False Negative) represents the number of false-negative samples, i.e. the number of MI records that were not correctly detected. FP (False Positive) represents the number of false-positive samples, i.e. the number of healthy records that were incorrectly detected as MI.
In addition to the above parameters a confusion matrix is also included, which gives a count of how well the classification model works by showing the number of correct and incorrect predictions for each class. It allows distinguishing between true positive, false positive, true negative and false negative predictions, which is key to evaluating model performance.
3 Results
The VCG signals in individual leads and the three-dimensional loop corresponding to HC and MI can be seen in Figure 1. From Figure 1, morphological changes are evident both in individual leads and in the spatial display. These morphological changes indicate the non-stationarity of the VCG signal. In this study, we used VCG features to analyze these morphological changes. We extracted a total of 12 VCG features that focus on morphological changes in both the QRS and the T segment.
3.1 Relevance of VCG features
To verify the informative value of individual features, each feature was tested with the M-W statistical test, due to the fact that for each feature we are dealing with two-sample data (MI and HC), where a statistically significant difference is sought. The use of M-W statistical test was expedient, because according to the Shapiro-Wilk (S-W) test, the data do not come from a normal distribution, the p-values of the S-W test for all VCG features are
The results of the M-W test confirmed the significance (p-value
Clinically, the observed differences in VCG features reflect alterations in the processes of ventricular depolarization and repolarization that occur as a result of ischemic injury. The lower T-loop velocity and smaller QRS loop area found in MI patients indicate delayed or weakened propagation of electrical activity within the infarcted myocardium. These findings suggest that the extracted VCG features not only provide statistically significant discrimination between HC and MI recordings but also have clear physiological relevance, supporting their use in automated MI detection.
A similar comparison is shown in Figure 5, where boxplots of individual features for the MI and HC groups are shown. We prioritized features for which the distributions of the MI and HC groups showed minimal overlap. For each group, there are also outlier observations that are usually removed from the observed dataset in common statistical tests. However, these values were not removed because they are real data that can occur in clinical practice. Since all the extracted VCG features show a significant ability to distinguish the analyzed groups, we came to the conclusion that for further processing all 12 features will be fed into the classification phase. The relevance of individual features can be seen in Table 1, sorted according to the p-value of the M-W test, while the most relevant feature is the one with the lowest p-value. Since all features show a result of statistical tests lower than the level of significance
Figure 5. Boxplots of the values of individual VCG features for the analyzed MI and HC records, where the red boxes belong to MI and the blue boxes belong to HC.
3.2 Analysis of classification methods
The results of individual classification methods are shown in Tables 3–10. However, it is important to note that we used a 10-fold cross-validation analysis. This approach helps to avoid overfitting due to data imbalance and the inclusion of multiple segments from the same patient, which can bias the results. To determine the best results, accuracy is used as the primary performance metric for each classification method. The setting that achieves the highest accuracy is highlighted in the corresponding tables.
In the literature, the Support Vector Machine method is frequently used for classification problems due to its ability to work with various kernel types and effectively separate nonlinearly separable data classes. In this analysis, different SVM classifier configurations with various hyperparameters were tested, and their results are summarized in Table 2. The best performance was achieved with a polynomial kernel of order 3 and a scale parameter set to 1, reaching the highest accuracy of 91.10%.
Table 2. Performance of Support Vector Machine classifier with different set hyperparameters using 10-fold cross-validation.
The results for individual types of discriminant analysis are shown in Table 3, where linear discriminant analysis (LDA) achieves the most accurate results with accuracy of 90.40%. The accuracy value is similar to the DT classifier, however, with different sensitivity and specificity values.
Table 3. Performance of Discriminant Analysis classifier with different set hyperparameters using 10-fold cross-validation.
Table 4 shows the results of the KNN classification method for experimentally set hyperparameters. KNN setting K = 10, distance type City block and distance parameter Squared inverse achieved the most accurate results, where it achieved a detection accuracy of 89.93%.
Table 4. Performance of KNN classifier with different set hyperparameters using 10-fold cross-validation.
Table 5 shows that the decision tree classifier achieves its highest accuracy of 90.40% using Cross entropy as the split criterion, with a minimum of 4 leaves and a maximum of 5 splits. Alternative settings, such as those using the Gini criterion or higher leaf counts, yield high sensitivity but often lower specificity, leading to reduced accuracy. Therefore, Cross entropy with 4 leaves and 5 splits emerges as the optimal setup for balanced classification on this dataset.
Table 5. Performance of Decision Tree classifier with different set hyperparameters using 10-fold cross-validation.
Another tested method was ANN, see Table 6, which achieved the most accurate results for hidden layers 2, where the size of the first layer is 10 and the size of the second layer is 20 with accuracy of 89.93%. Higher values of layer size and hidden layers already led to worse classification results and were not analyzed further.
Table 6. Performance of ANN classifier with different set hyperparameters using 10-fold cross-validation.
Another widely used method is Naive Bayes, with its results presented in Table 7. This method achieved the highest accuracy of 85.25% with the hyperparameter settings ’Prior empirical’ and a ’Distribution Kernel’ of the epanechnikov type. Despite the fact that this method is frequently analyzed, it achieves the worst results in our study compared to other classification methods.
Table 7. Performance of Naive Bayes classifier with different set hyperparameters using 10-fold cross-validation.
In the Random Forest classification method, which operates similarly to decision trees, the highest accuracy was achieved with hyperparameters set to 300 Trees and a Minimum Leaf Size of 1. Increasing the number of trees beyond 500 had no further effect on the overall accuracy of the classifier. It’s worth noting that the best accuracy was obtained with a Minimum Leaf Size of 1, which allows the trees to fit the specific patterns in randomly selected data subsets. However, this can result in trees that are too specialized. Therefore, 10-fold cross-validation was used, with the results shown in Table 8.
Table 8. Performance of Random Forest classifier with different set hyperparameters using 10-fold cross-validation.
Within the Logistic Regression method, the highest detection accuracy of 87.61% was achieved with the hyperparameters: Prior: empirical (class weights are determined based on their frequency in the training data), Regularization: lasso (uses regularization to eliminate irrelevant variables, simplifying the model), and Solver: sparsa (an optimization algorithm efficient for sparse data and large datasets, based on gradient descent). The results of individual logistic regression settings are shown in Table 9. The Table also lists empty values as NaN because these types of hyperparameters settings are not supported.
Table 9. Performance of Logistic Regression classifier with different set hyperparameters using 10-fold cross-validation.
Finally, the last type tested is Ensemble Methods, which are machine learning techniques that combine the outputs of several models (base models) to achieve better performance than any single model alone. Among these methods is Adaptive Boosting, which iteratively improves classification by placing greater emphasis on examples that were misclassified in previous iterations.
In our case, the Tree Learner Template was used, meaning that the base models in the ensemble are Decision Trees. The number of Learning Cycles is set to 200, which determines the number of repetitions during which the models in the ensemble learn and improve their predictions. Higher values for Learning Cycles were not used to prevent potential overfitting of the classifier. With the above hyperparameter settings, the model achieves the highest accuracy of 92.74% compared to the other tested methods. Overall, the results of individual types of Ensemble Methods, see Table 10, achieve better results compared to other methods used in this work based on a different basis.
Table 10. Performance of Ensemble Method classifiers with different set hyperparameters using 10-fold cross-validation.
The best results for each machine learning method are highlighted in bold in the respective tables. Of all the tested individual classification methods with experimentally set hyperparameters, the Ensemble Methods achieved the highest classification accuracy with an accuracy of 92.74%, a sensitivity of 96.83% and a specificity of 75.00%. Due to the class imbalance (80 H C vs. 347 MI) and the chosen ROC operating point, the specificity metric is highly sensitive to even a small number of false positives from the HC group. Among other things, the parameters PPV and NPV are also presented here, where PPV tells us the proportion of subjects with a positive test result who truly have the outcome of interest, while NPV reflects the proportion of subjects with a negative test result who truly do not have the outcome of interest.
To better understand which features contributed most to model performance, permutation importance of features (PIF) was computed for all nine base classifiers, see Figure 6. This method estimates the importance of each feature by measuring the increase in classification error when the feature values are randomly shuffled. The higher the increase in loss, the more critical the feature is to the model’s predictions. Positive PIF values indicate that shuffling the feature increases the model’s error, suggesting that the feature is important for prediction. In contrast, negative PIF values imply that randomizing the feature slightly improves the model performance, which may indicate redundancy or noise. As shown in Figure 6, certain features consistently showed higher importance across multiple classifiers–such as maxVelT, arcQRS, and maxVecQRS–suggesting their robust discriminative power for myocardial infarction detection. Notably, logistic regression, decision tree, and artificial neural network placed strong emphasis on a small subset of features, while methods like Random Forest and KNN demonstrated a more balanced distribution of feature importance.
Figure 6. Permutation Importance of Features across all nine classifiers. The y-axis represents the increase in classification loss when a given feature is randomly permuted–higher values indicate greater importance.
Finally, the stacking ensemble learning method was employed for classification, utilizing decision-level fusion by combining predictions from multiple base models to enhance classification performance. The input for logistic regression as a meta-classifier consisted of the best-performing trained models (basic models from Table 11), ensuring that only the most informative classifiers contributed to the final prediction. To avoid data leakage, OOF predictions from the base models were used as inputs. This means that each base classifier generated predictions only for samples that were not included in its training fold. The strong performance of the stacking approach can be attributed to the behavior of the individual base classifiers, which shows diverse error patterns across different patient groups. This diversity allowed the meta-classifier to effectively learn how to weigh individual model outputs. In particular, methods such as SVM, DT and ANN contributed different decision boundaries and sensitivities to feature variations. These different boundaries can together improve the final prediction. The results demonstrate high sensitivity of 97.70%, specificity of 86.25%, accuracy of 95.55%, PPV of 96.86%, NPV of 89.61%, and f1-score of 97.27%. These results confirm the effectiveness of the fusion-based approach, where decision-level integration of multiple classifiers improved robustness. This analysis also supports the conclusion that the success of a layering model lies not only in the combination of strong individual classifiers, but also in exploiting their diversity to mitigate individual weaknesses.
Table 11. Summary of the performance parameters of individual classification methods that achieved the most accurate results.
Figure 7 shows a heatmap comparing the performance metrics of all classifiers. It highlights the superior results of the stacking method, followed by strong performance from SVM, DA, and RF, while NB and LR performed worse in specificity and NPV. This graphical representation helps illustrate the overall superiority of ensemble-based methods compared to single classifiers.
Figure 7. Heatmap visualization of classification performance metrics for all evaluated methods. The stacking and ensemble models achieved consistently higher scores across all metrics.
Figure 8 illustrates the principle of the Stacking method, which integrates multiple M-L models into a single meta-classifier to enhance detection accuracy. This meta-classifier receives the outputs of the base M-L models, providing it with additional information for better decision-making. In cases of misclassifications by the base M-L models, the meta-classifier learns to recognize patterns in these errors and optimizes the final prediction.
Figure 8. Block diagram of the Stacking method principle combining M-L models into a meta-classifier.
The aforementioned M-L methods were analyzed using 10-fold cross-validation, where each iteration produces a specific accuracy value for detection. This variability is shown in Figure 9 for the M-L methods that achieved the best results, as listed in Table 11. It can be observed that the accuracy variability in the Stacking method is lower compared to individual M-L models, highlighting the robustness of this approach.
Table 11 shows summarization of the performance parameters for the classification methods with set hyperparameters that achieved the most accurate results. Correspondingly, Table 12 presents the confusion matrix for these methods, as detailed in Table 7. From the confusion matrices can be observed that the base classifiers made different types of errors. For example, KNN and DT tended to misclassify borderline HC as MI, while SVM and DA performed better in these cases but occasionally failed to detect atypical MI. ANN showed higher sensitivity but slightly lower specificity, reflecting its tendency to overestimate positive cases. LR achieved high sensitivity, but also achieved a higher false positive rate. These variations in error patterns indicate a high degree of model diversity, which the stacking ensemble effectively exploited.
Table 12. The confusion matrices for the analyzed machine learning methods that achieved the most accurate results.
Another way to illustrate the results is shown in Figure 10, which corresponds to the data presented in Table 12. From these visuals, it is evident that the Ensemble Method delivers the highest accuracy among all the tested methods. Given the class imbalance in the dataset (347 MI vs. 80 healthy), we also show ROC curves and AUC (area under curve) as performance metrics. Unlike accuracy, AUC is less sensitive to class imbalance and provides a more reliable evaluation of model performance across thresholds, see Figure 11, with the marked cut-point of the ROC curve (Habibzadeh et al., 2016). This cut-point is obtained from the true ROC curve representing the MIs. ROC curves describe the quality of a binary classifier depending on the setting of its classification threshold. Similarly, the cut-point can be obtained from the false ROC curve, which in this case would be the mirror values of the true curve. Table 13 presents the AUC values for the analyzed classifiers that achieved the highest accuracy. The stacking method reached the highest AUC value of 99.04%, confirming its ability to distinguish between classes. Although techniques such as SMOTE or cost-sensitive learning could be applied to mitigate class imbalance, we opted to evaluate model performance without resampling in order to preserve the natural class distribution. Nonetheless, we highlight this as a limitation and propose exploring such methods in future work.
Figure 10. Comparison of values of sensitivity, specificity, accuracy, PPV, NPV and f1-score for individual analyzed machine learning methods with set hyperparameters that achieved the most accurate results.
Figure 11. ROC with indication of the true and false operating point of classifiers: KNN with K = 10, distance type City Block, distance weight Squared Inverse; SVM with kernel type Polynomial and Order 3, Scale 1; DT with split criterion Cross entropy, Min. leaf 4, Max. splits 5; DA with discriminant type Linear; ANN with 2 hidden layers, layer 1 size 10 and layer 2 size 20, which achieve the highest accuracy; NB with Prior empirical, Distribution Kernel type of epanechnikov; RF with Number of trees 300, Min leaf size 1; LR with Prior empirical, Regularization lasso, Solver sparsa; EM as Adaptive Boosting, Learner template Tree, learning cycles 200; Stacking ensemble learning combining M-L method models.
4 Discussion
VCG features are a suitable parameter that can capture morphological changes in physiological and pathological recordings. The advantage of the used domain knowledge VCG features is the capture of morphological changes in the QRS and T loop, which is changed in the presence of pathology. We analyzed a total of 12 VCG features that were extracted from 80 physiological and 347 pathological recordings from PTB Physionet database. Each of these features was tested by the M-W statistical test to verify its predictive value. Since all features had a p-value less than 0.05, see Table 1, all features were fed into the classification methods.
Various methods have been investigated for classification, including KNN, SVM, ANN, DA, DT, NB, RF, LR, EM, and Stacking Ensemble Learning. Each of these classifiers was trained using 10-fold cross-validation. To maximize the information obtained, each of these classifiers was analyzed with experimentally adjusted hyperparameters. The performance of individual classifiers was investigated using various parameters, including Sensitivity, Specificity, Accuracy, Positive Predictive Value, and Negative Predictive Value. It can be observed from Table 11 that the most accurate classification results were achieved by the Stacking Ensemble Learning method, which combined the predictions of multiple base classifiers and used logistic regression as a meta-classifier. This approach yielded an accuracy of 95.55%, a sensitivity of 97.70%, a specificity of 86.25%, a positive predictive value of 96.86%, a negative predictive value of 89.61% and f1-score of 97.27%. To ensure the robustness of the used classifiers and to prevent overfitting, the 10-fold cross-validation method was applied, where the analyzed data was divided into 10 subsets with samples having approximately equal distribution in each group.
4.1 Comparison with existing studies
The following paragraphs compare the achieved results with existing studies. The focus is primarily on a similar problem, namely the extraction of VCG features and machine learning methods for the possibility of automatic pathology detection.
Among recent studies focused on MI detection, the authors in Correa et al. (2013b) analyzed a total of seven QRS features and their classification ability, achieving a sensitivity of 88.5% and a specificity of 92.1%. The authors further expanded their analysis in Correa et al. (2013a), where they extracted 4 features analyzing the QRS loop and 3 features analyzing the T loop for 51 ischemic and 52 healthy subjects. Using the LDA classifier, they achieved a sensitivity of 95.4% and a specificity of 95.2%. The authors also devoted their attention to the classification of anterior and inferior infarctions in Correa et al. (2016), where they achieved an accuracy of 89.8% using nine features and the LDA classifier. The spatial velocity dynamics of the QRS loop VCG of patients with AMI was also analyzed by Ghosal et al. (2024). Using quasi-orthogonal leads I, aVF, and V2, they constructed the VCG and examined spatial velocity, spatial distance and spatial magnitude. The results showed decreased spatial velocity and spatial distance values in patients with AMI. Another important analysis of VCG features was performed by the authors in De la Garza Salazar and Egenriether (2024), where they analyzed 315 VCG parameters from the P, QRS and T loops using derived VCG via the Kors transformation (Kors et al., 1990). They identified significant associations between VCG features and patient characteristics, such as age, sex, BMI, hypertension, and echocardiographic findings. The results highlight the potential of VCG analysis in cardiovascular disease assessment.
Dehnavi et al. (2011) compared the success of MI detection from ECG and VCG recordings. For this purpose, they extracted 22 features from 60 ischemic and 10 healthy recordings and used a neural network for classification. They observed a significant improvement in the processing of VCG data, achieving an accuracy of 86%, compared to 73% for ECG. Zhao et al. (2022) used a combination of features extracted from ECG and VCG and achieved accuracy of 90.3%, sensitivity of 90.3% and specificity of 90.5% using SVM classifier. Additionally, Dima et al. (2013) used 25 features extracted from 158 MI and 52 H C recordings to detect MI. For classification, they employed an SVM classifier, which resulted in an accuracy of 89.22%, sensitivity of 76%, and specificity of 87.5%. SVM was also used as a classification method by the authors in Panagiotou et al. (2013); Khan and Pachori (2021a). In Panagiotou et al. (2013), the authors extracted 27 morphological features from 158 pathological and 102 healthy records, achieving an accuracy of 82.36%, sensitivity of 84.31%, and specificity of 77.36%. In contrast, the authors in Khan and Pachori (2021a) used 12 features based on the frequency components of the signal, achieving an accuracy of 95.52%, sensitivity of 91.08%, and specificity of 97.45%. Another use of features based on the wavelet transform of the signal was used by the authors in Keshtkar et al. (2013). The authors proposed the evaluation of the wavelet coefficients obtained over the averaged ECG signal. From the obtained set of parameters and using a neural network, they achieved an accuracy of 89.5%, a sensitivity of 93% and a specificity of 86%. Tripathy and Dandapat (2017) also obtained VCG features using wavelet transformation. In their analysis, they employed the relevance vector machine method, using 100 MI and 50 H C records, which resulted in a sensitivity of 98.40% and specificity of 98.66%. Furthermore, the authors in Zhang J. et al. (2022) detected MI localization with a detection success rate exceeding 99% using the decomposition of the detected individual beats using the wavelet transform. However, these high detection success rates in the above-mentioned works dealing with signal decomposition using wavelet transformation are limited by the high computational complexity and the need for a large operating memory, as stated for example in Zhang J. et al. (2022).
The concept of ensemble learning was also applied by the authors in Sun et al. (2022), who extracted features from transformed VCGs based on the spectral fitting exponent, Lyapunov exponent, and Lempel-Ziv complexity. These features were processed through an ensemble learning algorithm based on bagging, combining various classifiers using weighted voting. The proposed algorithm achieved a detection accuracy of 91.11%, sensitivity of 90.49%, and specificity of 92.88% on the PTB database. Hafshejani et al. (2021) applied the Classification and Regression Tree (CART) method on a balanced dataset consisting of 80 MI and 80 H C records. Using octant theory, they extracted a total of 48 features. With the dataset split into 90% for training and 10% for testing, the authors achieved an accuracy of 98.1%, sensitivity of 98.8%, and specificity of 97.5%. Karisik and Baumert (2019) extracted VCG features based on the morphological properties of the QRS and T loops, and using a long short-term memory network, they achieved MI detection accuracy of 89.1%, with a sensitivity of 89.1% and a specificity of 90%. In their analysis, they used 78 MI and 69 H C records from the PTB database. The authors in Aranda Hernandez et al. (2023) also applied ensemble methods for detecting various types of MI, focusing on the automatic localization of MI using four different M-L methods with 98 VCG features as input data. They report that Lasso or ensemble models can achieve better detection results in situations with limited data, compared to more complex models based on deep learning approaches.
Attention was also paid to comparing different types of classifiers using VCG features analyzing signal decompositions using wavelet variation Sharma and Sunkaria (2018); Khan and Pachori (2021b), where in Sharma and Sunkaria (2018) authors analyzed two classification methods (SVM and KNN) to see which one would perform better. From the dataset of features based on signal decomposition, they achieved a higher accuracy of detection with the SVM method, which achieved very good results: sensitivity 99.35%, specificity 98.29% and accuracy 98.41%. However, it should be mentioned that the authors in Sharma and Sunkaria (2018) used 30 pathological and 52 healthy subjects. Gragnaniello et al. (2024) compared two methods for MI detection based on M-L and deep learning (D-L) techniques from a proposed single microcontroller-based system. They used 4-s recordings from the PTB database for analysis and achieved detection accuracy of 89.40% for M-L and 94.76% for D-L.
We also present a comparison with relevant publications analyzing a similar problem is shown in Table 14, where attention is mainly paid to the number of extracted features, the number of records used and the M-L method used.
Table 14. A summary of selected relevant publications related to the topic of this work analyzing VCG features using machine learning methods for myocardial infarction detection.
4.2 Advantages and limitations of proposed study
It is worth mentioning that the authors in Khan and Pachori (2021b) analyzed three different classifiers for posterior MI (PMI) detection by using derived VCG from a 12 lead ECG, whereas, in the present study, attention is paid to the possibility of MI detection from directly measured VCG records. In contrast, in the present study, we extracted 12 features that analyze the morphological properties of cardiac revolution caused by infarct conditions, while in Khan and Pachori (2021b) authors used features based on signal decomposition using a specialized wavelet transform. In this work, VCG features can be considered as mathematical operations applied to VCG signals that do not require high computational complexity and can be applied to any recording. Compared to other publications, this work ranks among the publications using a lower number of features, which may affect the subsequent computational complexity. These features were analyzed to verify their informative value and their relevance was confirmed using statistical analysis.
The present work also addresses the issue of utilizing a larger or comparable number of patient records compared to the previous studies, as indicated in Table 14 and the chapter dealing with comparison with existing studies. It is evident from the table that different authors use varying numbers of records from the same database for their analyses. This inconsistency can be misleading, as some studies do not specify which subset of data was used. We are also aware of the imbalance between the MI and HC groups, which is subsequently reflected in the specificity values in our study. Achieving consistently high specificity is particularly challenging when working with a relatively small number of HC. Even a few misclassified cases can influence this metric. Nevertheless, the obtained values correspond to the selected operating point on the ROC curve, which reflects the trade-off between sensitivity and specificity and was chosen to favor higher sensitivity due to its greater clinical importance in MI detection. This choice inevitably leads to a slight decrease in specificity, which becomes more apparent given the small HC cohort. Furthermore, due to the imbalance between the analyzed groups (MI and HC), additional metrics such as NPV, PPV, and f1-score were used, which is more appropriate in this case. Another challenge in the literature is the unavailability of data used by some authors who do not rely on publicly accessible databases. The limited availability of VCG data remains a barrier to comprehensive analyses of VCG recordings. Future research may focus on developing a modern database of ECG and VCG records to address this limitation.
Furthermore, this work examines commonly used machine M-L for detecting various heart diseases based on ECG and VCG recordings. A review of methods applied in the last 5 years was conducted. The analyzed M-L methods were experimentally optimized to determine their best performance through a comprehensive comparative analysis. The results of MI detection using M-L methods, combined in a Stacking ensemble model, achieve better results compared to the state-of-the-art, particularly in the analysis of morphological features in both the time and frequency domains. To the best of our knowledge, the used combination of features based on the morphological properties of cardiac revolution with verification of their predictive value using a statistical test and the detailed analysis of classifiers with experimentally set hyperparameters with subsequent creation of a Stacking ensemble learning model with high detection success rate has been used for the first time on directly measured VCG recordings in order to find a suitable methodology for detecting MI and a subsequent option to support diagnostics in clinical practice. This work contributes to the existing literature by demonstrating the benefits of MI detection based on a detailed analysis of M-L methods using morphological VCG features, followed by the design of a Stacking model of M-L methods. The results achieved have the potential to enhance clinical diagnostics.
Furthermore, this work offers additional potential directions for addressing the given issue, such as extracting additional VCG features that could enhance the informative value within classification, or applying the described methodology to transformed VCG recordings derived from 12-lead ECG, which is the most commonly measured in clinical practice. These derived VCGs could provide complementary information to existing ECG-based diagnostic systems and support clinicians in their decision-making. The 12-lead ECG, routinely used in clinical examinations, can be mathematically transformed into VCG leads using established transformation methods. Such reconstructed VCG recordings expand the scope of cardiac electrical activity analysis and can serve as a valuable supplement to conventional ECG interpretation. From these derived VCG leads, the same set of VCG features can be extracted, and the described classification approach using both standard machine learning and stacking techniques can be applied. In this way, the proposed method could be implemented on already acquired ECG data for diagnostic verification or integrated into real-time clinical analysis, thereby contributing to the improvement of the necessary treatment.
5 Conclusion
Early diagnosis and detection of myocardial infarction can help physicians provide timely and effective treatment, potentially saving the lives of at-risk patients. In this study, a methodology for processing of vectorcardiographic records for the automated detection of MI records was proposed, based on the analysis of extracted VCG features analyzing the morphological properties of the QRS and T loop. Among 210 different M-L settings, the highest accuracy was achieved using the Stacking Ensemble Learning method, which combines multiple base models and employs logistic regression as a meta-classifier. This approach yielded high accuracy of 95.55%, sensitivity of 97.70%, specificity of 86.25%, positive predictive value of 96.86%, negative predictive value of 89.61%, and f1-score of 97.27%. These results confirmed the usefulness of the VCG method and can aid physicians in decision-making regarding subsequent treatment plans.
Data availability statement
Publicly available datasets were analyzed in this study. This data can be found here: https://physionet.org/content/ptbdb/1.0.0/.
Ethics statement
The studies involving humans were approved by https://physionet.org/content/ptbdb/1.0.0/. The studies were conducted in accordance with the local legislation and institutional requirements. Written informed consent for participation was not required from the participants or the participants’ legal guardians/next of kin in accordance with the national legislation and institutional requirements.
Author contributions
JV: Conceptualization, Writing – review and editing, Methodology, Investigation, Writing – original draft, Visualization, Formal Analysis, Validation. MP: Writing – original draft, Funding acquisition, Investigation, Supervision, Formal Analysis, Writing – review and editing, Project administration, Conceptualization, Resources.
Funding
The authors declare that financial support was received for the research and/or publication of this article. This article has been produced with the financial support of the European Union under the LERCO project number CZ.10.03.01/00/22_003/0000003 via the Operational Programme Just Transition. The work and the contributions were supported by the project SP2025/032 “Biomedical Engineering systems XXI”.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The authors declare that no Generative AI was used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Supplementary material
The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fphys.2025.1683956/full#supplementary-material
References
Acharya U. R., Fujita H., Oh S. L., Hagiwara Y., Tan J. H., Adam M. (2017). Application of deep convolutional neural network for automated detection of myocardial infarction using ECG signals. Inf. Sci. 415, 190–198. doi:10.1016/j.ins.2017.06.027
Alexandropoulos S.-A. N., Aridas C. K., Kotsiantis S. B., Vrahatis M. N. (2019). Stacking strong ensembles of classifiers. Springer, 545–556.
Amini M., Zayeri F., Salehi M. (2021). Trend analysis of cardiovascular disease mortality, incidence, and mortality-to-incidence ratio: results from global burden of disease study 2017. BMC Public Health 1, 12–412. doi:10.1186/s12889-021-10429-0
Aranda Hernandez A., Bonizzi P., Peeters R., Karel J. (2023). Automatic localization of myocardial infarction using vectorcardiography.
Balakrishnama S., Ganapathiraju A. (1998). “Linear discriminant analysis-a brief tutorial,”Inst. Signal Information Process. 18. 1–8.
Barmpoutis P., Dimitropoulos K., Apostolidis A., Grammalidis N. (2019). Multi-lead ECG signal analysis for myocardial infarction detection and localization through the mapping of Grassmannian and euclidean features into a common hilbert space. Biomed. Signal Process. Control 52, 111–119. doi:10.1016/j.bspc.2019.04.003
Bartall H., Desser K. B., Benchimol A., Massey B. J. (1978). Echocardiographic left atrial enlargement. Comparison of vectorcardiogram and electrocardiogram for detection. J. Electrocardiol. 11, 355–359. doi:10.1016/s0022-0736(78)80141-1
Belgiu M., Dr Drăfuţ L. (2016). Random forest in remote sensing: a review of applications and future directions. ISPRS Journal Photogrammetry Remote Sensing 114, 24–31. doi:10.1016/j.isprsjprs.2016.01.011
Bousseljot R., Kreiseler D., Schnabel A. (1995). Nutzung der EKG-Signaldatenbank CARDIODAT der PTB über das Internet. Biomed. Technik/Biomedical Eng. 40, 317–318. doi:10.1515/bmte.1995.40.s1.317
Chuang Y.-H., Lee C.-Y., Chen Y.-H., Chang W.-W. (2023). A hybrid system for myocardial infarction classification with derived vectorcardiography IEEE 468–473. doi:10.1109/icufn57995.2023.10200755
Correa R., Arini P. D., Correa L., Valentinuzzi M. E., Laciar E. (2013a). Acute myocardial ischemia monitoring before and during angioplasty by a novel vectorcardiographic parameter set. J. Electrocardiology 46, 635–643. doi:10.1016/j.jelectrocard.2013.06.001
Correa R., Arini P. D., Valentinuzzi M. E., Laciar E. (2013b). Novel set of vectorcardiographic parameters for the identification of ischemic patients. Med. Engineering and Physics 35, 16–22. doi:10.1016/j.medengphy.2012.03.005
Correa R., Arini P. D., Correa L. S., Valentinuzzi M., Laciar E. (2016). Identification of patients with myocardial infarction. Vectorcardiographic and electrocardiographic analysis. Methods Information Medicine 55, 242–249. doi:10.3414/ME15-01-0101
Cowdery C. D., Wagner G., Starr J., Rogers G., Greenfield Jr J. (1980). New vectorcardiographic criteria for diagnosing right ventricular hypertrophy in mitral stenosis: comparison with electrocardiographic criteria. Circulation 62, 1026–1032. doi:10.1161/01.cir.62.5.1026
De la Garza Salazar F., Egenriether B. (2024). Exploring vectorcardiography: an extensive vectocardiogram analysis across age, sex, BMI, and cardiac conditions. J. Electrocardiol. 82, 100–112. doi:10.1016/j.jelectrocard.2023.12.004
Dehnavi A. R. M., Farahabadi I., Rabbani H., Farahabadi A., Mahjoob M. P., Dehnavi N. R. (2011). Detection and classification of cardiac ischemia using vectorcardiogram signal via neural network. J. Research Medical Sciences The Official Journal Isfahan Univ. Med. Sci. 16, 136–142.
Dellborg M., Malmberg K., Ryden L., Svensson A.-m., Swedberg K. (1995). Dynamic on-line vectorcardiography improves and simplifies in-hospital ischemia monitoring of patients with unstable angina. J. Am. Coll. Cardiol. 26, 1501–1507. Publisher: Journal of the American College of Cardiology. doi:10.1016/0735-1097(95)00361-4
Dhawan A., Wenzel B., George S., Gussak I., Bojovic B., Panescu D. (2012). Detection of acute myocardial infarction from serial ECG using multilayer support vector machine. IEEE 2012, 2704–2707. doi:10.1109/EMBC.2012.6346522
Diker A., Cömert Z., Avci E., Velappan S. (2018). Intelligent system based on genetic algorithm and support vector machine for detection of myocardial infarction from ECG signals. IEEE 1–4, 1–4. doi:10.1109/siu.2018.8404299
Dima S.-M., Panagiotou C., Mazomenos E. B., Rosengarten J. A., Maharatna K., Gialelis J. V., et al. (2013). On the detection of myocadial scar based on ECG/VCG analysis. IEEE Trans. Biomed. Eng. 60, 3399–3409. doi:10.1109/TBME.2013.2279998
Dohare A. K., Kumar V., Kumar R. (2018). Detection of myocardial infarction in 12 lead ECG using support vector machine. Appl. Soft Comput. 64, 138–147. doi:10.1016/j.asoc.2017.12.001
Dreiseitl S., Ohno-Machado L. (2002). Logistic regression and artificial neural network classification models: a methodology review. J. Biomedical Informatics 35, 352–359. doi:10.1016/s1532-0464(03)00034-0
Fatimah B., Singh P., Singhal A., Pramanick D., Pranav S., Pachori R. B. (2021). Efficient detection of myocardial infarction from single lead ECG signal. Biomed. Signal Process. Control 68, 102678. doi:10.1016/j.bspc.2021.102678
Filos D., Chouvarda I., Tachmatzidis D., Vassilikos V., Maglaveras N. (2017). Beat-to-beat P-wave morphology as a predictor of paroxysmal atrial fibrillation. Comput. Methods Programs Biomed. 151, 111–121. doi:10.1016/j.cmpb.2017.08.016
Frank E. (1956). An accurate, clinically practical system for spatial vectorcardiography. Circulation 13, 737–749. doi:10.1161/01.cir.13.5.737
Ghosal T., Hembram A., Ahmed I., Goswami D. P., Bandyopadhyay A., Sengupta A. (2024). Exploring cardiac vector propagation in acute myocardial infarction: a spatial velocity perspective. Future Cardiol. 20, 1–10. doi:10.1080/14796678.2024.2440257
Goernig M., Hoeffling B., Lau S., Figulla H. R., Haueisen J. (2015). T vector and loop characteristics improve detection of myocardial injury after infarction. J. Med. Biol. Eng. 35, 381–386. doi:10.1007/s40846-015-0041-8
Goldberger A., Amaral L., Glass L., Hausdorff J., Ivanov P. C., Mark R., et al. (2000). Components of a new research resource for complex physiologic signals. PhysioBank, PhysioToolkit, Physionet. doi:10.1161/01.cir.101.23.e215
Gragnaniello M., Borghese A., Marrazzo V. R., Maresca L., Breglio G., Irace A., et al. (2024). Real-time myocardial infarction detection approaches with a microcontroller-based Edge-AI device. Sensors 24, 828. doi:10.3390/s24030828
Habibzadeh F., Habibzadeh P., Yadollahie M. (2016). On determining the Most appropriate test cut-off value: the case of tests with continuous results. Biochem. Medica 26, 297–307. doi:10.11613/BM.2016.034
Hafshejani N. J., Mehridehnavi A., Hajian R., Boudagh S., Behjati M. (2021). Identification of myocardial infarction using morphological features of electrocardiogram and vectorcardiogram. IET Signal Process. 15, 674–685. doi:10.1049/sil2.12072
Hammad M., Maher A., Wang K., Jiang F., Amrani M. (2018). Detection of abnormal heart conditions based on characteristics of ECG signals. Measurement 125, 634–644. doi:10.1016/j.measurement.2018.05.033
Hammad M., Chelloug S. A., Alkanhel R., Prakash A. J., Muthanna A., Elgendy I. A., et al. (2022). Automated detection of myocardial infarction and heart conduction disorders based on feature selection and a deep learning model. Sensors 22, 6503. doi:10.3390/s22176503
Han C., Shi L. (2019). Automated interpretable detection of myocardial infarction fusing energy entropy and morphological features. Comput. Methods Programs Biomedicine 175, 9–23. doi:10.1016/j.cmpb.2019.03.012
He C., Liu M., Xiong P., Yang J., Du H., Xu J., et al. (2022). Localization of myocardial infarction using a multi-branch weight sharing network based on 2-D vectorcardiogram. Eng. Appl. Artif. Intell. 116, 105428. doi:10.1016/j.engappai.2022.105428
Hearst M. A., Dumais S. T., Osuna E., Platt J., Scholkopf B. (1998). Support vector machines. IEEE Intelligent Syst. Their Applications 13, 18–28. doi:10.1109/5254.708428
Hernandez A. A., Bonizzi P., Karel J., Peeters R. (2018). Myocardial ischemia diagnosis using a reduced lead system IEEE 5302–5305. doi:10.1109/embc.2018.8513511
Hsieh C.-J., Chang K.-W., Lin C.-J., Keerthi S. S., Sundararajan S. (2008). A dual coordinate descent method for large-scale linear SVM, 408–415.
Jambukia S. H., Dabhi V. K., Prajapati H. B. (2015). “Classification of ECG signals using machine learning techniques: a survey,” in 2015 international conference on advances in computer engineering and applications (IEEE), 714–721.
Karisik F., Baumert M. (2019). A long short-term memory network to classify myocardial infarction using vectorcardiographic ventricular depolarization and repolarization. IEEE–1. doi:10.22489/CinC.2019.272
Keshtkar A., Seyedarabi H., Sheikhzadeh P., Rasta S. H. (2013). Discriminant analysis between myocardial infarction patients and healthy subjects using wavelet transformed signal averaged electrocardiogram and probabilistic neural network. J. Medical Signals Sensors 3, 225–230. doi:10.4103/2228-7477.128316
Ketu S., Mishra P. K. (2022). Empirical analysis of machine learning algorithms on imbalance electrocardiogram based arrhythmia dataset for heart disease detection. Arabian J. Sci. Eng. 47, 1–23. doi:10.1007/s13369-021-05972-2
Khan S. I., Pachori R. B. (2021a). Automated detection of posterior myocardial infarction from vectorcardiogram signals using fourier–bessel series expansion based empirical wavelet transform. IEEE Sensors Lett. 5, 1–4. doi:10.1109/lsens.2021.3070142
Khan S. I., Pachori R. B. (2021b). Derived vectorcardiogram based automated detection of posterior myocardial infarction using FBSE-EWT technique. Biomed. Signal Process. Control 70, 103051. doi:10.1016/j.bspc.2021.103051
Kijonka J., Vavra P., Zonca P., Penhaker M. (2022). A wavelet-based VCG QRS loop boundaries and isoelectric coordinates detector. Front. Physiology 13, 941827. doi:10.3389/fphys.2022.941827
Kors J., Van Herpen G., Sittig A., Van Bemmel J. (1990). Reconstruction of the frank vectorcardiogram from standard electrocardiographic leads: diagnostic comparison of different methods. Eur. Heart Journal 11, 1083–1092. doi:10.1093/oxfordjournals.eurheartj.a059647
Malmivuo J., Plonsey R. (1995). Bioelectromagnetism: principles and applications of bioelectric and biomagnetic fields. USA: Oxford University Press.
Mantas C. J., Abellan J. (2014). Credal-C4. 5: decision tree based on imprecise probabilities to classify noisy data. Expert Syst. Appl. 41, 4625–4637. doi:10.1016/j.eswa.2014.01.017
Mantas C. J., Abellán J. (2014). Analysis and extension of decision trees based on imprecise probabilities: application on noisy data. Expert Syst. Appl. 41, 2514–2525. doi:10.1016/j.eswa.2013.09.050
McFee R., Parungao A. (1961). An orthogonal lead system for clinical electrocardiography. Am. Heart J. 62, 93–100. doi:10.1016/0002-8703(61)90488-4
Mozaffarian D., Benjamin E. J., Go A. S., Arnett D. K., Blaha M. J., Cushman M., et al. (2016). Heart disease and stroke statistics–2016 update: a report from the American heart association. Circulation 133, e38–e360. doi:10.1161/CIR.0000000000000350
Mucherino A., Papajorgji P. J., Pardalos P. M., Mucherino A., Papajorgji P. J., Pardalos P. M. (2009). “K-nearest neighbor classification,” in Data mining in agriculture, 83–106.
Murat F., Yildirim O., Talo M., Demir Y., Tan R.-S., Ciaccio E. J., et al. (2021). Exploring deep features and ECG attributes to detect cardiac rhythm classes. Knowledge-Based Syst. 232, 107473. doi:10.1016/j.knosys.2021.107473
Ng A., Jordan M. (2001). On discriminative vs. generative classifiers: a comparison of logistic regression and naive bayes. Adv. Neural Information Processing Systems 14.
Noble W. S. (2006). What is a support vector machine? Nat. Biotechnology 24, 1565–1567. Publisher: Nature Publishing Group UK London. doi:10.1038/nbt1206-1565
Panagiotou C., Dima S.-M., Mazomenos E. B., Rosengarten J., Maharatna K., Gialelis J., et al. (2013). Detection of myocardial scar from the vcg using a supervised learning approach. IEEE 2013, 7326–7329. doi:10.1109/EMBC.2013.6611250
Perkins S. J., Monovoukas D., Chopra Z., Kucharski K., Powell C., Vejalla A., et al. (2024). Vectorcardiography predicts heart failure in patients following ST elevation myocardial infarction. Ann. Noninvasive Electrocardiol. 29, e70013. Publisher: Wiley Online Library. doi:10.1111/anec.70013
Prabhakararao E., Dandapat S. (2019). A weighted SVM based approach for automatic detection of posterior myocardial infarction using VCG signals. IEEE, 1–6. doi:10.1109/ncc.2019.8732238
Prabhakararao E., Dandapat S. (2020). Automated detection of posterior myocardial infarction from VCG signals using stationary wavelet transform based features. IEEE Sensors Lett. 4, 1–4. doi:10.1109/lsens.2020.2992760
Ripoll V. J. R., Wojdel A., Romero E., Ramos P., Brugada J. (2016). ECG assessment based on neural networks with pretraining. Appl. Soft Comput. 49, 399–406. doi:10.1016/j.asoc.2016.08.013
Rossello X., Wiegerinck R., Alguersuari J., Bardaji A., Worner F., Sutil M., et al. (2013). Improved diagnostic yielding of acute pericarditis and myocardial ischemia based on changes in QRS complex and QT interval duration. Eur. Heart J. 34, P5576. doi:10.1093/eurheartj/eht310.p5576
Schmitt O. H., Simonson E. (1955). Symposium on electrocardiography and vectorcardiography: the present status of vectorcardiography. AMA Archives Internal Medicine 96, 574–590. doi:10.1001/archinte.1955.00250160016002
Shalev-Shwartz S., Singer Y., Srebro N. (2007). Pegasos: primal estimated sub-gradient solver for svm, 807–814.
Sharma L. D., Sunkaria R. K. (2018). Inferior myocardial infarction detection using stationary wavelet transform and machine learning approach. Signal, Image Video Process. 12, 199–206. doi:10.1007/s11760-017-1146-z
Sharma M., San Tan R., Acharya U. R. (2018). A novel automated diagnostic system for classification of myocardial infarction ECG signals using an optimal biorthogonal filter bank. Comput. Biology Medicine 102, 341–356. doi:10.1016/j.compbiomed.2018.07.005
Śmigiel S., Pałczyński K., Ledziński D. (2021). ECG signal classification using deep learning techniques based on the PTB-XL dataset. Entropy 23, 1121. doi:10.3390/e23091121
Sun Q., Liang C., Chen T., Ji B., Liu R., Wang L., et al. (2022). Early detection of myocardial ischemia in 12-lead ECG using deterministic learning and ensemble learning. Comput. Methods Programs Biomed. 226, 107124. doi:10.1016/j.cmpb.2022.107124
Tadesse G. A., Javed H., Weldemariam K., Liu Y., Liu J., Chen J., et al. (2021). DeepMI: deep multi-lead ECG fusion for identifying myocardial infarction and its occurrence-time. Artif. Intell. Med. 121, 102192. doi:10.1016/j.artmed.2021.102192
Thygesen K., Alpert J. S., White H. D.Joint ESC/ACCF/AHA/WHF Task Force for the Redefinition of Myocardial Infarction (2007). Universal definition of myocardial infarction. J. Am. Coll. Cardiol. 50, 2173–2195. doi:10.1016/j.jacc.2007.09.011
Tripathy R., Dandapat S. (2017). Detection of myocardial infarction from vectorcardiogram using relevance vector machine. Signal, Image Video Process. 11, 1139–1146. doi:10.1007/s11760-017-1068-9
van Bemmel J. H., Kors J. A., van Herpen G. (1992). Combination of diagnostic classifications from ECG and VCG computer interpretations. J. Electrocardiology 25, 126–130. doi:10.1016/0022-0736(92)90078-e
Vondrak J., Penhaker M. (2022a). Review of processing pathological vectorcardiographic records for the detection of heart disease. Front. Physiology 13, 535. doi:10.3389/fphys.2022.856590
Vondrak J., Penhaker M. (2022b). Statistical evaluation of transformation methods accuracy on derived pathological vectorcardiographic leads. IEEE J. of Transl. Eng. in Health and Med. 10, 1900208. doi:10.1109/JTEHM.2022.3167009
Vondrak J., Penhaker M., Kubicek J. (2024). Detection of myocardial infarction using analysis of vectorcardiographic loops. Measurement 226, 114094. doi:10.1016/j.measurement.2023.114094
Weng J. T.-Y., Lin J.-J., Chen Y.-C., Chang P.-C. (2014). Myocardial infarction classification by morphological feature extraction from big 12-lead ECG data. Springer, 689–699.
Wright S. J., Nowak R. D., Figueiredo M. A. (2009). Sparse reconstruction by separable approximation. IEEE Trans. Signal Processing 57, 2479–2493. doi:10.1109/tsp.2009.2016892
Xu W. (2011). Towards optimal one pass large scale learning with averaged stochastic gradient descent. arXiv Preprint arXiv:1107, 2490). doi:10.48550/arXiv.1107.2490
Yoon T., Kang D. (2023). Multi-modal stacking ensemble for the diagnosis of cardiovascular diseases. J. Personalized Med. 13, 373. doi:10.3390/jpm13020373
Zeng W., Yuan C. (2023). Myocardial infarction detection using ITD, DWT and deterministic learning based on ECG signals. Cogn. Neurodynamics 17, 941–964. doi:10.1007/s11571-022-09870-7
Zeng W., Yuan J., Yuan C., Wang Q., Liu F., Wang Y. (2020). Classification of myocardial infarction based on hybrid feature extraction and artificial intelligence tools by adopting tunable-Q wavelet transform (TQWT), variational mode decomposition (VMD) and neural networks. Artif. Intell. Med. 106, 101848. doi:10.1016/j.artmed.2020.101848
Zhang J., Liu A., Gao M., Chen X., Zhang X., Chen X. (2020). ECG-Based multi-class arrhythmia detection using spatio-temporal attention-based convolutional recurrent neural network. Artif. Intell. Med. 106, 101856. doi:10.1016/j.artmed.2020.101856
Zhang J., Liu M., Xiong P., Du H., Zhang H., Lin F., et al. (2021). A multi-dimensional association information analysis approach to automated detection and localization of myocardial infarction. Eng. Appl. Artif. Intell. 97, 104092. doi:10.1016/j.engappai.2020.104092
Zhang J., Liu M., Xiong P., Du H., Yang J., Xu J., et al. (2022a). Automated localization of myocardial infarction from vectorcardiographic via tensor decomposition. IEEE Trans. Biomed. Eng. 70, 812–823. doi:10.1109/TBME.2022.3202962
Zhang X., Wu H., Chen T., Wang G. (2022b). Automatic diagnosis of arrhythmia with electrocardiogram using multiple instance learning: from rhythm annotation to heartbeat prediction. Artif. Intell. Med. 132, 102379. doi:10.1016/j.artmed.2022.102379
Zhao C., Vij A., Malhotra S., Tang J., Tang H., Pienta D., et al. (2021). Automatic extraction and stenosis evaluation of coronary arteries in invasive coronary angiograms. Comput. Biol. Med. 136, 104667. doi:10.1016/j.compbiomed.2021.104667
Zhao X., Zhang J., Gong Y., Xu L., Liu H., Wei S., et al. (2022). Reliable detection of myocardial ischemia using machine learning based on temporal-spatial characteristics of electrocardiogram and vectorcardiogram. Front. Physiology 13, 854191. doi:10.3389/fphys.2022.854191
Zhou S., Tan B. (2020). Electrocardiogram soft computing using hybrid deep learning CNN-ELM. Appl. Soft Comput. 86, 105778. doi:10.1016/j.asoc.2019.105778
Zhou S. H., Liebman J., Dubin A. M., Gillette P. C., Gregg R. E., Helfenbein E. D., et al. (2001). Using 12-lead ECG and synthesized VCG in detection of right ventricular hypertrophy with terminal right conduction delay versus partial right bundle branch block in the pediatric population. J. Electrocardiology 34, 249–257. doi:10.1054/jelc.2001.28908
Keywords: myocardial infarction, vectorcardiography, electrocardiography, domain knowledge features dataset, ensemble learning, meta classifier
Citation: Vondrak J and Penhaker M (2026) Enhancing myocardial infarction detection with vectorcardiography: fusion-based comparative analysis of machine learning methods. Front. Physiol. 16:1683956. doi: 10.3389/fphys.2025.1683956
Received: 11 August 2025; Accepted: 24 November 2025;
Published: 05 January 2026.
Edited by:
Robert S. Kass, Columbia University, United StatesReviewed by:
Maruthi Venkata Chalapathi Mukkoti, VIT-AP University, IndiaAhmed Khudhur, University of Kirkuk, Iraq
Copyright © 2026 Vondrak and Penhaker. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Jaroslav Vondrak, amFyb3NsYXYudm9uZHJha0B2c2IuY3o=
Marek Penhaker