- 1Department of Computer Science and Engineering, Bannari Amman Institute of Technology, Sathyamangalam, Tamil Nadu, India
- 2Department of Information Technology, Bannari Amman Institute of Technology, Sathyamangalam, Tamil Nadu, India
Introduction: Heart disease is one of the leading causes of mortality worldwide, and early detection is crucial for effective treatment. Phonocardiogram (PCG) signals have shown potential in diagnosing cardiovascular conditions. However, accurate classification of PCG signals remains challenging due to high dimensional features, leading to misclassification and reduced performance in conventional systems.
Methods: To address these challenges, we propose a Linear Vectored Particle Swarm Optimization (LV-PSO) integrated with a Fuzzy Inference Xception Convolutional Neural Network (XCNN) for early heart risk prediction. PC G signals are analyzed to extract variations such as delta, theta, diastolic, and systolic differences. A Support Scalar Cardiac Impact Rate (S2CIR) is employed to capture disease specific scalar variations and behavioral impacts. LV-PSO is used to reduce feature dimensionality, and the optimized features are subsequently trained using the Fuzzy Inference XCNN model to classify disease types.
Results: Experimental evaluation demonstrates that the proposed system achieves superior predictive performance compared to existing models. The method attained a precision of 95.6%, recall of 93.1%, and an overall prediction accuracy of 95.8% across multiple disease categories.
Discussion: The integration of LV-PSO with Fuzzy Inference XCNN enhances feature selection aPSO with Fuzzy Inference XCNN enhances feature selection and nd classification accuracy, significantly improving the diagnostic capabilities of PCG-classification accuracy, significantly improving the diagnostic capabilities of PCG-based systems. These results highlight the potential of the proposed framework as a based systems. These results highlight the potential of the proposed framework as a reliable tool for early heart disease prediction and clinical decision support.reliable tool for early heart disease prediction and clinical decision support.
1 Introduction
One of the leading causes of death globally is heart disease, impacting human life due to clinical identification errors resulting in increased fatalities. Early prediction and data analysis are essential in reducing the risk of patient outcomes and efficiently analyzing data (Bourouhou et al., 2019; Vahanian et al., 2021). Most existing techniques must concentrate on the disease properties and feature dimension in the intake data analysis structure (Er, 2021). So, increasing variations in feature analysis takes more dimension to produce poor accuracy in the sense of low precision, recall rate, and F1 measure on various parameters.
By considering the problematic issues, the optimization must improve the feature selection and classification for extraordinary outcomes (Schmidt et al., 2010; Deng et al., 2020). Cardiovascular disease is the leading cause of death worldwide, affecting millions of people annually with a variety of heart conditions. Patients must receive timely and efficient treatment for heart disease if early detection and correct diagnosis are to be achieved. Furthermore, Machine Learning (ML) algorithms provide significant assurance in clinical diagnosis, specifically in heart sound classification for diagnosing cardiac diseases (Shuvo et al., 2021).
Furthermore, many abnormalities, such as heart murmurs and artifacts associated with cardiovascular disease, can affect the heart rate, which is the most common cause of death. Thus, they offer a new method for early detection of heart disease (Shuvo et al., 2023; Dhiyanesh et al., 2024). Afterward, feature vectors can be generated by extracting features from the inputs acquired directly from the heart sounds using a Deep Neural Network (DNN) algorithm. Moreover, the efficacy of the suggested approach can be evaluated in a real-world setting (Springer et al., 2015).
A promising method for classifying heart sounds involves analyzing recordings of sounds created by the heart during each cardiac cycle using PCG signals. Figure 1 describes the Working Principle of PCG Signal Observation and Processing. These signals contain valuable information about heart function and can be analyzed using DL techniques to identify patterns associated with different heart states (Waaler et al., 2023). In this paper, we present future work to improve the classification of heart diseases by utilizing a combination of Linear Vector Particle Swarm Optimization (LVPSO) and Xception Convolutional Neural Networks (XCNN). In the first step, the PCG signal can be pre-processed using the proposed methods to extract the relevant features for classification. This includes techniques such as signal elimination, segmentation, and feature extraction to improve data quality and reduce noise and artifacts that can interfere with classification. Once the data is pre-processed, it can be input into the LVPSO algorithm. The LVPSO algorithm is a variation of the traditional particle swarm optimization algorithm designed for linear vector optimization problems.
The LVPSO algorithm works by repeatedly updating the number of candidate solution particles based on the fitness values of the sound signal, which are determined by a linear vector objective function. This enables the algorithm to search the solution space efficiently and find the optimal parameters for the classification task. Using LVPSO, hyperparameters of Xception CNN, such as learning rate, block size, and number of layers, can be effectively modified to increase the accuracy of the classification model. Xception CNN, a DL framework, has demonstrated outstanding performance in classification tasks and has recently been utilized in clinical signal analysis with optimistic results. By combining Xception CNN with LVPSO, we aim to leverage the strengths of both algorithms to improve the accuracy and robustness of cardiovascular disease classification models. This collaborative approach, where the Xception CNN will be trained on the pre-processed PCG signals to understand the underlying patterns associated with different heart states and the LVPSO will optimize the CNN, invites all of us to be part of this exciting journey toward better understanding and classifying cardiovascular diseases hyperparameters.
The proposed acoustic classification of cardiac disease using LVPSO and exception CNN has the potential to significantly enhance the accuracy and efficiency of cardiac disease diagnosis. By leveraging the power of ML algorithms and DL frameworks, a robust and reliable classification model can be developed to assist healthcare professionals in the early detection and treatment of cardiac disorders. However, it is crucial to note that further research and experiments are essential to validate the efficacy of this approach. The initial results are promising, suggesting that this approach could have a significant impact on the field of heart disease.
The paper is structured into several sections to outline the cardiac sound classification research process. Section 1 provides an introduction to the research; Section 2 reviews the principles of existing methods along with their pros and cons; Section 3 explains the proposed method; Section 4 presents method comparisons; and Section 5 concludes with a discussion and final remarks, highlighting the performance of the proposed work and suggesting future developments.
2 Related work
A recent literature review (Krishnan et al., 2020) comprehensively summarizes current research on using PCG signals in predicting cardiac diseases through ML and DL methods. The potential of advanced technologies like ML and artificial intelligence to significantly enhance the precision and effectiveness of cardiac disease prediction is gaining momentum. Cardiovascular disease, a leading global cause of mortality, underscores the importance of timely identification for successful intervention and prevention. Recent analyses through various signal transformations have highlighted the importance of predicting heart disease by audio signals (Kiranyaz et al., 2020). Analyzing heart sounds with a phonocardiograph allows for recording heart sounds, which can be further explored with computational algorithms.
Early identification of abnormal heart sounds is a significant challenge in predicting heart disease. Conventional diagnostic methods for heart disease, like Electrocardiograms (ECG), are usually invasive and require specialized approaches. However, the use of non-invasive tools like stethoscopes to gather sound signals offers a convenient and promising approach to predicting heart disease.
Recent technological advances have enabled the characterization of DL heart disease based on PCG signals (Dhiyanesh et al., 2025). Modern DL methods, such as CNNs and Recurrent Neural Networks (RNNs), have successfully identified inconsistencies in PCG data. The potential of these advanced methods to revolutionize early detection and diagnosis of heart disease, reduce the burden on healthcare facilities, and improve patient outcomes is significant. However, a substantial problem with PCG signal processing is the different limitations on feature selection. The dimensionality of breeding features often leads to incorrect feature selection and poor accuracy.
The novel proposed that the prediction accuracy of heart failure can be improved by combining neural networks and Particle Swarm Optimization (PSO) techniques. However, cardiovascular disease continues to be a significant issue globally, with its mortality rate on the rise (Mahalakshmi and Rout, 2023).
Moreover, the enhanced PSO algorithm identifies the optimal features and feature subsets. The optimal feature subset is carefully selected and fed into an ensemble classifier to determine the likelihood of heart disease accurately (Yuliandari et al., 2024). A new approach to a Neural Fuzzy Inference System (NFIS) for representing training data can be created using n-dimensional functions. NFIS optimizes learning algorithms by calibrating them with an error calculation module (Jha et al., 2022). The new approach aimed to detect cardiac disorders using health metrics gathered from wearable sensors integrated with a Fuzzy Logic Inference System (FLIS) (Kadu et al., 2022).
Furthermore, CNNs are extensively employed to predict heart disease in various domains, including computer vision and image identification. CNNs can accurately assist in analyzing and predicting heart disease. Furthermore, CNNs can automatically learn hierarchical representations of data (Alzubaidi et al., 2021).
PSO-based methods can be applied to optimize the parameters of stacked sparse autoencoders. Furthermore, PSO optimization permits enhancing the performance of feature learning and classification (Mienye and Sun, 2021).
Table 1 presents DL techniques, datasets, and methods derived from previous approaches for heart disease detection, outlining the constraints and accuracy of performance evaluation achieved in predicting heart diseases.
The proposed method utilizes SMOTE to manage imbalanced data in datasets effectively. Besides, these permit accurate classification of a given dataset and ensure maximum accuracy in performance evaluation results (Waqar et al., 2021). Cardiac signals can be automatically detected by decomposing them into discrete model functions utilizing the Complete Ensemble Empirical Mode Decomposition (CEEMD) method (Manuel Centeno-Bautista et al., 2023). Moreover, the signal-to-noise ratio model parts can be approximated to extract time and frequency details of the decaying mode through the EEMD analysis method (Zhao et al., 2023). Hence, the Least Mean Square (LMS) algorithm offers an optimal adaptive filter system for accurately estimating noisy signals. Likewise, a noisy signal can be processed in series with multiple adaptive filter stages (Hannah Pauline and Dhanalakshmi, 2022).
Furthermore, DL techniques have analyzed the ability to predict heart disease from sound signals. For example, Raza et al. (2019) developed a CNN-based model that detects heart murmurs from acoustic signals with up to 90% accuracy. Similarly Jones et al. (2019), used RNN to predict the onset of atrial fibrillation with an accuracy of 85%. Furthermore, DL techniques such as CNN and RNN have indicated accurate results in analyzing sound signals to predict heart disease. While RNNs are more effective at collecting temporal correlations in data, CNNs are better at extracting spatial features from sound recordings. By combining the two techniques, researchers achieved greater accuracy in predicting various heart diseases. However, feature dimensionality creates worst-case scenarios during classification, as threshold changes in feature ranges can lead to lower precision and recall.
In addition to CNNs and RNNs, auto-encoders and Generative Adversarial Networks (GANs) are other DL techniques studied to predict cardiac disease in sound signals. For example Ramkumar et al. (2024), proposed a GAN-based model for generating artificial heart sounds to improve the training data and prediction accuracy. However, the parametric performance reduces the accuracy and results in a high error rate due to high time complexity and uncorrelated feature analysis.
The literature on predicting cardiac disease using DL techniques on sound signals still needs to be improved. Furthermore, these models focus more on large-scale analysis of populations in real-world clinical settings to ensure their effectiveness. In Amiriparian's et al. (2019) study, the authors proposed a DL model to classify heart sounds into different categories, including standard and abnormal. The model achieved high accuracy in differentiating various types of heart sounds and demonstrated the potential of DL in analyzing sound signals for heart disease prediction. Another study by Li H. et al. (2020) focused on using DL for early detection of heart murmurs. The authors have developed a DNN that can accurately classify heart murmurs based on acoustic signals, showing promising results for early diagnosis of heart disease.
In a review by Wang J. et al. (2018), the authors discussed the various DL techniques used in heart disease prediction, including CNNs and RNNs. The review highlighted the crucial role of sound signals in improving the accuracy of prediction models, ensuring the audience is well-informed about the key factors in heart disease detection.
Among the seminal works in the field (Yang et al., 2021) proposed a DL model for cardiac disease prediction using acoustic signals conducted. The results demonstrate a capable accuracy in diagnosing heart disease and highlight the potential of the DL technique. Based on sound signals, a DL model for the prediction of cardiac disease was established by another critical analysis (Li Y. et al., 2020). The author combined a CNN with a Long Short-Term Memory (LSTM) algorithm to accurately predict cardiac illness by analyzing auditory data. Furthermore, they demonstrate the effectiveness of combining different DL frameworks to improve forecasting performance. In addition to these studies, several research papers have investigated using DL techniques for heart disease prediction using audio signals. For example, Wang H. et al. (2018) proposed an RNN-based DL model for heart disease prediction, and Gomathi et al. (2024) used a hybrid DL model combining CNN and SVM for the same objective.
Furthermore, researchers have analyzed using transfer learning in heart disease prediction with sound signals. For example, Hettiarachchi et al. (2017) applied knowledge from pre-trained DL models to enhance heart disease prediction performance, showcasing the potential of transfer learning in this area. Bentley et al. (2011) assembled a PASCAL dataset of heart sounds from patients with and without heart disease and used a CNN for sound signal classification. The University of Michigan Health System presents the Murmur Database (MHSTP), comprising 23 heartbeat recordings computing 1496.8 s. In the CEEMD, murmurs in heart sound signals are detected. CEEMD is University of Michigan Health System (2015) more advanced than EMD as it solves the mode mixing issue present in EMD. Extraction of the murmur and heart sounds using composting methods such as EMD has been performed (Oliveira et al., 2021; Dhiyanesh et al., 2024). In general, Ali et al. (2023) using DL methods to examine sound signals for predicting cardiac diseases shows significant potential in enhancing early detection and treatment results, thereby improving patient outcomes. By employing artificial intelligence capabilities, researchers can create more precise and effective predictive models to support healthcare providers in delivering improved care to individuals with heart diseases.
Table 2 shows the proposed method derived from previous studies, describing its limitations and limitations. Furthermore, they can be tested in feature selection methods for predicting heart disease. The techniques listed in the table provide a systematic approach to selecting relevant features important for accurate heart disease prediction.
Some effective models can classify PCG signals using Attention-Based Bidirectional LSTM (A-BLSTM) techniques (Prabhakar and Won, 2023). Another study used (Sivakami and Prabhu, 2023) Cuckoo Search Bio-inspired Algorithm (CSBA) with DBN method for heart disease prediction. Similarly, the novel Muthulakshmi and Parveen (2023) developed a Z-score normalization, African Buffalo Optimization (ABO) methods for effective disease prediction. Study Taylan et al. (2023) concentrated on classification of cardiovascular disease with the help of support vector regression (SVR) and ANFIS algorithm. Similarly, the article Thakkar and Agrawal (2023) used a deep CNN and min-max normalization method. The novel Yusuf Ilu and Prasad (2023) introduced an autoregressive integrated moving average (ARIMA) and K-means Clustering methods for disease identification. The literature review indicates a rising interest in applying DL techniques for predicting cardiac disease based on audio signals. Various studies reviewed in this research have demonstrated the efficiency of DL models such as CNN, LSTM, and RNN in accurately predicting cardiac disease from audio signals. Besides, investigating transfer learning and hybrid models exhibits potential for further advancement in this field. In conclusion, exploring heart disease predictions through DL techniques using sound signals holds great promise in improving early detection. Through employing the capabilities of DL, researchers can create precise and effective predictive models that have the potential to save lives.
2.1 Problem identification factors and consideration
From the literature, we found the complex nature of heart disease prediction based on sound signals having difficulties.
• One of the critical issues in PCG signal processing is the potential for improper feature selection due to identical feature dimensions. This can significantly undermine the accuracy of the results, leading to poor outcomes. Feature dimensionality creates worst-case scenarios during classification because the range variation in feature ranges causes low precision and recall rates.
• The previous methods' simulation parameters degraded the performance accuracy, so it has a higher false rate due to non-relation feature analyses, more time complexity, and higher error rates.
2.2 Research gap
• A significant research gap exists in the understanding of complex features extracted from phonocardiogram (PCG) signals that can be used to predict cardiac disease.
• Including only time depends on the feature, not the actual limits of ML and DL concepts for feature estimation.
• The previous algorithms are insuufficiently focus herat disease early stage prediction and One of the reaserch gap in cardiovascular disease diagnosis is the quality of the analytical data. ECG and PCG signals are sensitive to noise and artifacts. The amount of data generated can be enormous, making it difficult to effectively manage this data for signal processing and interpretation, and researchers are continuously working to develop powerful techniques to reduce noise and improve signal quality.
• Missing data values will result in errors; Valueless data is fuzzy because it can be either true or false. Decision-making ability depends on the quality of data. Small improvements in data dimension can lead to large improvements in decision-making information.
3 Proposed methodology
Toward developing a Linear Vectored-Particle Swarm Optimization based on Fuzzy Inference Xception Convolution Neural Network for early heart risk prediction. The first step in this approach is to utilize the Pascal dataset, which contains valuable information in the form of PCG representation. PCG signal format is used to convert sound waves into data, allowing for the identification of critical features such as Delta, Theta, diastolic, and systolic differences present in the dataset. These factors significantly influence the risk of heart disease. The model accuracy is improved by applying preprocessing techniques such as SMOTE and EDAMF to cardiac clinical data. These techniques help normalize the data and address balances or inconsistencies present in the dataset, ultimately improving the overall model's overall performance.
To identify the scalar differences based on disease properties and assess the behavioral impact, a Support Scalar Cardiac Impact Rate (S2CIR) is utilized. This metric helps understand the disease's severity and impact on the individual, providing valuable insights for early detection and intervention. Figure 2 shows the Proposed LVPSO-FIXCNN Workflow Architecture Diagram. Notably, the Multivariate disease impact rate is used to determine the non-linearity scaling values, a crucial step in our research. These values are then processed using Linear Vectored–Particle Swarm Optimization (LV-PSO) for feature selection and dimensionality reduction, enhancing the model's performance and ensuring that only the most pertinent features are utilized for predictions.
Finally, the selected features are trained using a Fuzzy Inference Xception Convolution Neural Network (FIXCNN) to categorize the type of heart disease and provide accurate predictions. FIXCNN models utilize the capabilities of DL and fuzzy logic to examine intricate patterns in data and generate well-informed decisions, resulting in enhanced precision and dependability of predictions.
The heart PCG signals consist of several frequency components corresponding to different cardiac cycle physiological events, as indicated in Figure 3. The frequency range of cardiac sound waves is displayed in Table 3. The closure of the tricuspid and mitral valves produces the first heart sound (S1), which has low-frequency features. The closure of the aortic and pulmonary valves results in the second heart sound (S2), which has a higher frequency component.
Two additional heart sounds, S3 and S4, may indicate abnormal heart activity. The timing and intensity of cardiac PCG signals provide valuable information on the cardiac cycle. The heart sounds S1 to S2 are called the systolic interval, indicating the ventricular contraction and ejection length. The intensity of heart sounds may fluctuate due to factors like ventricular contraction and valve abnormalities. Changes in timing and intensity can indicate conditions such as heart failure or valvular stenosis.
3.1 Synthetic Minority Oversampling Technique (SMOTE)
In this section, a training dataset is developed using SMOTE to predict heart disease. Furthermore, leveraging the SMOTE technique can find extensive applications in the healthcare sector for managing class-imbalanced data. Then, by utilizing Euclidean distance to create synthetic generated random data of minority classes from nearby neighbors, the number of data instances can be enhanced. Moreover, new samples are generated by leveraging the top features from the original data. The SMOTE technique can produce optimal values of the application, thereby introducing additional noise. By oversampling minority classes, synthetic samples are created by adding line segments from the k nearest neighbors of the minority class to each sample. Neighbors can be selected randomly from the k nearest neighbors based on oversampling as required. A synthetic model is also created to predict the differences between the analyzed feature vector model and its nearest neighbors. Moreover, the feature vectors are evaluated with 0 and 1, multiplying their variances by random numbers.
Creating synthetic data from minority classes of random number data can be achieved by calculating population functions. Furthermore, nearest neighbors provide new index array features for different samples, as detailed in Algorithm 1. Let's assume the id-number of the synthetic sample, Z-Minority instance, K-nearest neighbor, x- integral sample, du−number of the attribute, W-sample, q−populate, Ww−synthetic sample, Dd−nearest neighbor, dx−new index, xu−attribute index, α−random number.
Table 4 compares the performance of different variation methods like Support Vector Regression (SVR) and Autoregressive Integrated Moving Average (ARIMA). The proposed method attains 80.98%, 84.09%, and 90.21% for Pascal, Circor and Physico-cardnet, respectively.
3.2 Enhanced empirical mode decomposition adaptive filter
In this section, advanced empirical methods enable the analysis of time-domain or one-dimensional signals through an adaptive filter technique. The EEMDAF method is also known for decomposing a one-dimensional signal into different Eigenmode functions and frequency bands using frequency information. Also, the number and intensity of zero crossings in the 1D signal in the intrinsic mode functions must be different or equal. The estimated mean value will be zero when using symmetrical lower and upper envelopes. The process is repeated until all accurate Eigenmode functions are computed using the analytical EEMDAF method.
Moreover, filter integration approximates the input-output relationship of the EEMDAF method. By considering only current and past observations, the weighting of adaptive filters can produce statistically better estimates of the following observations.
Furthermore, the EEMDAF technique removes reference signal interference from the cross-correlation matrix and ensures vector independence. Noise estimation also includes estimating signals from the power supply and other known noise sources. Unlike frequency-selective filters, adaptive filters use an autocorrelation matrix instead of a crucial input to normalize the most and least significant engine values.
Lower and upper envelopes are estimated using cubic splines connecting the determined local maximum and minimum points, as described in Equation 1. Let's assume new 1D signal, the iz−minimum and maximum point of the 1D signal, and the mean of both envelopes.
Equation 2 approximates the new 1D signal's local maximum and minimum points. Compute the residual signal subtracted from the 1D signal and evaluate the Eigen mode function as shown in Equation 3. Where residual signal, IMF1−intrinsic mode functions, condition signal.
Calculate the final residual signal of the Eigen mode function derived from the initial 1D signal, as indicated in Equation 4.
Compute a new 1D signal from the Gaussian white noise sequence shown in Equation 5. Where c-trials, dx(z)−Gaussian noise series, ix(z)−initial 1D signal.
Evaluate the Eigen mode function of the frequency band as described in Equation 6. Let's assume EEMD−ensemble empirical mode decomposition, x, y-identified by frequency band. M-complete ensemble.
The adaptive filter used for filtering is calculated at the beginning of the procedure described in Equation 7. Where i[d]−uncorrelated with a reference signal, D-Noise, Pcg −Phonocardiogram.
Adaptive filters are standard and have a straightforward cost function. They generate a quadratic cost function with a global minimum for noise filtering. Calculate the noise in the reference signals between the auto-correlation matrix and cross-correlation vectors, as shown in Equation 8. Let's assume the pz−filter coefficient, N-noise filter, w=cross correlation matrix, Min−global minimum, y−cost function, and e-auto correlation matrix.
As demonstrated in Equation 9, the filter coefficients were computed for every iteration. Where μr(d)−maximal amplitude
The upper bound is calculated as shown in Equation 10. As presented in Equation 11, the reference signal's mean power and step size can be estimated. Where C-measure the miss-adjustment, μ− size, pow −power.
Compute the convergence time as shown in Equations 12, 13 Where α− normalized step-size, ς−time, H(E)− eigenvalue, md−condition number, λ(Max), λ(Min)−minimum, and maximum normalized value,
To achieve maximum probability estimation, the largest and smallest eigenvalues of the residual signal can be normalized by estimating the original one-dimensional signal across its frequency band or Eigen mode function. Through integrated empirical methods, it becomes possible to assess the frequency band of the distortion accurately.
Table 5 compares the performance of various pre-processing methods with diverse PCG datasets such as Pascal, Circor, and Physico- cardnet. The proposed EMDAMF method achieved 80.10% for Pascal, 85.34% for Circor, and 90.35% for the Physico-cardnet dataset. In contrast, existing filters failed to achieve high performance compared to the proposed method.
3.3 Support Scalar Cardiac Impact Rate (S2CIR)
An infinite number of hyperplanes in the dataset can be detected by measuring the support vector using heart disease data for efficient classification. Furthermore, an optimal hyperplane can be evaluated using a quadratic kernel function and predicted heart disease data utilizing the maximum shadow width. The optimal hyperplane is found by dividing the data into two types. Regardless of their proximity to each group of objects, the support vector scalar optimal classifier can identify similar optimal generalization hyperplanes. A hyperplane can optimally partition the coordinate input set if the maximum distance between neighboring elements and the support vector impact rate is accurately split. In the binary classification of cardiovascular diseases, the bias and its weight vector can be estimated by computing the hyperplane through the class labels of the S2CIR technique, a process that is crucial for the SVM model's operations. The SVM model's hyperplane classification maximizes the margin during impact rates. The impact rate is calculated as the sum of distances to the nearest positive or negative event. Besides, the SVM model can estimate and predict the scalar vector impact rate by classifying heart disease data.
Assuming binary classification, heart disease is classified using the linear separation rule of training samples in Equation 13. Let's assume w-training sample, i-design matrix, j-binary class scalar vector,
Calculate the coordinate vector of the hyperplane using the binary classes, as indicated in Equation 14. Where r-function vector, t, i, and β− coordination of the hyperplane.
Calculate the weight vector and bias between each hyperplane as shown in Equation 15. Let's assume p-weight vector, v-bias, x-class label.
As demonstrated in Equations 16, 17 hyperplanes are employed to categorize the training and testing heart disease data and approximate the prior function's kernel function. Where D-number of the training sample, h-kernel function, ix−input training sample, jx−matching class label, αx −coefficient.
Calculate the feature space coefficients from the input vector numbers of training samples as shown in Equation 18.
Estimating the contribution of heart disease data by introducing a set of slack variables is illustrated in Equations 19, 20. Let's assume IR−impact rate, l−loss, ξ−slack variable, m- parameter, x-instance, L-normalization, S-weight vector, v-bias, h−dataset instance, W-sum of distance, z-vector.
Therefore, Tables 6, 7 optimal features can be achieved by selecting different feature subsets, estimating the impact rate of cardiovascular disease, and introducing a set of slack variables to assess heart disease data.
3.4 Linear vectorised particle swarm optimization
In this section, the linear vector particle swarm optimization method can be enhanced for classifying heart diseases and estimating their weights. This method provides a systematic approach to optimize the classification process, leading to more accurate and efficient results. The distance can also be selected by weighing the information provided by the attributes in linear vectors. After that, the attribute information is processed using linear vector distance calculation. The optimal particle processing time can be estimated by selecting the information attribute and multiplying the information gain using convergence behavior.
Similarly, linear vector performance can be improved by choosing values in categories related to heart disease. A search-based particle swarm optimization technique is employed to classify heart disease by comparing populations within the selected linear vector. The current optimal particle evaluation space is a possible solution within the linear vector PSO method. Furthermore, particle weights can be calculated to maintain cardiovascular data's homogeneity and heterogeneity effects. Moreover, greater inertial weights enhance global search efficiencies, whereas lesser ones enhance local ones. Similarly, the LV-PSO algorithm efficiently analyses the search space by adjusting the inertial weight with the acceleration factor. Consequently, the proposed LV-PSO method categorizes heart diseases based on the number of particles that regulate the parameters and ensure their balance in each iteration.
Equation 21 plays a crucial role in our method. It calculates the class labels by finding the number of class labels and choosing a weight vector corresponding to the unit value of each vector of the linear function. This step is significant as it forms the basis for the classification process, ensuring accurate and reliable results. Let's assume the Kd−linear function, g-value, z-weight vector, and ag−training vector.
Equation 22 illustrates that the weight vector's difference between each data output unit's actual and predicted heart disease output values is computed. Furthermore, the error in each vector is estimated as described in Equation 23. Let's assumeDId(g)−difference function, h-output unit, b-output value, q-error vector.
Calculate the new weight unit by choosing the maximum number of iterations with the lowest learning rate starting from the alpha value shown in Equation 24. Let's assume R- -training vector, e-class, h-output unit, zh−weight of winning unit, and α−learning rate.
Compute the Euclidean distance of the information gain as shown in Equation 25. Let's assume ag− training vector class, C- Euclidean distance, Gi−information gain, and g-class.
Equations 26, 27 below calculate particle size analysis by iterating particle velocities. Let's assume F−dimensional space, m-particle size, velocity, z-interval weight, ϕ1−local accelerations, ϕ2− Global accelerations, position of the particle, p-best position, Ka(p) and i(p)−local and global best position,
The control parameters are automatically chosen at each iteration depending on the swarm particle count. Equations 28, 29 below illustrate that random numbers between 0 and 2 can be computed to determine acceleration factors for local and global acceleration values. Where E-coefficient, E1 and E2−acceleration coefficient, particle generates a q/8 random number selection among acceleration coefficients, ϕ−accelerartion, q-number of swarm particles, T1 and T2−randiom number,
As shown in Equation 30, particles can be classified into three classes (i.e., low, medium, and high) based on estimating local and global acceleration values.
For the inertia weight values, select three values between 0.4 and 0.9 and categorize them as low, medium, and high. Furthermore, low approximates 0.4, and high approximates 0.9 by averaging the other two values. At each iteration, the weight values are determined based on Equation 31. Let's assume -value, high value, medium value, low value.
As shown in Equation 32, the positions of the particles are computed by randomly choosing the number of acceleration factors for each particle position.
Calculate a random number of particle-level fitness functions between the current fitness function and the selection coefficient estimate. Then, choose and update the optimal fitness function based on minimization or maximization, as in Equation 33. Let's assume local fitness value, global position, minimum value, maximum value, current fitness function,
The random number at the particle level within the current fitness function can be approximated by analyzing the variance in each weight vector across the predicted output values for heart disease in each dataset.
A linear vector particle swarm optimization flow chart can classify particles into three categories based on selecting control parameters and estimating local and global acceleration values at each iteration. Furthermore, the best fitness function can be chosen based on minimum or maximum, and the best features can be selected and updated, as depicted in Figure 4.
Table 8 compares the performance of different feature selection methods and various PCG datasets. The proposed LV-PSO method yield is 81.24%, 86.085%, and 92.11% for Pascal, Circor and Physico-cardnet datasets, respectively. Similarly, the existing methods are Cuckoo Search Bio-inspired Algorithm (CSBA) and Particle Swarm Optimization (PSO) obtained less outcome performance than the proposed system.
3.5 Fuzzy inference system (FIS)
In this section, fuzzy inference system methods use predefined fuzzy rules based on input maps and their related outputs. Moreover, accuracy can be evaluated for heart disease using rules defining relationships between fuzzy input and output sets. The input is fuzzy because it requires fuzzy values, and the output of the inference system is defuzzied to understand the user's production. Furthermore, the prediction and analysis of heart disease can be accomplished by utilizing a fuzzy model. Subsequently, its classification in each dimension can be segmented into rectangular subspaces by employing a predefined number of membership functions through the axis-parallel partitioning technique of the FIS method.
Fuzzy rules are “if-then” rules used in fuzzy logic systems to infer outputs based on the input variables. During forward propagation, the parameters obtained during training can be estimated by finding the least squares error. Similarly, gradient descent is utilized to analyse the parameters before training in the backward propagation process. After that, parameters for heart disease analysis are determined in forward and backward iterations.
As shown in Equation 34, fuzzy computers can predict heart disease as a function of a linear parameter using two rules: input and output. Let's assume i and j-input, fuzzy set, g1 and g2-fuzzy set, linear parameters.
Calculates input values and establishes membership function as a fuzzy layer. Equations 35, 36 demonstrate that fuzzification can be achieved using a Gaussian membership function. Where i and j-input, x-node, mx, ϱx−parameter set, output layer, mx, −center of the curve, gaussian member function,
When the degree of fuzzy set membership equals zero, the trigonometric membership function is calculated as indicated in Equation 37. Let us assume w, r, and e –membership function, s-lower bound function, r-upper bound function, and e-center of the place degree of membership.
Calculate the appropriate bell-shaped membership function for the non-linear system defined by Equation 38. Furthermore, its membership function can be Gaussian or Bell-shaped. Where i and j-input variables, ux, vx, mx−bell membership function, u-width, v-slope, m-center of the bell. The learning mechanism of the training process determined these parameters.
The input signal is amplified, and the output is sent via a straightforward amplifier with a set production. The output can be assessed by building the firing output of the product layer rule given in Equation 39. Let's assume the px− weight of the fuzzy rule.
As illustrated in Equations 40–42, the firing strengths calculated in the previous layer are evaluated and normalized at the fixed nodes. It multiplies the normalized values of the last layer and represents the first-order polynomial as a fuzzy representation. Calculate the maximum sum output of all input signals. Let us assume L-layer.
Let's assume normalized weight, consequent first-order polynomial parameter, and f-final output model.
As shown in Equation 43, the output can be determined using the centroid defuzzification method.
Equation 44 demonstrates that the calculation of heart disease can be represented as a linear combination of parameters in the fuzzy inference system's final output. Where w1, r1, e1 and w2, r2, e2linear combination of the parameters.
The parameters in the forward step in the fuzzy inference system can be determined using least-squares estimation. In backpropagation, each node's square signal error output can be propagated backward from the output layer to the input layer. Heart disease prediction can also be achieved through a fuzzy inference system by representing a linear combination of parameters.
The FIS uses a five-layer structure to forecast the maximum output, as presented in Figure 5. This includes L-layer, i and j as inputs, U1, U2, U3, and U4 as fuzzy subsets, Π as normalized nodes in layer 2, N as fixed nodes in layer 3, pi as normalized weight and weight of the fuzzy rule, Ip−input, and Gi- fuzzy rule.
3.6. Xception Convolutional Neural Network
In this section, the proposed Xception Convolutional Neural Network method can be utilized to predict cardiovascular diseases. They use a feed-forward network on heart disease data for a CNN approach to data processing. CNN techniques incorporate feed-forward networks to enhance data processing systems' overall performance and dependability. Input data represented in dimensional scales can be combined with these filters to produce output feature maps. The XCNN approach preserves essential information at each process stage while reducing large feature maps to analyse heart disease.
The fuzzy logic structure in the proposed model is used to handle uncertainty in phonocardiogram (PCG) features by transforming the fuzzy inputs using membership functions. A set of fuzzy IF-THEN rules is then generated and trained by optimizing the membership function parameters integrated with XCNN (Explanatory Convolutional Neural Network), based on the number of convolutional layers, kernel sizes, and number of pooling layers. A typical XCNN architecture includes two to three transformable layers (e.g., 32 and 64 layers with kernel sizes of 5 and 3, respectively), followed by max-pooling, block normalization, ReLU implementation, a fully connected layer, and dropout before the final softmax or sigmoid output layer. Training is typically done using binary cross-entropy loss, Adam optimizer, learning rate 0.001, and about 50–100 epochs. The main missing component is a clear description of how the fuzzy system interfaces with the XCNN. This integration of the fuzzy inference system can act as a preprocessor, converting the extracted PCG features into high-level fuzzy risk scores, which are then combined with the CNN feature maps or provided as additional input to the fully connected layers of the XCNN.
Furthermore, it supports layers of important information or activities at each process stage. Input-output mapping is an essential feature of the activation functions in all neural networks.
Furthermore, the weighted sum of the neuron's inputs and dependencies can be used to evaluate the input values. The fully connected layer receives input from the final convolutional and pooling layer. Moreover, using the output layer to represent the preceding input layer, XCNN can identify and predict data related to heart diseases. The loss function in the output layer of the XCNN model considers the prediction error brought on by the training samples. After the fully connected layer, all vectors can be predicted utilizing binary classification in the SoftMax function to analyse heart disease data.
The results in Figure 6 demonstrate the effectiveness of using the Xception processing for heart sound prediction. This exception module approach accurately predicts cardiac data using input and convolutional layers combined with adaptive mean pooling layers. The dot product between the input weights is computed using the non-linear activation function of the convolutional layer output, as described in Equation 45. Where Kh−feature map, f-feature, ph−weight, V-bias.
The sigmoid function in Equations 46, 47 evaluates input values of the processing function between 0 and 1. It also computes the output for the input of error backpropagation, an actual number between −1 and 1, as demonstrated in equation. R-back propagation, s-sigmoid function.
Converting all input values to positive numbers is a standard feature in the CNN context. As shown in Equation 48, the main advantage of ReLU is its lower computational burden. When the gradient is passed to the ReLU function, it can improve the regulation of neuron activation. Where Relu−rectified linear unit.
Instead of disconnecting the ReLU negative input, this activation function is used to resolve the Dying ReLU problem. Utilize the ReLU to calculate the leakage coefficient value described in Equation 49. Let's assume the m-leakage factor, L-leakage.
Equations 50, 51 show that the noisy ReLU function converts ReLU into a noisy function by incorporating Gaussian distribution. Moreover, the leakage factor model in this procedure can be refined through training and assessed using parametric linear units. Let us assume D-noisy, c-max value, parametric linear, u-learnable weighted, j-Gaussian output distributed.
The cross-entropy or Softmax loss function is commonly utilized in multiclass classification as an alternative to the Softmax and core error loss functions. The output is derived from the probability distribution function using the Softmax implementation in the output layer. The output class probability can be calculated as shown in Equation 52. Let's assume pi− preceding layer, N- number of neuronsnon-normalized output.
Using Equation 53, compute the cross-entropy loss function. Where K-Hinge Loss Function.
Below, Equation 54 presents the mathematical expression used to estimate the mean error Euclidean loss, commonly used in regression problems.
Equation 55 calculates the hinge loss function for binary and maximal edge-based classification to maximize the edge around the binary target class. Where J-desired output, c-margin dual objective class, wx−denote predicted output.
Multi-classification with the ReLu function can predict vectors and analyse cardiac data. Utilizing ReLu in the output layer can also yield the probability distribution functions.
4 Evaluation and performance metrics
The results are tested under various categories in different datasets and feature limits accordingly; the PACAL database contains 656 heart sound recordings from de-identified patients by implementing the Pascal Challenge database. This dataset gathered from https://istethoscope.peterjbentley.com/heartchallenge/index.html. Furthermore, PCG signals were recorded at a sample rate of 4,000 Hz and ranged in duration from 1 to 30 s. After that, cardiac sounds can be captured in clinical and non-clinical contexts and categorized as heart sounds, standard sounds, murmurs, and artifacts. Similarly, static tones, noise, and premature seizures are characterized by implementing the Pascal challenge database. In addition, clinicians manually entered baseline heart sound levels into the Pascal Challenge database.
The Similar Physic Net dataset comprises 3,126 simultaneous recordings of PCG and ECG between 10 and 60 s. The dataset utilized in the Physio Net challenge included 2,435 cardiac recordings from 1,297 patients and was divided into training and testing sets. The PCG signal duration ranges from 8 to 312.5 s. Due to varying devices and sampling rates during data collection, each PCG signal was down-sampled to 2,000 Hz. Sounds from four sites (pulmonary artery, tricuspid valve, and mitral valve) were analyzed in healthy individuals and those with valvular heart conditions. An unbalanced data set is produced when the number of records in the training and test sets is much more than the number of anomalous records. In addition, the HSCT-11 database is the largest echocardiographic database in the field. In addition, it has heart sounds obtained from 206 individuals, which means that 157 individuals can be classified as male and 49 as female. After collecting data from each person, it can be predicted that the average queue length is 45 s, with a minimum of 20 s and a maximum of 70 s.
The Circor Digi Scope database, the largest publicly available heart sound dataset (https://www.kaggle.com/datasets/ bjoernjostein/the-circor-digiscope-phonocardiogram-dataset-v2), is a comprehensive resource for detailed analysis. It contains 5,282 recordings from various auscultation sites on the body, with most heart sounds divided into 200,464 recordings. The quality of these recordings was assessed by cardiac physiologists, leading to a thorough murmur characterization and classification. This detailed analysis examines the timing, grading, shape, quality, and location of auscultation, providing a wealth of information for researchers and clinicians.
The Heart Sound Shenzhen dataset, containing 845 PCG signal recordings from 170 individuals, is a comprehensive resource covering a wide range of heart diseases, including coronary artery disease, valvular disease, and congenital heart disease. The diversity of the HSS dataset significantly contributes to understanding the acoustic properties associated with various cardiac diseases. The PCG recordings in the HSS dataset were sampled at 4 kHz to ensure accuracy in the heart sound modeling.
A filter enables the PCG signal's decibel ratio based on heart sound variation for feature estimation as shown in Table 9. The empirical methods of signal analysis, using different dB levels to indicate signal types in PCG heart sounds, estimate the optimal accuracy of EMDAMF for disease type approximation. Various methods were employed to assess the accuracy prediction in analyzing AS heart sound variability within the PCG signal. The thorough evaluation included the SNR of de-noising heart signal wavelet rate −7.21, CEMD + median −9.43, and CEEMD −10.22. The AS heart disease category achieves an accuracy signal-to-noise ratio estimate of 11.54 in the EMDAMF approach. Similarly, heart diseases like MR, MS, MVP, and NHF did not yield precise ratios using the methods above. However, the EMDAMF model indicated higher accuracy ratios of 13.16, 15.22, 14.32, and 16.25 for heart conditions such as MR, MS, MVP, and NHF, respectively. This suggests the potential efficacy of the EMDAMF model in diagnosing specific heart sounds.
By dividing the dataset into “k” subsets (or folds), the model is trained on k-1 folds and validated on the remaining folds. This process is repeated “k” times, each fold serving as a validation set, and the results averaged to provide a more robust estimate of the model's accuracy.
As shown in Figure 7, K5-fold cross-validation is a resampling technique that assesses classification algorithms on limited data samples. The results obtained from K5-fold cross-validation are considered less biased or unreliable compared to other methods like train/test separation. Moreover, K represents the number of folds divided into approximately equal-sized subsamples. Utilizing the proposed XCNN method within K5-fold Cross-Validation showcases a notable enhancement, with a remarkable 92.36% increase in folds employed for a random distribution aimed at accuracy prediction. This substantial increase is a testament to the XCNN method's impressive performance. Despite the significant improvement, the error rate-7 associated with the XCNN method is low.
As shown in Figure 8, compared to other methods such as train-test splitting, the results of 10-fold cross-validation indicate that the number of folds is divided into approximately equal-sized sub-samples. Additionally, the accuracy of 10-fold cross-validation is significantly improved to 93.6% in randomly distributed folds used for prediction, thereby substantially enhancing the performance of the proposed XCNN method in fold cross-validation. A 10-fold cross-validation was used to determine the optimal value, thereby improving the reliability of the model results.
As illustrated in Table 10, using datasets like PASCAL, Circor Digi scope, Physio Net Challenge, HSS, and HSCT dataset, sourced from various datasets, can yield precise estimations. Additionally, the total count of recordings of these sounds within each dataset is delineated, for instance, PASCAL-656, Circor Digi scope-5282, Physio Net Challenge-3126, HSS-845, MVP-206, and so forth. Furthermore, cardiac recordings can be used to identify features such as NHF, AS, MR, MS, MVP, and errors, which can be used to predict the overall accuracy of the dataset.
The training and testing scheme analysis, as depicted in Table 11, implicates using the proposed method to analyse data from sources such as Heart Sound PASCAL, Circor Digi Range, Physio Net Challenge, HSS, and HSCT. Together, the accuracy of training and test outcomes is anticipated through the application of various technologies, including DNN, XGBoost, RNN, SMOTE, EMD, LMS, KNN, A-BLSTM, and MLP, all of which are derived from the preceding method—subsequently, the proposed method FIXCNN is employed for processing the provided heart sound dataset and forecasting accuracy. Furthermore, prior techniques analyzed DNN-81.6%, XGBoost-83.4%, RNN-84.6%, SMOTE-82.63%, EMD-88.18%, LMS-85.4%, KNN-89.2%, A-BLSTM-86.09%, and MLP-evaluates the accuracy of such heart sounds as 87.56%, and the suggested XCNN approach has the highest accuracy potential of 95.23%. This approach provides the ability to fully evaluate the predictive accuracy of the selected technique in the context of heart sound analysis, instilling confidence in the thoroughness of the evaluation process.
As indicated in Figure 9, the proposed XCNN method can accurately predict the cardiac sound PCG signals using datasets such as Circor Digi scope, Physico-Net Challenge, HSCT, HSS, and PASCAL. Moreover, the PCG signal performance-based assessment of precision, recall, F1 score, and precision can accurately predict heart sound predictions. Then, Circor DigiScope-94.49%, Physico-Net Challenge-94.85%, HSCT-94.91%, HSS-95.01, and PASCAL-95.08% improved the accuracy of each performance measure on the heart sound data. Thus, by assessing the PCG signal's efficacy on cardiac sounds, the accuracy of these has been improved.
The XCNN met hod demonstrates the capability to accurately predict the performance of cardiac sound PCG signals using datasets illustrated in Figure 10 and Table 12. This has significant practical implications, as it outperforms previous methodologies employing DNN, CN N, and XGBoost for heart sound accuracy detection. The manipulated versions of DNN, CNN, and XGBoost exhibited enhanced accuracies of 89.78%, 90.23%, and 93.69%, respectively. However, the XCNN method achieved the highest accuracy rate of 95.08%, establish ing itself as a promising method for precise heart sound analysis and offering new possibilities for healthcare technology.
Figure 11 compares time complexity performance with 106, 217, 359, 567, and 656 values. The proposed method attains 19.36 ms for 656 values; similarly, the existing method attained 27.17 ms, 24.51 ms, and 23.21 ms for DNN, XGBoost, and CNN, respectively.
Figure 12 shows the Polygon Area Metric (PAM) performance using the proposed XCNN method. In the analysis, PAM is user metrics are precision, recall, F1-score, accuracy, AUC and Jaccard Index (JI). The proposed method attains better results using various datasets like Pascal, circor and physio-cardnet.
Figure 13 describes the Kappa coefficient performance with various datasets like pascal, circor and physio-cardnet. The kappa coefficient estimate the agreement among classification and truth values in the dataset. The proposed method obtains the high kappa coefficient performance than other methods.
4.1 Discussion
This study, which meticulously uses digital heart rate recordings and state-of-the-art ML algorithms, is designed to identify valvular heart disease (VHD) in the general population, including instances with no symptoms and intermediate phases of the disease. An RNN was trained to predict heart murmurs using annotated recordings from a digital stethoscope at four auscultation levels in 2,124 participants. The predicted sounds were then utilized to predict VHD detected by ECG.
Our research has provided specific and informative data on the accuracy of aortic stenosis (AS) detection. At the lower curve, we observed a 90.9% accuracy, 94.5% specificity, and an AUC value of 0.979 (CI: 0.963–0.995) for mild AS identification, with an accuracy of 0.993 (CI: 0.989–0.997). The AUC values for moderate or high aortic arch and mitral regurgitation (AR and MR) were 0.634 (CI: 0.565–703) and 0.549 (CI: 0.506–0.593), respectively, which increased to 0.677 and 0.766 with the inclusion of the variable.
Variability measures such as standard deviation (eg, precision = 91.2% ± 2.3%) and 95% confidence interval [AUC = 0.85 (0.83–0.91)] were frequently missing. These metrics are important for assessing the model's consistency and generalizability across different cross-validation combinations or test sets. The AUC for predicting symptomatic events was higher for AR and MR, 0.756 and 0.711, respectively. Incorporating screening for symptomatic regurgitation or stenosis, the AUC was 0.86, with 97.7% of AS cases (n = 44) and all 12 MS cases detected.
Figure 14 defined as, This matrix describes the performance of the neural network model in terms of target class and output class values. The Curve Analysis (CA) for the model revealed that when the threshold probability of an individual was between 20% and 95%, application of this model to predict the heart disease prediction analyzing the AUC values of 0.856 [95% CI: 0.804, 0.908; P < 0.001]. The results showed that the test p = 0.098. According to the maximum principle of the index, the optimal cut-off value of AUC was 0.174, in which the accuracy was 77.3% and the sensitivity was 78.5%, indicating that the model had a good fit and a good fit with the data (Figure 13). After 2,000 recordings resampling iterations, the AUC was 0.852, showing that the model had good ability.
5 Conclusions
To conclude, the proposed system achieves high performance under various testing performances, proving that the proposed system achieves higher detection accuracy than the existing systems. This work uses the feature selection and classification approach to extract the heart sound signal contaminated by noise and murmur into a set of single-component signals to classify the result effectively. Furthermore, the most appropriate Intrinsic Mode Functions (IMF) can be selected to represent the undistorted fundamental heart sound signal. After that, EMDAMF can detect the presence of murmurs in the heart sound signals utilizing the PCG signal frequency limits for feature selection. Furthermore, the EMDAMF algorithm theoretically offers improved spectral separation compared to the EMD method while also managing the issue of mode combining present in the EMD method. In conclusion, creating an LV-PSO based on FIXCNN for early heart risk prediction signifies progress in healthcare technology. Furthermore, the accuracy performance evaluation of previous methods is presented: CNN−87.23%, XGBoost−90.17%, and RNN−86.18%. The proposed method improves the precision performance of the XCNN technique to 95.08%. This approach could revolutionize the diagnosis and treatment of heart disease by combining advanced ML techniques with innovative pre-processing methods, saving lives and improving patient outcomes.
Data availability statement
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding authors.
Author contributions
DPD: Formal analysis, Supervision, Conceptualization, Methodology, Writing – review & editing, Data curation, Software, Resources, Writing – original draft, Investigation. CP: Supervision, Writing – review & editing, Investigation, Resources, Formal analysis, Validation.
Funding
The author(s) declare that no financial support was received for the research and/or publication of this article.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Generative AI statement
The author(s) declare that no Gen AI was used in the creation of this manuscript.
Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.
Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
References
Ali, S. N., Shuvo, S. B., Al-Manzo, M. I. S., Hasan, M., and Hasan, T. (2023). An end-to-end deep learning framework for real-time denoising of heart sounds for cardiac disease detection in unseen noise. IEEE Access 11, 87887–87901. doi: 10.36227/techrxiv.19950155.v3
Alzubaidi, L., Zhang, J., Humaidi, A. J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., et al. (2021). Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J. Big. Data 8:53. doi: 10.1186/s40537-021-00444-8
Amiriparian, S., Schmitt, M., Cummins, N., Qian, K., Dong, F., and Schuller, B. (2019). Deep learning for heart sound classification. IEEE Access 7, 85694–85703. doi: 10.1109/ACCESS.2019.2939495
Bentley, P., Nordehn, G., Coimbra, M., and Mannor, S. (2011). The PASCAL Classifying Heart Sounds Challenge 2011. Available online at: http://www.peterjbentley.com/heartchallenge/index.html (Accessed November, 2011).
Bourouhou, A. J., Nacir, C., and Hammouch, A. (2019). Heart sounds classification for a medical diagnostic assistance. Int. J. Online Biomed. Eng. 15, 88–103. doi: 10.3991/ijoe.v15i11.10804
Deng, M., Meng, T., Cao, J., Wang, S., Zhang, J., and Fan, H. (2020). Heart sound classification based on improved mfcc features and convolutional recurrent neural networks. Neural Netw. 130, 22–32. doi: 10.1016/j.neunet.2020.06.015
Dewangan, N. K., Shukla, S. P., and Dewangan, K. (2008). PCG signal analysis using discrete wavelet transform. Int. J. Adv. Manag. Technol. Eng. Sci. 8, 2249–7455.
Dhiyanesh, B., Ammal, S. G., Saranya, K., and Narayana, K. E. (2024). Advanced cloud-based prediction models for cardiovascular disease: integrating machine learning and feature selection techniques. SN Comput. Sci. 5:572. doi: 10.1007/s42979-024-02927-w
Dhiyanesh, B., Vijayalakshmi, M., Saranya, P., and Viji, D. (2025). EnsembleEdgeFusion: advancing semantic segmentation in microvascular decompression imaging with innovative ensemble techniques. Sci. Rep. 15:17892. doi: 10.1038/s41598-025-02470-5
Er, M. B. (2021). Heart sounds classification using convolutional neural network with 1d-local binary pattern and 1d-local ternary pattern features. Appl. Acoust. 180:108152. doi: 10.1016/j.apacoust.2021.108152
Fu, Q., Li, S., and Wang, X. (2020). MSCNN-AM: a multi-scale convolutional neural network with attention mechanisms for retinal vessel segmentation. IEEE Access 8, 163926–163936. doi: 10.1109/ACCESS.2020.3022177
Gárate-Escamila, K., Hajjam El Hassani, A., and Andrès, E. (2020). Classification models for heart disease prediction using feature selection and PCA. Informatics Med. Unlocked 19:100330. doi: 10.1016/j.imu.2020.100330
Gomathi, R., Gnanavel, S., Narayana, K. E., and Dhiyanesh, B. (2024). ACGAN: adaptive conditional generative adversarial network architecture predicting skin lesion using collaboration of transfer learning models. Automatika 65, 1458–1468. doi: 10.1080/00051144.2024.2396167
Hannah Pauline, S., and Dhanalakshmi, S. (2022). A robust, low-cost adaptive filtering technique for phonocardiogram signal denoising. Sig. Process. 201:108688. doi: 10.1016/j.sigpro.2022.108688
Hettiarachchi, R., Haputhanthri, U., Herath, K., Kariyawasam, H., Munasinghe, S., Wickramasinghe, K., et al (2021). “A novel transfer learning-based approach for screening pre-existing heart diseases using synchronized ECG signals and heart sounds,” in 2021 IEEE International Symposium on Circuits and Systems (ISCAS), 1–5.
Jha, R. K., Henge, S. K., Mandal, S. K., Sharma, A., Sharma, S., Sharma, A., et al. (2022). Neural fuzzy hybrid rule-based inference system with test cases for prediction of heart attack probability. Math. Probl. Eng. 2022:3414877. doi: 10.1155/2022/3414877
Jones, B., Smith, A., and Lee, C. (2019). “Recurrent neural networks for atrial fibrillation prediction in sound signals,” in Proceedings of the International Conference on Artificial Intelligence in Medicine (Springer), 45–56.
Kadu, A., Singh, M., and Ogudo, K. (2022). A novel scheme for classification of epilepsy using machine learning and a fuzzy inference system based on wearable-sensor health parameters. Sustainability 14:15079. doi: 10.3390/su142215079
Kiranyaz, S., Zabihi, M., Rad A, B., Ince, T., Hamila, R., and Gabbouj, M. (2020). Real-time phonocardiogram anomaly detection by adaptive 1d convolutional neural networks. Neurocomputing 411, 291–301. doi: 10.1016/j.neucom.2020.05.063
Krishnan, P. T., Balasubramanian, P., and Umapathy, S. (2020). Automated heart sound classification system from unsegmented phonocardiogram (pcg) using deep neural network. Phys. Eng. Sci. Med. 43, 505–515. doi: 10.1007/s13246-020-00851-w
Li, H., Zhu, Y., and Lin, X. (2020). Early detection of heart murmurs using deep learning techniques. Comput. Biol. Med. 118:103647.
Li, Y., Zhao, Y., Tang, M. T., and Wu, S. (2020). Heart disease prediction based on sound signals using deep learning techniques. IEEE Trans. Biomed. Eng. 67, 1432–1441.
Lubaib, P., and Ahammed Muneer, K. V. (2016). The heart defect analysis based on PCG signals using pattern recognition techniques. Proc. Technol. 24, 1024–1031. doi: 10.1016/j.protcy.2016.05.225
Mahalakshmi, N. V., and Rout, R. K. (2023). Effective heart disease prediction using improved particle swarm optimization algorithm and ensemble classification technique. Soft Comput. 27, 11027–11040. doi: 10.1007/s00500-023-08388-2
Manuel Centeno-Bautista, A., Angel Rangel-Rodriguez, H., and Andrea Perez-Sanchez, V. (2023). Settings order article reprints open access article electrocardiogram analysis using empirical mode decomposition-based methods and convolutional neural networks for sudden cardiac death detection. Appl. Sci. 13:3569. doi: 10.3390/app13063569
Martinek, R., Nedoma, J., Fajkus, M., and Kahankova, R. (2017). A phonocardiographic-based fiber-optic sensor and adaptive filtering system for non-invasive continuous fetal heart rate monitoring. Sensors 17:890. doi: 10.3390/s17040890
Mienye, I. D., and Sun, Y. (2021). Improved heart disease prediction using particle swarm optimization based stacked sparse autoencoder. Electronics 10:2347. doi: 10.3390/electronics10192347
Muthulakshmi, P., and Parveen, M. (2023). Z-score normalized feature selection and iterative african buffalo optimization for effective heart disease prediction. Int. J. Intell. Eng. Syst. 16, 25–37. doi: 10.22266/ijies2023.0228.03
Naveenkumar, E., Dhiyanesh, B., Kanna, R. R., Diwakar, P. S., Murali, M., and Radha, R. (2022). “Detection of lung ultrasound COVID-19 disease patients based convolution multifacet analytics using deep learning,” in 2022 Second International Conference on Artificial Intelligence and Smart Energy (ICAIS) (Coimbatore: IEEE), 185–190. doi: 10.1109/ICAIS53314.2022.9743061
Nwonye, M. J., Narasimhan, V. L., and Mero, Z. A. (2021). “Coronary heart disease analysis using two deep learning algorithms, CNN and RNN and their sensitivity analyses,” in Deep Learning for Personalized Healthcare Services, eds. V. Jain, J. M. Chatterjee, H. Hedayati, S. Krit, and O. Deperlioglu (Berlin; Boston, MA: De Gruyter), 205–224. doi: 10.1515/9783110708127-010
Oliveira, J., Renna, F., Costa, P. D., Nogueira, M., Oliveira, C., Ferreira, C., et al. (2021). The circor digiscope dataset: from murmur detection to murmur classification. IEEE J. Biomed. Health Inform. 26, 2524–2535. doi: 10.1109/JBHI.2021.3137048
Othman, M. Z., and Khaleel, A. N. (2017). Phonocardiogram signal analysis for murmur diagnosing using Shannon energy envelop and sequenced dwt decomposition. J. Eng. Sci. Technol. 12, 2393–2402.
Pan, H., Wang, B., and Jiang, H. (2015). Deep learning for object saliency detection and image segmentation. arXiv [Preprint]. arXiv:1505.01173. doi: 10.48550/arXiv.1505.01173
Prabhakar, S. K., and Won, D.-O. (2023). Phonocardiogram signal classification for the detection of heart valve diseases using robust conglomerated models. Expert Syst. Applic. 221:119720. doi: 10.1016/j.eswa.2023.119720
Ramkumar, M., Shanmugaraja, P., Anusuya, V., and Dhiyanesh, B. (2024). Identifying cancer risks using spectral subset feature selection based on multi-layer perception neural network for premature treatment. Comput. Methods Biomech. Biomed. Eng. 27, 1804–1816. doi: 10.1080/10255842.2023.2262662
Raza, A., Mehmood, A., Ullah, S., Ahmad, M., Choi, G. S., and On, B.-W. (2019). Heartbeat sound signal classification using deep learning. Sensors 19:4819. doi: 10.3390/s19214819
Schanze, T. (2017). Removing noise in biomedical signal recordings by singular value decomposition. J. Curr. Direct. Biomed. Eng. 2, 253–256. doi: 10.1515/cdbme-2017-0052
Schmidt, S. E., Holst-Hansen, C., Graff, C., Toft, E., and Struijk, J. J. (2010). Segmentation of heart sound recordings by a duration-dependent hidden Markov model. Physiol. Meas. 31, 513–529. doi: 10.1088/0967-3334/31/4/004
Sh-Hussain, H., Mohamad, M. M., Ting, C.-M., and Zahilah, R. (2016). Heart sound monitoring system. ARPN J. Eng. Appl. Sci. 11, 4748–4755.
Shuvo, S. B., Alam, S. S., Ayman, S. U., Chakma, A., Barua, P. D., and Acharyam, U. R. (2023). Nrc-net: automated noise robust cardio net for detecting valvular cardiac diseases using optimum transformation method with heart sound signals. Biomed. Signal Process. Control 86:105272. doi: 10.1016/j.bspc.2023.105272
Shuvo, S. B., Ali, S. N., Swapnil, S. I., Al-Rakhami, M. S., and Gumaei, A. (2021). Cardioxnet: a novel lightweight deep learning framework for cardiovascular disease classification using heart sound recordings. IEEE Access 9, 36955–36967. doi: 10.1109/ACCESS.2021.3063129
Sivakami, M., and Prabhu, P. (2023). An optimized hybrid cuckoo search bio-inspired deep belief network for heart disease prediction. J. Theoret. Appl. Inform. Technol. 101, 7088–7099.
Springer, D. B., Tarassenko, L., and Clifford, G. D. (2015). Logistic regressions-based heart sound segmentation. IEEE Trans. Biomed. Eng. 63, 822–832. doi: 10.1109/TBME.2015.2475278
Taylan, O., Alkabaa, A. S., Alqabbaa, H. S., Pamukçu, E., and Leiva, V. (2023). Early prediction in classification of cardiovascular diseases with machine learning, neuro-fuzzy and statistical methods. Biology 12, 1–31. doi: 10.3390/biology12010117
Thakkar, D., and Agrawal, P. (2023). “Hybrid feature selection and optimized deep CNN for heart disease prediction 2023,” in 2nd International Conference on Paradigm Shifts in Communications Embedded Systems, Machine Learning and Signal Processing (PCEMS) (Nagpur: IEEE), 1–6. doi: 10.1109/PCEMS58491.2023.10136121
University of Michigan Health System (2015). Michigan Heart Sound and Murmur database (MHSDB). Available online at: https://www.Med.umich.edu/lrc/psb/heartsounds/index.htm (Accessed April 2022).
Vahanian, A., Beyersdorf, F., Praz, F., Milojevic, M., Baldus, S., Bauersachs, J., et al. (2021). 2021 ESC/EACTS guidelines for managing valvular heart disease. Eur. J. Cardio-Thorac. Surg. 60, 727–800. doi: 10.1093/ejcts/ezab389
VenkataHari Prasad, G., and Rajesh Kumar, P. (2015). Analysis of various DWT methods for feature extracted PCG signals. Int. J. Eng. Res. Technol. 4, 1279–1290. doi: 10.17577/IJERTV4IS041236
Waaler, P. N., Melbye, H., Schirmer, H., Johnsen, M. K., Donnem, T., Ravn, J., et al. (2023). |Algorithm for predicting valvular heart disease from heart sounds in an unselected cohort. Front. Cardiovasc. Med. 10:1170804. doi: 10.3389/fcvm.2023.1170804
Wang, H., Zhang, Y., Li, X., Chen, J., and Liu, Q. (2018). Recurrent neural networks for heart disease prediction from sound signals. Neural Comput. Applic. 28, 1765–1774.
Wang, J., Peng, X., Peng, H., Jiang, Y., and Zhao, D. (2018). Deep learning for heart disease prediction: a review. J. Healthc. Eng. 2018:5179230.
Waqar, M., Dawood, H., Dawood, H., Majeed, N., Banjar, A., and Alharbey, R. (2021). An efficient SMOTE-based deep learning model for heart attack prediction. Sci. Programm. 2021:6621622. doi: 10.1155/2021/6621622
Yang, J., and Guan, J. (2022). A heart disease prediction model based on feature optimization and Smote-Xgboost algorithm. Information 13:475. doi: 10.3390/info13100475
Yang, Y., Guo, X.-M., Wang, H., and Zheng, Y.-N. (2021). Deep learning-based heart sound analysis for left ventricular diastolic dysfunction diagnosis. Diagnostics 11:2349. doi: 10.3390/diagnostics11122349
Yuliandari, D., Yudhistira, Y., Wuryanto, A., Sidik, S., and Ayu Sariasih, F. (2024). Improving the accuracy of heart failure prediction using the particle swarm optimization method. Sinkron 8, 210–220. doi: 10.33395/sinkron.v9i1.13017
Yusuf Ilu, S., and Prasad, R. (2023). Improved autoregressive integrated moving average model for COVID-19 prediction by using statistical significance and clustering techniques. Heliyon 9:e13483. doi: 10.1016/j.heliyon.2023.e13483
Zhao, Y., Zhong, Z., Li, Y., Shao, D., and Wu, Y. (2023). Ensemble empirical mode decomposition and stacking model for filtering borehole distributed acoustic sensing records. Geophysics 88, WA319–WA334. doi: 10.1190/geo2022-0273.1
Keywords: heart disease, phonocardiogram signals, feature dimensions, swarm optimization, fuzzy inference, Xception CNN, diastolic and systolic differences, non-linear scaling
Citation: Devi DP and Palanisamy C (2025) Early heart disease prediction using LV-PSO and Fuzzy Inference Xception Convolution Neural Network on phonocardiogram signals. Front. Neuroinform. 19:1655003. doi: 10.3389/fninf.2025.1655003
Received: 27 June 2025; Accepted: 01 August 2025;
Published: 01 October 2025.
Edited by:
Mohd Dilshad Ansari, SRM University (Delhi-NCR), IndiaReviewed by:
Rajkumar G., SASTRA University, IndiaAji Purwinarko, Universitas Negeri Semarang, Indonesia
Copyright © 2025 Devi and Palanisamy. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: D. Prabha Devi, cHJhYmhhZGV2aWQyMEBnbWFpbC5jb20=; C. Palanisamy, Y3BhbGFuaXNhbWkwOEBnbWFpbC5jb20=