FPGA-Based High-Performance Phonocardiography System for Extraction of Cardiac Sound Components Using Inverse Delayed Neuron Model

The study focuses on the extraction of cardiac sound components using a multi-channel micro-electromechanical system (MEMS) microphone-based phonocardiography system. The proposed multi-channel phonocardiography system classifies the cardiac sound components using artificial neural networks (ANNs) and synaptic weights that are calculated using the inverse delayed (ID) function model of the neuron. The proposed ANN model was simulated in MATLABR and implemented in a field-programmable gate array (FPGA). The proposed system examined both abnormal and normal samples collected from 30 patients. Experimental results revealed a good sensitivity of 99.1% and an accuracy of 0.9.


INTRODUCTION
Heart diseases are one of the major causes of human death worldwide. In the last 15 years, heart disease and stroke have been the leading killers and causes of death on a global scale (1). Heart failure has no cure, but early detection of its related symptoms helps in properly diagnosing heart diseases and, thus, reducing the death rate. In the modern technological revolution, many heart diagnosis methods like phonocardiography (PCG), ECG, echocardiogram (Echo), cardiac MRI (CMRI), and CT heart scan are available to detect early heart failure. Each method has its own advantages and disadvantages; for example, ECGs are widely used in diagnosis but have trouble detecting the structural abnormalities of the heart valves, which can be detected through heart murmurs. Furthermore, echo, CMRI, and CT scans provide accurate results but need a lot of pre-evaluation, are high cost, and are not affordable for many people.
Phonocardiography is one of the non-invasive methods for the detection of major heart sounds and murmurs (2). The stethoscope was the primary PCG instrument that was used to detect cardiac auscultation; however, it had limitations in terms of clinical expertise for the analysis of the low-frequency amplitudes formed during heart failures, such as systolic and diastolic murmurs (3,4). A great deal of research has been conducted in the field of heart sound segmentation and classification in order to detect normal and abnormal components. Many signal processing algorithms are proposed to extract heart sound components for detecting pathological events (5)(6)(7). Artificial neural networks (ANNs) are widely used in cardiology (8)(9)(10). A neural network model is a well-known method for separating the normal and abnormal pathological events in heart sounds (11)(12)(13)(14). A welltrained neural network model can be used to detect the complex relationship between abnormal and normal cardiac sounds. Deep neural networks extract high-level features from low-level features. The methods proposed (15)(16)(17) to play significant roles in solving the non-linear functions in medical applications.
Cardiac auscultations are non-linear in nature and are analyzed using ANNs, which are more useful for the approximation of non-linear functions. The neural networks models are mainly classified by their architecture, activation function, and learning algorithm. Instead of directly emulating biological behavior, the traditional neuron network models translate this behavior in terms of time-averaging techniques (9,18,19). The inverse delayed (ID) function model proposed by Nakajima (20)(21)(22) is a universal neuron model that includes characteristics of both the Bonhoeffer Van der Pol model (19) and the Hopfield model (23). In addition, the ID function model uses the inverse function of the activation function rather than the traditional activation function and features a finite conversion time from the internal state of the element to the output.
The energy function of the ID function model with symmetric synapse weights is similar to that of the Hopfield model. Through selective destabilization, the negative resistance of the ID model can free the neural network state from such local minima (20). Unlike the chaotic neural network, the ID model does not need to transform the output vector, record the output vector during calculation, or control the dynamics by changing the network parameters (22). We only need to wait for it to enter an inactive state in order to find a solution using the ID neural network and a simple method of implementation. The ID model is capable of resolving combinatorial optimization problems (21). The negative resistance of the ID model can destabilize the stable equilibrium points of a neural network, reducing the possibility of unknown values in suboptimal synaptic weight solutions obtained using an ANN based on a traditional neuron model. The ANN implementation using the ID neuron model needs numerous parallel computations to solve the real-time complex data for the extraction of components. The advances in field-programmable gate arrays (FPGAs) handle these real-time complex computations effectively and improve the performance of the system.
The current study concentrated on the non-invasive detection of cardiac component abnormalities in raw samples collected with micro-electromechanical system (MEMS)-based microphones. Wavelet decomposition algorithms were used to generate the featured set. The ID neuron function was used to create the ANN model, which extracted the cardiac components from the feature set obtained. The ID function model of the neuron was used to optimize the weights of the synapses between the neurons. The entire algorithm was implemented on a Xilinx SoC FPGA XC7Z020CLG400 (Xilinx, USA). The proposed system has been validated using the sensitivity, specificity, and accuracy of the cardiac components, and justified using receiver operating characteristic curve analysis. The study is organized as follows: Section Theory focuses on the theory supporting the proposed method using ID function model of the neuron and illustrates the feature sets for cardiac sound assessment. Section Materials and Methods focuses on the materials and methods for the proposed technology. The results and discussions are presented in Section Experimental Results and Discussion.

Inverse Delayed Neuron Function Model of the Neuron
The ANN was realized using the ID function model of the neuron. Nakajima and Hayakawa proposed the ID function model of the neuron by the following set of equations: Where u i is the ith neuron internal state, x j is the jth neuron output, W ij is the synaptic weight between jth and ith neurons, h i is the bias input, a ii is the self-connection synaptic weight, τ is the internal state time constant, and τ x is the neuron output time constant. From Equation (3), f(x) is the neural network sigmoid function, then g(x) = f −1 (x) is the N-shaped inverse output function (24). g(x) can be changed with a positive value of K times output of the neuron. The transition time from u to x is less than τ and thus τ x < < τ . In general, the transition time should be taken into account if it is significantly less than τ . For the present problem, we used the self-connectionless neurons to devoid the hysteresis effects (24). So a ii = 0 in Equation (1).
Differentiating Equation (2) with respect to time t, we get Let us consider Substituting Equation (5) and (4) gives Where  U i denotes the potential of the ID function model of the neuron. In Equation (7), the first term denotes the inertia and the second term denotes the friction. If g(x i ) is an N-shaped function, then the area where dg(x) dx i is less than −τ x τ for specific values of x i is called the negative resistance region. From the Lyapunov function, the energy of the ID function model is Since the proposed neuron network has self-connectionless neurons, the self-connections between neurons are ignored. The last term in Equation (8) shows the time delay in the ID function model. Differentiating both sides of Equation (8) with respect to time t, we get From Equation (11), the energy (E) of the ID model, similar to that of the Hopfield model, decreases with time if the network state is in the positive resistance region (φi > 0). However, in the negative resistance region (φi < 0), the energy (E) increases with time; thus, even if the state is in the minima region, it quickly exits this region. It is necessary to have this feature in order to avoid local minima. As a result, if the network is an ID function model, the likelihood of escaping the local minima is expected to increase.

Feature Sets for Cardiac Sound Assessment
The assessment of the cardiac sound components involved different parameters that were deferred by a set of features to sort out the components from the heart sounds. In the current research, the Springer segmentation algorithm by Springer et al. (25) was used to differentiate the heart sounds using timing intervals of S1, S2, systole, and diastole. The following parameters were used as feature sets to identify the low-frequency abnormal components and normal heart sound components. Mean of the Systolic to diastolic time interval ratio of each heart sound.

F3
Mean of the heart sound peak energy in systolic cycle to total cardiac cycle energy of each heart beat 4 F4 Mean of the heart sound peak energy in diastolic cycle to total cardiac cycle energy of each heartbeat.

F5
Mean of spectral frequencies from 10 to 900 Hz with window resolution of 10 Hz in systole cycle of each heartbeat.

F6
Mean of spectral frequencies from 10 to 900 Hz with window resolution of 10 Hz in diastolic cycle of each heartbeat.

High-Performance Phonocardiography System Hardware
The proposed high-performance phonocardiography system was developed based on the advanced MEMS microphone to capture the low-frequency components and analyze the captured data using a proposed algorithm based on the inverse delayed neuron model. The proposed algorithm was implemented on the Xilinx Zynq-7 System on-chip FPGA, which has a dual-core ARM cortex-A9 for application software and programmable logic for algorithm complex computations. The detailed block diagram for the high-performance phonocardiography system is as shown in Figure 1.
A cardiac sound detection unit consists of a MEMS microphone, which is a tiny integrated circuit with a sound transducer, an analog front end, and a signal conditioning circuit (7). The MEMS microphone has a high Signal-to-Noise Ratio (SNR) at 70 dB and a good frequency response from 10 Hz to 10 kHz as shown in Figure 2. Due to its flat response and high SNR in the lower region, a cardiac sound detection (CSD) is more suitable for the detection of the third and fourth heart sounds and murmurs. The CSD module has four microphones, the placements of which were on the basis of sound source localization to cover the four heart valves (aortic valve, tricuspid valve, mitral valve, and pulmonary valve). These valves are the origins of cardiac sounds.  A high-speed data processing unit consists of a Xilinx Zynq-7 System on-chip FPGA, which has a dual-core ARM cortex-A9 processor for the application software and programmable logic for algorithm complex computations. The high-speed data processing (HSDP) module is responsible for the separation of cardiac sounds based on the frequencies and processes used in the proposed ANN-based ID model neuron. The Graphical Display Recorder (GDR) unit consists of an LCD touchpad for parameter configuration and a display for analyzing the results for further diagnosis. The application software was developed on an ARM cortex-A9 processor and interfaced to the GDR module.
The current system implemented was in two phases. In the first phase, the proposed algorithm was modeled using the MATLAB, simulated with different test parameters, and baselined as a golden reference for further hardware system development. The proposed neural network model was implemented on FPGA in the second phase to achieve performance comparable to the MATLAB model. The cardiac sounds detected by the CSD unit were then passed to the HSDP unit for the extraction of the cardiac feature set mentioned in Table 1. Afterward, these feature sets were given as inputs to the hidden layer and known spectrograms, which would train the network and predict the error deviation to assess the features of the cardiac sounds.

Prediction Model Using ID Function Model of the Neuron
The proposed ANN algorithm is based on a feedforward network with three layers as shown in Figure 3. The first layer has six inputs for the six feature sets mentioned in Table 1. The second layer is a hidden layer that consists of 12 hidden neurons that compute the delayed weighted sum of inputs and the inverse tangent sigmoid non-linear function for the feature extraction. The third layer is an output layer that consists of five output neurons. It is a logical net to reduce the error in extraction and sends the output based on the input from the hidden layer.

Realization of Inverse Activation Function
An activation function is used to present non-linearity into the output of the neuron. For the current work, an inverse tan hyperbolic function is taken as the activation function.
f (x) = x+ x 3 3 Neglecting the higher terms f(x) becomes Equation (16) is realized using a constant (1/3), adder, and multiplier. The activation limits the output in the range of (1,−1). Figure 4 shows that the activation function in the ID model is an N-shaped transfer function.
The ID network, which consisted of 12 hidden neurons in the hidden layer, was trained using MATLAB R . Learning was accomplished through the use of the Levenberg-Marquardt backpropagation algorithm. Backpropagation was used to obtain the input and layer weight matrices by incorporating the derivatives of the inverse functions. These matrices were used to replicate the ID network onto the FPGA. The neural network was trained using the ID function model. The result of the regression coefficient "R" is shown in Figure 13.

System-Level Implementation
The experimental setup with PCG sensors and the Cardiac Health Monitoring System (CHMS) is shown in Figure 5. The proposed  algorithm implementation in FPGA is shown in Figure 6. The prediction model, which was based on the ID function neuron model, was developed in the system generator tool, and the Verilog netlist for RTL integration was generated. The VIVADO design suite 2018.2 was used for simulation and synthesis. The model was successfully dumped onto the Zynq7 -XC7Z010-1CLG400 device FPGA board. The PCG sensor data captured in FPGA internal Block RAM memory were passed through the preprocessing module to remove the unwanted noise. The cleaned data were then processed through the cardiac cycle separation engine to differentiate the systole and diastole cycles. The feature extraction module extracted the features mentioned in Table 1 from systole and diastole. The extracted six featured sets were passed through the ANN model to classify them into five heart sound component groups. The decision logic outputted the true-negative (TN), true-positive (TP), false-negative (FN), and false-positive (FP) for corresponding heart sound components. Figure 7 shows the FPGA hardware board developed for the proposed system.
The ANN with the inverse delayed neuron model was implemented in MATLAB R using the neural network toolbox and the fixed-point toolbox as discussed in previous sections. The model was simulated to check for functionality and then used to calculate the synaptic weights and bias required for the hidden and output layers. The simulated fixed-point MATLAB model was taken as a golden reference for further hardware realization using FPGA.
The Xilinx System generator tool was used for the implementation and generation of the Verilog code for the system integration. As discussed in earlier sections, the ANN model was realized in three stages. For the first stage, in the input layer, the inputs were scaled with weights and passed to hidden neurons. For the second stage, the bias was added to the summed weights by the hidden layer, which then passed through the delayed activation function. For the third stage, the output layer computed the output value from all hidden neurons, the output bias, and the activation function. The sigmoid activation function was a building block that was used in both the hidden and output layers (26). Equation (16) was used to implement the inverse activation function, which consists of an adder, multiplier, and constant value. As shown in Figure 8, the neuron model was realized using a Mult-Add block and a constant block for bias, an activation function. Figure 9 depicts the proposed ANN model implementation. The estimated sigmoid value is closer to the real sigmoid value obtained from MATLAB, allowing the approximation effect to be reduced for improved accuracy when implemented in hardware.

EXPERIMENTAL RESULTS AND DISCUSSION
The extraction of cardiac sounds from the MEMS-based highperformance phonocardiography system using a neural network was based on the ID function model of the neuron. The neural networks based on the ID function model and the     conventional neuron models were realized on FPGA, and the hardware requirements and performance of the two models were compared. The proposed system was validated using the data of 30 patients in accordance with the Declaration of Helsinki. After obtaining informed consent, a total of 60 patients were made available for evaluation, with the data of 30 patients being used for training the neural networks and the data of the remaining 30 patients used for testing the proposed system.
The neural networks were trained offline using data from the 30 patients in MATLAB R , and then implemented on FPGA to reduce design circuitry. The MATLAB R model weights and biases were used as hard-coded values in the FPGA ANN model to reduce computational cycles and achieve the accuracy obtained in the simulation. The FPGA implementation of the inverse tangent sigmoid function, which requires the realization of an N-shaped activation function, involved multiplication but not division. The multiplier was all that was required for the functional units of the ID model. This greatly reduced hardware complexity. The system generator FPGA netlist files were used to run the synthesis, and implement and generate the bit file needed to program the FPGA. Figure 10 depicts the RTL schematic for the neuron model following RTL synthesis. The physical layout of the proposed system is depicted in Figure 11. Figure 12 shows the FPGA resource utilization of the proposed layout after place and route.
The mean square error in the extraction of cardiac sound components detection rate using neural networks with 12 neurons in the hidden layer was .9. Regression analysis was performed on the input and target datasets, and the mean square error was found to be 4.4 × 10 −5 . Figure 13 depicts the regression analysis of the training and validation of the network. In the regression analysis, the parameter "R" = 0.99 represents the correlation between extracted cardiac components and actual cardiac components.
The regression analysis showing the training and validation of the network is depicted in Figure 13. The parameter "R, " which is equal to 0.99 in the regression analysis, signifies the correlation between extracted cardiac components with actual cardiac components. Figure 14 depicts the training state analysis of a neural network based on the ID function model of a neuron. Figure 15 depicts a neural network for performance analysis. Training analysis was performed for epoch 11, with a gradient factor of 9.0661 × 10 −5 and validation checks equal to 6. In accordance with the Declaration of Helsinki, clinical trials were conducted on 30 patients using the proposed highperformance phonocardiography system, and the results were compared to the known test results from the medical practitioner. The prevalence of disease in the tested population, the outcome of the diagnostic test, and the sensitivity and specificity of the test all had an impact on the reliability of any diagnostic test result. The sensitivity, the specificity rate, and the accuracy can be computed as follows where true-negative (TN) represents the number of correct heart sound components rejected, true-positive (TP) represents the number of correct heart sound components detected by the proposed system, false-negative (FN) represents the number of incorrect heart sound components rejected, and false-positive (FP) represents the number of incorrect heart sound components detected by the proposed system. Table 2 displays the sensitivity, specificity, and accuracy of cardiac component detection for a healthy individual under specific disease conditions.   The performance of the proposed method was evaluated using the receiver operating characteristic (ROC) curve area under curve (AUC) value. This validated the extraction of cardiac components from the captured data using the proposed algorithm. The ROC curve for the extraction of cardiac component accuracy for the proposed ID neuron model system is shown in Figure 16. The accuracy of S1 and S2, S3 and S4, aortic stenosis, mitral Stenosis, and mitral regurgitation is 99.3, 98.6, 98.7, 98.7, and 98.6%, respectively, based on the AUC values in Figures 16A-E.

CONCLUSION
The current research focused on the development of multichannel MEMS-based phonocardiography system to capture heart sounds and process the acquired sample to remove unwanted noise and derive a feature set using wavelet transforms. Thereafter, the low frequency cardiac sounds were extracted using the ANN based on the ID function of the neuron model. The neural network was trained using real, known data, and the proposed system was tested using patient test data. The developed ANN-based phonocardiography system was useful to the physician for recognizing abnormal, low-frequency heart sounds with a simple diagnosis setup similar to the stethoscope and visualizing graphical data for better medical diagnoses. The performance of the phonocardiography system was evaluated using 2,150 cardiac cycles of PCG from a cohort of 30 patients with different pathophysiological conditions, resulting in a sensitivity of 99% and an accuracy of 0.9.

DATA AVAILABILITY STATEMENT
The datasets presented in this article are not readily available because of Non-disclosure Agreement with the patients.