A learning-based image processing approach for pulse wave velocity estimation using spectrogram from peripheral pulse wave signals: An in silico study

Carotid-to-femoral pulse wave velocity (cf-PWV) is considered a critical index to evaluate arterial stiffness. For this reason, estimating Carotid-to-femoral pulse wave velocity (cf-PWV) is essential for diagnosing and analyzing different cardiovascular diseases. Despite its broader adoption in the clinical routine, the measurement process of carotid-to-femoral pulse wave velocity is considered a demanding task for clinicians and patients making it prone to inaccuracies and errors in the estimation. A smart non-invasive, and peripheral measurement of carotid-to-femoral pulse wave velocity could overcome the challenges of the classical assessment process and improve the quality of patient care. This paper proposes a novel methodology for the carotid-to-femoral pulse wave velocity estimation based on the use of the spectrogram representation from single non-invasive peripheral pulse wave signals [photoplethysmography (PPG) or blood pressure (BP)]. This methodology was tested using three feature extraction methods based on the semi-classical signal analysis (SCSA) method, the Law’s mask for texture energy extraction, and the central statistical moments. Finally, each feature method was fed into different machine learning models for the carotid-to-femoral pulse wave velocity estimation. The proposed methodology obtained an $R2\geq0.90$ for all the peripheral signals for the noise-free case using the MLP model, and for the different noise levels added to the original signal, the SCSA-based features with the MLP model presented an $R2\geq0.91$ for all the peripheral signals at the level of noise. These results provide evidence of the capacity of spectrogram representation for efficiently assessing the carotid-to-femoral pulse wave velocity estimation using different feature methods. Future work will be done toward testing the proposed methodology for in-vivo signals.


Feature selection
The feature selection was made by using the feature ranking method FQC and then a sensitivity analysis based on the mean R 2 value.

SCSA features selection
Tables S5 and S6 shown the top 5 SCSA features from PPG and BP spectrogram respectively. As is shown, the matrix with more features for the BP and PPG spectrogram was the sum matrix that combines information from the row eigenvalues and the columns eigenvalues giving more information about the general spectrum of the image. In addition, the features based on the ratio between h and the κ ( R and M R) were the most repeated features in the top 5 showing relevance of the relation between the h parameters and the eigenvalues to predict the cf-PWV.
The best feature for the PPG spectrogram were IN V 3sum , IN V 2sum , and M ean(κ) sum for the Radial, Digital and Brachial locations respectively. Additionally, the best features for the BP spectrogram were the ST D(κ) column , E 2 column , and IN V 1row for the Radial, Digital and Brachial locations respectively. This results shows the importance of the different invariants (INV) to describe changes in the spectrograms related to estimation of the cf-PWV values.  Figure S1. Sensitivity analysis for the PPG spectrogram using SCSA-based features

Number of features
After obtain the feature ranking, a sensitivity analysis was made to select the number of features to use ( Figure S1 and S2). Figure S1 show the result for PPG spectrograms were the best R 2 values obtained were 0.91 for the best 11 features for the Radial location, and the best 26 for the Digital location, and 0.92 for the best 13 features. For the Radial and Brachial features the number of features selected were the values named before were the R 2 obtain the maximum value and represent less than 40% of the original feature space. However, for the Digital spectrogram the best 10 features were selected given that the value obtained  Figure S2. Sensitivity analysis for the BP spectrogram using SCSA-based features (0.90) was just only 0.01 less than the maximum and using at least 16 features less. This selection helped to reduce the model complexity and reduce the computational complexity caused by SCSA.
Finally, it can be see in figure S2 that Radial and Digital BP spectrograms presented a similar behavior where the maximum value of R 2 was 0.94. For both signals the number of features selected was 11 with a R 2 of 0.92 using at least 15 and 20 features less and only having a R 2 lower by 0.02. Additionally, the best 14 features with a R 2 of 0.90 were selected from Brachial given that the maximum value of 0.92 (just 0.02 more) was obtained by using 33 features. As it was mentioned in the PPG feature selection, reduce the number of features reduces the computational cost of this method making this method more suitable to be use in the future in more applications where the computational cost can be a limitation. Table S7 show the number of SCSA selected.  Table S8 shows the top 5 ranked features from the 102 computed to each PPG spectrogram. It can be notice that only Radial PPG spectrogram have features extracted from the 3x3 mask being the standard derivation form the E 3 L 3 mask the best feature obtained. Contrary, Digital and Brachial signal doesn't had any 3x3 mask between the top five obtaining as the best the standard deviation for the L 5 W 5 and W 5 L 5 5x5 masks respectively. It is important notice that the kernel W used to extract features of waves, is the most repeated kernel in the top five ranking for the three different locations. This indicate that the wave patterns are relevant in the prediction of the cf-PWV.

Energy features selection
For the BP signals, it can be seen in table S9 that the entropy from the image with the filter S 5 L 5 obtained the best ranking for the Radial and Digital location, and for the Brachial location the standard deviation from the image with the filter L 5 W 5 . It can be seen that the kernel S used to extract features of spots is the most repeated kernel in the top five features for the locations studied.  7  10  13  16  19  22  25  28  31  34  37  40  43  46  49  52  55  58  61  64  67  70  73  76  79  82  85  88  91  94  97  100 Radial Digital Brachial Figure S3. Sensitivity analysis for the PPG spectrogram using energy-based features Figure S3 shows the sensitivity analysis made to select the number of feature to use for the PPG spectrogram. It can been observed that the three different location obtain a similar behavior where the highest R 2 was 0.98 using 36 features for Radial, 37 features for Digital, and 35 for Brachial. However, in order to reduce the possibility to overfitting, 25, 19, and 21 features were selected to each location respectively using 11 , 18, and 14 feature less and only obtaining a R 2 of 0.01 less than the maximum.
Based on figure S4 the number of features selected for the Brachial BP spectrogram was 5 obtaining a R 2 of 0.95. This decision was made given that the higher R 2 of 0.97 found for 46 to 102 features was only 0.02 higher but using at least 41 more features increasing the possibility to present overfitting in the models. In the same way, the best 17 and 15 features were selected for the Radial and Digital locations  Figure S4. Sensitivity analysis for the BP spectrogram using energy-based features respectively obtaining a R 2 of 0.97. This correspond to use at least 47 and 44 less features that the 64 and 59 features uses to obtain a R 2 of 0.99 respectively. Table S10 shows the number of features selected for the BP and PPG spectrograms.  Table S11 shows the raking obtained for the six statistical features computed for the PPG spectrogram. As is shown, the first feature (F1) corresponding to the logarithmic of the standard derivation of the spectrogram obtained the best ranking for the Radial and Digital locations being the most relevant feature in both cases. This feature also obtained the third-best place for the Brachial spectrogram being the only feature between the top three for the three locations proposed in this study. On the other hand, the fourth feature (F4) related to the standard derivation of the normalized spectrogram was the only feature that never ranked between the top three. Table S12 shows the feature raking obtained for the BP spectrogram. In contrast with the result obtained for the PPG signals, the F4 feature has relevance for the three locations being the first ranked for the Radial and the Brachial. Furthermore, table S12 shows that the BP spectrogram obtained similar ranking features for all the locations; where F3, F5, and F6 were ranked as fourth, fifth, and sixth for the three locations, and the features F1, F2, and F6 were in the top three in all the locations. It is important to notice that the Radial and Digital spectrograms obtained the same ranking for the features.  Figure S5. Sensitivity analysis for the PPG spectrogram using statistical-based features Finally, figure S5 shows the mean R 2 obtained from all the ML algorithms using different numbers of features. As can be seen, for the three locations, the best results it was obtained using the 6 best features with a R 2 of 0.82. The number of 6 features was selected given that there is an improvement compared to the other results, even is this improvement is only by 0.01 compared with using 5 features. In contrast to the selection made for the Energy-based features, in this case it was decided gave more relevance to the R 2 value over the number of features because the number of features used was still low compared to other methods, for this reason there are low possibility to produce overfitting caused by the model complexity even if we used all the features.

Statistic feature selection
On the other side, figure S6 shows the results of the sensitivity analysis made for the BP spectrogram. The Radial and Digital spectrogram presented a similar behavior than the PPG were the performance of the model increases with an increase in the feature used until reached the best value of 0.90 . For this reason the maximum number of features (6) were selected for this locations. However, for the Brachial BP the best 5 features were selected given that the maximum value of 0.86 were reached with this features, and when  Figure S6. Sensitivity analysis for the PPG spectrogram using statistical-based features the 6 best feature was added the models performance don't improved meaning that this feature don't have impact in the final result. Table S13 shows the number of Statistical features selected for each location.