Diminished heart beat non-stationarities in congestive heart failure

Studies on heart rate variability (HRV) have become popular and the possibility of diagnosis based on non-invasive techniques compels us to overcome the difficulties originated on the environmental changes that can affect the signal. We perform a non-parametric segmentation which consists of locating the points where the signal can be split into stationary segments. By finding stationary segments we are able to analyze the size of these segments and evaluate how the signal changes from one segment to another, looking at the statistical moments given in each patch, for example, mean and variance. We analyze HRV data for 15 patients with congestive heart failure (CHF; 11 males, 4 females, age 56±11 years), 18 elderly healthy subjects (EH; 11 males, 7 females, age 50±7 years), and 15 young healthy subjects (YH; 11 females, 4 males, age 31±6 years). Our results confirm higher variance for YH, and EH, while CHF displays diminished variance with p-values <0.01, when compared to the healthy groups, presenting higher HRV in healthy subjects. Moreover, it is possible to distinguish between YH and EH with p < 0.05 through the segmentation outcomes. We found high correlations between the results of segmentation and standard measures of HRV analysis and a connection to results of detrended fluctuation analysis (DFA). The segmentation applied to HRV studies detects aging and pathological conditions effects on the non-stationary behavior of the analyzed groups, promising to contribute in complexity analysis and providing risk stratification measures.


INTRODUCTION
It has been a long time since studies on heart rate variability (HRV) have become as popular as the many devices available to record the cardiac and autonomic activity (Akselrod et al., 1985). HRV refers to heart beating signal itself as well to the time between consecutive heart beatings. As reported in literature, several diseases and conditions, as myocardial infarction, diabetic neuropathy, and myocardial dysfunction, for example, are related to decreasing HRV (Pagani et al., 1988;La Rovere et al., 1998), pointing to a connection between healthiness and complexity. Also, the effects of aging are known to present higher HRV in younger individuals, compared to elderly ones (Task Force, 1996). The possibility of diagnosis based on non-invasive techniques such as HRV studies is widely investigated (Kurths et al., 1995) and compels us to overcome the difficulties originated on the environmental changes that can affect the signal, eventually pointing out to a false disruption of the dynamics or leading to neglect a real one. Besides environmental conditions, nonstationarities can arise from multiple regulatory mechanisms, for example blood flow and respiration, operating concomitantly and varying over time, with each subsystem presenting its own time scale (Porta et al., 2009), and since the activation of only one control mechanism is not possible, the cardiac record will carry non-stationarities related to the superposition of different regulatory mechanisms (Magagnin et al., 2011). Spectral analysis, for example, relies on the assumption of weak stationarity, where the mean value is constant and the covariance is only dependent on a time shift, but even in controlled environments, it is questionable whether the efficiency of such control ensures these conditions.
The idea of the segmentation applied to time series is to provide patches of the signal where stationarity is verified. Bernaola-Galván et al. (2001) introduced a mean based segmentation algorithm which looks for points along the signal where the difference of the mean of the resultant segments is maximal. Since the mean based segmentation has t-statistics computed for equal variance samples as the criterion for locating the cutting points, no information about the variances could be inferred. Instead of testing the difference for the mean, we perform a non-parametric segmentation (Camargo et al., 2011), taking into account the whole distribution, with all moments, e.g., mean and variance. In the case of HRV, the known amplitude-frequency coupling of the dominant short-term oscillation, the respiratory sinus-arrhythmia, connects the changes in the variance with the time-dependent covariance, implying non-stationarity (Hirsch and Bishop, 1981).

MATERIALS AND METHODS
For analysis, we consider 24 h measurements of the electrocardiogram of three groups consisting of young and elderly subjects, and patients suffering from congestive heart failure. The groups are composed of 15 patients with congestive heart failure (CHF; 11 males, 4 females, age 56 ± 11 years), 18 elderly healthy subjects (EH; 11 males, 7 females, age 50 ± 7 years), and 15 young healthy subjects (YH; 11 females, 4 males, age 31 ± 6 years). Data for the CHF patients and the YH subjects are available from PhysioNet (Goldberger et al., 2000). Data for the EH were provided by the Charité in Berlin (Wessel et al., 2003). The series of time intervals between consecutive heart beats, the beat-to-beat intervals, are extracted from the electrocardiograms (Suhrbier et al., 2006). All resultant signals were filtered in order to avoid ectopic beats (Wessel et al., 2007).
We have chosen a non-parametric method based on the comparison of the two segments resulting from a hypothetical cutting, by building the empirical cumulative distribution of the points to the left and to the right of this cut and looking for the cut which would provide the maximal distance between the empirical cumulative distributions, namely the Kolmogorov-Smirnov distance. After determining the position of the cut, we check for its statistical significance, and if necessary, we check for a minimum length. If the required critical distance value, for a given significance level, and the minimum length are verified, then the signal is split into two segments, that on their turns, will be examined in order to find other cutting points. Segmentation ends when no significant difference is found or when the remaining segment is smaller than the minimum. The segmentation procedure can be summarized as follows: given a time series, {x i , i 1 ≤ i ≤ i n }, a sliding pointer is moved from i = i 1 , . . . , i n − 1 in order to compare the two fragments of sizes n L = i p − i 1 + 1, and n R = i n − i p , where i p is the position of the pointer. Then one selects the position i max that maximizes the Kolmogorov-Smirnov (KS) statistics where D KS is the maximal distance between the cumulative distributions of the samples. After determining the position i max of the maximal distance D, D max , one checks statistical significance (at a chosen significance level α = 1 − P 0 ) of a potentially relevant cut at i max by comparison with the result that would be obtained was the sequence random. The critical value is given by the empirical expression (a, b, c) = (1.52, 1.80, 0.14) for P 0 = 0.95, n = n L + n R . Equation (2) was obtained by determining D max of a large number of sequences of n independent and identically distributed Gaussian numbers, and then, using the complementary cumulative distribution of D max , we found the critical values for each significance level (Camargo et al., 2011). The potential cut ticks the first stage if D max exceeds its critical value for the selected significance level D max crit (n). If each resulting segment is greater than a defined minimum 0 , then the pointer is set and the procedure is recursively applied starting from the left patches until no patch is segmented. See Camargo et al. (2011) for further details. We performed the KS-segmentation with 0 = 30 and P 0 = 0.95. This value of 0 is chosen in correspondence to the defined higher edge frequency of the very low frequency (VLF) band of heart rate with 0.03 Hz (Task Force, 1996) providing at least a half period of this frequency in each segment. Figure 1 presents a schematic first step of the segmentation: in (A) we see the original signal, that will be examined point to point, as indicated in (B), splitting the signal into two parts and computing the distance between the empirical cumulative distribution of each side (C). Once this step is completed, one will have a distance D associated to each point of the signal (D) and we look then to the point where D is maximal. We verify the significance of this cut by the comparison with the critical value, and the segment minimum length required, then the signal is cut. As a result, we have the signal split into two segments (E). Repeating the procedure, for each new segment resulting from a cut, until no significant differences found or the length reaches a value smaller than the minimum, one will have the signal fragmented in segments (F).

RESULTS
By finding stationary segments we are able to analyze the size of these segments and evaluate how the signal changes from one segment to another, looking at the statistical moments given in each patch, for example, mean and variance, in what we can call the intrinsic time scale of the signal (in opposition to arbitrary divisions of the signal). To quantify the non-stationarity, we obtain the length of the segments, L, the mean RR interval in each segment, μ, and the variance, σ 2 , associated to each segment, as well as the distributed jump size. When considering individual distributions of the segment sizes, the empirical cumulative distribution of segment sizes is described by the double exponential curve Ae −(L− 0 )/L 1 + (1 − A)e −(L− 0 )/L 2 . The Mann-Whitney U test for L 2 provides significant differences between CHF and YH (p < 0.01) as well as CHF and EH (p < 0.05), pointing to differences in the occurrences of large segments between the pathological group and the healthy ones, while the occurrence of short segments shows no significant differences. Comparing the means of the segments does not improve the basic comparison of the means of the complete signal, thus the importance of accessing the second and further moments, if desirable, of each segment. Looking at the distribution of σ 2 of each individual, we were able to fit the curve log[PDF(σ 2 )] = α[log (σ 2 )] 2 + β[log (σ 2 )] + γ, aiming at individual parameters capable of distinguishing the groups. Performing a Mann-Whitney U test for γ values showed in Figure 3 we obtain p < 10 −4 when considering groups CHF-YH and CHF-EH, and p < 0.05 for EH-YH. Although significant differences were found between CHF and the healthy groups when comparing means, the aging effect in the healthy groups is harder to detect, being pointed through symbolic dynamics Once this step is completed, one will have a distance D associated to each point of the signal (D) and the point where D is maximal is determined. The empirical expression given by Equation (2) is used to verify the significance level of the cut, and if D exceeds D crit and the segment results not smaller than the minimum, the signal is cut. As a result, the signal is split into two segments (E). Repeating the procedure, for each new segment resulting from a cut, until no significant differences found or the length reaches a value smaller than the minimum, one will have the signal fragmented in segments (F). POLVAR20 (Wessel et al., 2007) and now through the distribution of variances.
Since it is usual to work with signals subtracted their mean, we compare how the signal subtracted by the segments mean, RR , can improve the understanding of the non-stationary behavior of the groups. Figure 4A shows the trends given by the mean in the segments, μ for a section of the segmented signal and in (B) we see the signal RR resulting from the subtraction of the mean in each segment, μ. Then, we plot in (C) the variance in the detrended signal, σ 2 , against the variance of μ, σ 2 T , for each subject. The results show correlations between the variability of the trends and the variability of the detrended signals, still displaying higher variability for healthy subjects, yet not providing differentiation due to aging effects.
In order to understand the non-stationary behavior of the groups, we compute the mean variance of the segments σ 2 (L) given a segment length, L, displayed in Figure 5. One can see that CHF always presents the smaller values of σ 2 (L) , while YH and EH systematically have higher values. It is worth to mention here the similarity of Figure 5 with plots given by detrended fluctuation analysis (DFA) (Peng et al., 1995), where the signal is integrated and then linearly detrended according to fixed sized boxes and then one verifies how the fluctuations behave with varying box sizes. With the segmentation, we have the box sizes replaced by the segments sizes and the fluctuations are given by the mean variances of the segments with given lengths. Computing the scaling exponents, obtained through the numerical fittings indicated in Figure 5 by solid lines, we obtain the slopes ξ CHF = 0.6, ξ EH = 0.4, and ξ YH = 0.2. Results from DFA have shown that, in the range of 10 2 ∼ 10 4 beats, the scaling exponent of the pathological group approaches to 1.5, corresponding to Brownian noise, that is integrated white noise. For random walk, DFA provides a scaling exponent of 0.5. In our segmentation analysis, as we do not integrate the signal, the correspondent pathological scaling exponent ξ CHF = 0.6 indicates that random walk fluctuations dominate the dynamics of this group, in good accordance with DFA interpretation. For the EH and YH groups, the exponents ξ EH = 0.4, and ξ YH = 0.2 indicate power law correlations associated to the interchange of large and small RR intervals. In this sense, we can confirm the loss of complexity due to pathological condition and aging.
To the extent of standard HRV analysis, in Figure 6 we plot the number of segments with L > 300, L >300 , corresponding to approximately 5 min, the shortest required segment length of HRV analysis (Task Force, 1996) and %μ 50 , the percentage of differences of the mean of two consecutive segments |μ i + 1 − μ i | > 50 ms in an analogy to the standard HRV measure pNN50, the percentage of consecutive RR intervals differing by more than 50 ms (Task Force, 1996). Figure 6 reveals that the pathological group, CHF, presents smaller %μ 50 than the healthy groups EH and YH, with the last exhibiting also higher number of long segments. The aging affects the occurrence of larger segments as we know that the oldest subjects of EH are clustered around L >300 = 40 and the oldest patients of CHF group are clustered around L >300 = 30. We found high positive correlations between %μ 50 and standard deviation of the signal, sdNN, as well as a negative correlation between L >300 and the normalized very low frequency power, VLF/P, linking the segmentation outcomes to standard measures. The first one shows that sdNN is dominated by the large jumps whereas the second correlation reflects the reduction of very low frequencies for longer segments. For the pathological group, CHF, the correlation between sdNN and %μ 50 is 0.94, while healthy groups, YH and EH, present a correlation of 0.90. For VLF/P and L >300 we have a correlation of −0.77 for CHF and −0.81 for YH and EH.

DISCUSSION
In this paper we present the analysis of non-stationarities in heart rate by means of a non-parametric segmentation algorithm, being able to detect significant differences between CHF and age-matched EH, as well as CHF and YH. Also, significant differences between YH and EH can be detected, showing the aging effect in the loss of complexity of the heart rate. We found a correspondence between DFA results, suggesting that random walk fluctuations dominate the dynamics of the pathological group. For the healthy groups, EH and YH, the long term correlations in the interchange of large and small RR intervals appears to be larger for EH and smaller for YH, denoting diminished non-stationarities due also to aging and pathological condition. Although KS-segmentation is limited to long term observations, since short data provide poor statistics of the segments, when considering previous segmentation methods, this choice for using the Kolmogorov-Smirnov distance instead of the mean of the segments provides the possibility of characterizing all the statistical moments of the segments, particularly the changes in the variance. According to our results, comparing means or standard deviation is not efficient to distinguish EH and YH, but the aging effect can be indicated by the variance in the segments. The aging effect has been detected when applying POLVAR20 (Wessel et al., 2003), which gives a good classification of the groups; however, no information about the dynamics underlying the systems is provided. Additionally, this segmentation procedure differs from arbitrarily splitting the signal in equal size segments and testing whether they come from the same distribution or not, rather it is the test for significant differences that determines the lengths of segments.
In terms of standard HRV analysis, a high positive correlation between %μ 50 and sdNN shows that the standard deviation is dominated by the large jumps. A negative correlation between L >300 and VLF/P reflects the reduction of very low frequencies for longer segments. Periodic breathing is prevalent in 30-50% of the patients with CHF (Hall et al., 1996;Lorenzi-Filho et al., 2005), and it has been reported (Lorenzi-Filho et al., 1999) that periodic breathing leads to modulation of the RR intervals in the VLF band, what would cause the algorithm to detect a larger number of short segments for CHF.
Through the outcomes of segmentation we have access to time characteristics of the signal that were no longer available, making possible a different approach to quantify non-stationarities in HRV analysis. Results are in agreement with previous knowledge and do not require arbitrary thresholds or exclude fragments of the time series. The individual risk stratification ability of this method relies on further applications to cardiological data bases.