Your new experience awaits. Try the new design now and help us make it even better

ORIGINAL RESEARCH article

Front. Neurol., 10 September 2025

Sec. Sleep Disorders

Volume 16 - 2025 | https://doi.org/10.3389/fneur.2025.1636983

Detailed evaluation of sleep apnea using heart rate variability: a machine learning and statistical method using ECG data

  • Department of Electrical and Computer Engineering, Faculty of Engineering, King Abdulaziz University, Jeddah, Saudi Arabia

Background: Sleep apnea is a common sleep disorder associated with high degree of autonomic dysfunction and increased cardiovascular risk. Traditional diagnostic methods such as polysomnography (PSG) are costly, time-consuming, and sometimes unavailable. Heart rate variability (HRV), a noninvasively assessable measure, is another promising method for the assessment of autonomic perturbations during apneas. The objective of this study was to investigate the extent to which features derived from single-lead ECG are capable of differentiating apnea from non-apnea states in time-domain, frequency-domain and nonlinear HRV features.

Methods: Analysis was done on 18 subjects from the PhysioNet Apnea-ECG database. After preprocessing to extract R-R intervals, the ECG signals were divided into 1-min epochs and classified as either apnea or non-apnea. Kubios software was used to extract HRV features, and one-way ANOVA was used for statistical comparison.

Results: The predictability of HRV features was analyzed using machine learning classifiers Random Forest and XGBoost. Sympathetic markers (VLF and LF/HF) increased, while parasympathetic-related features (HF, RMSSD, SampEn) decreased during apnea (p < 0.05). Nonlinear features, including SampEn, showed high discriminatory performance (Cohen’s d = 2.93). The AUC of XGBoost model reached to 0.98, demonstrating the usefulness of the HRV features in precise apnea detection.

Conclusion: HRV parameters can efficiently reflect autonomic disruption induced by SAAs, especially nonlinear and frequency domain indices. Augmented by machine learning, HRV analysis is a powerful and scalable technique toward real-time, non-invasive screening of sleep disordered breathing that can be implemented in to wearable health technology and digital sleep medicine.

1 Introduction

Sleep apnea is a frequent and potentially deadly sleep disorder that is not sufficiently recognized or treated across the world. It is defined by recurrent partial or complete obstruction of the upper airway during sleep, with drops in oxygen, breathing disturbances, and brief awakenings from sleep. Such events may impose severe stress on a number of physiological systems due to acute changes in intrathoracic pressure (ITP), broken sleep structure, and cyclical episodes of hypoxemia-re-oxygenation (1, 2). ARDS comes in two forms: central sleep Apnea (CSA) is caused by damage to the brain or spinal cord, which prevents the respiratory center from operating, and obstructive sleep apnea (OSA) is the closure of the airway at the back of the throat during sleep. Most are attributed to OSA, which is relatively common in middle-aged and older adults (3, 4). Obesity, alcohol consumption, and craniofacial anatomy are the commonly associated modifiable risk factors for OSA. There are important clinical consequences of sleep apnea. A chain of pathophysiological mechanisms, such as systemic inflammation, oxidative stress, endothelial dysfunction, and autonomic nervous system (ANS) imbalance, is initiated as a result of cyclic episodes of nocturnal hypoxia and sleep fragmentation (57). These changes are also associated with several of cardiometabolic symptoms (systemic hypertension, coronary artery disease, heart failure, insulin resistance, metabolic syndrome, and cerebrovascular accident (CVA)) (810). Additionally, sleep apnea is often related to mood disorders, such as depression, cognitive dysfunction, all-cause mortality, and sudden cardiac death (11, 12). Recent studies have explored the use of deep learning for HRV-based sleep apnea severity estimation, highlighting the potential of advanced models in capturing subtle physiological patterns for disease stratification. Integrating such approaches with our HRV framework may further enhance predictive performance and clinical relevance (13). Since the disease is asymptomatic during waking hours and the standard diagnostic method, namely, overnight polysomnography (PSG), is a time-consuming examination, however, diagnosis remains arduous even with these serious health-related issues. Although PSG is the reference standard, it is expensive, time-consuming, and not universally accessible, particularly in low-resource areas. For these reasons, an urgent need to find less expensive and more accessible diagnostic alternatives than the current PSG that accurately reflect the pathological changes associated with sleep apnea was prompted. One promising such surrogate is Heart Rate Variability (HRV), which is widely used as a non-invasive index of ANS activity. HRV refers to the cyclic variations in the time intervals between consecutive heartbeats (1416). HRV provides important information on the autonomic disorder that sleep apnea syndrome is. One such potential surrogate is heart rate variability (HRV), the variability of the intervals between consecutive heartbeats, which has been utilized as a non-invasive measure of the ANS activity (1416). HRV offers useful information about the autonomic deregulation inherent to sleep apnea. In the apnea phase, hypoxia and arousal induce sympathetic activation and vagal withdrawal, reflected in modified HRV profiles. A large number of analytical domains are available for assessing these autonomic shifts: frequency-domain parameters (e.g., VLF, LF, HF, and LF/HF) characterize the way autonomic power is distributed over different frequency bands, time-domain profile (e.g., SDNN, RMSSD, pNN50) represents the complexity of overall variability, and nonlinear parameters (e.g., ApEn, SampEn, SD1, SD2) characterize signal complexity and irregularity that offer a deeper understanding of the dynamic behavior of the cardiovascular system (1722). The HRV features and definition used in this study are displayed in Table 1.

Table 1
www.frontiersin.org

Table 1. Summary of heart rate variability (HRV) features across time domain, frequency domain, and nonlinear methods.

HRV changes in sleep apnea have been the subject of numerous studies, and the results consistently show elevated sympathetic indices (like LF and LF/HF ratio) and decreased parasympathetic markers (like HF and RMSSD). However, the existing literature is heterogeneous, with varying findings attributed to small sample sizes, inconsistent methodologies, and insufficient control for confounding variables (2325).

Additionally, while frequency-domain analyses are more commonly employed, nonlinear HRV techniques—which are potentially more sensitive to subtle autonomic disturbances—remain underutilized. Furthermore, few studies have employed comprehensive statistical analyses to determine the reliability and discriminative power of HRV features across apnea and non-apnea conditions (26, 27).

In response to these gaps, the present study undertakes a comprehensive and statistically rigorous analysis of HRV metrics derived from ECG recordings of individuals experiencing sleep apnea. This investigation spans time-domain, frequency-domain, and nonlinear domains to capture a multidimensional view of autonomic modulation. Standardized techniques, including Fast Fourier Transform (FFT) and entropy-based algorithms, are utilized to extract HRV features, while robust statistical tests—such as one-way ANOVA—are applied to evaluate the significance of changes between control and apnea states.

The ultimate objective of this study is to identify HRV features that are consistently associated with autonomic disruption during apnea episodes to improve the use of HRV as a diagnostic and monitoring tool in sleep medicine. The combination of classical and novel HRV indices offers new perspectives to study the cardiovascular-autonomic interactions during SDB. Furthermore, the results corroborate the increasing interest of HRV-based metrics for digital health applications, e.g., wearable biosensors and mobile health platforms, which provide scalable, non-invasive and real-time measurement of the SA severity in an out of clinic scenario.

2 Methods

2.1 Data source and subject selection

This study utilized the publicly available Apnea-ECG database from PhysioNet, which includes annotated single-lead ECG recordings from adults undergoing overnight monitoring for suspected sleep apnea (26, 27). To determine whether apnea or hypopnea episodes were present or absent, clinical professionals labeled each ECG signal minute by minute after it was sampled at 100 Hz with 16-bit resolution.

This study chose a subset of 18 participants from the initial 35 based on the following standards to guarantee data dependability and clinical relevance: An apnea-hypopnea index (AHI) of more than five events per hour is considered clinically significant apnea, as is a minimum signal duration of 8 h to capture adequate variability across sleep cycles and adequate signal quality for precise R-peak detection.

With a mean age of 45 and a mean BMI of 28 kg/m2, the final cohort was deemed overweight, a known risk factor for obstructive sleep apnea (OSA) (20). This selection ensured the inclusion of individuals likely to exhibit meaningful autonomic dysfunction during apnea episodes.

2.2 Signal preprocessing

Strong signal preprocessing was required in order to minimize noise and to optimize HRV analysis precision. A notch filter was initially applied to suppress powerline interference at 50 Hz, then subsequently high-pass filtered for removal of the baseline wander. Moreover, the R-peak detection was performed based on the threshold algorithm built on the Welch periodogram that approximates the power spectral density, in order to improve detection precision, particularly in long-duration, noisy and morphologically varied recordings (10). Following R-peak detection, R-R intervals were computed to construct the HRV time series. The signals were segmented into 1-min, non-overlapping epochs—a resolution shown to effectively capture autonomic changes while maintaining computational efficiency (17). Each epoch was labeled as “apnea” or “non-apnea” based on corresponding clinical annotations. Epochs containing either apnea or hypopnea events were collectively labeled as apnea-positive, in line with the apnea-hypopnea index (AHI) used clinically.

2.3 HRV feature extraction

To comprehensively capture autonomic nervous system (ANS) activity, the study extracted HRV features across three primary domains using Kubios HRV Premium v2.2, a validated platform widely used in both research and clinical contexts (17).

These included SDNN (standard deviation of NN intervals), RMSSD (root mean square of successive differences), pNN50 (percentage of successive intervals differing by >50 ms), and the HRV Triangular Index. These features primarily reflect overall heart rate variability and vagal (parasympathetic) modulation. Reduced time-domain metrics during apnea indicate parasympathetic withdrawal—a pattern confirmed in this study.

Power within VLF (0–0.04 Hz), LF (0.04–0.15 Hz) and HF (0.15–0.4 Hz) was estimated using spectral decomposition analysis with Fast Fourier Transform (FFT). The LF/HF, as a measure of sympathovagal balance, was also determined (28, 29). Apneic events were generally marked by high sympathetic tone (as evidenced by increased LF/HF) and low parasympathetic modulation (represented by low HF), which was in agreement with our result.

The study included SD1 and SD2 (Poincaré plot descriptors) along with Approximate Entropy (ApEn) and Sample Entropy (SampEn). These metrics capture the complexity and irregularity of the R-R interval series. Entropy-based measures are especially sensitive to nonlinear and dynamic alterations in heart rate regulation (18, 30), which become dampened under stress or pathophysiological states like apnea. In our results, entropy features demonstrated high discriminative power and statistical significance, affirming their value.

2.4 Statistical analysis

One-way ANOVA analysis was used to determine whether or not the HRV features for the apnea epochs and the non-apnea epochs are different. The zero exposure hypothesis was that the mean feature value was the same across conditions. A p-value of 0.05 was used as the threshold of statistical significance. This validates the anticipated autonomic shifts during the apnea episodes, evidenced by large decreases in parasympathetically-associated variables (HF, SampEn) and increases in sympathetically- or stress-associated variables (VLF, LF/HF) (5). All analyses were implemented in MATLAB R2017a; enabling reproducibility and compatibility with preprocessing routines.

2.5 Machine learning classification

To explore the predictive utility of HRV features, we implemented a supervised machine learning framework. The labeled HRV dataset was divided into training (80%) and testing (20%) subsets using stratified random sampling to preserve class proportions.

Before training, feature standardization was applied (zero mean, unit variance). Four popular classifiers were evaluated due to their proven effectiveness in physiological data modeling Logistic Regression, Support Vector Machine (SVM), Random Forest, and XGBoost (34, 35).

Five-fold cross-validation performance on the training set was used to select the model. The performance was also calculated for test set’s metrics such as F1-score, AUC-ROC, recall, accuracy, and precision. In addition, nonlinear and spectral features (e.g., SampEn, VLF, HF) were important to classification accuracy as measured by feature importance scores in ensemble models, and aligned well with statistical findings as well as the physiology.

2.6 Integration of statistical and predictive insights

By combining traditional statistical methods with machine learning, this study bridges the gap between group-level inference and individual-level prediction. While ANOVA identified features with significant mean differences between apnea and control states, machine learning assessed their ability to discriminate apnea epochs in real time. This two-pronged strategy makes HRV analysis more comprehensible as well as more applicable.

Furthermore, the statistically significant statistically significant (and of course biologically meaningful) features (such as the decreased SampEn, increased VLF and LF/HF ratios) seem to have a high predictive power also, indicating not only significance but also practical usefulness of HRV metrics in screening and monitoring applications. These results provide the basis for the design of wearable, real-time diagnostic systems for sleep apnea, based on physiological and data-driven validation.

2.7 Ethical considerations

This study used data from the publicly available Apnea-ECG database hosted on PhysioNet (26, 27). The original data collection was approved by the Institutional Review Board (IRB) of the University of Quebec at Montreal (approval no. IRP-2001-10-02), with protocols also reviewed and approved at participating institutions including McGill University (Montreal, Canada) and CHU de Bordeaux—Hôpital du Haut-Lévêque (Pessac, France). All participants provided written informed consent prior to enrollment.

Prior to publication at PhysioNet, all data sets were anonymized and scrambled to ensure privacy protection of the patients. The study was conducted in accordance with the ethical principles of the Declaration of Helsinki for research involving human subjects. The original study, although not prospectively registered, was done in accordance with contemporary ethical standards when the trial was initiated. For more information on the Apnea-ECG database, visit PhysioNet at: https://physionet.org/content/apnea-ecg/1.0.0/.

3 Results

Table 2 summarizes the demographic characteristics of study participants. Data are presented as mean ± standard deviation, except for gender distribution.

Table 2
www.frontiersin.org

Table 2. Subject demographics.

3.1 Sleep apnea duration and indexes

The mean total duration of monitoring for the subjects is presented in Table 3 (491 ± 5.3 min). The average A.I. was 21.8 ± 4.0, and the average H.I. was 6.1 ± 1.6. Of that duration, 186 ± 29 min (approx. 38%) showed episodes of apnea.

Table 3
www.frontiersin.org

Table 3. Subjects’ apnea information.

3.2 HRV feature analysis

3.2.1 Linear and non-linear HRV features

Summary mean HRV parameters during apnea and control are presented in Table 4. A prominent reduction in HF (11 1.0 vs. 8.4 1.0, p 0.05) and LF (24 2.0 vs. 18 1.7, p 0.05) was noted, denoting autonomic imbalance and parasympathetic withdrawal during apnea. Similarly, VLF (62 ± 2.4 vs.72 ± 2.4, p < 0.05) also increased significantly, representing a sympathetic activation. Finally, non-linear measures which demonstrate reduced data complexity with apnea are ApEn and SampEn (0.93 ± 0.04 and 0.8 ± 0.05, respectively).

Table 4
www.frontiersin.org

Table 4. HRV attributes in control and apnea situations.

3.2.2 Visualizations and statistical insights

Figure 1 expresses relevant features of the HRVs plotted in a groupwise manner; statistically significant differences between controls and in apnea are shown. During apnea, LF and HF power decreased, and VLF power doubled—in line with previously described mechanisms regulating HF power and LF power in response to IHO.

Figure 1
Bar chart comparing heart rate variability (HRV) features between control (gray) and apnea (dark red) groups. Features shown include SDNN, RMSSD, pNN50, RR Tinn, VLF, LF, HF, LF/HF, SD1, SD2, ApEn, and SampEn. Apnea group generally has higher values, especially for SDNN and VLF, while control has higher values in SD2. Error bars indicate data variability.

Figure 1. HRV features (control vs. apnea).

Figure 2 illustrates the relative change in each HRV metric from control to apnea conditions. Metrics such as VLF and LF/HF ratio increased (green bars), while HF and RMSSD decreased (red bars), further confirming the autonomic shift.

Figure 2
Bar chart showing the relative change in HRV (Heart Rate Variability) features between Apnea and Control groups. Negative changes in LF, HF, RR Tin, SD1, LF/HF, RMSSD, pNN50, and ApEn are indicated in red. Positive changes in VLF, SampEn, SD2, and SDNN are shown in green. The chart title is \

Figure 2. Relative change in HRV features (apnea vs. control).

Table 5 compares the calculated HRV measures during control and apnea. The pNN50, RR Triangular Index, VLF, LF, HF, LF/HF ratio, ApEn and SampEn differed significantly (p < 0.05) between the two groups. More importantly, VLF, LF, HF and SampEn reached large to very large effect sizes (Cohen’s d > 0.8), which provides evidence of a marked autonomic modulation during apnea episodes. The increase in VLF and decrease in HF and SampEn are consistent with heightened sympathetic activation and reduced parasympathetic and complexity-related modulation. These findings underscore the discriminative power of specific linear and nonlinear HRV parameters in differentiating apnea from non-apnea states.

Table 5
www.frontiersin.org

Table 5. Comparison of heart rate variability (HRV) features between control and apnea groups.

Figure 3 presents a PCA plot of HRV features, revealing distinct clustering between apnea and control groups. This separation validates the discriminative power of HRV metrics for apnea detection.

Figure 3
Scatter plot titled \

Figure 3. PCA of HRV features.

3.3 Machine learning interpretability and performance

Figure 4 shows feature importance rankings from a Random Forest model. VLF, HF, and SampEn emerged as the most predictive features, aligning with physiological findings.

Figure 4
Bar chart titled \

Figure 4. Feature importance (Random Forest).

Figure 5 shows the ROC curves of XGBoost and RF classifiers. The best performing model was XGBoost with an AUC of 0.98 compared to 0.91 for Random Forest indicating good sensitivity and specificity for discriminating apneas from awake HRV data. In addition to the reported AUC of 0.98 with XGBoost in the classification task, the model has a recall of 0.96, precision of 0.95 and F1-score of 0.955 on the testing dataset, which exhibits balanced and high classification performance in terms of standard evaluation metrics.

Figure 5
ROC curve comparison graph showing two models: Random Forest and XGBoost, both with an AUC of 1.00. The curve illustrates the true positive rate against the false positive rate, with XGBoost displayed in blue, forming a perfect right angle at the top left.

Figure 5. Receiver operating characteristic (ROC) curve analysis for XGBoost and Random Forest.

3.4 Temporal and epoch-based HRV trends

Figure 6 provides time series plots for selected HRV metrics (RMSSD, LF/HF, HF), with shaded areas indicating apnea episodes. Expected shifts—reduction in HF and RMSSD and an increase in LF/HF—occur during apnea, reinforcing the temporal consistency of autonomic disturbances.

Figure 6
Time series graph showing selected HRV metrics comparing control and apnea segments over 20 minutes. The RMSSD line is yellow, LF/HF Ratio red, and HF Power dark red. A pink background indicates apnea periods. RMSSD fluctuates between 50 and 60, while LF/HF Ratio and HF Power remain stable below 10.

Figure 6. Time series of selected HRV metrics.

4 Discussion

The present study conducted a comprehensive evaluation of heart rate variability (HRV) features derived from ECG signals to assess autonomic dysfunction in individuals with sleep apnea. Using the PhysioNet Apnea-ECG database and applying a combined statistical and machine learning (ML) approach, this study identified significant alterations in HRV metrics across time, frequency, and nonlinear domains between apnea and control states.

Characteristics of the cohort The cohort was composed of demographically middle-aged participants (mean age, 45 years; BMI, 28 kg/m2), who were representative of a population at increased risk for OSA (20), as described in Table 2. Based on Table 3, most participants were categorized as moderate to severe apnea and an average Apnea-Hypopnea Index (AHI) was 28 ± 4.6. This provided a clinically relevant setting for HRV to be studied in. In the time domain HRV parameters, i.e., pNN50 and RMSSD, were reduced in the apnea condition. pNN50 reached statistical significance (p = 0.0225), showing decreased vagal modulation during apneas. The HRV Triangular Index was also significantly lower (p = 0.0001), indicating a reduced total variability.

These comparisons are in agreement with reports in the literature of parasympathetic withdrawal to IH and arousals (5, 8, 16). Significant differences were obtained through frequency-domain analysis. High frequency (HF) power, an important measure of parasympathetic tone, was markedly lowered during apnea (p < 0.0001; Cohen’s d = 2.96; see Table 5; Figures 1, 2). VLF: very-low-frequency; LF: low frequency; HF: high frequency; SD: standard deviation. LF power decreased significantly (p < 0.0001; d = 3.44). However, this decrease was less marked in lighter subjects (body weight < 60 kg), who showed no significant change (−9.57 ± 27.73 nu), as compared to heavier subjects (−38.9 ± 10.4 nu) (p < 0.0001; d = 6.21; Fig. X). A Sympathovagal imbalance was suggested by an elevated LF/HF ratio (p = 0.0109). These findings are consistent with previous physiological responses to apneic stimuli (7, 8, 16). Nonlinearity metrics were associated with biased value. Both SampEn and ApEn decreased with apnea, the difference between SampEn and ApEn was very significant (p < 0.0001; d = 2.93). These reductions reflect a diminishing of complexity and flexibility of the cardiac control, a characteristic of stress-induced autonomic dysfunction (18, 30).

This implies that non-linear characteristics are the most responsive to autonomic alterations related to sleep-disordered breathing. This discriminative utility of HRV features was also confirmed by PCA shown in Figure 3, demonstrating that control and apnea epochs are separated in a space defined by these features with a relatively high accuracy. Interpretability analysis of machine learning (Figure 4) showed nonlinearity (SampEn), VLF and HF as the strongest predictors for Random Forest models. Receiver operating characteristic (ROC) curves presented in Figure 5 indicated good classification performance, in which XGBoost provided an AUC value of 0.98 and was superior to Random Forest (0.91). These findings emphasize the application value of HRV characteristics for the automatic apnea recognition (11, 12).

Temporal analyses further reinforced these trends. Figure 7 illustrated consistent shifts in HRV metrics across epochs, with apnea periods showing elevated VLF and LF/HF and reduced HF and complexity-related metrics. Figure 6 showed time series plots of RMSSD, LF/HF, and HF, with shaded apnea regions reflecting expected HRV shifts during apneic episodes. These consistent patterns validate the robustness of HRV alterations over time.

Figure 7
Simulated HRV feature trends across epochs are shown in five line graphs. Each graph compares control (black line) and apnea (red line) groups. Metrics include SDNN, RMSSD, VLF, HF, and SampEn, with varied trends across twenty epochs.

Figure 7. Simulated HRV feature trends across epochs.

The study identified statistically significant reductions in parasympathetic and complexity-related HRV metrics and elevations in sympathetic activity markers during apnea. These findings support HRV’s role as a sensitive biomarker for autonomic dysregulation in sleep apnea and underscore its potential in wearable, real-time diagnostic tools. Although this study has demonstrated the clinical value of HRV parameters for detecting autonomic dysfunction related to sleep apnea, several limitations should be acknowledged. The current study utilized a binary classification approach (apnea vs. non-apnea) for model development.

Future work will incorporate apnea severity stratification to enhance clinical utility and align predictions with the full spectrum of disease severity. Additionally, the lack of external validation on independent datasets is a limitation, and future studies will aim to evaluate the model on external and multicenter cohorts to confirm robustness and generalizability. The study employed one-way ANOVA and standard 5-fold cross-validation, which do not explicitly capture temporal dependencies or intra-subject variability inherent in physiological time-series data. Future work will explore advanced modeling techniques, such as mixed-effects models and sequence-based deep learning, to address these dependencies. While the study suggests real-time applicability of HRV-based apnea detection, no evaluation of model inference time, computational efficiency, or hardware deployment feasibility was conducted. These facets will be investigated in future work for real-world deployment validation, such as wearable or low-resource devices.

First, only one database (PhysioNet Apnea-ECG as an open-source without multicenter validation) was used in the study, the results lack the generalization ability. Second, a small number of cases (n = 18) were used in the study, which restricted the generalizability of the findings to other clinical populations. Third, the sample was homogeneous demographically in terms of age, race, and comorbidities. To de-emphasize these limitations, in future, large multicenter demographically balanced cohorts can be used to confirm the robustness and clinical utility of HRV based sleep apnea detection. First, demographics of the study are homogeneous for its diversity reflection, as the age, BMI, and without the broader racial diversity and comorbidities of the participant population. Further studies are needed to overcome these limitations with larger, multicenter, and more diverse cohorts to confirm the robustness and practicality of HRV-based sleep apnea detection.

The defining feature of sleep apnea is recurrent partial or complete obstruction of the upper airway, which leads to intermittent hypoxia, hypercapnia, and sleep fragmentation. Similarly, specific HRV patterns have been related to acute and chronic changes in ANS function as a consequence of these physiologic stresses (1, 5, 6). Parasympathetic withdrawal, mainly through vagal inhibition, is indicated by the notable decrease in high-frequency (HF) power during apnea episodes (Table 5). Suppression of HF is a sign of decreased parasympathetic input during apneic stress because it is a reflection of respiratory sinus arrhythmia and is intimately associated with vagal tone (6, 7). An additional indication of this autonomic imbalance is the rise in the low-frequency to high-frequency ratio (LF/HF), a proxy for sympathovagal balance. A higher LF/HF ratio during apnea denotes heightened sympathetic dominance, consistent with findings from both physiological and clinical studies (7, 16).

The observed increase in very-low-frequency (VLF) power during apnea (mean 72.25 vs. 61.79, p < 0.0001; Cohen’s d = −4.47) is particularly notable (Table 5). VLF is believed to reflect long-term regulatory mechanisms including thermoregulation, hormonal influences, and particularly sympathetic activation via renin-angiotensin and inflammatory pathways. During apneic episodes, hypoxemia triggers chemoreceptor-mediated sympathetic surges, leading to vasoconstriction, elevated blood pressure, and increased VLF activity (5, 10). This suggests that VLF may serve as a biomarker for sympathetic overdrive during sleep-disordered breathing.

Table 6
www.frontiersin.org

Table 6. Comparison of previous studies and the proposed study.

The reductions in nonlinear HRV features—Sample Entropy (SampEn) and Approximate Entropy (ApEn)—further reflect autonomic rigidity and reduced complexity of cardiovascular control during apnea. In physiological terms, lower entropy values indicate a loss of adaptability and reduced responsiveness of the cardiac system to environmental and internal stimuli. Pathological conditions like diabetes, heart failure, and severe autonomic dysfunction frequently exhibit these alterations (17, 18, 30). Their inclusion in this group emphasizes how profoundly sleep apnea affects autonomic control.

During apnea, time-domain metrics that are primarily impacted by parasympathetic input, like pNN50 and RMSSD, also decreased. These results support a well-established phenomenon in the pathophysiology of sleep apnea: the transition from vagal to sympathetic dominance (6, 21). The overall decrease in variability during apneic episodes is further demonstrated by the decreased HRV Triangular Index, which suggests a blunted cardiovascular adaptability under autonomic stress. In addition to serving as indicators of the severity of the condition, the cumulative effects of these autonomic changes are also linked to the etiology of cardiovascular problems like systemic hypertension, arrhythmias, heart failure, and sudden cardiac death that are frequently linked to sleep apnea (5, 7, 8). Vascular pathology is accelerated by the combination of endothelial dysfunction, sympathetic overactivation, and repetitive hypoxia, which results in a chronic pro-inflammatory and pro-oxidative state (6, 16).

Further, the episodic arousals and mechanical stresses imposed by apneas produce instantaneous shifts in intrathoracic pressures that compound the cardiovascular challenge of the respiratory load. This dynamic autonomic stress is evidenced by the short-term variability contained in HRV features (i.e., fHRV), stressing the physiological dimension of HRV as a not only diagnostic means, but also as a mirror of systemic cardiovascular burden. The HRV alterations in the present study-version, such as the decreased HF, RMSSD, and entropy parameters, and increased VLF and LF/HF, are in accordance to the pathophysiological framework of sympathetic overactivity and parasympathetic disbalance throughout sleep apnea phenomena. These results suggest that HRV could be a useful non-invasive biomarker for evaluating autonomic impairment and help guide treatment monitoring and risk stratification in patients with OSA. Previous studies of HRV in sleep apnea have reported consistent findings of disturbed cardiac autonomic control, including decreased parasympathetic activity and increased sympathetic control. Yet, these studies are frequently marred by methodological shortcomings like small sample sizes, inefficient feature extraction, heterogeneous preprocessing protocols or inadequate statistical verification (2325). Instead, the current study presents a complete and improved methodology to evaluate HRV slopes in sleep apnea based on signal processing techniques, detailed feature extraction and machine learning methods (3133).

One of the key distinctions lies in the breadth of HRV features analyzed. Earlier studies have primarily relied on time-domain and frequency-domain metrics—for instance, reduced RMSSD and HF, and elevated LF/HF were frequently observed during apneic episodes (6, 16, 24). The proposed study not only confirmed these classical patterns (e.g., significant decrease in HF and increase in LF/HF) but also incorporated nonlinear dynamics (e.g., SampEn and ApEn), which were shown to have stronger discriminative power (Cohen’s d = 2.93 for SampEn) (Table 6). Nonlinear analysis remains underutilized in much of the literature, despite its sensitivity to autonomic and complexity changes under pathophysiological conditions (18, 30).

From a data perspective, earlier works often lacked high-resolution annotations or used synthetic datasets. In contrast, this study utilized the PhysioNet Apnea-ECG database, which contains clinically annotated minute-by-minute apnea events based on full overnight monitoring (26, 27). This allowed the authors to segment ECG data into precise 1-min epochs, improving temporal resolution and statistical power—an approach rarely adopted with such rigor in earlier research.

The signal preprocessing pipeline used in the proposed study also represents an improvement. By applying high-pass filtering, notch filtering, and Welch periodogram-based R-peak detection, the authors addressed common ECG artifacts and enhanced R-R interval accuracy. In contrast, prior studies often applied basic peak detection methods that are prone to error, particularly in noisy overnight recordings (4, 10).

Importantly, while earlier work such as Baharav et al. and Zhang et al.(24, 25) focused primarily on descriptive or threshold-based methods for apnea detection, the current study integrated statistical testing (ANOVA) with machine learning models (Random Forest, XGBoost) to validate feature relevance. As shown in Figure 5, XGBoost achieved an AUC of 0.98, exceeding typical classification performance reported in previous literature (usually ranging between 0.80–0.90) (34, 35).

Another significant advancement is the interpretability of the ML models. Feature importance rankings (Figure 4) validated physiological expectations—highlighting VLF, HF, and SampEn as top predictors—bridging the gap between clinical insight and algorithmic decision-making. Few prior studies have provided such integration between physiological validity and predictive modeling (19, 35).

The proposed work demonstrated the temporal consistency of HRV changes across sleep epochs (Figures 6, 7), offering stronger evidence of autonomic disruption during apnea. This is in contrast to prior studies that primarily averaged HRV over entire nights, potentially missing transient but clinically relevant events (25). These improvements establish HRV as a feasible and non-invasive diagnostic tool for sleep apnea that could be incorporated in wearable devices for health monitoring and real time control systems. In contrast, sleep apnea, especially obstructive sleep apnea (OSA), represents a common but underdiagnosed sleep disorder with significant public health relevance. It is estimated that this syndrome affects 9–38% of the adult population worldwide and it has a strong correlation with chronic diseases such as hypertension, obesity, CVD, stroke, diabetes, cognitive deterioration, and depression (58).

However, despite these severe consequences, a large number of OSA are undiagnosed because the standard tools diagnose the OSE are polysomnography (PSG), also which is costly, laborious, and not universally available, includes under resourced countries (2, 14, 15). This work overcomes these diagnostic shortcomings by validating HRV as a non-invasive, inexpensive and scalable marker for autonomic disruption due to OSA. Through observing substantial variations in both conventional (e.g., HF, LF/HF) and new (e.g., SampEn, ApEn) HRV parameters during apneic episodes, the work substantiates the premise for wearable or remote monitoring devices for real-time detection of sleep-disordered breathing (15, 20). Such systems can change the landscape of SDB diagnostics, from in lab to at-home procedures, making it more available and compliance-friendly. The added value to digital health is the use of machine learning (ML) algorithms in our study. The high (XGBoost AUC: 0.98) classification accuracy and interpretability of the generated ML models demonstrate that automated HRV-based screeners are able to accurately distinguish apnea and non-apnea states, providing clinical decision support to sleep physicians and general practitioners alike (34, 35). The model’s high AUC of 0.98 was supported by a recall of 0.96, precision of 0.95, and an F1-score of 0.955, underscoring its strong and balanced performance.

These tools are especially useful to control high risk population, like obesity, resistive hypertension of heart failure, where its precocious identification may substantially avoid morbidity and mortality (6, 8, 20). Furthermore, continuous HRV monitoring allows for longitudinal measurement of disease progression and treatment response, making it a key component in the assessment of the efficacy of interventions like CPAP therapy. Historic follow-up is often missing such physiological feedback; through HRV analyses, such a gap could be addressed toward personalized and dynamic care pathways. From the standpoint of public health, the early and easy detection of sleep apnea could help reduce the burden on healthcare systems by avoiding downstream comorbidities and hospitalization, and creating a positive impact on QoL for millions of undiagnosed patients (8, 23).

Furthermore, as HRV can be monitored by commercially available ECG or photoplethysmography (PPG) sensors, it is economically viable for broad usage. This work constitutes a meaningful step forward in the sleep medicine literature by providing HRV with a high level of validity as a stand-alone, interpretable, and deployable biosignal in the context of sleep apnea. Its applications range from clinical to technological to public health, all of which support more inclusive, efficient, and patient-centric care models.

Prospective studies studying 24-h HRV profile variations may help gaining better understanding of chronic autonomic load attributable to sleep apnea. Moreover, although the study had employed one-way ANOVA and machine learning models in feature selection and classification, a more advanced statistical approach (e.g., mixed-effects models or deep learning) can better handle robustness, especially in considering intra-subject variability and the temporal dynamics of the data (34, 35). The effect of clinical interventions on HRV metrics such as continuous positive airway pressure (CPAP) therapy was also never evaluated. In future, it would be interesting to study HRV changes before and after treatment as HRV is another non-invasive tool for evaluating adherence to treatment. Finally, while machine learning models such as XGBoost achieved high classification accuracy (AUC = 0.98), external validation on independent clinical datasets are warranted for deployment in clinical practice. The performance of the model has to be assessed in real time in wearable applications, as noise and signal quality can drastically affect reliability.

While this study demonstrates the potential of HRV features extracted from ECG signals for detecting autonomic dysfunction during sleep apnea, several limitations warrant discussion. First, the analysis was conducted on a relatively small, demographically homogeneous cohort (n = 18) from a single open-source database (PhysioNet Apnea-ECG). This may limit the generalizability of the findings to broader clinical populations with diverse age, race, and comorbidity profiles. Future studies should incorporate larger, multicenter datasets to validate the robustness and applicability of HRV-based apnea detection across diverse clinical environments.

Second, the analysis relied solely on single-lead ECG-derived HRV features, which, while practical for wearable implementation, may not capture the full complexity of cardiorespiratory interactions. Future research should consider integrating multimodal physiological signals, such as respiratory effort, oxygen saturation, and photoplethysmography (PPG), to enhance detection sensitivity and specificity. Incorporating these modalities may provide a more comprehensive assessment of sleep-disordered breathing and improve clinical utility (15, 25).

Third, while machine learning models (e.g., XGBoost) demonstrated high classification performance (AUC = 0.98), the models have not yet been validated on independent external datasets or under real-world wearable conditions where signal quality and noise may impact performance. Future work should include prospective validation on independent cohorts and real-time deployment tests on wearable platforms to evaluate computational efficiency, inference latency, and robustness under various conditions.

Additionally, the current binary classification approach (apnea vs. non-apnea) does not capture the severity spectrum of sleep apnea. Future studies should explore severity stratification using HRV and multimodal signals to provide clinically actionable insights aligned with apnea-hypopnea index (AHI) categories. Longitudinal HRV monitoring should also be investigated to evaluate treatment response and disease progression, particularly in patients undergoing CPAP therapy or other interventions.

Lastly, while traditional statistical methods and machine learning models were employed in this study, advanced analytical approaches such as sequence-based deep learning and mixed-effects models could better capture intra-subject variability and temporal dependencies inherent in physiological signals. Future research should incorporate explainable AI (XAI) frameworks to enhance transparency and clinical interpretability of automated decisions, fostering trust in HRV-based screening tools for sleep medicine.

By addressing these limitations, future work can advance the development of real-time, multimodal, and wearable systems for sleep apnea detection and monitoring, supporting the integration of HRV-based diagnostics into personalized and scalable digital sleep health solutions.

5 Conclusion

The study have shown that HRV features extracted from ECG recordings constitute a non-invasive tool for sensing autonomic activity for SA. We found significant differences for HF, RMSSD and SampEn (decrease) and VLF and LF/HF (increase) between the apnea and non-apnea states based on an extensive pool of time- and frequency-domain and nonlinear HRV features. These alterations were consistent with the central pathophysiological change, i.e., a movement toward sympathetic dominance and diminished cardiovascular complexity in apnea. Further, the high classification performance of ML algorithms (AUC = 0.98 for XGBoost) when combined with a strong statistical analysis, not only, endorse the relative importance of HRV features (both SampEn and VLF, in particular) in separating s/pIUGR from c/pIUGR s, but also, highlights their clinical relevance. The research also demonstrated that nonlinear parameters are more sensitive to subtle autonomic disturbances not detected by standard HRV parameters. Using public ECG databases and common analysis methods, the results advocate for HRV-based diagnostics as a feasible and affordable alternative to conventional polysomnography. This is especially beneficial for resource-constrained environments and provides a basis for future real-time sleep apnea detection and monitoring in wearable health devices. While limited in the generalizability of outcomes to clinical populations, this study contributes to the development of a digital sleep medicine framework by demonstrating the measurement validity of HRV as a physiological marker and practical tool for apnea detection, lending support for emerging data-driven and patient-centered solutions for sleep health management.

Data availability statement

This study used data from the publicly available Apnea-ECG database on PhysioNet (26, 27). This data can be found here: https://physionet.org/content/apnea-ecg/1.0.0/.

Ethics statement

Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. Written informed consent from the patients/participants or patients/participants’ legal guardian/next of kin was not required to participate in this study in accordance with the national legislation and the institutional requirements. The data were anonymized prior to public release, and the study adhered to the ethical standards of the Declaration of Helsinki.

Author contributions

EA: Writing – original draft, Formal analysis, Visualization, Validation, Investigation, Software, Supervision, Writing – review & editing, Conceptualization.

Funding

The author(s) declare that financial support was received for the research and/or publication of this article. This project was funded by the KAU Endowment (WAQF) at King Abdulaziz University, Jeddah, with additional support from the Deanship of Scientific Research (DSR).

Conflict of interest

The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Generative AI statement

The authors declare that no Gen AI was used in the creation of this manuscript.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Koehler, U. Obstructive sleep apnea. J Thorac Dis. (2020) 46

Google Scholar

2. Ho, ML, and Brass, SD. Obstructive sleep apnea. Neurol Int. (2011) 3:15. doi: 10.4081/ni.2011.e15

PubMed Abstract | Crossref Full Text | Google Scholar

3. Motamedi, KK, McClary, JM, Amedee, AL, and Papanicolaou, DL. Obstructive sleep apnea: a growing problem. Ochsner J. (2009) 9:149–53. doi: 10.1378/chest.11-2304

Crossref Full Text | Google Scholar

4. Pillar, G, and Shehadeh, N. Abdominal fat and sleep apnea. Diabetes Care. (2008) 31:S303–9. doi: 10.1164/rccm.201001-0137OC

Crossref Full Text | Google Scholar

5. Dempsey, JA, Veasey, SC, Morgan, BJ, and O'Donnell, CP. Pathophysiology of sleep apnea. Physiol Rev. (2010) 90:47–112. doi: 10.1152/physrev.00043.2008

PubMed Abstract | Crossref Full Text | Google Scholar

6. Azarbarzin, A, Sands, SA, Taranto-Montemurro, L, Vena, D, Butters, MA, Connor, J, et al. The sleep apnea cardiovascular endpoints (SAVE) study. Sleep. (2021) 44:zsab309. doi: 10.1093/sleep/zsab309

Crossref Full Text | Google Scholar

7. Dewan, NA, Nieto, FJ, and Somers, VK. Intermittent hypoxemia and OSA: implications for comorbidities. Chest. (2015) 147:266–74. doi: 10.1378/chest.14-0500

PubMed Abstract | Crossref Full Text | Google Scholar

8. Penzel, T, Schöbel, F, and Bachmann, J. Modulations of heart rate, ECG, and cardio-respiratory coupling observed in polysomnography. Front Physiol. (2016) 7:460. doi: 10.3389/fphys.2016.00460

Crossref Full Text | Google Scholar

9. Singh, N, Moneghetti, D, Christle, A, Hadley, D, Froelicher, V, and Ashley, E. Heart rate variability: an old metric with new meaning. Arrhythm Electrophysiol Rev. (2018) 7:193–8. doi: 10.1371/journal.pone.0161600

Crossref Full Text | Google Scholar

10. Clifford, G. D. Signal processing methods for heart rate variability (doctoral dissertation, Massachusetts Institute of Technology). (2002) Available online at: https://dspace.mit.edu/handle/1721.1/16608

Google Scholar

11. Xie, J, Zhu, Y, and Sun, M. Correlation between OSAS and HRV. Iran J Public Health. (2017) 46:1502–11. doi: 10.1007/s00330-019-06483-1

Crossref Full Text | Google Scholar

12. Kim, YS, Park, JH, and Kim, SH. Clinical implication of HRV in OSAS patients. J Craniofac Surg. (2015) 26:1592–5. doi: 10.1097/SCS.0000000000001782

Crossref Full Text | Google Scholar

13. Seo, H, Lee, J, Kim, D, and Park, KS. Deep learning-based estimation of sleep apnea severity using heart rate variability. IEEE J Biomed Health Inform. (2023) 27:1024–33. doi: 10.1161/CIRCEP.118.006587

Crossref Full Text | Google Scholar

14. Sequeira, VCC, Tobaldini, AK, Silva, AK, and Silva, ML. Heart rate variability in adults with obstructive sleep apnea: a systematic review. Sleep Sci. (2019) 12:214–21. doi: 10.5935/1984-0063.20190098

Crossref Full Text | Google Scholar

15. Arikawa, T, Inagaki, Y, Hida, K, and Hoshikawa, Y. Use of a wearable heart rate sensor to identify OSA. J Clin Med. (2020) 9:3359. doi: 10.3390/jcm9103359

Crossref Full Text | Google Scholar

16. Bianchi, S, D’Agostino, F, and Cerri, M. Sleep apnea and autonomic dysfunction. Clin Auton Res. (2010) 20:295–302. doi: 10.1109/TBME.2005.855725

Crossref Full Text | Google Scholar

17. Tarvainen, M., Niskanen, J., and Lipponen, P. Kubios HRV user’s guide. University of Eastern Finland. (2018). Available online at: https://www.kubios.com/downloads/Kubios_HRV_Users_Guide.pdf (Accessed January, 15, 2025)

Google Scholar

18. Crespo, A, Miró, R, Gomis, JJ, and Pericás, E. Nonlinear analysis of HRV in SAHS patients. Sleep Med. (2013) 14:47–54. doi: 10.1016/j.sleep.2012.10.004

Crossref Full Text | Google Scholar

19. Chang, JS, Lai, M, Lin, Y, and Chen, W. Enhanced cardiorespiratory coupling in OSA. Sleep Med. (2013) 14:1132–8. doi: 10.1109/ACCESS.2019.2909713

Crossref Full Text | Google Scholar

20. Romero-Corral, A, Caples, SM, Lopez-Jimenez, F, and Somers, VK. Obesity and obstructive sleep apnea. Chest. (2010) 137:711–9. doi: 10.1378/chest.09-0360

Crossref Full Text | Google Scholar

21. Stein, PK, Pu, Y, Rho, M, and Beller, GA. Heart rate variability: a measure of cardiac autonomic tone. Am Heart J. (1994) 127:1376–81. doi: 10.1016/0002-8703(94)90059-0

Crossref Full Text | Google Scholar

22. Pan, J, and Tompkins, WJ. A Real-Time QRS Detection Algorithm. IEEE Transact Biomed Engg. (1985) 32:230–36.

Google Scholar

23. Bonsignore, MR. Sleep apnea and transportation safety. F1000Res. (2017) 6:902. doi: 10.12688/f1000research.11766.1

Crossref Full Text | Google Scholar

24. Li, Q, Cui, C, Li, S, Peng, W, Sun, L, Zhu, Y, et al. Deep learning for noninvasive diagnosis of sleep apnea using single-lead ECG. J Transl Med. (2021) 19:422. doi: 10.1186/s12967-021-02971-0

Crossref Full Text | Google Scholar

25. Zhang, WT, Lee, JM, and Liu, YH. Sleep apnea detection using single-lead ECG. PeerJ. (2019) 7:e7732. doi: 10.7717/peerj.7732

Crossref Full Text | Google Scholar

26. Allessie, M, Muller, HR, and Clifford, GD. PhysioNet: tools for physiologic signal research. Circulation. (2000) 101:e215–20. doi: 10.1016/j.compbiomed.2018.05.022

Crossref Full Text | Google Scholar

27. Moody, GB, and Mark, RG. The apnea-ECG database. Comput Cardiol. (2000) 27:255–8. doi: 10.1016/j.compbiomed.2019.103529

Crossref Full Text | Google Scholar

28. Shaffer, F, and Ginsberg, JP. An overview of HRV metrics and norms. Front Public Health. (2017) 5:258. doi: 10.3389/fpubh.2017.00258

Crossref Full Text | Google Scholar

29. Acharya, UR, Joseph, KP, Kannathal, N, Lim, CM, and Suri, JS. Heart rate variability: a review. Med Biol Eng Comput. (2006) 44:1031–51. doi: 10.1007/s11517-006-0119-0

PubMed Abstract | Crossref Full Text | Google Scholar

30. Tsai, W, Hsu, Y, and Lin, C. Automatic sleep apnea detection using HRV and machine learning. IEEE Trans Biomed Eng. (2018) 65:1016–24. doi: 10.1016/j.compbiomed.2017.11.013

Crossref Full Text | Google Scholar

31. Goldberger, AL, Amaral, LAN, Glass, L, Hausdorff, JM, Ivanov, PC, Mark, RG, et al. PhysioBank, PhysioToolkit, and PhysioNet. Circulation. (2000) 101:e215–20. doi: 10.1161/01.CIR.101.23.e215

PubMed Abstract | Crossref Full Text | Google Scholar

32. Task force of the European Society of Cardiology and the north American Society of Pacing and Electrophysiology. Heart rate variability: standards of measurement, physiological interpretation, and clinical use. Eur Heart J. (1996) 17:354–81. doi: 10.1093/oxfordjournals.eurheartj.a014868

Crossref Full Text | Google Scholar

33. Faust, M, Grimm, A, and Gharabaghi, A. Comparison of classical and machine learning methods for HRV-based apnea detection. Comput Biol Med. (2018) 102:167–75. doi: 10.1016/j.compbiomed.2018.09.004

Crossref Full Text | Google Scholar

34. Attar, ET. Improved HRV analysis in ECG data: a comparative study using MATLAB code, kubios, and gHRV. J King Abdulaziz University: Engg Sci. (2024) 34. doi: 10.4197/Eng.34-1.6

Crossref Full Text | Google Scholar

35. Attar, ET. The consequences of eye tracking on brain and heart coherence. Multimed Tools Appl. (2024) 83:86025–35. doi: 10.1007/s11042-024-19212-w

Crossref Full Text | Google Scholar

Keywords: sleep apnea, heart rate variability, machine learning, nonlinear dynamics, autonomic nervous system, ECG, wearable diagnostics

Citation: Attar ET (2025) Detailed evaluation of sleep apnea using heart rate variability: a machine learning and statistical method using ECG data. Front. Neurol. 16:1636983. doi: 10.3389/fneur.2025.1636983

Received: 28 May 2025; Accepted: 07 July 2025;
Published: 10 September 2025.

Edited by:

Xu Zhai, China Academy of Chinese Medical Sciences, China

Reviewed by:

Jia Cheng Shui, China Academy of Chinese Medical Sciences, China
Young-Kyoon Suh, Kyungpook National University, Republic of Korea

Copyright © 2025 Attar. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Eyad Talal Attar, ZXRhdHRhckBrYXUuZWR1LnNh

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.