Longer and Deeper Desaturations Are Associated With the Worsening of Mild Sleep Apnea: The Sleep Heart Health Study

Study Objectives Obesity, older age, and male sex are recognized risk factors for sleep apnea. However, it is unclear whether the severity of hypoxic burden, an essential feature of sleep apnea, is associated with the risk of sleep apnea worsening. Thus, we investigated our hypothesis that the worsening of sleep apnea is expedited in individuals with more severe desaturations. Methods The blood oxygen saturation (SpO2) signals of 805 Sleep Heart Health Study participants with mild sleep apnea [5 ≤ oxygen desaturation index (ODI) < 15] were analyzed at baseline and after a mean follow-up time of 5.2 years. Linear regression analysis, adjusted for relevant covariates, was utilized to study the association between baseline SpO2-derived parameters and change in sleep apnea severity, determined by a change in ODI. SpO2-derived parameters, consisting of ODI, desaturation severity (DesSev), desaturation duration (DesDur), average desaturation area (avg. DesArea), and average desaturation duration (avg. DesDur), were standardized to enable comparisons between the parameters. Results In the group consisting of both men and women, avg. DesDur (β = 1.594, p = 0.001), avg. DesArea (β = 1.316, p = 0.004), DesDur (β = 0.998, p = 0.028), and DesSev (β = 0.928, p = 0.040) were significantly associated with sleep apnea worsening, whereas ODI was not (β = −0.029, p = 0.950). In sex-stratified analysis, avg. DesDur (β = 1.987, p = 0.003), avg. DesArea (β = 1.502, p = 0.024), and DesDur (β = 1.374, p = 0.033) were significantly associated with sleep apnea worsening in men. Conclusion Longer and deeper desaturations are more likely to expose a patient to the worsening of sleep apnea. This information could be useful in the planning of follow-up monitoring or lifestyle counseling in the early stage of the disease.


INTRODUCTION
Sleep apnea is a common nocturnal breathing disorder in which breathing is interrupted numerous times during sleep. These interruptions are usually associated with transient drops in blood oxygen saturation (SpO 2 ) and/or arousals from sleep. The apneahypopnea index (AHI) is the most widely used parameter in sleep apnea diagnostics and is derived from polysomnography (PSG) (Kapur et al., 2017). However, PSG is a labor-intensive and expensive method and experienced technicians are required to set up and monitor patients during in-laboratory PSGs or instruct patients using in-home PSG equipment. Therefore, alternative recording setups containing fewer channels have been developed (Krishnaswamy et al., 2015). The oxygen desaturation index (ODI), determined from the SpO 2 signal, could be alternatively used as a parameter in sleep apnea screening (AASM, 1999). The ODI is a good AHI predictor due to its high correlation (Tsai et al., 1999). Furthermore, the AHI and ODI can be accurately determined from the SpO 2 signal using neural networks (Nikkonen et al., 2019). Therefore, screening of sleep apnea and monitoring of disease progression could be based on a simple and low-cost pulse oximetry measurement.
Male sex, obesity, and older age are known risk factors for sleep apnea (Young et al., 1993;Young et al., 2004). In addition, neck circumference (NC) (Ahbab et al., 2013;Caffo et al., 2010) and the neck circumference/height ratio (NC/H) (Davies et al., 1992;Ho et al., 2016) are independent risk factors for sleep apnea. However, assessing the risk of sleep apnea progression by body mass index (BMI), NC, age, snoring, and/or upper airway structure is challenging. For example, both positive Tishler et al., 2003;Lin et al., 2015) and negative (Sforza et al., 1994;Pendlebury et al., 1997;Berger et al., 2009) results on whether the BMI is a factor for sleep apnea worsening have been reported. Similarly, there are conflicting results about whether higher baseline ODI values are associated with an expedited worsening of sleep apnea (Sforza et al., 1994;Lin et al., 2015).
Patients with mild sleep apnea are not systematically treated, especially if symptomless, despite being the most prone to the worsening of the disease (Sforza et al., 1994;Berger et al., 2009). Moreover, even though the severities of individual respiratory events vary within mild sleep apnea patients (Kulkas et al., 2013a,b) and are generally associated more strongly with severe health consequences than the AHI (Muraja-Murro et al., 2013, 2014, the severity of individual events are ignored in current sleep apnea diagnostics. To address these shortcomings, we have introduced novel SpO 2 -derived parameters (Kulkas et al., 2013a) to quantify the severity of the hypoxic burden and physiological stress experienced by a patient. An elevated hypoxic burden has been associated with several sleep apnea-related comorbidities (Stone et al., 2016;Azarbarzin et al., 2019Azarbarzin et al., , 2020, while the AHI and ODI have not. Therefore, novel parameters considering the severities of individual desaturation events could describe the true severity of sleep apnea better than the AHI and ODI (Otero et al., 2012;Kulkas et al., 2013a). However, it is unknown whether mild sleep apnea patients with deep and long desaturations have an elevated risk of expedited worsening of the disease. We hypothesize that mild sleep apnea patients with severe desaturations at baseline experience an expedited worsening of sleep apnea severity. To investigate this, we evaluated the effect of baseline hypoxemia markers on the progression of mild sleep apnea in 805 Sleep Heart Health Study participants.

Dataset
The Sleep Heart Health Study (SHHS) is a multicenter cohort study implemented by the National Heart, Lung, and Blood Institute to determine the consequences of sleep-disordered breathing, such as cardiovascular diseases (CVD). The SHHS dataset is available through the National Sleep Research Resource (Quan et al., 1997;Zhang et al., 2018;The National Sleep Research Resource, 2021). Participants were recruited from nine existing parent cohort studies and provided informed consent for data collection. Successful baseline PSG examination was performed for 6,441 participants between 1995 and 1998, who met the following inclusion criteria: (1) age ≥40 years; (2) no history of sleep apnea treatment; (3) no tracheostomy; and (4) no current home oxygen therapy. The follow-up PSG was performed between 2001 and 2003 for 3,295 participants who were not treated for sleep apnea with positive continuous airway pressure, oral device, or oxygen therapy 3 months prior to the follow-up PSG. Due to the sovereignty issues with one of the parent studies (Strong Heart Study), data from approximately 600 participants are not available. Moreover, due to data corruption over time, data have been lost from a few participants. Therefore, 5,793 baseline and 2,651 follow-up PSGs are available; out of these, 2,647 participants have both recordings available. More details on the SHHS dataset are available elsewhere (Quan et al., 1997;Redline et al., 1998;Dean et al., 2016).

Polysomnography and Covariates
In-home PSGs were performed with Compumedics P-series portable monitors (Abbotsford, Australia) (Quan et al., 1997;Redline et al., 1998). The finger pulse oximeters (Nonin XPOD model 3011, Minneapolis, MN, United States) were used to record SpO 2 with a 1-Hz sampling frequency. Mercury gauge sensors were used to record the body position during sleep. Total sleep time was determined based on 30-s epochs in which the sleep stage was scored as non-rapid eye movement sleep (N1, N2, or N3) or rapid eye movement sleep (REM).
Each PSG recording was supplemented with a sleep habits questionnaire, medical history, medication usage, blood pressure, and anthropometric measurements. NC was measured just below the laryngeal prominence. The existence of hypertension was defined if the systolic blood pressure was ≥140 mmHg, diastolic blood pressure was ≥90 mmHg, or medication for hypertension was in use. At the medical history interview, history of CVD, consisting of myocardial infarction, heart failure, stroke, coronary angioplasty, and coronary artery bypass graft, was inquired. In addition, the existence of diabetes was defined based on self-reported diabetes status and usage of insulin or oral hypoglycemic agents.

Oxygen Desaturation Parameters
Oxygen saturation signals were reanalyzed due to known issues of data corruption and loss of scored event data in the SHHS (The National Sleep Research Resource, 2021). To improve data consistency, desaturations were automatically re-scored using Noxturnal software (version 5.1.19824, Nox Medical, Reykjavík, Iceland). The scoring criteria for desaturations were: (1) minimum of 3% drop in the SpO 2 signal; (2) minimum event duration of 3 s; (3) maximum plateau duration of 45 s; and (4) values lower than 50% were considered as artifacts (no desaturations were scored in these parts of the signal). The maximum plateau duration denotes the maximum period within the desaturation event during which the SpO 2 signal values do not change. If this period is exceeded, the end point of the desaturation is determined to be the starting point of the plateau. It was observed that the software started automatic event scorings systematically one data point too early, and thus, this was corrected in the parameter calculations. To validate the accuracy of the automatic scorings, 30 SpO 2 signals were randomly selected from the available SHHS dataset of 8,444 recordings and scored manually. Correlations and Bland-Altman plot agreements between manual and automatic scorings of the desaturation events were calculated. In addition to the ODI, novel SpO 2 signal-based parameters consisting of desaturation severity (DesSev), desaturation duration (DesDur), average desaturation duration (avg. DesDur), and average desaturation area (avg. DesArea) were calculated ( Table 1; see Kulkas et al., 2013a). These parameters describe the hypoxic burden by taking into account the duration and depth of the desaturation events.

Sleep Apnea Severity Classification
The severity of sleep apnea was determined based on the ODI 4% criterion for several reasons. First, only desaturation events n desat events is the number of desaturation events and TST is the total sleep time. t 1 and t 2 denote the start and end time points of a single desaturation event, respectively, in the SpO 2 signal. ODI, oxygen desaturation index; DesSev, desaturation severity parameter; DesDur, desaturation duration parameter; avg. DesArea, average area of individual desaturation events; avg. DesDur, average duration of individual desaturation events.
fulfilling the minimum transient drop of 4% were included in the analysis as the 4% criterion was considered more reliable than the 3% criterion, as the desaturations were scored automatically and separately from respiratory events. Second, the ODI is known to be a good predictor of AHI (Tsai et al., 1999;Chung et al., 2012;Fabius et al., 2019). Third, originally apneas and hypopneas were scored based on the thermistor, respiratory belts, or some combination of them (The National Sleep Research Resource, 2021). Therefore, scored respiratory events are not in line with the current standards. In addition, the hypoxic burden is an important feature of sleep apnea pathophysiology (Dempsey et al., 2010), and thus, the usage of ODI in the assessment of sleep apnea severity can be justified. In the present study, the term "progression" refers to a change in ODI (either an increase or a decrease) between the two PSG recordings, whereas "worsening" refers to an increase in ODI.
Out of the 2,647 participants with both PSG recordings, 832 had mild sleep apnea (5 ≤ ODI < 15) at baseline, from which 27 were excluded due to the missing covariate data. Therefore, 805 (441 men and 364 women) participants were included for further analyses ( Table 2)

Statistical Analysis
The statistical significance of the differences in the demographic and desaturation parameters between the baseline and followup were evaluated within men and women using the Wilcoxon signed-rank test, and between men and women with the Mann-Whitney U test and Chi-squared test for continuous and categorical variables, respectively. Linear regression was used to investigate the association between the baseline desaturation parameters and the progression of mild sleep apnea with and without covariate adjustment. Change in the ODI between the PSG recordings was used as a continuous dependent variable. Baseline BMI, change in BMI during the follow-up, age, NC/H, the existence of hypertension, diabetes, and CVD, percentage of time slept in the supine position, percentage of time slept in REM, change in the time slept in REM between the PSGs, and follow-up time were used as covariates in the adjusted models. Desaturation parameters at baseline were standardized to enable comparisons between parameters. Thus, regression coefficients (β values) correspond to the expedited increase in ODI between the PSG recordings that were associated with a one standard deviation (SD) change in the desaturation parameter values at baseline. In addition, we investigated whether the desaturation parameter values at baseline differed between the participants whose sleep apnea severity remained in the healthy-to-mild state (i.e., ODI < 15) and the participants whose disease worsened to moderate (15 ≤ ODI < 30) or severe (ODI ≥ 30) sleep apnea during the follow-up. Finally, to address the possibility of selection bias, we investigated whether there were differences in the baseline parameter values between the participants with mild sleep apnea who underwent only baseline PSG and those with both PSGs. Analyses were conducted in MATLAB R (version 2018b, MathWorks, Natick, MA, United States). To address the multiple comparisons, due to five investigated desaturation parameters, a Bonferronicorrected p-value threshold of <0.01 was used to indicate statistical significance, whereas p-values < 0.05 were considered as nominal evidence.
Linear regression analyses revealed that the baseline ODI was not associated with the worsening of mild sleep apnea either in the unadjusted or in the adjusted model (Table 3). However, in men and in the group consisting of both sexes, all novel desaturation parameters were significantly (p < 0.05) associated with sleep apnea worsening in the unadjusted models. In the covariate-adjusted models for the group consisting of both sexes, avg. DesArea (p = 0.001) and avg. DesDur (p = 0.004) were significantly associated with sleep apnea worsening by fulfilling the Bonferroni-corrected threshold, while DesSev (p = 0.040) and DesDur (p = 0.028) reached the limit of nominal significance. Moreover, in men, avg. DesDur was associated with sleep apnea worsening at the Bonferroni-corrected threshold (p = 0.003), while avg. DesArea (p = 0.024) and DesDur (p = 0.033) reached nominal association. Overall, in men and in the group consisting of both sexes, a one SD unit increase in avg. DesDur resulted in the greatest expedited increase in ODI during the follow-up (i.e., largest β values), followed by avg. DesArea.
Men and women whose mild sleep apnea worsened to moderate sleep apnea during the follow-up had significantly higher ODI (p < 0.001 for men, p = 0.008 for women), DesSev (p < 0.001 for men, p < 0.001 for women), and DesDur (p < 0.001 for men, p < 0.001 for women) at baseline compared to the participants who remained in the healthy-to-mild state ( Table 4). Similar findings were observed in men (p < 0.001 for ODI, DesSev, and DesDur) and women (p < 0.001 for ODI, p = 0.001 for DesSev, and p = 0.001 for DesDur) whose mild sleep apnea worsened to severe sleep apnea during the followup. In addition, avg. DesArea (p < 0.001) and avg. DesDur (p = 0.021) were significantly higher at baseline in women whose disease worsened to moderate sleep apnea. The only statistically significant difference between the participants who worsened to moderate sleep apnea and those who worsened to severe sleep apnea was observed in ODI (p = 0.039) in the group consisting of both sexes.
No statistically significant differences in the baseline desaturation parameters were observed between the participants with mild sleep apnea who underwent only baseline PSG and those who underwent both PSGs ( Table 5).
The automatic scoring of the desaturation events was very well in line with the manual scoring. For all five desaturation parameters, the correlations between the manual and automatic scorings were excellent (ρ ≥ 0.94), the median differences in the parameter values were minimal (Table 6), and agreements in the parameter values were strong (Figure 1).

DISCUSSION
In this study, we investigated whether the desaturation parameters at baseline were associated with the worsening of mild sleep apnea. We provide novel evidence showing that especially avg. DesDur and avg. DesAreas are associated with the expedited β values correspond to the expedited increase in ODI between the PSG recordings that is associated with a one standard deviation change in the desaturation parameter values at the baseline. Standard deviations for men, women, and the group consisting of both sexes were respectively: for ODI = 2.8, 2.7, and 2.7; for DesSev = 0.11, 0.11, and 0.12; for DesDur = 2.9, 2.8, and 2.9; for avg. DesArea = 29.6, 29.9, and 29.9; and for avg. DesDur = 6.8, 6.9, and 6.9. ODI, oxygen desaturation index; DesSev, desaturation severity parameter; DesDur, desaturation duration parameter; avg. DesArea, average area of individual desaturation events; avg. DesDur, average duration of individual desaturation events. a Adjusted for age, body mass index, change in body mass index during the follow-up, neck circumference/height ratio, the existence of hypertension, diabetes, and cardiovascular diseases (consisting of heart failure, stroke, myocardial infarction, coronary artery bypass graft, and coronary angioplasty), percentage of time slept in the supine position, percentage of time slept in rapid eye movement sleep, change in rapid eye movement sleep between the polysomnography recordings, and follow-up time.
worsening of sleep apnea. Notably, the baseline ODI values did not appear to be associated with the worsening of mild sleep apnea. These findings suggest that a detailed analysis of the oxygen desaturation signal that considers the morphology of the desaturation events is relevant in the risk assessment of sleep apnea progression. More importantly, this study focused on patients with mild sleep apnea as these patients are not systematically treated especially when symptomless. Therefore, our results implicate that mild sleep apnea patients with deeper and longer desaturation events might benefit from regular followup monitoring. Previously, Lin et al. (2015) demonstrated that baseline ODI is a significant predictor of the worsening of sleep apnea, which is partially contradictory to our findings. On one hand, it represents the same biological concept where increased nocturnal hypoxemia is a predictor of disease worsening. However, based .7 (7.1-12.0) a 9.9 (7.5-12.2) a Women 7.5 (6.1-9.8) 8.5 (6.4-10.9) a 9.8 (7.6-12.0) a Both 7.8 (6.2-10.0) 9.2 (6.8-11.7) a 9.9 (7.5-12.2) ab    For this analysis, 30 oxygen saturation signals were randomly selected from the Sleep Heart Health Study dataset of 8,444 available recordings. Values are presented as medians (interquartile range). Statistical significance (p < 0.05) of the observed difference between the scorings was investigated with Wilcoxon signed-rank test (no statistically significant differences were observed). ODI, oxygen desaturation index; DesSev, desaturation severity parameter; DesDur, desaturation duration parameter; avg. DesArea, average area of individual desaturation events; avg. DesDur, average duration of individual desaturation events.
on the present results, the ODI alone might not be a robust marker for assessing the progression of mild sleep apnea; a more detailed morphological assessment of individual desaturation events could provide more accurate estimates. Furthermore, the opposing findings could be partly explained by differences in the study populations and the lengths of the follow-up periods. Our study population size is significantly larger than that in the study by Lin et al. (n = 805 vs. n = 50), and we had a longer followup period (5.2 vs. 3 years). In addition, their study population consisted of patients who were suffering from more severe sleep apnea at baseline (ODI mean ± SD = 20.8 ± 13.4), whereas we focused only on patients with mild sleep apnea. Nevertheless, the present results are consistent with other previously reported findings (Sforza et al., 1994) with a similar follow-up period (5.7 years), but with a small population (n = 32) of patients with more severe sleep apnea (mean AHI = 52.2 at baseline).
Moreover, it has been shown that AHI does not change over time (mean follow-up period of 5.1 years) in severe sleep apnea patients, and the ones whose AHI increased have initially mild or moderate disease (Berger et al., 2009). It was suggested (Berger et al., 2009) that this is due to the ceiling effect of sleep apnea. Another explaining factor could be the "regression toward the mean" phenomenon, where initially extreme AHI values get closer to the mean at the follow-up measurement, and vice versa. Therefore, it seems that it is highly dependent on the severity of sleep apnea whether baseline ODI or AHI values can be used in the risk assessment of sleep apnea progression; thus, the generalization of our findings should be done with caution.
We observed that mild sleep apnea patients with longer and deeper desaturations experience an expedited worsening of the disease. However, in the sex-stratified analyses, significant findings were observed in men when the ODI 4% criterion was used, whereas the associations were stronger in women with the ODI 3% criterion (Supplementary Material). In addition, we noted that men had more severe desaturations than women, which is supported by previous studies (Ware et al., 2000;Schwartz et al., 2008;Peppard et al., 2009). Therefore, it could be speculated that the 4% criterion might be too strict to assess the progression of sleep apnea severity in women. Furthermore, the ODI values increased, while the avg. DesArea decreased for both sexes during the follow-up. Thus, our findings suggest that individuals whose desaturation events are more severe at baseline develop more of these less severe events. Azarbarzin et al. have also shown that the severity of hypoxic burden predicts cardiovascular mortality (Azarbarzin et al., 2019) and incident heart failure (Azarbarzin et al., 2020). Therefore, the nocturnal hypoxic burden seems to play a potential prognostic role in the worsening of sleep apnea and the development of related comorbidities.
The present study has limitations. First, the desaturations were autoscored using a commercial software without further manual adjustment by specialists. However, the correlations and agreements between the subset of manual and automatic scorings were excellent and the median differences minimal ( Table 6 and Figure 1). Therefore, we were convinced that the used automatic desaturation scoring methods can be assumed to be valid. New scoring was required since part of the manually scored desaturation events had been lost due to the SHHS data corruption over time (The National Sleep Research Resource, 2021). Moreover, there are no current standardized criteria for scoring desaturation events, in addition to a minimum transient drop of 3 or 4% in the SpO 2 signal. For example, the minimum or the maximum durations of the desaturation events are not specified in the rules of the American Academy of Sleep Medicine (AASM), unlike in the case of hypopneas and apneas (Berry et al., 2017). Furthermore, no instructions for the maximum duration of the plateau in the middle of the event exist. In this study, a minimum event duration was set to 3 s and a maximum plateau duration set to 45 s based on visual inspection in which this criterion was observed to be appropriate. However, no fine-tuning to obtain an optimized desaturation scoring criterion was performed. With a shorter plateau length, some of the events might not have filled the minimum rule of transient drop in the SpO 2 signal, or one longer event might have been split into multiple shorter events. In contrast, with longer plateau criteria, short events could fuse into a longer one. All these aspects affect the number (ODI) and severity (novel parameters) of the events and, therefore, the determined parameter values. Furthermore, we decided to use the 4% criterion for our primary analysis as, without associating desaturation events to the respiratory events, the 3% criterion was assumed to be too sensitive.
Apnea-hypopnea index was not used for the severity categorization in this study because the scoring criteria have changed since the apneas and hypopneas were originally scored in the late 1990s and early 2000s. The biggest difference is in the channels used for hypopnea scoring. At the time the recordings were conducted, hypopneas could have been scored based on signals from the thermistor, respiratory belts around the thorax or abdominal region, or some combination of them (The National Sleep Research Resource, 2021). The current AASM recommendation for apnea scoring is an oronasal thermal airflow sensor, while a nasal pressure transducer is recommended for hypopnea scoring (Berry et al., 2017). Moreover, in addition to the desaturation events, the Noxturnal software is capable of scoring apneas and hypopneas automatically. However, the accuracy of the detection of hypopnea and apnea events without manual adjustment was found to be insufficient, and manual re-scoring of the massive SHHS dataset was not feasible. Furthermore, the oximetry used in the SHHS data acquisition differs from the current clinical recommendations. For example, the oximetry used in the SHHS utilized a sampling frequency of 1 Hz, while the current minimum recommendation for routine clinical recordings is 10 Hz (Berry et al., 2017). However, the use of a sampling frequency of 1 Hz can be considered sufficient as it has been shown not to affect the accuracy to detect sleep apnea (Nigro et al., 2011).
Using the ODI to characterize the severity of OSA has certain limitations. As the SpO 2 is not a direct measure of breathing, the ODI cannot be used as a direct measure of the frequency of respiratory events. In addition, the ODI cannot distinguish between obstructive and central respiratory events. Furthermore, no standardized criteria exist to score desaturations, as discussed above. The ODI can also potentially underestimate the AHI, as hypopneas can be scored with an association to desaturation or arousal (Berry et al., 2017). In addition, apneas can be scored without desaturation or arousal, or multiple respiratory events can be associated with a single desaturation. However, using the ODI for the severity categorization is adequate as it has been shown that only 6.3% of apneas are not associated with desaturations and 4.7% of desaturations are not associated with respiratory events (Fabius et al., 2019). Thus, misclassification due to unmatched events can be assumed to be minor.
Another limitation is the potential influence of night-to-night variability in the assessment of sleep apnea severity. It has been shown that there is significant intra-patient variability in the AHI between two consecutive nights (Roeder et al., 2020). However, no such night-to-night variability was observed at the group level (Roeder et al., 2020). Therefore, it is likely that such variations are averaged out in our relatively large study population. Finally, the study population was relatively old and could be enriched with participants with CVD, due to the study design of the SHHS, thus potentially causing selection bias.
Our present findings give a new perspective on the risk assessment of whether mild sleep apnea will worsen over time. The consideration of individual desaturation event severities could be an additional tool in the planning of regular followup monitoring, initiation of treatment, or lifestyle counseling. With these preventive actions, sleep apnea worsening could be slowed or prevented earlier. Adequate management of mild sleep apnea patients with severe desaturation events could further lower the risks of sleep apnea-related comorbidities and generally improve the quality of life. However, regular monitoring of all mild sleep apnea patients would be complicated and expensive with the current diagnostic methods (i.e., polysomnography). In addition, treating a large number of mild sleep apnea patients would be costly while not providing significant benefits for many of the patients, thus being a waste of resources. Therefore, using novel SpO 2 -derived parameters in the risk assessment of sleep apnea worsening and planning of the follow-up monitoring and interventions could be a practical approach. This would allow cost-efficient regular monitoring of sleep apnea using a simple pulse oximeter often included, e.g., in many consumer-grade wearable devices. This could also enable reducing the effect of night-to-night variability on sleep apnea severity estimation (Stöberl et al., 2017;Roeder et al., 2020), allowing more reliable diagnosis and prognosis.

CONCLUSION
The present results indicate that, in the risk assessment of mild sleep apnea worsening, the severity of the desaturation events is more useful than the exact number of the events. Based on the present findings, sleep apnea can be understood as a progressive disease, and many of the mild patients develop more severe disease in 5 years.

DATA AVAILABILITY STATEMENT
Publicly available datasets were analyzed in this study. This data can be found here: National Sleep Research Resource: https:// sleepdata.org/datasets/shhs/. Access to the full Sleep Heart Health Study data was granted by the National Sleep Research Resource as a part of Mazzotti's proposal (agreement #2731).

ETHICS STATEMENT
Each patient/participant provided written informed consent and the study protocol was reviewed and approved by the institutional review boards of each participating site of the Sleep Heart Health Study (SHHS). Participating institutions in the Sleep Heart Health Study are (https://sleepdata.org/datasets/shhs/ pages/full-description.md): Boston University, Case Western Reserve University, Johns Hopkins University, Missouri Breaks Research, Inc. New York University Medical Center, University of Arizona, University of California at Davis, University of Minnesota -Clinical and, Translational Science Institute, University of Washington.

AUTHOR CONTRIBUTIONS
TK contributed to the study design, data analysis, and interpretation, and wrote the manuscript. SM contributed to the study design and writing of the manuscript. SN contributed to the writing of the manuscript and data analysis. DM contributed to data interpretation and writing of the manuscript. JT and TL contributed to the study design, data interpretation, and writing of the manuscript. All authors contributed to the article and approved the submitted version.

FUNDING
This work was funded by the Academy of Finland (decision numbers 313697 and 323536), the Research Committee of the Kuopio University Hospital Catchment Area for the State Research Funding (projects 5041767, 5041768, 5041770, 5041781, 5041787, 5041794, and 5041797), Business Finland (decision number 5133/31/2018), The Research Foundation of the Pulmonary Diseases, Tampere Tuberculosis Foundation, Orion Research Foundation, Instrumentarium Science Foundation, Finnish Anti-Tuberculosis Association, Päivikki, and Sakari Sohlberg Foundation, and The Finnish Cultural Foundation-North Regional Fund. This manuscript was supported by the American Academy of Sleep Medicine Foundation  and the NHLBI (P01HL094307).