Phenotype-Based and Self-Learning Inter-Individual Sleep Apnea Screening With a Level IV-Like Monitoring System

Purpose: We propose a phenotype-based artificial intelligence system that can self-learn and is accurate for screening purposes and test it on a Level IV-like monitoring system. Methods: Based on the physiological knowledge, we hypothesize that the phenotype information will allow us to find subjects from a well-annotated database that share similar sleep apnea patterns. Therefore, for a new-arriving subject, we can establish a prediction model from the existing database that is adaptive to the subject. We test the proposed algorithm on a database consisting of 62 subjects with the signals recorded from a Level IV-like wearable device measuring the thoracic and abdominal movements and the SpO2. Results: With the leave-one-subject-out cross validation, the accuracy of the proposed algorithm to screen subjects with an apnea-hypopnea index greater or equal to 15 is 93.6%, the positive likelihood ratio is 6.8, and the negative likelihood ratio is 0.03. Conclusion: The results confirm the hypothesis and show that the proposed algorithm has potential to screen patients with SAS.


METRIC DESIGN
In this section, we detail the designed phenotype-base metric to determine the similarity between subjects, and hence the modified KNN scheme. We consider the commonly available phenotype information for each subject, including gender, age, and body-mass index (BMI). It is well-known that gender, age and BMI are closely related to the sleep apnea pattern and severity [1,2]. The gender is saved as 0 and 1 to indicate male and female. While there is no ground truth guiding this similarity, we take the physician's clinical experience to design the metric. We weight the standardized Euclidean distances between the gender, BMI and age by 4, 2 and 1 to form the phenotype distance between subjects; that is, we view gender as the most important parameter, followed by BMI and then age. In clinics, the gender, BMI and age are not the only considered parameters related to the SAS severity. The comorbidity of hypertension, diabetes, and hypothyroidism are all relevant.
To further take physicians' wisdom into account, we take the comorbidity of hypertension [3], diabetes [4], and hypothyroidism [5] into account to better determine the similarity between subjects.
We view comorbidity as a categorical variable, and save it as 1 and 0 to indicate the presence of a disease. The Euclidean distance between the vectors saving co-morbidity status of two subjects is called the correction distance. Following the clinical practice, we can find the K nearest neighbors (KNN). If we want to find the K most similar subjects of the new-arriving subject, we first determine K + K most similar subjects that are related to the phenotype metric, and remove the K subjects with the largest correction distance. If there are less than K subjects that have the correction distance greater than 0, we remove subjects with the largest phenotype distance to determine the K most similar subjects. This is the modified KNN scheme considered in the main article.

SELECTED FEATURES
In this work, we consider features extracted from the thoracic movement recorded from the triaxial accelerator sensor (TAA-THO), the abdominal movement recorded from the triaxial accelerator 1 2 SUPPLEMENTARY sensor (TAA-ABD) [6], and the peripheral capillary oxygen saturation (SpO 2 ). We extract apnearelated features from TAA-THO and TAA-ABD, and the desaturation features from SpO 2 . For the calibration between PSG and sensors, the calibration procedure is done by taking the PSG signals during initial tests, like normal-stop breaths and left-right turns, as the reference. The calibrated TAA signals are then treated as the baseline for the following signal recording. However, the automatic calibration procedure has not been implemented.

2.1.
Features from the Respiratory Signal -apnea or not. The proposed algorithm extracts the features from TAA-THO, denoted asY tho , and TAA-ABD, denoted as Y abd , to identify apnea events.
Based on the physiological phenomenon, two features, amplitude ratio (AR) and frequency ratio (FR) of TAA-THO and TAA-ABD, are used to identify apnea events [7].
TAA-ABD and TAA-THO are down-sampled from 226Hz to 4Hz. We segment the signals into overlapping windows of 10-s duration with 9.5-s overlap. We call these windows current windows (CW), and denote the n-th CW as CW(n). The CW provide information about the apnea event, and the goal is to determine if there is a sleep apnea event over each CW. Due to the nature of the triaxial accelerator sensor and the possible body movement, when we evaluate the amplitude of TAA-THO and TAA-ABD, the amplitude might not be meaningful. Instead, it is found in [7] that the ratio of the amplitude in the current window and the previous window is more informative. Therefore, we consider another set of windows to define the feature. For each CW, we call the closest window of 60-s duration in which no apnea was reported by the sleep expert the pre-window (PW). The TAA-THO and TAA-ABD over PW contains the baseline information for the amplitude information.
Denote PW(n) to be the n-th PW associated with the n-th CW. Note that PW's might be the same for different CW's, particularly during the apnea event. In other words, the PW is fixed when the CW moves forward, if there is a sleep apnea on the CW annotated by the sleep expert. The relationship between CW and PW is illustrated in Figure 1.
In this study, the lengths of PW and CW, 60 and 10 seconds separately, are chosen based on the rules used by the sleep expert to mark apnea events. In practice, the sleep expert compares the amplitudes of the oral-nasal flow and the abdominal and thoracic movement signals in the current 10 s with those of the previous signals, up to 120 s, to determine an event. To reduce the possibility of information mixup, we selected 60-s duration as our PW. FR features quantity the oscillatory pattern of the respiratory signal. When the subject has a normal breathing pattern, the spectrum should be well concentrated in the low frequency region.
When apnea events happen, the distorted respiratory pattern in the time domain causes a different concentration in the spectrum. The FR of TAA-THO over the n-th CW is defined as the log of the ratio of the TAA-THO energy over the frequency band ranging from 0.8Hz to 1.5Hz and that over the frequency band ranging from 0.1Hz to 0.8Hz. The FR of TAA-ABD over the n-th CW is defined in the same way. We follow the suggestion in [7] and take the integration range from 0.8 to 1. Note that physiologically there is a latency between an apnea/hypopnea event and its associated oxygen desaturation event. Therefore, we need to shift all features our oxygen desaturation detection result. Compare the sensitivity of detecting the oxygen desaturation events over the whole group, the optimal latency time is 20 seconds, when the mean of the sensitivities of all subjects is maximal, and the variance is minimal.

LEARNING APNEA CLASSIFIERS BY THE SUPPORT VECTOR MACHINE
For the new-arriving subject x, we find K subjects from X that are most similar to x in the sense of D. To learn the relationship between the apnea events and the extracted respiratory features from those K subjects for the prediction purpose, the kernel support vector machine (SVM) [8,9] is considered based on the standard radial based function. Denote a ten dimensional vector v(n) to be the features extracted from TAA-THO, TAA-ABD, and SpO 2 on CW(n), which include AR tho , FR tho , AR abd , FR abd , and six desaturation features. Call v(n) the SAS feature over CW(n). The SAS features are divided into three groups, denoted as C , A and H , by the following rules: (1) If s PSG is dominated by NOR over CW(n), v(n) ∈ N ; (2) If s PSG is dominated by APN over CW(n), v(n) ∈ A ; (3) If s PSG is dominated by HYP over CW(n), v(n) ∈ H .

PREDICT APNEA EVENTS BY THE STATE MACHINE
We apply the SVM classifiers and Cov to design a state machine for an online prediction system.
Like the designed features, the state machine is designed to be as simple as possible to demonstrate the notion. There are three status, NOR, APN and HYP and four rules guiding the state transition, (1)-(4), stated below. The state machine initially stays in the NOR state, and outputs a prediction of sleep apnea events every 0.5 second.
The rules depend on the established binary SVM classifiers, CLF NA, CLF NH, and CLF AH and the status of the previous L CW's, where L ∈ N. We call L the tap number for the state machine.
While the tap number is a tunable parameter, following the suggestion in [7, Section V.B], the tap number is fixed to L = 12 in this study. At the n-th CW, depending on corresponding status, the transition occurs according to the following four rules, (1)-(4).
(1) if one of these conditions hold, transition from NOR to APN happened.
(a) More than half CLF NH(i), n − L + 1 ≤ i ≤ n, are HYP, and CLF NA(i) = AH and CLF AH(i) = A for all n − L + 1 ≤ i ≤ n.
(b) More than half CLF NA(i), n − L + 1 ≤ i ≤ n, are AH, and CLF NH(i) = HYP, and CLF AH(i) = A for all n − L + 1 ≤ i ≤ n.

SUPPLEMENTARY
(3) if one of these conditions hold, transition from NOR to HYP happened.
(a) More than half CLF NH(i), n − L + 1 ≤ i ≤ n, are HYP, and CLF NA(i) = AH and CLF AH(i) = HYP for all n − L + 1 ≤ i ≤ n.