A New Berlin Questionnaire Simplified by Machine Learning Techniques in a Population of Italian Healthcare Workers to Highlight the Suspicion of Obstructive Sleep Apnea

Obstructive sleep apnea (OSA) syndrome is a condition characterized by the presence of repeated complete or partial collapse of the upper airways during sleep associated with episodes of intermittent hypoxia, leading to fragmentation of sleep, sympathetic nervous system activation, and oxidative stress. To date, one of the major aims of research is to find out a simplified non-invasive screening system for this still underdiagnosed disease. The Berlin questionnaire (BQ) is the most widely used questionnaire for OSA and is a beneficial screening tool devised to select subjects with a high likelihood of having OSA. We administered the original ten-question Berlin questionnaire, enriched with a set of questions purposely prepared by our team and completing the socio-demographic, clinical, and anamnestic picture, to a sample of Italian professional nurses in order to investigate the possible impact of OSA disease on healthcare systems. According to the Berlin questionnaire, respondents were categorized as high-risk and low-risk of having OSA. For both risk groups, baseline characteristics, work information, clinical factors, and symptoms were assessed. Anthropometric data, work information, health status, and symptoms were significantly different between OSA high-risk and low-risk groups. Through supervised feature selection and Machine Learning, we also reduced the original BQ to a very limited set of items which seem capable of reproducing the outcome of the full BQ: this reduced group of questions may be useful to determine the risk of sleep apnea in screening cases where questionnaire compilation time must be kept as short as possible.

Obstructive sleep apnea (OSA) syndrome is a condition characterized by the presence of repeated complete or partial collapse of the upper airways during sleep associated with episodes of intermittent hypoxia, leading to fragmentation of sleep, sympathetic nervous system activation, and oxidative stress. To date, one of the major aims of research is to find out a simplified non-invasive screening system for this still underdiagnosed disease. The Berlin questionnaire (BQ) is the most widely used questionnaire for OSA and is a beneficial screening tool devised to select subjects with a high likelihood of having OSA. We administered the original ten-question Berlin questionnaire, enriched with a set of questions purposely prepared by our team and completing the socio-demographic, clinical, and anamnestic picture, to a sample of Italian professional nurses in order to investigate the possible impact of OSA disease on healthcare systems. According to the Berlin questionnaire, respondents were categorized as high-risk and low-risk of having OSA. For both risk groups, baseline characteristics, work information, clinical factors, and symptoms were assessed. Anthropometric data, work information, health status, and symptoms were significantly different between OSA high-risk and low-risk groups. Through supervised feature selection and Machine Learning, we also reduced the original BQ to a very limited set of items which seem capable of reproducing the outcome of the full BQ: this reduced group of questions may be useful to determine the risk of sleep apnea in screening cases where questionnaire compilation time must be kept as short as possible.

INTRODUCTION
Obstructive Sleep Apnea (OSA) is a syndrome characterized by partial or complete obstruction of the upper airways during sleep. This phenomenon, in turn, causes numerous and repetitive arousal from sleep to restore airways, leading to disrupted sleep, daytime hypersomnolence, and sympathetic activation. The obstruction of the airways may also lead to blood oxygen desaturation (1) during sleep, and cardiovascular lesions (2). OSA is associated with numerous conditions including stroke, hypertension and death (3,4). These comorbidities are particularly evident in obese patients, and varying in severity according to gender and age.
The prevalence of OSA is highly different in the general population, ranging from 9 to 38%, with older age, male gender, and obesity as known risk factors (1,5,6). In advanced age groups, prevalence can even increase to 84% (1).
According to a worldwide epidemiological prevalence study (5) there are an estimated 936 million OSAS patients aged 30-69 years with mild-moderate OSA and 425 million patients aged 30-69 years with severe OSA who need Continuous Positive Airway Pressure (CPAP) treatment. In Italy, one study estimated the prevalence of moderate-to-severe OSA in 27% of the general population, with an overall prevalence of mild and moderateto-severe OSA of more than 24 million people in the ages 15-74 years (54% adult population), while from a practical perspective, Italian NHS physicians diagnosed only 460,000 moderate-to-severe patients (4% of estimated prevalence) and 230,000 patients were treated (2% of estimated prevalence), highlighting a substantial gap between diagnosis and treatment. Considering that each patient is diagnosed many years after the onset of the disease, the direct and indirect healthcare costs determine a significant burden for the National Health System (NHS), which affects every single citizen. Prevention and early diagnosis are the only ways to achieve cost containment and improved quality of life.
Although studies have considerably increased in recent years, to date OSA is still a highly underdiagnosed disease. The gold standard for OSA diagnosis is nocturnal polysomnography (PSG) in the sleep laboratory. However, since this is not well workable for large numbers of patients, the Home Sleep Test (HST) is also an accepted validated ambulatory diagnostic method. Among non-invasive screening tools for OSA diagnosis in the general population, the Berlin questionnaire (BQ) (7) is the most widely used to define patients at risk for OSA. It was employed for the first time in the US: it contains ten questions related to risk factors and symptoms of OSA with the purpose of selecting high-risk OSA patients that may undergo polysomnography and increase the number of diagnosed patients.
The main purpose of this study was to find possible risk factors that are best correlated with being at high risk for OSA-according to the BQ-in professional nurses in order to investigate the possible impact of OSA on healthcare systems by considering one of the most important categories in health and assistance fields. We also assessed the capabilities of a reduced BQ of predicting a high-risk OSA group according to the result of the standard BQ. For this purpose, we used techniques related to supervised feature selection and Machine Learning.

Design
From May 2020 to September 2021 a cross sectional, multicenter study was conducted among professional nurses. Four hundred and five Italian subjects agreed to participate in the study. No eligible criteria were applied to the volunteers. The survey was conducted by means of an anonymous electronic questionnaire distributed on a voluntary basis. All subjects were asked to answer the BQ (7) and an additional set of 38 questions including items about baseline socio-demographic characteristics, work information, clinical status, and symptoms category. In particular, socio-demographic characteristics included gender, age, BMI, smoking, and neck circumference. For work information, we intended years of work experience, working hours, work shift, work shift regularity. For health status, we assessed the presence of arrhythmias, sleep disturbances, hypo/hyperthyroidism, anxiety, hypertension, transient ischemic attack or stroke, diabetes mellitus, chronic obstructive pulmonary disease (COPD), asthma, anxiety, depression, frequent confusion or agitation, craniofacial morphological alterations, alcohol and drug abuse. Symptoms category included difficulty staying awake during an activity, difficulty concentrating, difficulty in expressing oneself, use of stimulants, interference with work, interference with social relationships, slow reactions and difficulty keeping attention up, difficulty in paying attention to several tasks at once, striving not to make mistakes, and need to doze off.

The Berlin Questionnaire
The BQ (7) is the most widely used non-invasive screening tool for OSA diagnosis devised to identify subjects with a high likelihood of having OSA based on the frequency, loudness, disturbance and breathing interruptions of nocturnal snoring, on daytime sleepiness, and on the presence of high blood pressure/obesity. The BQ consists of three categories of questions related to the risk of having sleep apneas. Patients can be classified into high-risk or low-risk based on their responses to the individual items and their overall scores in the symptom categories. Category 1 contains five items and incorporates questions about snoring; Category 2 contains three items investigating daytime somnolence; Category 3 contains one item assessing hypertension and information about the Body Mass Index (BMI). Scores from the first two categories were positive if the responses indicated frequent symptoms, such as more than 3-4 times per week, whereas the score from the third category was positive if there was a history of hypertension or a BMI > 30 Kg/m 2 (7). The overall score was determined from the response to the three categories. Patients were scored as being at OSA highrisk when they had a positive score on two or more categories, else they were considered as being at low-risk (7).

Statistical Analysis
The answers of all respondents to the BQ were analyzed using descriptive statistics. To identify items associated with being at high-risk of OSA, baseline characteristics, working information, health status, and symptoms category were separately studied in the two OSA risk groups. Continuous variables were summarized by mean and standard deviation (SD) and categorical variables by frequencies and percentages. Kruskal Wallis test and Mann-Whitney U-test were used for assessing difference between high vs. low risk of having OSA. Contingency tables were also analyzed, and chi-square and Fisher's exact tests were carried out to ascertain the presence of relations between the two OSA risk groups. A p-value < 0.05 was considered statistically significant. BQ scoring and statistical analyses were conducted for all qualitative and quantitative variables using Matlab software.

Predictive Value
Calculating group statistics is important to establish the statistical relevance of variables in a diagnostic problem so that risk factors or relationships with comorbidities can be assessed. Nonetheless, it is well known (8-10) that relevance is not a synonym for discriminant power, the latter being most useful in classification and prediction: significant variables in a statistical model do not guarantee prediction performance, and non-significant attributes might reveal predictive. For this reason, we decided to also study both Berlin and our questionnaires from the point of view of their prediction capabilities, by techniques related to supervised feature selection and Machine Learning.
It must be noted that prediction in this case is not related to actual OSA diagnosis, because the only data on which we worked is the response to the questionnaires: therefore, the target variable was simply the high risk of being affected by OSA according to the result of the BQ. As the latter is not a perfect test and can give FP and FN (11,12), our conclusions are valid within the same limits.
XGBoost (13) in python was chosen as the classifier model. A relevant reason was that the responses to the questionnaires unfortunately had a certain number of missing answers and out-of-the-box XGBoost deals quite satisfactorily with missing data thanks to the algorithm called "sparsity-aware split finding": therefore no explicit imputation mechanism (14) had to be implemented. Moreover, XGBoost is fast and reliable, as also witnessed by frequent wins on Kaggle competitions with this classifier 1 .
After converting the ordered response scales to numeric, the following analysis were performed. First, the Fisher score (15) was calculated on each variable. This index measures the ratio between the inter-class distance and the total intra-class variance, F = (x 1 −x 2 ) 2 / σ 2 1 +σ 2 2 wherex j and σ 2 j are the mean and the variance of a variable for class j. F is a parameter clearly related to the discrimination power of each attribute. Similarly, the area under the ROC (Receiver Operating Characteristics) curve (AUROC) for each variable was computed, directly measuring its predictive power. The Fisher score and the AUCROC have similar meaning but they are independent, so they complement each other. However, though these two figures of merit are important because they assess the discriminant power of each feature individually, nonetheless they only partially characterize the dataset, as they neglect the combination of features, which means evaluating two or more features together: it often happens that the scores for single features is low but their combination is strongly discriminant, so some mechanism of feature group scoring assessment is necessary. For this purpose, we employed the backward Sequential Feature Selector (bSSF) from scikit-learn 2 , with XGBoost as the scorer, to build a plot of AUROC vs. the cardinality of the optimal subset of features, from which we could infer interesting conclusions on the prediction power of feature combinations. We finally performed some adhoc calculations on particular subsets of features, which we considered interesting.
The feature selection procedure based on bSSF was built as follows. We started from the whole dataset of feature vectors containing n attributes. The dataset was randomly split into two parts, one for feature selection (P1) and the other for quality assessment (P2) of each subset of selected features. Proportions between selection and quality assessment datasets were arbitrarily set to 70 and 30% of the whole dataset, respectively.
At the m-th step (m going from 0 to n -2), feature selection by bSSF, from nm to nm -1 features, was applied on the P1 dataset, followed by prediction quality measurement on the selected features. Therefore, each iteration took as its input the dataset containing the "best" features, as selected by the preceding iteration. At each iteration (with fixed m), instead of performing feature selection just once, we preferred to study the robustness of the selected subset of features, by applying bSSF a given number of times (typically 100), each time recording which feature was considered as the least important (downvoted). As the P1 vectors were shuffled before bSSF application, we had a certain variability on the selected features and, at the end of this internal loop, we removed the feature that had been downvoted more often.
At this point, with a robust subset of features, we calculated the AUROC (arbitrarily with 50 iterations) on the quality assessment dataset P2 and assigned the average AUROC (with an uncertainty calculated as the standard deviation) to the feature set.
The loop on m then continued, until there was just one feature in the dataset.
The results of this process were: • A graph showing AUROC as a function of the number of selected features. • A list of features, ordered by importance (considering that the least predictive variables, in a multivariate framework, were discarded first).
The whole procedure was repeated many times, each time modifying the initial split between P1 and P2, so that the influence of random splitting might be judged.

Ethical Considerations
The ethical aspects of the study were set out in the questionnaire presentation, which was designed in accordance with the principles of the Italian data protection authority (DPA). It was emphasized that participation was voluntary and that the participant could refuse participation in the protocol whenever he or she wished. Those who were interested in participating were given an informed consent form, which recalled the voluntary nature of participation, as well as the confidentiality and anonymous nature of the information.

Berlin Questionnaire Score and Metrics
The BQ was evaluated for all respondents and data were collected ( Table 1). According to the questionnaire, the subjects were stratified into low vs. high OSA risk groups by means of a score calculation. Among all subjects, 76 (20%) were categorized as high likelihood of having OSA. Table 2 shows the BQ answer counts subdivided between low and high Berlin score subjects. Respondents were also asked if they had already been diagnosed for OSA through a gold standard test (e.g., polysomnography). Among the subjects identified as high-risk, 24% (n = 18, 5% of the complete sample) had already been diagnosed with OSA whereas 76% (n = 58, 15% of the sample) had not undergone any diagnostic test. Among the subjects categorized as low-risk for OSA, 1% (n = 2), had received a diagnosis of OSA (false negatives) whereas 99% had not been tested.
As reported in the literature (16), the dominant symptom of OSA is snoring with a prevalence of 75-90%. Accordingly, in our sample the high-risk OSA group had a significantly larger proportion of respondents reporting frequent snoring (95%) compared to the low-risk group (21%). Nocturnal snoring also increased in frequency and loudness in high-risk OSA cases compared with low-risk, and this difference was statistically significant (p < 0.001 for both). Specifically, 28% of the high-risk group report snoring very loudly compared with 3% of the lowrisk group. The percentage of those who snore every night also increases from 10 to 63% in the high-risk group.
Nocturnal symptoms may also include apnea and dyspnea generally observed by bed partners and this was confirmed by the bothersome snoring percentage that passed from 22% in the low-risk to 80% in the high-risk group. These differences were statistically significant (p < 0.001).
The high-risk group also reported more breathing interruptions than the low-risk subjects (p < 0.001).
Fatigue, somnolence at awakening and during daytime are also symptoms significantly present in the high-risk group compared to the low risk group (p = 0.0018 and 0.0029, respectively). The percentage of those who reported falling asleep while driving a vehicle was also higher in the high-risk group (24%) than for the low-risk subjects (9%), with a statistically significant difference (p < 0.001).
This significance is also present in the frequency of episodes (p < 0.001).
High blood pressure was also reported in half of the high risk subjects (51%) compared with 5% of the low risk ones, and this difference was statistically significant.
Socio-demographic characteristics, work information, clinical factors, and symptoms category were compared between the two OSA risk groups. The results are summarized in Table 3.

Predictive Value of the Berlin Questionnaire Variables
Fisher Indices and AUROC for Single Variables The ten variables from the BQ plus BMI were considered. The most discriminant variables were the four related to snoring (B1 to B4 in Table 1) with B1 being the most important in absolute (AUROC = 0.88, F = 1.9) and snoring loudness B2 being the least predictive. As to the two variables with relatively objective measurement, i.e., having high blood pressure, B10, and the body mass index (computed from the subject physical data), the former had high predictivity (AUROC = 0.74, F = 0.80) while the latter showed lower discriminant power (AUROC = 0.64, F = 0.03). This result was quite surprising if compared with the one reported in (18) where BMI is found to be quite a strong predictor.

Sequential Feature Selection
The typical relationship between the number of features and AUROC we obtained by the bSSF procedure is shown in Figure 1.
Repeating the run with different random splits of P1 vs. P2 partitioning did not appreciably change the result, with AUROC for sets ≥ 3 features always attaining values near 1. Reaching so high AUROC with the full set of variables, of course, has no particular meaning because the target variable (high risk of OSA) is obtained from the BQ variables (the answers to the questions), so there exists a well-established a priori relationship between the variables and the target, which the classifier finds. On the other hand, what is surprising is the fact that a subset of three variables is capable of predictive power comparable to the whole questionnaire.
The subset of three variables was reasonably robust and did not depend too much on the particular dataset split; after about 60 runs, the subset was found to contain the variables computed from B10 (selected at every run), B1 (present in 73% of the "best" feature subsets), B6 (presence in 38%), B7 (37%), B3 (25%), B4 (2%). We remark that hypertension B10 is always among the most useful features [which was already known from the singlevariable calculations; this result confirms what was found in (18)]. Considering now the remaining five features, three concern snoring (B1, the most voted after B10; then B3 and B4) while two concern feeling tired in daytime, either at wake-up or along the day (B6 and B7), with similar presence in the subsets. By calculating the (normalized) co-occurrence matrix of these five      Several times a day 3 (%) 3 (4%) * p < 0.05; ** p < 0.01; *** p < 0.001.
Anamnesis factors, work information, clinical status, and symptoms category were assessed. A p-value < 0.05 was considered statistically significant.

Predictive Value for the Proprietary Questionnaire Variables
The proprietary questionnaire was also examined from the Machine Learning point of view, with a similar approach but very different results. The target variable was, as in the preceding analysis, the BQ output in terms of high vs. low risk of OSA. Global AUROC was not too high, with values about 0.80, which witnesses the relationship between the questions and the pathology, but also the scarce usefulness of the proprietary questionnaire in a ML context, at least with the data we possess. No variable derived from the questionnaire items revealed to be strikingly discriminant per se. Moreover, the partially stochastic nature of the feature selection process (due to the different random choices of the selection and quality assessment sets, respectively, P1 and P2) , leaded to quite different AUROC vs. number of features functional dependences at each run (in which AUROC slowly decreased from 80 to 60% with the progressive depletion of the feature set).

DISCUSSION
Of 387 screened patients who completed the BQ, about 20% (n = 76) fell within the high-risk group. Socio-demographic characteristics, work information, clinical factors, and symptoms category were compared between the two groups and are reported in Table 2.

Socio-Demographic Baseline Characteristics
Age is a well-established risk factor for OSA (19,20). The increase in the prevalence of OSA with age could be explained in part by the increase in comorbidities, menopause, hypertension, BMI, but also by the decrease of tongue and palate muscle functions and activities that occurs in older adults (21,22). Regarding the age of the sample, in the high-risk group 67% (n = 51) was ≥41 years old compared to 41% (n = 133) in the low-risk group. We have to consider that our cohort is predominantly composed of young subjects, more than half being <40 years old and only <2% of subjects being more than 60 years old. In our cohort, age was also found to be a risk factor significantly associated with high risk of OSA (p < 0.0001). With respect to gender, epidemiological studies reported a prevalence ranging from 13 to 31% in men and 4 to 21% in women (17,(23)(24)(25)(26)(27). It is difficult to confirm this prevalence in our analysis, considering that our sample is predominantly female (76%). Despite this, we found a statistically significant difference between low-risk and high-risk groups with respect to gender (p = 0.0011). In particular, the percentage of men increases from 21% at low-risk to 39% at high-risk. In contrast, the percentage of women at low-risk is 79% and decreases in high-risk subjects (61%).
Obesity is the most severe known risk factor for OSA. Generally, almost 60% of patients with OSA are obese (28). The risk of OSA increases progressively with BMI and also with neck circumferences (29). In our analysis, the mean of BMI was significantly higher in the high-risk group than in the lowrisk group (p < 0.001). Regarding neck circumferences, half of subjects did not know their neck circumferences. However, neck circumferences were higher than the chosen cut-off in the high-risk group (14%) compared to the low-risk group (6%).
No association was found with smoking and OSA in our sample and this reflects what is found in the literature (30). However, inhalation of cigarette smoke increases oxidative stress and systemic inflammation, which are typically present in OSA (30). Thus, the concomitant presence of OSA in smoker could worsen disease progression.

Work Information
Regarding work information, only the number of years of work experience seems to be associated with a high risk of OSA. However, rather than being a risk factor per se, this variable could be significant just because it is correlated with increasing age, an important risk factor previously discussed. Distribution of working time (full time/part time), work shift (day shift only or 24 h shift) and work shift regularity (yes/no) were not found to be associated with a high risk of OSA. Interestingly, professional categories and instruction level appear to be determinants between the two groups (0.039 and 0.049, respectively).

Health Status
Among all the clinical factors investigated, only the presence of craniofacial morphological alterations was not found to be a risk factor associated with an elevated risk of OSA, contrary to what reported in the literature (31). However, we must consider that only 8 subjects declared to have these alterations, which makes the sample less significant. Sleep disorders, instead, were obviously statistically significant between the two groups (p < 0.001), demonstrating the reliability of the sample.
Hypertension was already known to be associated with OSA (32,33). Normally, 50% of hypertensive patients have OSA and this percentage rises to 85% in patients with hypertension who have at least another OSA symptom (34,35). Subjects with OSA have an 1.8-times increased risk of resistant hypertension compared to non-OSA individuals (36). Our sample confirmed these data since 51% of high-risk persons were hypertensive compared with 5% found in low-risk subjects.
Arrhythmias and transient ischemic attack or stroke were found to be associated to high OSA risk score (p < 0.001 and p = 0.0013, respectively). This is in line with the literature, which attests that prevalence of OSA is estimated to be between two and three times higher in patients with cardiovascular diseases (37).
The percentage of OSA patients who suffer from type 2 diabetes was about 30% (n = 118). The link between diabetes and OSA seems bidirectional but has not been fully evaluated yet. In our cohort, 14% of the high-risk group shows presence of diabetes mellitus, compared to 2% of patients found in the low-risk group. This is statistically significant and the association between diabetes mellitus and being at high-risk is also significant (p < 0.001).
OSA and asthma are closely related. Numerous studies have consistently reported higher OSA burden among subjects with asthma (38,39) and in relation to asthma severity (38,40). In our sample, the percentage of individuals with asthma in the low-risk group was 6% rising to 24% in high-risk group. Asthma was also found to be a strong risk factor for OSA (p = 0.0018).
Chronic obstructive pulmonary disease (COPD) is also highly associated with OSA. COPD is one of the most prevalent respiratory diseases worldwide. There exists what is called COPD-OSA overlap syndrome that represents a distinct clinical diagnosis, where clinical outcomes are even worse than in each disease alone (41). Based on this evidence, we found a significant difference between the low and high-risk groups (p < 0.001).
Recent systematic reviews and meta-analyses reported that OSA is linked to depression (42) and anxiety (43). Other longitudinal studies suggested that patients with OSA are about twice as likely to be depressed than those without OSA (44,45). In our sample, the rate of depression increased from 10% in the low-risk group to 21% in the high-risk OSA group, while the rate of anxiety increased from 33 to 54%. We also found a strong correlation between being at high-risk of OSA and having both depression and anxiety (p = 0.0014 and p = 0.0082, respectively).
Frequent confusion and agitation resulted also to be an important risk factor (p = 0.0022) in our cohort. In particular, 11% of the high-risk subjects show presence of confusion and agitation, compared to 3% of those found in the low-risk group. This phenomenon could be related to anxious behavior, but several efforts should be done for understanding this association.
Excessive alcohol consumption and drug abuse were also assessed between low vs. high score. Results from the literature revealed that alcohol consumption is associated with 25% increased risk of OSA (46). To the best of our knowledge, no data was shown for drug abuse. We found that 7% of the highrisk group declared alcohol and drug abuse, compared to 1% of patients found in the low-risk group. Alcohol and drug abuse were also found to be two independent risk factors for the high-risk group (p = 0.0022 and p = 0.0061, respectively).

Symptoms Category
Daytime OSA symptoms consist of unexplained fatigue and excessive sleepiness. Patients also report repetitive problems with concentration and memory as well as depressive symptoms (47) and impairment of cognitive functions (48). Moreover, a study of men and women aged 60 years and older showed memory impairment related to OSA and hypertension (49). All of these evidences are in line with our findings: difficulty staying awake during an activity, difficulty concentrating, difficulty in expressing oneself, use of stimulants, interference with work, interference with social relationships, slow reactions and difficulty keeping attention up, difficulty in paying attention to several tasks at once, striving not to make mistakes, and need to doze off, are all significantly strong risk factors related to high-risk of having OSA. These symptoms fully describe the OSA patient during his/her daily activity, including working and social activities.

Predictive Value of Questionnaire Items
As concerns the predictive value of the variables acquired by the BQ, our conclusion was that a reduced set of questions, i.e., a reduced set of selected features, composed only of Table 4, is sufficient to obtain an output close to that of the BQ, by using a trained XGBoost classifier.
This reduced questionnaire shows some similarity with the one proposed in Arunsurat et al. (18) with the important difference that (as already remarked) BMI is not preserved in the reduced set. The discrepancy might partly come from the different group considered, i.e., the high percentage of young and prevalently female respondents in our sample compared to the all-male healthcare workers investigated in Arunsurat et al. (18).
From the Results section, it is also evident that the proprietary questionnaire is interesting from the point of view of risk factor assessment, but the ML approach gave no hint on the possibility of replacing/integrating the original Berlin test with (parts of) it. In order to clarify this possibility, a dataset with ground truth coming from PSG or HST is needed along with the questionnaire itself.

Limits
The results of our study must be considered taking into account some limitations that concern the sample size, the lack of the actual disease diagnosis for most subjects, the absence of disease follow-up and long-term effect investigation for the subjects who declared to suffer from OSA and, finally, the possible reluctance of the respondents to faithfully declare their health status since they are professional nurses. Moreover, our survey group does not fully represent the general population, because of the high percentage of young and prevalently female respondents. Finally, we are also aware that the study might give different conclusions in different ethnic groups, depending on language, habits, lifestyles or physical conformation.

Conclusions
In conclusion, there are numerous risk factors associated with a high-risk of having OSA in a population of nurses. Given the high percentage of people who are still underdiagnosed for OSA and the lack of knowledge about this disease, our study contributes to highlight an alarming result that may be just the tip of the iceberg. This study could be helpful to expand awareness about it, especially among professional nurses, who are one of the most important categories in health and our care. It could also allow more professionals to investigate suspected patients who could undergo overnight polysomnography, as well as to explore possible alternative screening tests and cures for the treatment of this still too hidden disease.
Further efforts should be done to increase the number of diagnoses but also, more importantly, to refer these subjects for screening. On this regard, our simplified test might also allow a better administration of the questionnaire facilitating the orientation of the subject at risk toward the diagnostic pathway. We plan indeed a prospective clinical trial that can use the simplified Berlin test together with our proprietary questions on the general population, with the aim of possibly creating a richer questionnaire with better sensitivity and specificity.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

ETHICS STATEMENT
Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. The participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
GDN, LC, RL, and LDB contributed to conception and design of the study. LC, GDN, AC, ME, and MC organized the database. LC, GDN, and EV performed the statistical analysis. LC and GDN wrote the first draft of the manuscript. LC, GDN, MA, DT, and LDB wrote sections of the manuscript. All authors contributed to manuscript revision, read, and approved the submitted version.