Endotyping Seasonal Allergic Rhinitis in Children: A Cluster Analysis

Background Seasonal Allergic Rhinitis (SAR) is a heterogeneous inflammatory disease. We hypothesized that a cluster analysis based on the evaluation of cytokines in nasal lavage (NL) could characterize distinctive SAR endotypes in children. Methods This cross-sectional study enrolled 88 children with SAR. Detailed medical history was obtained by well-trained physicians. Quality of life and sleep quality were assessed through standardized questionnaires [Pediatric Rhinoconjunctivitis Quality of Life Questionnaire (PRQLQ) and Pittsburgh Sleep Quality Index (PSQI) respectively]. Children were grouped through K-means clustering using Interleukin (IL)-5, IL-17, IL-23, and Interferon (INF)-γ in NL. Results Out of the 88 patients enrolled, 80 were included in the cluster analysis, which revealed three SAR endotypes. Cluster 1 showed lower levels of IL-5 and IL-17 and intermediate levels of IL-23 and IFN-γ; Cluster 2 had higher levels of IL-5 and intermediate levels of IL-17, IL-23, and IFN-γ; Cluster 3 showed higher levels of IL-17, IL-23, and IFN-γ and intermediate levels of IL-5. Cluster 1 showed intermediate values of nasal pH and nasal nitric oxide (nNO), and a lower percentage of neutrophils at nasal cytology than Clusters 2 and 3. Cluster 2 had a lower level of nasal pH, a higher nNO, higher scores in the ocular domain of PRQLQ, and worse sleep quality than Clusters 1 and 3. Cluster 3 showed a higher percentage of neutrophils at nasal cytology than Clusters 1 and 2. Conclusions Our study identified three endotypes based on the evaluation of cytokines in NL, highlighting that childhood SAR is characterized by heterogeneous inflammatory cytokines.


INTRODUCTION
The diversity of cytokines in nasal lavage (NL) fluid offers the opportunity of assessing the underpinning immune-inflammatory network of allergic rhinitis (AR) (1). Indeed, measuring such mediators in NL may contribute to describing the mucosal activity profile (2) and gaining insight into the pathophysiologic processes (3). A recent study on adult patients assessed nasal secretions, searching for potentially relevant mediators related to different rhinitis endotypes; however, despite a broad panel of inflammatory mediators, no clear profile could be found (4).
The pivotal role of Th2-derived cytokine polarization in seasonal AR (SAR) has been emphasized in a previous study in children, highlighting a close connection between Th2 cytokines, such as Interleukin-5 (IL-5), and eosinophil infiltration in the nasal mucosa (5). Interestingly, significantly increased levels of IL-17 were found in NL from adults with SAR, suggesting that the upregulation of Th17 may be involved in the inflammatory pathways of nasal disease in these patients (6). Moreover, the progressive decrease in the expression of IL-23p19 mRNA in response to specific allergen observed in the peripheral blood mononuclear cells of children with tree pollen-induced AR after sublingual immunotherapy suggests a role for this cytokine in the pathogenesis of SAR (7). Finally, low levels of IFN-γ have been described in NL from children with SAR, indicating that the Th1 immune response in the nasal mucosa is reduced in these patients (8). Overall, these data suggest that allergic inflammation is characterized by the activation of a complex immunological network, including a heterogeneous range of mediators. However, no study investigated whether different endotypes of childhood SAR could be characterized based on the cytokines that may be involved in the pathophysiological mechanisms underlying AR.
Data-driven approaches such as clustering methods could be useful for characterizing heterogeneous features of diseases among patients. In particular, the K-means algorithm is one of the most popular iterative descent clustering methods; it aims to minimize the sum of within-cluster variances and to maximize cluster separation, thereby identifying distinct groups within the population (9).
We hypothesized that a cluster analysis based on the evaluation of cytokines such as IL-5, IL-17, IL-23, and INF-γ in NL could characterize distinctive SAR endotypes in children.

Participants
We show the results of the first phase (to characterize the subjects at baseline and discriminate groups of children based on IL-5, IL-17, IL-23, and INF-γ in NL) of a cross-sectional study approved by the local Institutional Ethics Committee (Palermo 1, Italy, Approval Number: 10/2017). Once approved, the study was registered on ClinicalTrials.gov (NCT03349619). The study was conducted in accordance with Good Clinical Practice and the Declaration of Helsinki; informed consent was obtained from all parents before study entry.
The study population comprised 88 children with SAR. Children underwent a physical examination and were assessed for eligibility during their first consultation at the Institute for Biomedical Research and Innovation of the National Research Council (Palermo, Italy), between March 2018 and July 2018. The inclusion criteria were: (1) age 6-16 years; (2) diagnosis of AR in the previous year; (3) mono-sensitization to grass pollen, identified by positive skin prick test and specific immunoglobulin E (IgE >0.70 kU/l). The exclusion criteria were: (1) upper or lower respiratory tract infections (having taken antibacterial therapy in the 4 weeks before the study entry); (2) lifetime history of asthma (doctor diagnosis); (3) use of systemic/topical corticosteroids, systemic/topical decongestants, or antihistamines in the 4 weeks before the study entry; (4) anatomic nasal defects (i.e., septum deviation), or nasal polyps evaluated by a well-trained physician (VM) who performed a visual inspection through a headlamp with a nasal speculum in order to evaluate the anterior third of the nasal airway, including the anterior tip of the inferior turbinates and portions of the nasal septum; (5) active smoking. Pollen counts were monitored throughout the study period showing that grass pollen levels on average were below 30 grains/m 3 .

Procedures
Detailed medical history was obtained by well-trained physicians (VM, GF, SLG) through a standardized questionnaire administered to parents (10)(11)(12) to investigate current AR symptoms and lifetime comorbidities, parental history of rhinitis, parental education, household crowding index (HCI, defined as the total number of co-residents per household, divided by the total number of rooms, excluding the kitchen and bathrooms), and information about exposure to current indoor (mold/pet/smoke) and outdoor risk factor (traffic at residential address). AR diagnosis was confirmed according to ARIA guidelines (13).
Current exposure to mold/pet/smoke at home was assessed through the questions: "Have you currently seen mold/dampness/fungi on the walls or on the ceiling of your child's bedroom?"; "In the past 12 months has your child had a pet (dogs or cats) at home?"; "Are there smokers at home"? Self-reported traffic exposure was recorded as the frequency of trucks passing on the street of residence on weekdays (never/rare/frequent/constant), and subjects were considered exposed if they answered "frequent" or "constant."

Nasal Parameters
The temporal sequence of nasal procedures was the following: nasal nitric oxide (nNO), nasal cytology and NL. All the procedures were performed on the same day within 30 min. Following ATS/ERS recommendations (14), nNO was measured 'off line' by an electrochemical sensor (Hypair FeNO, MediSoft, Belgium). Air from the nasal cavity was continuously analyzed by the sensor at a sample flow rate of 350 ml/min, during 30 s tidal breathing through resistance, so that the velum was closed to prevent any contamination of nasal with bronchial air. Only measurements with a variability <10% were retained. The mean nNO level was determined after three exhalations performed at >30-s intervals. nNO values were expressed as log nNO.
Nasal cytology was performed using a small plastic curette (Rhinoprobe TM) in anterior rhinoscopy, scraping from the middle portion of the inferior turbinate. The cellular material was spread on a glass slide, fixed by air drying, and then stained through the May-Grünwald-Giemsa method. Slides were read by a well-trained physician (ML) using a standard optical microscope equipped with a digital camera at ×1,000 magnification in oil immersion. The analysis of rhinocitograms involved the reading of not <50 fields. Granulocytes were assessed based on previously published recommendations (15).  NL was practiced by a well-trained physician (VM). Subjects were asked to tilt their head back at a 45 • angle and close the nasopharynx with the soft palate. NL fluid was obtained by instilling 3 mL of isotonic saline (0.9% NaCl) prewarmed to 37 • C in each patient's nostril, using a syringe. After 10 sec, the subject blew their nose forcefully into a sterile plastic container. The recovered NL vs. introduced volume saline solution was comparable. The average recovery of fluid from NL was approximately 70%. Obtained samples were transferred into conical polypropylene tubes and processed as previously described by Pizzichini et al. (16), with minor modifications (2). Briefly, dithiothreitol (DTT) (Sputolysin, Calbiochem Corp., San Diego, CA, USA), freshly prepared in a 10% dilution with distilled water, was added to the recovered NL fluid in the equivalent volume of 1/10th.
After homogenization and centrifugation at 500 g for 10 min at 4 • C, the supernatant was stored at−80 • C for later ELISA assay and pH measurements. Determinations of the absolute value of IL-5, IL-17, IL-23, and INF-γ in NL fluid were assessed using commercial ELISA kits, according to the manufacturer's instructions. The IL-5 sensitivity limit was 0.29 pg/ml (R&D Systems, Oxon, UK), IL-17 sensitivity limit was 0.01 pg/ml (Invitrogen, Thermo Fisher Scientific, Waltham, MA), IL-23 sensitivity limit was 4 pg/ml (Affimetrix, eBioscience, part of Thermo Fisher Scientific, Waltham, MA), IFN-γ sensitivity limit was 5 pg/ml (Abcam, Cambridge, MA). A stable pH was achieved in all cases after deaeration/decarbonation of the NL samples by bubbling with argon (350 ml/min) for 10 min. pH was measured in the sample using a pH meter (Corning 240, Science Products Division, New York, N.Y., USA) with a 0-14.00 pH Range 9. The pH meter was calibrated before each measurement using solutions with pH values of 4, 7, and 9.
All procedures were performed in a room maintained at a constant ambient temperature (23 • C) and relative humidity (65%).

Total 5 Symptom Score
The Total five Symptom Score (T5SS) is a subjective scoring system for the determination of symptom severity based on five domains. rhinorrhoea, nasal obstruction, nasal itching, sneezing, and eye itching. Each symptom is scored on a 4-point scale from 0 to 3 (0, absent; 1, mild -any symptom that is present but not particularly bothersome; 2, moderate -any symptom that is bothersome but does not interfere with daily activities or disturb sleep; 3, severe -any symptom that interferes with daily activity or disturbs sleep). The total score is calculated by adding the scores for all the five domains, resulting on a range of 0-15 (17).

Pediatric Rhinoconjunctivitis Quality of Life Questionnaire
Quality of life was measured through the Italian validated version of the Pediatric Rhinoconjunctivitis Quality of Life Questionnaire (PRQLQ). PRQLQ is a self-administered diseasespecific questionnaire, referring to the previous week, for assessing physical, emotional, and social problems in children with AR aged 6 to 12 years. It includes 23 items in five domains: nose symptoms, eye symptoms, practical problems, activity limitation, and other symptoms. Each domain is scored on a 7-point scale (from 0, not troubled, to 6, extremely troubled). The overall score is obtained as the mean score of all items and the domains score is the mean of the corresponding items (18).

Pittsburgh Sleep Quality Index
The Pittsburgh Sleep Quality Index (PSQI) was used to assess sleep quality and disturbances. PSQI is a generic questionnaire completed by parents, with a 4-week recall, which includes 19 questions in 7 domains: subjective sleep quality, sleep latency, sleep duration, habitual sleep efficiency, sleep disturbance, use of sleep medications, and daytime dysfunction. Each domain is scored from 0 to 3. The overall score, ranging from 0 to 21, is obtained as the sum of each domain score. A total score above 5 indicates poor sleep quality (19).

Statistical Analysis
Data were presented as n (%) or mean ±SD. Differences of categorical variables were analyzed using the Chi-squared test. Quantitative variables were compared using Kruskal Wallis test.
We applied K-means clustering (9) using standardized IL-5, IL-17, IL-23 and IFN-γ as the input variables. The optimal number of clusters was selected according to Elbow method (20), which runs k-means clustering on the dataset for a range of values for k (from 1 to 10 in our case) and plot a line chart of the total within sum of square for each value of k. If the line chart looks like an arm, then the "elbow" on the arm is the value of k that is the best. However, the determination of the number of clusters should also consider a combination of factors, including fit indices, cluster size, and interpretability (21). For cluster visualization, a "cluster plot" was drawn, plotting subjects in two dimensions after multidimensional scaling.
Between-cluster differences were tested by using the Kruskal-Wallis test. To check the robustness of our findings, a sensitivity analysis was performed imputing the missing values. Missing data were imputed using the mice (22) package in R, which creates multiple imputations (replacement values) for multivariate missing data. The Benjamini & Hochberg method was used to adjust the p-values for multiple comparisons. Pvalues lower than 0.05 were considered to indicate a statistically

RESULTS
Demographic and clinical characteristics of the study sample are reported in Table 1. Out of 88 NL samples, 80 were includeed in the analysis (eight samples were below the detection limit of the ELISA kit).

Cluster Analysis
According to the best trade-off between Elbow method, cluster size and interpretability, the optimal number of clusters was 3 ( Figure 1A). Figure 1B highlights a good cluster separation.  years. Multiple comparisons among clusters were reported in Table 2. Table 3 reports the clusters characterization. Cluster 1 showed intermediate values of nasal pH and nNO, and a lower percentage of neutrophils at nasal cytology than Clusters 2 and 3. Cluster 2 had a lower level of nasal pH, a higher nNO, and higher scores of "eye symptoms" (as a domain of the PRQLQ), and PSQI total score, than Clusters 1 and 3.

Cluster Characterization
Cluster 3 showed a higher percentage of neutrophils at nasal cytology than Clusters 1 and 2.

Sensitivity Analysis
After multiple imputations of missing values, the cluster analysis was re-run, and the new clusters were compared. Findings obtained with the new clustering did not change ( Table 4 and Figure 3).

DISCUSSION
To the best of our knowledge, this is the first study showing that SAR endotypes in children could be grouped into three distinct clusters based on the evaluation of cytokines in NL. In particular, differently from the current study, the authors failed to detect IL-5, IL-17, or IFN-γ in NL of pollen-sensitized AR patients (4). Such discrepancy could be ascribed to differences in study population that contribute to the variation observed between patients. In this regard, our results add further evidence attempting to define the NL cytokine profiling in children with SAR.
Non-invasive sampling of airway epithelial-lining-fluid by NL is a sensitive method to monitor inflammation in patients with respiratory diseases (28). The detection of inflammatory parameters in the nasal compartment may provide information about the general inflammatory status and might support the clinician in the framework of precision medicine targeted to pediatric allergic respiratory diseases (29,30). Indeed, by measuring cytokines in NL, we were able to identify three distinct clusters of SAR in children.

Cluster 1
Cluster 1 included 40 (50%) children, predominantly with moderate/severe intermittent and persistent severity level according to ARIA guidelines; this group showed intermediate values of nasal pH and nNO, and a lower percentage of neutrophils at nasal cytology than Clusters 2 and 3. Cluster 1 was characterized by a disease with a low overall inflammatory burden that did not correspond to a distinct Th1-, Th2-, or Th17associated signature; therefore, it seems not associated with a specific polarized immune-inflammatory response.

Cluster 2
Cluster 2 comprised 25 (31.2%) children, predominantly with moderate/severe persistent severity level according to ARIA guidelines; this group had a lower level of nasal pH, a higher nNO, and higher scores of "eye symptoms" (as a domain of the PRQLQ), and PSQI total score, than Clusters 1 and 3. Cluster 2 carried a predominant Th2 signature in the NL, given the significantly higher levels of IL-5 in this group than Clusters 1 and 3. IL-5 concentrations in both the upper and lower airways have been demonstrated to be related to the degree of eosinophilic inflammation in airway disease (31). Indeed, a previous study in children with SAR demonstrated the close connection between Th2 cytokines such as IL-5 and eosinophil infiltration in the nasal mucosa, emphasizing the pivotal role of Th2-derived cytokine polarization in SAR (5). It is accepted that the lower values of pH are associated with eosinophilic inflammation in the airways (31). Indeed, lower pH values were observed in the oral and nasal exhaled breath condensate (EBC) of children with asthma, atopic dermatitis, and AR with respect to healthy controls (29). Interestingly, the higher score of the domain "ocular symptoms" in PRQLQ as well as the higher PSQI total score highlight the higher burden of disease in this cluster in comparison with Clusters 1 and 3, which might be ascribed to a Th2 polarized immune-inflammatory response. These results are not surprising, as we previously demonstrated that QoL and sleep quality may be impaired in children with perennial and   seasonal AR (32,33). In particular, ocular allergic symptoms are strongly associated with AR and are increasingly recognized as a disorder that imposes its burden on patient's QoL (34). A significant association between AR and poor sleep quality has also been reported (35). The finding of higher nNO levels in Cluster 2 is in line with a predominant Th2 signature in this group of patients. Indeed, allergen exposure and consequent inflammation in the nose and paranasal sinuses lead to mast cells' activation and antigen-specific Th2 cells with the simultaneous production of cytokines, including IL-5 (36). In patients with AR, nNO may be used as a biomarker of eosinophilic inflammation because nasal eosinophils displayed the best correlation with symptoms and inflammation in AR (37). In addition, the levels of nNO positively correlated with the levels of IL-5 in nasal EBC of children with AR, as we previously demonstrated (29). Nonetheless, we could not observe a significantly different percentage of eosinophils at nasal cytology. We can hypothesize that a low accumulation of eosinophils in the nasal mucosa of our children could be due to a low extent of the epithelial damage which in turn could be ascribed to the low (below 30 grains/m 3 ) grass pollen levels monitored throughout the study period.

Cluster 3
Cluster 3 included 15 (18.8%) children, mainly with mild intermittent severity level according to ARIA guidelines, and carried a predominant Th1/Th17 signature in the NL, as suggested by the significantly higher levels of IL-17, IL-23, and IFN-γ in this group than Clusters 1 and 2. The IL-23/IL-17 pathway is related to the local tissue inflammatory response. It is thought that IL-23 plays a significant role in the early stages of allergic inflammatory responses, as it directly induces IL-17 production and neutrophil migration and accumulation (38). Indeed, IL-23 is significant in the antigen-dependent activation of both Th2 and Th17 cells as well as in the active phase of allergic respiratory tract inflammation (39,40). Here, the significance of IL-17 and IL-23 in Cluster 3 and the relevant increase of nasal neutrophils in NL suggest a potential regulatory role of IL-23-Th17 axis in nasal inflammation in this group of children. The concomitant overexpression of IFN-γ, IL-23, and IL-17 in Cluster 3 suggests a potential involvement of IFN-γ in reducing Th2 profile and an endotypic switch toward IL-23/Th17 axis in this group of children with SAR. Indeed, IFN-γ acts not only as a potent activator of the Th1 phenotype but also as a suppressor of Th2 development (41). The suppressive effects of IFN-γ on allergic diseases have been shown to be mediated by various mechanisms, such as the regulation of allergen presentation to T lymphocytes, the differentiation of naive T cells toward Th1 phenotype and/or inhibition of Th2 cell recruitment/differentiation, the suppression of Th2 cytokine release from activated T cells, and the induction of apoptosis in T cells and eosinophils (42)(43)(44)(45).
Overall, the inflammatory endotypes outlined in this study confirm that SAR is a heterogeneous inflammatory disease and suggest that clustering methods could be useful for characterizing heterogeneous features of AR within distinct patients.

Strengths and Limitations
The main strength of our study is the application of an unsupervised statistical method, i.e. cluster analysis, to a population of subjects with a wide range of nasal parameters to identify the possible underlying endotypes. With respect to hierarchical clustering, K-means is computationally faster, relatively scalable, and simple, and can produce tighter clusters. Moreover, clinical characterization followed the international ARIA classification and standardized questionnaires were used to evaluate the disease burden.
We also recognize some study limitations. First, we did not test peripheral specific and total IgE as markers of systemic allergic inflammation; however, comparable levels of eosinophil inflammation in the blood and airways have been previously reported in patients with AR regardless of the pollen season (46). Moreover, we could not perform the cluster analysis on the overall study sample, given that eight samples were considered non-viable due to little or no protein content upon analysis; however, after sensitivity analysis, results did not change. No other Th2 cytokines as for example levels of IL-4 and IL-13 in nasal secretions were measured. A control group could be considered in further studies in order to have a normal reference group. To our knowledge there is no evidence that rhinoprobe scraping may alter the wash content, however, testing for confirm this hypothesis would have been useful. Finally, our subjects were cross-sectionally analyzed; therefore, validation of the three clusters identified in this study will be necessary to assess whether the results obtained are maintained, and especially with higher pollen exposure. Moreover, more extensive longitudinal studies are required, for assessing cluster stability over time, also in relation to treatment and environmental exposures, and generalizability of our findings to other populations.

CONCLUSIONS
In conclusion, this is the first study reporting different inflammatory SAR endotypes in a pediatric population based on K-means clustering method, and the first one using a minimally invasive collection of nasal cytokines for this purpose. Endotypes may provide a more accurate description of the inflammatory patterns than phenotypes only; in this regard, SAR-related inflammation in children should be considered multi-dimensionally heterogeneous on the Th1, Th2, Th17 axes. The clinical implications of the current findings would benefit from future studies that are required to assess the stability of inflammatory signatures over time.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by EC Palermo 1, Italy, Approval Number: 10/2017. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.

AUTHOR CONTRIBUTIONS
SLG designed the study. GC contributed to data analysis, interpretation, and to draft of the article. SLG, GF, and AL wrote the initial draft. SF and GLM contributed to the interpretation of the results and reviewed the manuscript. VM, ML, RG, MP, and LM mainly contributed to data collection. All authors actively participated in all phases, and agreed to be accountable for the accuracy and integrity of any part of the work.