CLEC16A variants conferred a decreased risk to allergic rhinitis in the Chinese population

Background: Allergic rhinitis (AR) is a chronic respiratory disease. Hereditary factors played a key role in the pathogenesis of the AR. This study investigated the association between CLEC16A variants and AR risk in the Chinese population. Methods: We applied Agena MassARRAY technology platform to genotype five single nucleotide polymorphisms (SNPs) located in CLEC16A in 1004 controls and 995 cases. The association between CLEC16A SNPs (rs2286973, rs887864, rs12935657, rs11645657 and rs36045143) and AR risk were calculated by logistic regression analysis, with odds ratio (OR) and 95% confidence interval (CI). False-positive report probability (FPRP) was also used to assess the significant results to reduce false positives. Multifactor dimensionality reduction (MDR) was completed to assess the interaction between CLEC16A variants to predict AR risk. Results: Totally, CLEC16A (rs887864, rs12935657, rs2286973, rs11645657 and rs36045143) were significantly associated with AR risk. Therein, rs2286973, rs11645657 and rs36045143 were related to a decreased risk of AR in the people ≤ 43 years old, females and the people with BMI≤24, respectively. And rs887864 and rs12935657 were also associated with a decreased susceptibility of AR in the people >43 years old. Meanwhile, in the results of region stratification, rs887864 conferred a reduced risk to AR in the people from loess hilly area. Conclusion: CLEC16A variants conferred a decreased risk to AR in the Chinese population.


Introduction
Allergic rhinitis (AR) is a chronic respiratory disease characterized by allergen sensitization mainly mediated by immunoglobulin E (IgE). It is characterized by runny nose, sneezing, stuffy nose, pruritus and other triggers. Several factors have been implicated in the etiology of AR, including pollen, dust mites, mold, animal dander and other factors (Skoner, 2001). The disease has a major impact on patients' quality of life and healthcare spending (Canonica et al., 2008). Globally, AR affects approximately 10%-30% of the population. And due to genetic susceptibility, and environmental exposure, it was steadily increasing in the world (Passali et al., 2018). Although the etiology of AR has not been elucidated, there was sufficient evidence that cytokines played an important role in the development and process of AR (Sio et al., 2022).
CLEC16A (C-Type Lectin Domain Family 16 Member A) was formerly known as KIAA0350. It is located in the chromosome 16p13.13. This gene encodes a family member containing a C-type lectin domain. The protein CLEC16A was shown to be a cytosolic protein that binds to Vps16A and regulates the receptor expression through autophagy. In a CLEC16A knockout mice through Cre/loxP system, they found upregulation of cytokine and chemokine secretion, imbalance of dendritic cell subsets and change of receptor expression in mice (Pandey et al., 2019). Some researchers also found that the deletion of CLEC16A leaded to the increase of Nrdp1 targeting parkin (Soleimanpour et al., 2014), and Golgi-related CLEC16A negatively regulated autophagy by regulating mTOR pathway (Tam et al., 2017). At present, CLEC16A was recognized as a susceptibility gene for autoimmune diseases, such as multiple sclerosis (Mero et al., 2011), juvenile idiopathic arthritis and rheumatoid arthritis (Skinningsrud et al., 2010). This suggested that CLEC16A may be the main regulator of abnormal autoimmune response.
Single nucleotide polymorphisms (SNPs) were reported to be an important genetic factor in abnormal autoimmune response. In one previous study, Hakonarson et al. identified CLEC16A as a protective gene for type 1 diabetes gene (Hakonarson et al., 2007). Also, Hirschfield et al. observed the association between primary biliary cirrhosis and variants of immune regulatory gene CLEC16A (Hirschfield et al., 2012). Besides, CLEC16A polymorphisms may involve in pathogenesis of allergic diseases (Revez et al., 2016;Ashley et al., 2017), which can increase or decrease susceptibility to AR through interleukin, and other genes (Black et al., 2009). In an article reported by Ferreira et al., they observed CLEC16A rs62026376 associated with increased asthma with hay fever risk (Ferreira et al., 2014). So, it may be a risky variant. However, CLEC16A variants have been rarely reported in AR population from China.
In the present study, we selected five variants of CLEC16A (rs2286973, rs887864, rs12935657, rs11645657 and rs36045143) based on the 1000 genome project. Based on the NCBI website, we found that rs887864, rs12935657, rs11645657 and rs36045143 were located in the intron region of CLEC16A, while rs2286973 was located in the last exon region of CLEC16A. When rs2286973 G mutates into rs2286973 A, the encoded amino acid has not changed. Subsequently, we assessed the relationship between CLEC16A variants (rs2286973, rs887864, rs12935657, rs11645657 and rs36045143) with AR risk in the Han population from Shaanxi, China.

Study population
The study was approved by the Ethics Committee of Shenmu Hospital and the authorization number was sm004. In this study, all the involved subjects were the Chinese Han population. Totally, 1004 controls (346 males and 658 females) and 995 patients with AR (371 males and 624 females) were consecutively recruited from 5 towns (Daliuta Town in plain lands area, Jinjie Town, Langanbao Town, Hejiachuan Town and central urban area in hilly areas) in Shenmu city. The selected criteria for patients with AR were as follows. ①Clinical symptoms: the patient has two or more symptoms such as sneezing, runny nose, itchy nose, blocked nose and so on. ②Nasal symptoms: pale and swollen bilateral nasal mucosa, inferior turbinate edema, and watery nasal discharge. ③Allergen testing: Blood was drawn to examine specific IgE, and the diagnosis of AR was confirmed if the patient had high specific IgE in serum. The exclusion criteria for AR patients were: no history of asthma, no comprehensive disease such as lung cancer, and no infectious disease such as tuberculosis. Also, based on the routine physical examination, the controls had no history of respiratory diseases, including sinusitis, rhinitis and other inflammatory disease. By reviewing the participant's hospital records, we collected the information of participants, such as age, sex, height, weight, place of origin, residence, occupation, past medical history, family medical history, etc. In addition, we also collected the blood routine indicators obtained by the blood analyzer, such as red blood cell distribution width_SD (RDW_SD), basophil count (BASO), eosinophil count (EO), hemoglobin (HGB), mean hemoglobin concentration (MCHC), neutrophil ratio (NEUT_per), lymphocyte ratio (LYMPH_per), eosinophil ratio (EO_per), basophil ratio (BASO_per) and red blood cell distribution width_CV (RDW_CV). Biochemical indicators (fasting blood glucose, FBG) and renal function indicators (uric acid, UA) were measured by drawing venous blood. Written informed consent was collected from all subjects. All procedures comply with the regulations of the Department of Health and Human Services (DHHS) on the protection of human research objects.

Genotyping of SNPs
According to the SNPs data downloaded from the 1000 genome project (http://www.internationalgenome.org/), CLEC16A variants (rs2286973, rs887864, rs12935657, rs11645657 and rs36045143) with minor allele frequencies> 5% and Hardy-Weinberg equilibrium (HWE) > 0.01 were chosen in the global population. According to the specificity principle of primers, these sites meet the requirements. The SNP-related primers (amplification and extension primers) were finished and listed in Supplementary Table S1. 3ml peripheral blood samples were collected from participants in EDTA pretreated tubes. Subsequently, genomic DNA extracted by the GoldMag Whole Blood Genomic DNA Purification Kit (GoldMag Co. Ltd., Xi'an, China), which was used as the PCR amplification template. Subsequently, DNA concentration was estimated by NanoDrop 2000 (Thermo Scientific, Waltham, Massachusetts, United States). Then, using the Agena MassARRAY platform with iPLEX gold chemistry (Agena Bioscience, San Diego, CA, United States), genotyping of CLEC16A variants were proceeded in a 384-well plate. And data management was performed by Agena Bioscience TYPER software, Version 4.0 (Ding et al., 2015).

Statistical analysis
IBM SPSS statistical software version 24 and Microsoft Excel were used to analyze the socio demographic descriptive data of the case group and the control group. The continuous variables were expressed as mean ± standard error (SE) and compared by Student's t-test. Age and sex distribution differences of AR patients and controls were assessed by t-test and chi-square test, respectively. Meanwhile, in the controls, we analyzed whether the selected SNP sites met the Hardy-Weinberg equilibrium by chi-square test. Chi-square test was also used to analyze the genotype and allele distribution of AR patients and controls. The relationship between CLEC16A variants and AR risk was assessed by logistic regression analysis calculated by the PLINK software, version 1.07 (Harvard, Boston, MA, United States). Generally, OR and 95% CI were calculated by logistic regression to analyze the association under four genetic models (codominant, dominant, recessive and log-additive models). To further eliminate false positive in the data, we used FPRP to analyze the significant results. Multifactor dimensionality reduction (MDR) was completed to assess the interaction between CLEC16A variants to predict AR risk. P less than 0.05 was considered as significative.

Results
The basic information of study population and the selected variants Noteworthy, 1004 controls and 995 patients with AR were collected in this study. The mean age of the 1004 controls (346 males and 658 females) and 995 patients (371 males and 624 females) were 43.77 ± 0.26 years old and 42.81 ± 0.33 years old, respectively (Table 1). There was a difference in age between the two groups (p = 0.002), but there was no difference in sex and BMI between the two groups (p = 0.188, p = 0.796, respectively). In addition, there was no significant difference between the two groups in the distribution of wind beach area and loess hilly area (p = 0.738). Supplementary Table S2 listed the clinical parameters of cases and controls.
In Table 2, the basic information of CLEC16A variants (rs2286973, rs887864, rs12935657, rs11645657 and rs36045143) were listed, including SNP-ID, gene, chromosome, position, localization, allele, minor allele frequency, major allele frequency and HWE p-value. The genotype distribution of all loci in the control group met Hardy-Weinberg equilibrium. We listed the function of the selected SNPs predicated by Regulome DB scores and HaploReg. And the primers of CLEC16A variants were shown in Supplementary Table S1.
Also, we conducted clinical parameter analysis of different genotype carriers (Supplementary Table S3). The results showed that the EO content of the patients with the three genotypes (A/ A, G/A, G/G) was significantly different (p = 0.023). As for rs11645657, controls with the three genotypes (A/A, G/A, G/G) had different clinical parameters-HGB and UA (p = 0.007, p = 0.028), whereas, cases with A/A, G/A, G/G genotype had significantly different LYMPH_per (p = 0.035).
The correlation between CLEC16A variants and AR risk was analyzed in the overall population In order to analyze the above differences, we used logistic regression to evaluate the association of CLEC16A variants (rs2286973, rs887864, rs12935657, rs11645657 and rs36045143) with AR risk (Table 3). The results of the allele model showed that rs887864-A conferred a remarkable  0.74-0.99, log-additive: p = 0.043). The above results illustrated that allele-A and genotype-AA may play a protective role in the pathogenesis of AR. And, a significant correlation between rs12935657 and AR susceptivity was observed in the codominant (G/G genotype: OR: 0.24, 95%CI: 0.07-0.83, p = 0.025) and recessive (G/G genotype: OR: 0.24, 95%CI: 0.07-0.84, p = 0.025) models, suggesting that genotype-GG may play a protective role in the pathogenesis of AR. We also did the FPRP analysis for the significant associations of CLEC16A variants with AR risk (Supplementary Table S4). When the statistical power assumed to be 1.00, the noteworthy association between rs887864 and AR risk in the allele, codominant, recessive and log-additive models (FPRP = 0.097, 0.136 and 0.147, respectively), with noteworthiness for OR of 1.50/0.67 and the FPRP cut-off value 0.2.
In Table 5, the sex stratification displayed that rs2286973, rs887864 and rs11645657 were related to AR risk in females (rs2286973: codominant-G/G genotype: p = 0.027; recessive-G/G genotype: p = 0.040; log-additive: p = 0.041; rs887864: allele-A: p = 0.023; codominant-A/A genotype: p = 0.044; dominant-G/A-A/A genotype: p = 0.040; log-additive: p = 0.018; rs11645657: allele-C: p = 0.038; codominant-C/C genotype: p = 0.044; log-additive: p = 0.025), but these loci had no significant association with AR risk in males. The BMI stratification demonstrated that rs2286973, rs11645657 and rs36045143 were correlated with AR risk in the people with BMI≤24 (rs2286973: allele-G: p = 0.012; codominant-G/ G genotype: p = 0.029; dominant-G/A-G/G genotype: p = 0.023; logadditive: p = 0.013; rs887864: allele-A: p = 0.014; codominant-A/A genotype: p = 0.037; dominant-G/A-A/A genotype: p = 0.035; logadditive: p = 0.015; rs36045143: allele-A: p = 0.018; log-additive: p = 0.020). Whereas, these SNPs were not found to be associated with AR risk in the people with BMI>24.  Meanwhile, we evaluated the association between CLEC16A variants and AR risk stratified by region (wind beach area and loess hilly area) in Table 6. In the people from loess hilly area, rs887864 conferred a reduced risk to AR in the codominant (A/A genotype: p = 0.015) and recessive (A/A genotype: p = 0.016) models. Yet, the site was not found to be associated with the risk of AR in the people from wind beach area.

SNP-SNP interaction was used to predict AR risk
To predict the AR risk, we did the SNP-SNP interaction among CLEC16A variants by MDR method (Supplementary Table S5, Figure 1 and Figure 2). In the Supplementary Table S5, the best model was composed of rs2286973 G/A, rs887864 G/A, rs12935657 G/A, rs11645657 C/G and rs36045143 A/G, with cross-validation consistency (CVC) of 10/10 and testing balanced accuracy of 55% (adjusted OR: 2.05, 95%CI: 1.67-2.51, p < 0.000). In the Supplementary Table S6, we have listed all possible genotype combinations. Among them, the high-risk genotypes (ratio greater than 1) were AG*GA*GG*CC*AA, AG*GA*GG*GC*AA, AG*GA*GA*GC*AG, AA*GA*GA*GG*AG, AA*GA*GA*GC*AG, AA*GG*GG*CC*AA, GG*AA*GG*CC*AA and GG*AA*GG *GC*AA. In addition, we showed the interaction between each site with the dendrogram (Figure 1) and the circle graph ( Figure 2). Antagonism was indicated by blue, green and brown. In Figure 2, the interaction between rs2286973 and rs887864 was the strongest, with the information gain (IG) value 0.00%.

Discussion
CLEC16A is found in the 16p13 area next to CIITA gene. It was reported that CLEC16A lacks a functional Ag recognition domain, and Schuster et al. observed that it can induce autoimmune responses in mice, possibly by stimulating Ag expression in thymic epithelial cells (Schuster et al., 2015). More importantly, CLEC16A may be the main regulator of autoimmunity, like Type 1 diabetes (Barrett et al., 2009), primary biliary cirrhosis (Hirschfield et al., 2012), alopecia areata (Jagielska et al., 2012). Also, CLEC16A variations were observed to affect the selection and reactivity of T cells (Schuster et al., 2015). The study provided a link between CLEC16A variants and immune disorders underlying autoimmune risk. Based on the genome wide association analysis, Ferreira et al. identified the risk variations associated with hay fever phenotype asthma, including the sites of CLEC16A (Ferreira et al., 2014). Among them, rs62026376 C allele was observed to be associated with an increased risk of hay fever asthma. In addition, Gao et al. noticed that CLEC16A rs7203459 was a specific susceptibility locus of AR in the Han Chinese population (Gao et al., 2020). In our research, we also observed the significant association between CLEC16A SNPs (rs2286973, rs887864, rs12935657, rs11645657 and rs36045143) and AR risk. It showed the importance of CLEC16A polymorphisms in AR. The analysis of the interaction between rs2286973, rs887864, rs12935657,  rs11645657 and rs36045143 is helpful to discover the risk factors of AR. Our results showed that the best model was composed of rs2286973 G/A, rs887864 G/A, rs12935657 G/A, rs11645657 C/G and rs36045143 A/G for AR. The incidence of AR increased with age (Westman et al., 2012). It has been reported that there is a certain prevalence of AR in childhood (Wise et al., 2018). Once allergic symptoms appear, the symptoms generally persist into adulthood. As much as 80% of AR patients develop symptoms before the age of 20 years (Roditi and Shin, 2018). Not only that, some researchers observed that the influence of some susceptible genes on AR was related to age (Zhang and Xu, 2020;Yin et al., 2021). In this study, we took the average age of 43 as the critical value for stratified analysis. The results showed rs2286973, rs11645657 and rs36045143 were related to a decreased risk of AR in the people ≤43 years old, while no such association was found in people over 43 years of age. These findings suggest that rs2286973, rs11645657 and rs36045143 polymorphisms have agerelated effect on AR risk.
AR was a sex-specific disease (Osman et al., 2007). In one article reported by Rosário et al., boys often appeared allergic reactions in childhood, while girls were more likely to develop allergic disorders (including AR) during sexual development (Rosário et al., 2021). This may be related to sex hormones, lifestyle, dietary differences, professional choice and treatment compliance and other factors (Stübner et al., 1999). What's more, genetic susceptibility to AR varied by sex (Tian et al., 2015). In the present study, rs2286973, rs887864 and rs11645657 were related to AR risk in females, but these loci had no significant association with AR risk in males. The findings reemphasized the importance of sex in the study of the association between genetics and AR risk.
BMI was an influential factor for the high prevalence of AR (Lokaj-Berisha et al., 2015). Not only that, some reporters found

FIGURE 1
The interaction between rs2286973, rs887864, rs12935657, rs11645657 and rs36045143 were shown in the dendrogram. Antagonism was indicated by blue, and green.

FIGURE 2
The circle graph showed that the interaction between rs2286973 and rs887864 was the strongest, with the information gain (IG) value 0.00%. Antagonism was indicated by blue, green and brown.
Frontiers in Genetics frontiersin.org that variants of genes were associated with the susceptibility to AR. Wang et al. found that LPP (rs2030519, rs6780858 and rs2990220) were related to AR risk . Lian et al. observed that Related Orphan Receptor A (RORA)-rs10519067, rs10519068, and rs11071559-were associated with AR patients with BMI ≤24 kg/m 2 (Lian et al., 2022). In the study, rs2286973, rs11645657 and rs36045143 were correlated with AR risk in the people with BMI≤24, illustrated that AR risk was influenced by BMI. Regionality was also a factor affecting AR. In our previous research (Gao et al., 2021), the prevalence of asthma without AR was 1.55 times higher in people living in plain areas than that in people living in hilly areas. Moreover, we found that the prevalence of AR with asthma was 2.00 times higher in plains than that in hills. These results indicated that the incidence of asthma and AR was closely related to the place of residence of the patients. In order to further study the impact of the region on AR, we assessed the association between CLEC16A variants and AR risk stratified by region (wind beach area and loess hilly area). In the people from loess hilly area, rs887864 conferred a reduced risk to AR. Yet, the site was not found to be associated with the risk of AR in the people from wind beach area. A large number of samples were required for subsequent verification.
There are still some deficiencies in this study. Firstly, the study population focused on the Han population, and different ethnic populations should be collected to verify the results. Secondly, the other risk factors (occupational exposures, working condition, family background, physical activity, etc.) were not included and should be considered for future assessment. Thirdly, we used Regulome DB (http://www.regulomedb.org/) and HaploReg predict the function of SNPs (Table 2). Further theoretical experiments were needed to verify that these sites may affect the risk of AR.

Conclusion
In a short, CLEC16A (rs887864, rs12935657, rs2286973, rs11645657 and rs36045143) conferred a decreased risk to AR in the Chinese population.

Data availability statement
The original contributions presented in the study are publicly available. This data can be found here: https://zenodo.org/7284973.

Author contributions
YN and HW completed genotyping and performed the manuscript. ZL and ML took part in genotyping. BS, JL and QW participated in the statistical analysis of the data and modified the manuscript. YL designed the study, cosupervised the project and finalized the manuscript. All the authors have read and approved the final manuscript.

Funding
The study was supported by Natural Science Foundation of Shaanxi Province (2021SF-075), Science and Technology Plan Project of Yulin City (YF-2020-191) and Shenmu Municipal Government Scientific Research Project (2019) No.5.