A combination of HLA-DP α and β chain polymorphisms paired with a SNP in the DPB1 3’ UTR region, denoting expression levels, are associated with atopic dermatitis

Introduction: Components of the immune response have previously been associated with the pathophysiology of atopic dermatitis (AD), specifically the Human Leukocyte Antigen (HLA) Class II region via genome-wide association studies, however the exact elements have not been identified. Methods: This study examines the genetic variation of HLA Class II genes using next generation sequencing (NGS) and evaluates the resultant amino acids, with particular attention on binding site residues, for associations with AD. The Genetics of AD cohort was used to evaluate HLA Class II allelic variation on 464 subjects with AD and 384 controls. Results: Statistically significant associations with HLA-DP α and β alleles and specific amino acids were found, some conferring susceptibility to AD and others with a protective effect. Evaluation of polymorphic residues in DP binding pockets revealed the critical role of P1 and P6 (P1: α31M + (β84G or β84V) [protection]; α31Q + β84D [susceptibility] and P6: α11A + β11G [protection]) and were replicated with a national cohort of children consisting of 424 AD subjects. Independently, AD susceptibility-associated residues were associated with the G polymorphism of SNP rs9277534 in the 3’ UTR of the HLA-DPB1 gene, denoting higher expression of these HLA-DP alleles, while protection-associated residues were associated with the A polymorphism, denoting lower expression. Discussion: These findings lay the foundation for evaluating non-self-antigens suspected to be associated with AD as they potentially interact with particular HLA Class II subcomponents, forming a complex involved in the pathophysiology of AD. It is possible that a combination of structural HLA-DP components and levels of expression of these components contribute to AD pathophysiology.


Introduction
Atopic dermatitis (AD) is one of the most common dermatologic illnesses with a known genetic predisposition (Abuabara et al., 2019a;Abuabara et al., 2019b;Chiesa Fuxench et al., 2019;Silverberg et al., 2019). It is often hypothesized to be a disorder involving both skin barrier and immune system dysfunction, the latter of which is thought to be mediated by antigen and centered on the activation of the T H 2 pathway (Gough and Simmonds, 2007).
The Human Leukocyte Antigen (HLA) region of chromosome 6 is frequently associated with immune mediated illnesses (Gough and Simmonds, 2007). Using next-generation sequencing technology (NGS), we recently evaluated associations between HLA Class I genetic variation and AD (Margolis et al., 2021a;Margolis et al., 2021b). HLA Class II molecules, as part of the adaptive immune response, play a major role in the presentation of antigens to CD 4 T cells, so we extended our study to include the characterization by NGS of HLA Class II polymorphisms. Presentation of antigen through the skin occurs via antigen presenting cells that contain receptors formed by highly polymorphic molecules (epitopes) in HLA Class II as HLA-DR, HLA-DQ, and HLA-DP (Couture et al., 2019). These genes and the subsequent receptor epitopes from these genes create highly variable HLA Class II binding groves that interact with antigen(s) (Couture et al., 2019;Gong et al., 2020). The formed epitopes within these grooves have varying affinities for antigen(s) thereby playing an important role in antigen presentation (van Lith et al., 2010;Couture et al., 2019;Gong et al., 2020). The functioning of HLA Class II is, therefore, consistent with the hypothesized immunologic basis of AD.
In the past, large genome wide association studies (GWAS) relied on imputation protocols for HLA and evaluated HLA associations in cohorts of primarily Europeans and Asians with AD (Paternoster et al., 2011;Sun et al., 2011;Weidinger et al., 2013;Paternoster et al., 2015). These reports identified HLAs associated with AD or other allergic illnesses, such as asthma and food allergy, which are diseases commonly seen with AD. These HLA associations were most frequently in the HLA Class II region (Weidinger et al., 2010;Paternoster et al., 2011;Hirota et al., 2012;Weidinger et al., 2013;Paternoster et al., 2015). Given the critical role of HLA Class II genes in the adaptive immune response, prior identification of specific HLA polymorphisms relevant to AD, and the potential role that HLA Class II polymorphisms may play in AD susceptibility/protection, deeper exploration of HLA Class variation is warranted.
Several smaller studies using older immunogenotyping technology have explored the relationship between HLA allelic variation and AD as well as comorbid "atopic" illness like asthma, food allergies, and seasonal allergies with conflicting results (Paternoster et al., 2011;Sun et al., 2011;Weidinger et al., 2013;Margolis et al., 2015;Paternoster et al., 2015). For example, Aron et al. found a strong positive relationship between HLA Class II DR4 and DR7 alleles with atopy and asthma in a Finnish cohort (Aron et al., 1996). HLA-DRB1, -DQB1 and -DPB1 genotypes were shown to be correlated with peanut allergy in a British study (Howell et al., 1999). In a small Japanese study, HLA-DRB1 and -DQB1 alleles were associated with severe AD with high IgE levels (Saeki et al., 1995). In contrast, Affes et al. found no association between AD and HLA-B, -DR and -DQ, whereas HLA-A had a protective effect in a cohort of Tunisian patients (Affes et al., 2007). Park et al. showed an association of HLA-DRB1 with AD in Korean children with food allergy (Park et al., 2012). In a small study of African-Americans, HLA-DRB1 allelic variation was specifically associated with HLA-DR receptor binding-pocket changes that were associated with both the onset and persistence of AD; these associations were not found in Whites (Margolis et al., 2015). Madore et al. reported an association between HLA-DQB1 and peanut allergy (Madore et al., 2013).
Employing advances in sequencing technologies, we performed HLA gene targeted sequencing to fully characterize the HLA genes in an AD case-control cohort [Genetics of Atopic Dermatitis (GAD) cohort], where the goal of this study was to determine whether genetic variation of these genes is associated with the likelihood of susceptibility to or protection from AD. Next-generation sequencing (NGS)-based technologies is now common in clinical practice for HLA genotyping labs supporting transplant programs but is still rarely performed for large-scale disease association cohort studies. We conducted a comprehensive and detailed characterization of HLA genes (including exonic, intronic, and some 3' UTR sequences). Previously, we utilized this cohort to study the Class I HLA genes and found one allele associated with AD, B*53:01, and four alleles (A*01:01, A*02:01, B*07:02 and C*07:02) and six HLA protein residues (HLA-A 9F, HLA-A 97I, HLA-A 152V, HLA-A 156R, HLA-B 163E, and HLA-C 116S) that were associated with protection from AD (Margolis et al., 2021b). Our current study undertook a different approach to study the HLA Class II genes, taking into account the nature of the Class II genes to form dimers of the alpha and beta genes on the cell surface to present peptides. First, we evaluated allelic variation in the GAD cohort for the HLA Class II genes HLA-DRB1, DQA1, DQB1, DPA1, and DPB1. Based on the allelic variation, we assessed the differential frequency of specific amino acid residues comprising the binding pockets of the -DR molecules (Zerva et al., 1996;Raychaudhuri et al., 2012;van Deutekom and Keşmir, 2015). Next, we evaluated the differential frequency of HLA Class II A1-B1 haplotypes formed for the DQ and DP dimers and the pocket residues, critical for peptide binding, within each of the DQ or DP dimers. HLA-DRA is not polymorphic so it is not necessary to characterize the HLA-DRA1/-DRB1 dimers (Matern et al., 2020). Additionally, we also assessed the polymorphisms of the other, non-antigen recognition domains, DQ or DP α2, DQ or DP β2 domains, the transmembrane and cytoplasmic regions, and a 3' UTR of DPB1 single nucleotide polymorphism (SNP) rs9277534, known to be associated with DP expression (Thomas et al., 2012;Schöne et al., 2018). A replication cohort, The Pediatric Eczema Elective Registry (PEER) was also HLA characterized by NGS and used to confirm the associations identified with the GAD cohort. Our set of analyses, which relied on the detailed characterization of HLA genes via NGS, help to further our understanding of the genetic basis of AD. Moreover, this analytic framework can potentially serve as a model for future studies of HLA polymorphisms identified through NGS in a range of diseases.
Pennsylvania Perelman School of Medicine, Children's Hospital of Philadelphia, Pennsylvania State University/Hershey Medical Center, and Washington University School of Medicine in St Louis. All subjects had a history and an exam consistent with AD (cases) or no history of AD (controls). There was no age restriction for enrollment. HLA polymorphisms, as well as AD genetic variation, can be highly dependent on race, so we focused on White and Black Americans in our study . All subject-related information was obtained using a standard case report form that was completed by the subject or by an investigator after subject interview and/or medical record review.
All subjects or legal guardians provided written informed consent or, if appropriate, assent approved by their appropriate Institutional Review Board.
After completion of the study, a replication cohort of 424 AD subjects, called the Pediatric Eczema Elective Registry (PEER) was HLA characterized and used to assess the findings in the GAD cohort. This registry is described in detail is previous publications and compared to the 385 GAD controls (Margolis et al., 2012).

HLA genotyping
DNA was collected using Oragene DNA collection kits (DNA Genotek, Ottawa Canada) as previously reported (Margolis et al., 2012). The five HLA Class II genes (DRB1, DPA1, DPB1, DQA1, and DQB1) for individuals in the GAD cohort were sequenced using targeted amplicon-based NGS with Omixon Holotype HLA ™ V2 kits (Budapest, Hungary). HLA genes were amplified (Qiagen LR PCR kits, Valencia, CA) on a Veriti thermal cycler (ThermoFisher, Waltham, MA), then amplicons from each gene were pooled per sample with library preparation occurring according to the manufacturer's protocol. The final library was sequenced on an Illumina MiSeq (San Diego, CA) using paired-end 2 × 150 V2 chemistry. Omixon Twin ™ (7,000 pairs/locus, v, 2.5.1) and GenDx NGSengine ® (Utrecht, Netherlands, 300,000 pairs/sample) analyzed each set of Fastq files. Genotyping was conducted in the Immunogenetics Laboratory of Children's Hospital of Philadelphia, a CLIA and ASHI accredited clinical laboratory, using clinical protocols with appropriate quality controls and standards.

Data analysis
The NGS sequencing included full genomic characterization of HLA-DPA1, -DQA1, -DQB1 genes and partial characterization for HLA-DPB1 and -DRB1 (exon 2 to the 3' UTR). All results were presented at two field resolution. Protein variations were determined using the IPD-IMGT/HLA database (Robinson et al., 2020). We focused on known binding pocket residues for -DR, -DQ, and -DP as being critical for peptide binding and therefore possibly involved in disease processes.
Allelic frequencies (AF) were based on the number of chromosomes with alleles that coded for the residue variant and were estimated along with 95% confidence intervals (CI). Epitope or Residue frequencies (RF) were based on the number of chromosomes with alleles that coded for the residue variant. Frequencies were estimated separately for those with and without AD. Logistic regression was used to estimate the odds ratio (OR) of having AD, assuming an additive genetic model for the allele or residue. Additional analyses were conducted within racial subgroups. Amino acid residue analysis was restricted to polymorphic residues within the binding pocket of the HLA molecules. Binding pocket residues were identified based on published crystal structures for each HLA molecule, whereby the residues that were found to be within a 4-Å neighborhood of the presented peptide were included (Stern et al., 1994;Chicz et al., 1997;Henderson et al., 2007;Dai et al., 2010;Tollefsen et al., 2012;Kusano et al., 2014;Jiang et al., 2019;Ting et al., 2020).
The mature Class II molecules are composed of an α and a β chain, products of the respective A1 and B1 genes of each of the DQ and DP Class II genes. The α1 domain of the α chain and the β1 domain of the β chain form the binding site. Since both chains are polymorphic, at least for DQ and DP molecules, we explored the possible contributions generated upon formation of the dimer, and therefore the concept of haplotypic analysis. Haplotype analysis was conducted using the BIGDAWG package in R (version 3.6.2), which utilizes the R-routine haplo.stats and haplo.em to estimate haplotypes, to account for the eventual dimerization of particular pairs of α and β chains (Pappas et al., 2016). Significance was assessed using a chisquared test. The results of using the BIGDAWG package was further confirmed by using an independent approach for haplotype generation. Haplotype analysis was only performed for the HLA-DPA1/-DPB1 and HLA-DQA1/-DQB1 pairs of genes. Note that the haplotype analysis refers to the A1 and B1 genes of a single isotype and does not refer to the extended DP~DQ haplotype. Haplotype analysis was not performed for the HLA-DRA/DRB1 pair of genes since DRA is not polymorphic (Matern et al., 2020) and not characterized in our study. The DRB1 gene was therefore evaluated assessing all polymorphisms located on the DRβ chain and participating in the formation of the DR binding pockets. To identify relevant residues on the α/β chains of DP and DQ molecules influencing AD, we used alleles or haplotypes with significant p-values showing opposing directionality for their association with AD, and identified polymorphic residues, focusing on those that participate in pocket formation. The described approach has been previously used to identify residues of relevance for tuberculoid leprosy on the DR molecules (Zerva et al., 1996).
For completeness, all remaining polymorphic residues that were outside of exon 2 of each A1 and B1 genes (exons 1, 3, 4, 5, 6, depending on the particular gene), were analyzed and evaluated for association using logistic regression in R. Specifically, for the DPB1 gene, an additional polymorphism SNP rs9277534 at the 3' UTR was inferred based on HLA-DPB1 genotyping (Schöne et al., 2018). SNP rs9277534 is previously reported to be associated with expression levels of DP molecules and was evaluated as high or low HLA-DPB1 expression using logistic regression for its contribution to protection or disease (Thomas et al., 2012).
To further explore the relationship and the possible independent effect of significant amino acid positions and residues found in different regions of the DPA1 and DPB1 gene, including the A/G polymorphism of rs9277534 SNP, we assessed linkage disequilibrium (LD) among these entities. LD was calculated in the whole GAD cohort using the LD coefficient D (D = p ABp A p B , where p AB is the frequency of the amino acid residue of interest and rs9277534 SNP of interest co-occuring on the same haplotype, p A = frequency of the amino acid residue of interest, p B = frequency of rs9277534 SNP of interest; generally ranging from −0.25 to 0.25 but is dependent upon Frontiers in Genetics frontiersin.org 03 allele frequency). Further, the correlation coefficient, R 2 , was calculated using D to account for the frequency of the alleles in the population (R 2 = D/(p A (1-p A )p B (1-p B ) (Slatkin, 2008). The same strategy was used to evaluate the LD between two amino acid residues in different DPA1 or DPB1 regions, replacing the polymorphisms of rs9277534 SNP with a second amino acid residue of interest in the above calculations.
An additional type of analysis was undertaken involving the possible relationship of T cell epitope (TCE) groups with AD. Each TCE group is defined by a distinct combination of amino acid residues that influence the peptides bound to the HLA-DP protein and presented to T cells. Originally, and in the context of hematopoietic stem cell transplantation, the HLA-DPB1 alleles were organized into TCE groups by experimental methods by Zino et al. (Zino et al., 2007). The definitions of TCE groups was further extended to all alleles using a functional distance method by Crivello et al. . The TCE classification is based on exon 2 amino acid composition and an updated TCE classification is maintained for all alleles as part of the IPD-IMGT/HLA database (Robinson et al., 2020) using the Crivello functional distance method . HLA-DPB1 alleles from both the GAD and PEER cohorts were assigned the TCE group that is defined by the Crivello functional distance method available through the IPD-IMGT/ HLA database. For the novel DPB1 alleles that exist in either the GAD or PEER cohort, all novelties are outside of exon 2, and were assigned a TCE group that corresponds to the known allele that shares exact exon 2 sequence with the novel allele.
The odds ratios were not adjusted for other atopic illnesses like asthma, seasonal allergies, or food allergies because these illnesses are likely on the same causal pathway as noted in studies of the atopic March (Kapoor et al., 2008;Davidson et al., 2019). Our final analysis was at the amino acid level within the binding pocket of the HLA molecules (e.g., a phenotype). We report the Bonferroni correct threshold by number of variants or residues analyzed per HLA gene. We also report the uncorrected p-value. All statistical analyses were conducted using Stata Version 17.0 (College Station, TX) or R (R Foundation, version 3.6.2).
Since the A1 and B1 genes of -DP and -DQ are both polymorphic and exist in particular haplotype combinations (i.e., DQA1~DQB1 and DPA1~DPB1) resulting in heterodimers of α and β chains, the binding sites formed from the α and β chains of a given combination are functionally meaningful. The relative frequencies and p-values of the different DQA1~DQB1 and DPA1~DPB1 haplotypes in our control and AD populations are shown in Table 3. No remarkable differences are observed in the distribution of DQA1~DQB1 haplotypes. A number of DPA1~DPB1 haplotypes appear to have significant differences between the two groups (Table 3). Of note is that one of the DPA1~DPB1 haplotypes (DPA1*01:03~DPB1*04:01) is associated with protection and highly significant (p = 4.76 × 10 −07 ). The same DPA1*01:03 and DPB1*04:01 alleles were found to be associated with protection with significant differences when evaluated as independent alleles (Table 2). Additionally, we noticed (Table 3; Figure 1A) that there are several DPB1 alleles (DPB1*06:01, 18:01 and 104:01) found in a haplotype with the DPA1*01:03 allele that have opposing direction (OR>1), suggesting association with AD.
Adapting the same approach previously applied in a study of tuberculoid leprosy (Zerva et al., 1996), whereby alleles with opposite directionality of association can guide our search for the relevant residues that influence disease process, we identified 14 residues (8, 9, 11, 36, 55, 56, 57, 65, 69, 76, 84, 85, 86 and 87) in the β1 domain to be different between the DPB1*04:01 and the group of DPB1*06:01, 18: 01 and 104:01 alleles ( Figure 1A). Of those residues in the β chain, the residue at position 84 is part of pocket 1, residue at position 76 is part of pocket 2, residues 69 and 76 are part of pocket 4, residues 65 and 69 are part of pocket 7, residues 9, 36 and 55 are part of pocket 9 and residue 11 is part of pocket 6 ( Table 4). In a similar fashion we identified the pocket residues that are different between the DPA1 alleles associated with disease and protection ( Figure 1B). The DPA1*01:03 allele was associated with protection and the other 3 alleles, DPA1*02:01, *02:02 and *03:01, were associated with disease (Table 2). For the DPα chain we find that residues 11 and 66 are part of pocket 6, while residue 31 is part of pocket 1 Frontiers in Genetics frontiersin.org 04  (Table 4). Thereafter, and knowing the different residues of the pockets associated with AD or protection, we evaluated the distribution of these pocket residues in the whole population of control and AD subjects. In the analysis of pocket residue combinations shown in Table 5 we found that amino acid combinations of pocket residues of the DP dimer may be associated with protection or AD. In terms of AD association or protection, it appears that while there are pockets influenced by both DP α and β chain residues, like pocket 1 (α31, β84) and 6 (α11, β11), there are others that are influenced only by residues of the β chain, like pocket 4 (β69 and 76), pocket 7 (β65 and 69) and pocket 9 (β36 and 55). More specifically the combinations of α31M + (β84G or β84V) at pocket 1, α11A+ β11G at pocket 6, β36A+ β55A at pocket 9, and β69K + β76M at pocket 4, are associated with lower risk and therefore protective, while the combination of α31Q + β84D is associated with increased risk of disease (Table 5). The performed analysis was targeted and involved predetermined positions and amino acids generated from the described analytical approach, hence, the only correction factor here is the number of different estimates, that is 10, and shown in Table 5.
Regarding the analysis of the DRB1 gene, we focused on the polymorphic pocket residues of DRβ chain. The DRB1 polymorphisms specific to pocket residues are shown in Supplemental Table 2, none of these differences were statistically significant after p-value correction (n = 63, threshold p-value = 0.0008). Also considering that the DQA1~DQB1 haplotyping did not reveal any significant associations, we evaluated the polymorphic pocket residues of the DQA1 and DQB1 genes independently (Supplemental Table 2). Accounting for the 25 DQA1 and 45 DQB1 comparisons, none of these residues demonstrated a statistically significant difference after correction.
Assessing polymorphisms in the non-exon 2 sequences of the DRB1, DQA1, and DQB1 alleles revealed no significant differences (Supplemental Table 2). However, DPα and β chains, did reveal other additional residues being of significance located in exon 3 and other segments of the molecule like transmembrane and cytoplasmic domains (Supplemental Table 2). These residues were the following: DPα111, 127, 160 and 228 and DPβ96 and 170. To assess whether any of these polymorphisms outside of the α1 and β1 domains were conferring independent risk from those identified within the α1 and β1 domains in the previous analysis we performed an assessment of LD. The amino acids in these positions of the non-α1 and β1 domains were found to be in variable LD with polymorphisms of the α1 and β1 domains (Supplemental Table 3). Since very strong LD is demonstrated between the non-α1 and β1 residues with at least one of the α1 or β1 residues it is unclear as to whether they confer an independent contribution to susceptibility or protection.
Additionally, the frequency of SNP rs9277534 polymorphisms, that denote relative expression of the DPB1 gene, were evaluated. The G polymorphism for rs9277534 SNP located in the 3' UTR of the DPB1 gene, reflecting higher expression, was associated with AD (OR = 1.45, p = 4.71e-04), while the alternative, A, reflecting lower expression, was associated with protection (OR = 0.69, p = 4.71e-04) ( Table 6). It was also noted that neither G nor A was associated with any of the two racial groups included in our study.
Considering the strong linkage disequilibrium of these A/G polymorphisms with DPB1 alleles, the LD of the A/G SNP polymorphism with each of the positions from Table 5 and  Supplemental Table 2 with at least one significant association, was assessed (Table 7) (Schöne et al., 2018). The underlying principle here is that, while DPB1 alleles may be in LD with the 3' UTR SNP rs9277534 polymorphism, that does not mean that individual residues of the DP molecule will necessarily be in LD with either G or A of the rs9277534 SNP. A particular residue may be present on several alleles, some of which may be in LD with the G and some others with A SNP. Indeed, it was observed that while some of these residues are in LD with the A (DPβ84G and β76M) or G polymorphism (DPβ84D), not all residue polymorphisms of the binding pockets, with significance for either susceptibility or protection, are in LD with the rs9277534 SNP (DPβ36A, β55A, β69K) (Table 7). Upon assessing the LD between exon 3 polymorphisms and the rs9277534 SNP, it was found that there is very strong LD between the two (DPβ96, β170) ( Table 7), suggesting interrelated functionalities between the expression of DPβ molecule and the β2 domain itself.
Regarding the analysis of our data using the TCE groups for the DP alleles we found moderate correlation between TCE group 3 and multiple pocket residues to be associated with protection from AD: P1 (α31M + (β84G or β84V); r = 0.53), P4 (β69K + β76M; r = 0.44), P6 (α11A + β55G; r = 0.59) and P9 (β 36A + β84V; r = 0.43), while we also found moderate correlation between TCE group 1 and P1 residues associated with AD (α31Q + β84D; r = 0.40). Table 8 shows the relative association of each one of the TCE groups for protection or susceptibility to AD, whereby TCE group 3 is associated with protection from AD (OR = 0.64, p = 0.00100) and TCE group 1 is associated with susceptibility to AD (OR 1.76, p = 0.0097).
In PEER, a cohort of children with AD used as a replication cohort, we confirmed that children with DPA1*01:03 and DPB1*04:01, and the DPA1*01:03~DPB1*04:01 haplotype, were less likely to have AD as compared to the GAD controls. For this comparison we used controls from the GAD cohort because the PEER cohort was comprised of only AD cases which were used to study the progression of disease over time. As in many genetic association studies, such as GWAS, data from a standard set of controls are often used for different case comparisons because the underlying premise is that the "public" control group is representative of the underlying relevant non-diseased population (Mitchell et al., 2014). We also confirmed the susceptibility role of P1: α31Q + β84D and protective role of P1: α31M + (β84G or β84V) and P6: α11A + β11G. However, the DP β chain residues of pockets 4 and 9 were not confirmed. (Table 9). Furthermore, it was confirmed that the distribution of the SNP rs9277534 polymorphisms A/G in the PEER cohort was such that the A remains to be associated with protection and 'G' with susceptibility. None of the associations between TCE groups and AD were replicated in the PEER cohort. No correction was applied to the p values, as this study was a replication confirming the findings of the GAD cohort.

Discussion
HLA Class II genes are involved in the formation of molecules integral to the presentation of antigens to CD 4 T cells and as a result are part of immune mediated responses and illnesses. We conducted a casecontrol study of individuals with and without AD using high-resolution NGS sequencing of genes in the HLA Class II region. We also evaluated individuals by White or Black race. After statistical adjustment for multiple comparisons, we found no associations between HLA-DRB1, DQA1 or DQB1 alleles or any of their positional residues or amino acids with AD. Analysis of DPA1 and DPB1 alleles, DPA1/DPB1 dimers and pocket residues had significantly different distributions between the control and disease cases. This evaluation allows for assessing the role of positional residues and of specific amino acids in a way that is independent of specific -DP alleles. This evaluation is meaningful because individual alleles may not be significantly associated with AD, while specific positional amino acids that are shared among multiple alleles can demonstrate significant associations between control and disease cases. Considering that the operational entities influencing peptide binding are the pocket residues, this type of analysis allows for a better assessment of the sub-molecular components of an HLA molecule that contributes to disease or protection. The findings were replicated in a second group of AD cases from the PEER cohort for the P1 and P6 pockets and their respective residues. Replication did not confirm the involvement of the P4 and P9 pocket residues. More specifically, P1; α31Q + β84D are associated with a susceptibility role and P1: α31M + (β84G or β84V) and P6: α11A + β11G are associated with a protective role. In summary, -DPA1 and B1 genomic variation is associated with AD but no other Class II genes are associated with AD.
Accounting for the relative expression of the DPs on the cell surface, whereby the residue β84G, which was associated with protection, is in LD with the rs9277534 3' UTR SNP polymorphism indicating lower expression (A), while the residue (β84D) associated with disease is in LD with the SNP reflecting higher expression (G), it becomes apparent that a combination of polymorphisms in the coding region, together with cell surface expression of the DP molecule and the presented non-self antigen, may set the stage for the ensuing immune response that results in protection or disease.
It should also be mentioned that the absolute LD identified between exon 3transmembrane region-3 UTR SNP rs9277534 (residue 96R-170T-SNP A or residue 96K-170I-SNP G) of DPB1 in our population (Table 7), denotes a possible interdependence between these polymorphisms and expression as reflected by SNP rs9277534 in the 3' UTR of the DPB1 gene. It is unclear as to how these non-exon 2 polymorphisms may coordinate functionalities critical for the molecule, but this LD character contrasts to the somehow strong but not absolute LD between the rs9277534 SNP and exon 2 polymorphisms that comprise the part of the DP that serves as a receptor for peptides and T cell receptor interactions (Table 7). Furthermore, residue DPβ11G that is associated with protection is in rather weak LD with exon 3 polymorphisms (β96 and β170, Table 8) and also the rs9277534 SNP A (Table 7). It therefore appears that association with protection may or may not depend entirely on the low levels of expression of the DP molecules. If we assume that protection can be an active state of the immune response, like susceptibility is, then there may be T cell responses of regulatory nature that actively diminish the immune response; these mechanisms may be influenced not only by structural components but also expression patterns and the peptides presented by the DP molecule. Lower expression of these DP structural components can be compatible with the protective role; nevertheless, the details of the interactions between structure and expression are not entirely clear. It should be clarified that in our study we claim associations of either pocket residues or SNPs denoting expression with susceptibility or protection but we do not demonstrate that these residues or SNP polymorphisms are engaged directly in specific mechanisms resulting in disease susceptibility or protection.
In conclusion, the LD observed among polymorphisms in exonic sequences (particularly exon 3 and to lesser extent exon 2) and noncoding regions (intron 2 or 3' UTR) is an indicator suggesting coordinated functionalities maintained through the evolutionary history of the DPB1, connected through the different segments of the DPB1 genomic sequences. A hypothesis, therefore, can be developed that those particular structural components of DPs, along with their increased or reduced expression, form the basis for the pathophysiology of AD. It remains, that because of the LD between the residues and expression components, it is unclear what the exact mechanistic role each may play and what their exact interactions may be. It should be noted that the HLA-DP loci are not in LD with the other HLA Class I loci or HLA Class II loci, as there   are at least four recombination hotspots located between the HLA-DP locus and the next closest HLA locus studied in this project, HLA-DQB1 (Jeffreys et al., 2001;Cullen et al., 2002;Miretti et al., 2005). The DPB1 alleles, amino acid residues and the expression marker are not correlated with the HLA Class I alleles or residues (Margolis et al., 2021b) that were either protective of or associated with AD from this same dataset (data not shown). Therefore, the involvement/ engagement of the Class I and the Class II associated or protective elements in AD pathophysiology are rather independent and do not reflect any linkage disequilibrium effect. In our study, the HLA typing was done by NGS. Unlike previous studies of AD, we identified and fully characterized both the A and the B genes of DQ and DP loci, allowing a better identification of pocket residues. This combined evaluation is important because polymorphisms on both A and B genes contribute to the binding site and therefore influence peptide binding. In the past, in HLA and disease association studies the characterization of the DQA1 and DPA1 genes was not performed routinely, either because it was not feasible at the time or because there was a belief that their polymorphism was limited and therefore inconsequential. Since the majority of HLA and disease associated studies lack this DPA1, DQA1 gene characterization, our understanding and contribution of the possible role of α pocket residues is unclear for the different HLA and disease association studies performed thus far. In our study, the β84D is in combination with α31Q (Table 5) and therefore it is likely that, this combined structural features of the DP molecule, along with its high expression may contribute to disease susceptibility. Recent studies of Type 1 diabetes mellitus provide another example of the importance of characterizing both A and B gene polymorphisms which show associations with both A and B genes of DQ and DP loci (Erlich et al., 2008;Noble, 2015;Enczmann et al., 2021).
The DP pockets and residues associated, in our study, with AD P1β84 and P6β11, have been previously identified by Castelli et al as influencing the binding of synthetic peptides originating from allergens, viral and tumor antigens to DP molecules (Castelli et al., 2002). The P1 and P6 pockets accommodate the main anchor residues of foreign peptides interacting with DPs. Even though the Castelli et al study (Castelli et al., 2002) does not address the contributions of the α chain residues, considering that these pockets are composed of polymorphic α chain residues as well, it is not unlikely that whatever the effect of the pockets, is a resultant of influences originating from both chains. It therefore becomes likely that a combination of DP α and β pocket polymorphisms, as they interact with different antigens, form the complex that initiates T cell responses. Indeed, in our study we find that not only β chain residues are associated with protection or disease, but also α chain residues (α11A, α31Q, and α31M).
Furthermore, the different HLA-DPB1 alleles areorganizedinto T cell epitope (TCE) functional groups (Zino et al., 2007;Crivello et al., 2015). Each one of these TCE groups of DPB1 alleles is characterized by distinct structural features that differentiate one from the other group and influence the peptides bound to the DP protein. Most recently, the immunopeptidome of the alleles that belong to each one of these groups has been characterized (van Balen et al., 2020;Meurer et al., 2021;Laghmouchi et al., 2022). Considering that in our study of AD we have identified HLA-DPA1 and DPB1 alleles and specific residues relevant to the disease, we were interested to investigate as to whether any of the structural elements (alleles or residues) of the DP molecules correlated with the TCE groups, therefore, establishing a relationship   TABLE 7 Relationship between DPB1 positions with at least one residue of significance for protection/association of AD (Table 5; Supplemental Table 2) with DPB1 alleles found in this study and the 3' UTR SNP rs9277534. Alleles with frequency >0.05 are bold and underlined. (p) = protection; (s) = susceptibility. between the DP structural elements of our study with the functional grouping of DPs including the peptide repertoire that characterizes each one of the TCE groups. Indeed, we found in the GAD cohort that those particular alleles belonging to a specific TCE group, having common structuralfeaturesandauniqueimmunopeptidomeboundtothesealleles, correlate with the structural elements we found through our independent approach in the same cohort. However, due to lack of reproducibility with the PEER group we do not expand on this observation and refrain from a final conclusion. Nevertheless, it is very likely that the structural elements, identifiedasbeingrelevantwiththedisease,correlatetoparticularDPTCE groups. As such, the immunopeptidome characterizing the different TCE groups may be instructive in identifying peptides involved with atopic dermatitis; the exact structural details within a particular TCE group that may play a role in atopic dermatitis remain unclear. Atopic illnesses, like atopic dermatitis, are associated with T H 2 dominant cellular response and often IgE antibody production in response to the presence and presentation of specific allergens. The activation of the T H 2 results in the production of IL-4 and IL-13 cytokines. These cytokines are associated with the acute phase of AD and can also result in epidermal barrier dysfunction (Gong et al., 2020). The physiologic effect of these cytokines are diminished by IL-4 blocking agents that are currently being used to treat moderate to severe AD (Beck et al., 2014). The strength of the immune response to an antigen is likely related to genetic and environmental factors as well as the type and intensity of allergen.
Allergen presentation is influenced by HLA Class II and, in this study, we demonstrate that HLA Class II genetic variation at the level of allele and receptor epitope is associated with both an increased risk and decreased risk of AD. We do not currently know which allergens bind and with what intensity to the epitopes effected by the described genetic variation. We did previously show in an in silico study that HLA Class II epitope variation could result in differential binding of auto allergens suspected of being  (Gong et al., 2020). However, when an allergen is suspected to be associated with AD, it should now be possible to determine how it might interact with the epitopes described in this study and more precisely determine if it is likely to be associated with AD. Furthermore, agents that influence -DP binding might have an impact on AD.
As with all epidemiologic studies, this study has limitations. Although this study is the largest study of its kind to use NGS to genotype HLA Class II genes and then evaluate their associations with AD, we were underpowered to evaluate less common HLA alleles. For this reason, we a priori limited our analyses to alleles with a frequency of ≥0.05. We believe that this strategy is important for understanding the effect of HLA on the population at large. However, it is possible that important information about how HLA Class II allelic variation affects immune response with respect to AD might be gleaned by evaluating less frequent alleles. To evaluate these alleles, large cohorts will be needed that are designed to focus on less common alleles. Study subjects were classified by self-described race and not by genetic ancestry. It is possible that genetic admixture is not fully accounted for by self-described race and could have added bias to our results. However, it is important to note that we have previously shown that in an evaluation of a similar population, self-described race was highly concordant with an assessment of genetic ancestry (Margolis et al., 2012;Abuabara et al., 2020;Biagini et al., 2022). In addition, many of our findings had similar effect estimates in both races. The origin of the GAD is primarily from academic dermatology offices and as such patients were more likely to have treatment resistant AD than patients seen in general practice. As a result, it is possible that our results may not generalize to all clinical sites and to individuals with less treatment resistant AD. Our genotyping did not phase the A/B genes of the DQ or DP loci, so we used a well-known algorithm, haplo.stat to form our DPA1~DPB1 and DQA1~DQB1 haplotypes. It is possible that there was some inaccuracy in the estimation of haplotypes. However, as a sensitivity analysis, we used an alternative approach for coding the haplotypes that does not rely on the EM algorithm and found very similar results (see Supplemental Table 4). Finally, it is possible that our findings are not primarily associated with AD but other co-morbid illnesses that are part of the atopic March like asthma and seasonal allergies (Zino et al., 2007). However, it is generally believed that AD occurs before these illnesses (Kapoor et al., 2008).
In summary, we conducted a case-control study of individuals with AD and controls using NGS for the characterization of the three classic HLA Class II genes. NGS allows for a more thorough interrogation of Class II genes and found that the DP molecule is relevant to AD and that specific pockets and residues within these pockets, along with their expression levels, play a critical role in both susceptibility and protection.

Data availability statement
The HLA Class II genotypes for the GAD and PEER cohorts presented in the study are deposited in the Zenodo repository, https:// zenodo.org/record/7565999 (DOI: 10.5281/zenodo.7565999).

Ethics statement
The studies involving human participants were reviewed and approved by University of Pennsylvania Institutional Review Board. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.

Author contributions
DJM and DSM designed and oversaw the study. DJM, OH, and AY oversaw the collection of samples. OH prepared the samples. JW, AD, NT, and DF performed the HLA genotyping. AD, TM, DF, IK, GD, and JD analyzed the sequencing data. DJM, JD, NM, RB, OH, and TH conducted the statistical analyses. DJM, JD, NM, and DSM wrote the paper. All authors reviewed the results and approved the final version of the manuscript.

Funding
This work was supported in part by grants from the National Institutes for Health (NIAMS) R01-AR060962 (PI: DJM), R01-AR070873 (MPI: DJM/DSM), and School of Medicine Designated funds (DJM). The PEER study is funded as the Atopic Dermatitis Registry by Valeant Pharmaceuticals International (PI: DJM). The sponsors had no role in the design and conduct of the study; collection, management, analysis and interpretation of the data; preparation, review or approval of the manuscript; and the decision to submit the manuscript for publication.