Genome-Wide Association Study of Non-syndromic Orofacial Clefts in a Multiethnic Sample of Families and Controls Identifies Novel Regions

Orofacial clefts (OFCs) are among the most prevalent craniofacial birth defects worldwide and create a significant public health burden. The majority of OFCs are non-syndromic and vary in prevalence by ethnicity. Africans have the lowest prevalence of OFCs (~ 1/2,500), Asians have the highest prevalence (~1/500), Europeans and Latin Americans lie somewhere in the middle (~1/800 and 1/900, respectively). Thus, ethnicity appears to be a major determinant of the risk of developing OFC. The Pittsburgh Orofacial Clefts Multiethnic study was designed to explore this ethnic variance, comprising a large number of families and individuals (~12,000 individuals) from multiple populations worldwide: US and Europe, Asians, mixed Native American/Caucasians, and Africans. In this current study, we analyzed 2,915 OFC cases, 6,044 unaffected individuals related to the OFC cases, and 2,685 controls with no personal or family history of OFC. Participants were grouped by their ancestry into African, Asian, European, and Central and South American subsets, and genome-wide association run on the combined sample as well as the four ancestry-based groups. We observed 22 associations to cleft lip with or without cleft palate at 18 distinct loci with p-values < 1e-06, including 10 with genome-wide significance (<5e-08), in the combined sample and within ancestry groups. Three loci - 2p12 (rs62164740, p = 6.27e-07), 10q22.2 (rs150952246, p = 3.14e-07), and 10q24.32 (rs118107597, p = 8.21e-07) are novel. Nine were in or near known OFC loci - PAX7, IRF6, FAM49A, DCAF4L2, 8q24.21, NTN1, WNT3-WNT9B, TANC2, and RHPN2. The majority of the associations were observed only in the combined sample, European, and Central and South American groups. We investigated whether the observed differences in association strength were (a) purely due to sample sizes, (b) due to systematic allele frequency difference at the population level, or (c) due to the fact certain OFC-causing variants confer different amounts of risk depending on ancestral origin, by comparing effect sizes to observed allele frequencies of the effect allele in our ancestry-based groups. While some of the associations differ due to systematic differences in allele frequencies between groups, others show variation in effect size despite similar frequencies across ancestry groups.


INTRODUCTION
Orofacial clefts are among the most common birth defects in all populations worldwide and pose a significant health burden. Surgical treatment along with ongoing orthodontia, speech and other therapies, are very successful in ameliorating the physical health effects of OFC, but there is still a significant social, emotional, and financial burden for individuals with OFC, their families, and society (Wehby and Cassell, 2010;Nidey et al., 2016). Furthermore, there are disparities in access to such therapies for OFCs (Nidey and Wehby, 2019), similar to other malformations with complex medical and surgical needs. Some studies have suggested a reduced quality of life for individuals with OFCs (Naros et al., 2018), while other studies have identified higher risk to certain types of cancers (Bille et al., 2005;Taioli et al., 2010;Bui et al., 2018). Thus, it is critical to identify etiologic factors leading to OFCs to improve diagnostics, treatments, and outcomes.
OFCs manifest themselves in many forms -cleft lip alone (CL), cleft palate alone (CP), combination of the two (CLP), and vary in their severity. A large proportion of the causal genes involved with syndromic OFCs (i.e., OFCs that are part of other syndromes) are known (OMIM, https://www.omim.org/ search/advanced/geneMap). However, the majority of OFC cases -including about 70% of CL with or without CP (CL/P) and 50% of CP alone -are considered non-syndromic, i.e., occurring as the sole defect without any other apparent cognitive or structural abnormalities (Dixon et al., 2011). The genetic causes of nonsyndromic OFCs are still largely undiscovered.
There are differences in birth prevalence around the world with respect to OFC, that may be the result of etiological differences among these different populations, both genetic and non-genetic. Populations of Native American and Asian ancestry have the highest prevalence of ∼2 per 1,000 live births, European ancestry populations have an intermediate prevalence of ∼1 per 1,000, and African-ancestry populations have the lowest prevalence, ∼1 per 2,500 (Dixon et al., 2011). Most of the reported GWAS loci are observed within specific ancestry groups, and there are only a few studies that conducted a systematic comparison of genetic differences between populations of different ancestry, largely due to the lack of sufficiently large samples.
In the current study, we expanded this analysis to other loci and additional populations from the Pittsburgh Orofacial Clefts (POFC) study to describe differences in the genetic etiology of OFC across populations of varying ancestral origin. The POFC Multiethnic study is a geographically diverse, family-based study comprising a large number of families and individuals (∼12,000 participants) from multiple populations worldwide including those of European ancestry from the US and Europe, Asians from China and the Philippines, mixed Native American, European and African ancestry from Central and South America, and Africans from Nigeria and Ethiopia. The POFC Multiethnic study sample is therefore a rich resource for examining the genetic heterogeneity across populations. In two previous GWAS studies (Leslie et al., , 2017, a subset of the total POFC participants were analyzed, including 1,319 independent parentoffspring trios, 823 unrelated cases and 1,700 controls. These previous GWASs have focused on unrelated cases and controls, or parent-offspring trios, not fully taking advantage of the fact that OFCs often segregate within families. Here, we include all available participants from simplex and multiplex pedigrees, along with singleton cases and controls for a total sample size of 2,915 affected and 8,729 unaffected individuals. As a result of the increased sample size, our study should provide more power to detect novel OFC loci, while providing stronger evidence for previously reported associations that are common to all OFCs.

Study Sample
The current study sample consists of 2,915 participants with OFCs, 6,044 unaffected individuals related to the OFC cases, and 2,685 controls with no personal or family history of OFC. The participants were recruited from multiple sites from Asia (China, India, Philippines, and Turkey), Europe (Spain, Hungary, Denmark), Africa (Nigeria and Ethiopia), and the Americas (Argentina, Colombia, Guatemala, mainland USA, and Puerto Rico). All study sites provided local IRB approval, then the University of Pittsburgh Coordinating Center (Marazita, PD/PI) received approval from the University of Pittsburgh IRB (FWA # 00006790) for the overall project (IRB approval number CR19080127-001). Genotyping was performed at the Center for Inherited Disease Research (CIDR) at Johns Hopkins University, on the Illumina platform on ∼580 K variants. QC was carried out by the CIDR genotyping coordination center at the University of Washington. Genotypes were then imputed using the 1,000 genome project phase three reference panel, at ∼35,000,000 variants of the GrCH37 genome assembly. More details on the POFC Multiethnic study can be found in the Database of Genotypes and Phenotypes (dbGaP; study accession number phs000774.v2.p1) and at the FaceBase resource for craniofacial research under dataset OFC4: Genetics of Orofacial Clefts and Related Phenotypes (URL https://www.facebase.org/id/1-50DT) (Marazita, 2019).
Ancestry of a subset of unrelated participants from the POFC Multiethnic study was determined based on principal components using genotyped variants, which were then projected onto the larger sample. The first three principal components of ancestry correspond well with the broad geographical origin of the POFC participants, except for those from United States, and were used to classify participants as being of African, Asian, European, or Central and South American origin; this same classification was used for group analyses. Participants from the United States were included in African, European, and Central and South American ancestry groups. Participants from Turkey are part of the European ancestry group. A more detailed description of ancestry determination can be found in the previously published GWASs of POFC Multiethnic study participants Marazita, 2019). The first three PCs provide a workable, although not perfect, separation of participants into the four broad ancestry groups within the POFC Multiethnic study sample. Note that the PCs were not used to account for population substructure in the actual association analysis, therefore we did not need to utilize the higher order PCs.
Genome-wide association was run on the combined study sample (ALL), as well as the four ancestry-based groups, Africans (AFR), Asians (ASIA), Europeans (EUR), and Central and South Americans (CSA). The sample sizes for each ancestry-based group are provided in Table 1. Table 1 contains the median and maximum pedigree sizes, as well as the median and maximum number of affected individuals per pedigree. The AFR group contains mainly singleton CL/P cases and controls, whereas ASIA and EUR have a larger proportion of extended families; the CSA group's pedigree sizes lie somewhere in between.

Phenotype Definition
In our study, we ran GWAS of CL/P on participants from pedigrees in which the OFC-affected members have a cleft lip only (CL), or a cleft lip and a cleft palate (CLP). Pedigrees with members who are affected with a cleft palate only, or a reported history of cleft palate only, were excluded from our analysis. Within these pedigrees, any member with an OFC was considered to be affected for CL/P; unaffected pedigree members from CL/P pedigrees as well as control pedigrees that do not have any history of OFCs were considered to be unaffected for CL/P. The final combined sample (ALL) included 2,221 affected and 7,226 unaffected participants.

Statistical Method
In our study, we selected all available participants from pedigrees ranging from singletons to 35 members in a single pedigree, for conducting case-control association. The program EMMAX (Kang et al., 2010) was used to run the seven GWAS. EMMAX uses a variance component linear mixed model approach to detect association at each variant. The model includes a genetic relationship matrix (GRM) estimated from the observed genotype data to simultaneously correct for both population structure and familial relatedness. Therefore, ancestry PCs are not separately included in the association analysis. This same GRM is also used to estimate the polygenic variance component. Strength of association is then measured using a score test of comparison between the maximum likelihood conditional on observed genotypes at each variant vs. the unconditional maximum likelihood model. Due to the nature of the score test, the reported effect size is not interpretable for a binary phenotype, therefore we do not utilize the effect sizes reported by EMMAX. Instead, another mixed-model association program, GENESIS (Gogarten et al., 2019) was used to calculate approximate effect sizes (betas and standard errors). The same effect allele is used to report effect size and direction consistently across groups, namely the minor allele at each variant as identified in the combined POFC sample.

Filtering of Variants and Identification of Associated Loci
Variants that met genotyping and imputation quality control filters including significant deviation from Hardy-Weinberg equilibrium (HWE p-value 1.0e-05 in European controls) were considered for analysis. These were further filtered based on minor allele frequencies; those with MAF of 2% or more within their respective ancestry groups are included in the GWAS. In this study, loci containing variants with association p-value < 1.0e-06 were reported as positive associations. The observed minor allele frequencies of reported loci were compared to the respective population frequencies from the gnomAD database (Karczewski et al., 2020) to ensure that minor allele frequencies were not impacted by the imputation (Supplementary Table 1).
Note that major and minor alleles may be switched between groups, but this impacts only the direction of effect, not the strength of association.

Identification of Novel OFC Loci
We compared significant and suggestive associations from our GWASs to genes and regions that have been identified through genome-wide linkage or association, as well as by candidate gene studies by multiple previous genetic studies of OFCs. Our reference loci consist of the 29 genomic regions listed in the review article by Beaty et al. (2016), combined with loci reported by six recently published GWASs. The six newer GWASs include (1) combined meta-analysis of parent-offspring trio and casecontrol cohorts from the current POFC multiethnic study sample , (2)  For each comparison, we derived search intervals for detecting overlaps between published/known OFC loci and the associations reported in this current study. For each known OFC gene, GrCH37 base pair positions of start and end transcription sites from the UCSC genome browser were used to measure distance between our locus and the gene. For the 8q24.21 locus, which is a gene desert, we checked whether any of our associated SNPs were located within the 8q24.21 chromosome band. A positive replication is reported where the minimum p-value observed within 500 KB on either side of the gene exceeds 1e-05. For comparing published GWAS loci, separate 1 MB search intervals were created around the reported lead variants, and a positive replication noted if any of our selected associations were within 500 KB of these intervals.

Comparison Between Ancestry Groups
For comparing GWASs of ancestry groups, we first compared association results, namely, effect size and direction, and association p-value of each lead variant within each ancestry group, as well as variants nearby. However, comparing association outcomes at a single variant is not always feasible, nor advisable, as association outcomes are impacted by differences in variant allele frequencies and LD, which differ by ancestry. We therefore looked at intervals of 500 KB to either side of the lead SNPs, and selected the 30 top associations observed within these intervals within each GWAS. Approximate effect sizes for selected variants were computed using the same effect allele at each variant, which is the minor allele for that variant in ALL. GENESIS (Gogarten et al., 2019) reports the beta coefficient for each SNP in terms of allele dosage. In our analysis, a positive effect size indicates that the effect allele increases risk of CL/P and vice-versa.
We hypothesized that the main reason for difference in association outcomes between ancestry groups is due to the fact that variants are common in one population, but rare in others. If an OFC causing variant is observed with similar frequency in multiple groups, we expect to see elevated association signals in all these groups as well. Although the strength of association indicated by a p-value is dependent on sample size and may be reduced in the smaller groups, the magnitude and direction of effect sizes should be similar. If this is not the case, it implies that variant's contribution to the development of OFCs varies depending on the participants' ancestral origin. Thus, we compared the effect size and effect allele frequencies of the top 30 associations observed close to each lead SNP. In order to assess whether allele frequencies are different between groups, we calculated the fixation index F ST for the top 30 associations, using the ancestry-based groups in our ALL sample, and used the 95% percentile of the distribution of F ST as a threshold value to decide whether the effect allele frequencies were similar or different. To compare effect size estimates, we examined whether the point estimate was in the same direction and if the confidence intervals overlapped. We also ran a meta-analysis of the selected variants as a statistical means for measuring whether the effect allele affects CL/P risk similarly across ancestry groups. The meta-analysis procedure reports the p-value for Cochrane's Q statistic at each analyzed variant, testing the hypothesis that effect sizes differ across the groups. PLINK (Chang et al., 2015) was used to run meta-analysis and for calculation of Wright's F ST fixation index.
We used these comparisons to categorize each locus into one of three categories: Category 1: The effect alleles have similar frequencies within the groups, and association p-values are estimates are also similar. Category 2: Effect allele frequency differs across groups, effect size estimates may or may not differ. Category 3: Allele frequencies are similar between groups, but the effect size estimates differ.

RESULTS
GWASs of the combined sample and the four ancestrybased groups yielded 22 associations with p-values < 1e-06, including 10 genome-wide significant associations (i.e., below the Bonferroni threshold of 5e-08) at 18 distinct loci. In some cases, these loci are observed in more than one ancestry-based group (Figure 1, Table 2). Of the 18 loci, 15 are in or near reported OFC genes and reported associations from the published OFC GWASs listed in Methods, and the other three are novel. The novel associations include (i) chromosome 2p12 (rs62164740, p-value 6.27E-07); (ii) 10q22.2 (rs150952246, p-value 3.14E-07); and (iii) 10q24.32 (rs118107597, p-value 8.21E-07). All three showed the strongest association in the combined ALL sample. Four of the 22 associations were observed within a specific ancestry group: these included (i) chromosome 9q33.1 in EUR, rs9408874, p-value 2.59E-07; (ii) 12q13.2 in CSA, rs34260065, p-value 3.87E-07; (iii) TANC2 in EUR, rs1588366, p-value 1.09E-08; and (iv) 17q25.3 in AFR, rs1975866, p-value 3.13E-08. The ASIA subgroup did not yield any association with a significance p-value less than our threshold of 1e-06. The strongest associations were seen at the 8q24 locus with p-value 2.80E-29 in ALL, as well as exceeding genome-wide significance in EUR and CSA. Table 2 lists the 18 associations, with the three novel loci are listed in bold. Figure 1 shows the Manhattan plots of the five GWASs.

Novel Associations and Replication
There were three novel associations that did not correspond to known OFC GWAS loci, including the previously published GWASs of POFC Multiethnic Study participants (Leslie et al., , 2017, GWAS of Europeans (van Rooij et al., 2019), and GWAS of Africans . Along with the three novel loci, we also investigated four additional associations that were reported in Han Chinese GWASs with significance below the genomewide suggestive threshold of 1e-05 for possible roles in the development of CL/P. Interestingly, the four loci reported in Chinese samples do not show association with the ASIA group in our study, rather they are observed with ancestry groups other than Asians in our current study. These 7 loci are listed in Table 3 along with the gene nearest the lead SNP. The nearest gene was identified based on gene locations, specifically, transcription start and end positions obtained from the UCSC genome browser. In Table 3, we also note whether there are regulatory elements such as super-enhancers involved in craniofacial development,  Peak numbers in bold indicate novel loci, other loci showed modest association (p < 1e-04) in GWAS of Chinese (Huang et al., 2019). or if these regions appear to be expressed in fetal craniofacial tissue, as listed in Epigenomic Atlas of Early Human Craniofacial Development (Wilderman et al., 2018;Justin Cotney, 2020).

Comparison of Associations by Ancestry Group
Consistent with previous studies, we observed that many loci show significant association in one ancestry group, but are well below the suggestive threshold in the others. We wanted to explore the possible explanations for these findings and categorized each locus into one of three categories, which we describe below. We were able to easily categorize some of the associated loci into one of these three groups, while others were not as definitive. Table 4 lists the category that best fits each associated locus and summarizes the characteristics of top associations at each locus in brief, with respect to their allele frequencies, main SNP effects, and significance of the Q-statistics. The effect sizes are noted as being different if their confidence intervals do not overlap, and allele frequencies are considered to vary across groups if the F ST value falls above 0.065, which is the 95th percentile of the genome-wide distribution of F ST values in ALL. From these results, one can conclude that consistently significant Q statistics within a locus is indicative of a category 3 locus (when allele frequencies are similar), wherein the same variant contributes differently to CL/P risk in populations with different ancestry. The reverse is almost always true where the variant has similar effects irrespective of ancestry, i.e., those located within category 1 loci. Category 2 loci appear to lie somewhere in between.

Category 1
The significance of association appears to be a function of the sample size. In this category, the effect sizes are similar in all four ancestry groups; therefore, larger ancestry groups (e.g., CSA) and/or larger effect allele frequencies will have greater power to detect an association than the smaller ancestry groups (e.g., Asian and African), and/or if the effect allele is rare. We found this to be the case for six loci, which included 2p12, 5q35.1-q35.2, 8q24.21, 10q24.32 (novel locus), 17p13.1 (NTN1), and 17q21.31-q21.32 (WNT3 and WNT9B). The Q statistic p-value for metaanalysis is not significant at the nominal threshold of 5% at the top associated positions, and F ST values are generally small (0.05 or less). The 8q24.21 locus is a special case where effect allele frequencies of top associated SNPs are much lower in the ASIA group than the others.

Category 3
Association p-values differ between groups. Effect sizes and direction vary substantially indicating that the same allele increases risk of OFC in certain ancestry groups, but has the opposite effect or has no effect on risk of OFC in others. However, effect allele frequency match across groups. Four associations, 9q33.1, 12q13.2, 17q23.2-q23.3 (TANC2), and 17q25.3 are consistent with this category.

Uncertain Category
The 5p13.2-p13.1 locus appears to contain associated SNPs from categories 1 and 3. The effect sizes are not the same for all groups at a few SNPs, however, effect allele frequencies appear to be mostly similar, with F ST values around 0.05. The 8q21.3 (DCAF4L2) locus also contains a few loci that belong in category 3, where effect allele frequencies are the same across all groups (F ST <0.05), while effects differ significantly (p-value of Q statistic < 1e-03). Other variants conform to category 2, where the effect sizes are statistically similar, but allele frequencies differ. An example from each of the three categories are shown in Figures 2-4 below, the 17p13.1 (NTN1) locus from category 1, the 1q32.2 (IRF6) locus from category 2, and the novel association in AFR at 17q25.3 from category 3. Each figure shows a regional plot of association p-values (1st row), effect size and confidence intervals (forest plot in 2nd row left), and allele frequencies of the effect allele in each group (heatmap in 2nd row right). The regional plot spans the top 30 associations observed at that locus, with the top 10 SNPs identified. The forest plots show effect sizes and confidence intervals on the effect sizes (x axis) of the top 10 SNPs (listed on the y-axis) for each group. The Q-statistic p-value (precision above 0.001) is noted below each variant in the forest plots. The heatmap shows effect allele frequencies of the top 10 variants for each group (y-axis); the lead variant is highlighted in red within each heatmap. Variants are ordered by base-pair position from top to bottom in each forest plot of effect size, and allele frequency heatmap.
In Figure 2, the largest sample, CSA produces the highest association p-values, followed by EUR and ASIA, and AFR. The effect sizes are very similar across the groups. The combined ALL sample shows the strongest association, and all four of the ancestry-based groups appear to contribute evidence for association at the 17p13.1 (NTN1) locus. This association conforms to the characteristics that describe category 1.
In Figure 3, CSA and ASIA show elevated association p-values at the 1q32.2 (IRF6) locus, and are the major contributors to the significant association observed for the combined ALL sample, whereas EUR and AFR are weakly associated. The variants' effect alleles are common in CSA and ASIA, and rare in the other two samples. The effect sizes are all similar, therefore, it is likely that the heterogeneity observed at this locus is due to difference in allele frequencies. The IRF6 association conforms to category 2.
In Figure 4, at the 17q25.3 locus, the effect alleles are common in all four groups, however, a significant association is observed only in AFR. This is consistent with the small effect sizes observed in all groups except AFR. The 17q25.3 locus conforms is consistent with category 3.

DISCUSSION
The current study extends previously published GWAS studies by using a substantially larger study sample than the previous GWAS study that included unrelated case trios, and singleton cases and controls. The larger sample size allowed us to identify three new potential loci that confer risk of CL/P, however, the addition of participants also contributed to added genetic heterogeneity, that diminished the strength of association in some cases.
One of the striking findings from OFC GWAs to date is the difference in association signals between populations. Although some loci such as 1q32 (IRF6) have been replicated in almost every population, most loci have not replicated between different populations. The lack of replication of associated loci between different populations can be attributed to differences in phenotype, sample size, or differences in allele frequency of the variants themselves. For example, the 8q24 locus is strongly associated in European populations but has not been detected in Asian populations because the risk alleles in Europeans have low Frontiers in Cell and Developmental Biology | www.frontiersin.org 11 April 2021 | Volume 9 | Article 621482 frequencies in Asian populations, limiting the power to detect associations (Murray et al., 2012). In our current study, we explored the genetic heterogeneity of cleft lip with or without cleft palate as a result of difference in ancestry, by running GWASs of CL/P within participants classified into four ancestry groups. As a result, we found that association outcomes differed substantially by ancestry. Our hypothesis was that variants controlling the risk of CL/P differs by ancestry, although these variants are distributed similarly. To test this hypothesis, we compared the main SNP effects of top associations in each locus to their observed allele frequencies. Our comparison showed evidence that some loci confer risk in every population, and the likelihood of observing an association is low in populations where these variants occur at a low frequency.
More importantly, we identified loci at which, the variants produced significantly different association outcomes, even though they were similarly distributed across the ancestry-based groups. It is likely that these loci do not modulate genetic risk of CL/P directly, but through gene-by-gene and gene-byenvironment interactions.
An analysis of genetic diversity across the sites within each ancestry group showed that the ASIA and AFR groups are most diverse (median F ST 0.02), indicating more distinct population admixture within those samples, while the EUR and CSA groups show lower diversity (median F ST 0.003 and 0.009, respectively). It is likely that the genetic diversity of the ASIA sample was responsible for the lack of genome-wide significant or even suggestive associations, although the sample size is adequate (445 affected and 1,246 unaffected participants). The ASIA sample is geographically diverse, consisting of participants from China, India, and Philippines. The AFR and CSA groups are also geographically diverse (although CSA has a low F ST genetic diversity coefficient), which could have led to weaker associations; there are only two genome-wide significant associations in CSA, even though this is our largest group consisting of 1,050 affected and 2,988 unaffected participants. In a previous whole genome sequence study that included only the CSA participants from Colombia, we observed a genome-wide significant association in chromosome 21q22.3 (Mukhopadhyay et al., 2020), but no corresponding association is seen either in this current study or the two previous studies that used CSA participants (Leslie et al., , 2017. The variants in the chromosome 21q22.3 in Colombian CL/P trios were observed to have similar allele frequencies across the subpopulations that make up the CSA group, but yielded no significant association when non-Colombian CSA participants were analyzed separately. That locus now appears to fit the characteristics of our current study's category 3 loci.
Another contributing factor to the genetic heterogeneity is due to our phenotype itself. CL/P combines the two cleft subtypes: cleft lip only, and cleft lip with cleft palate. It has been shown by previous studies that the two subtypes are etiologically different (Carlson et al., , 2019. In the POFC multiethnic study sample, the frequency of these two subtypes differ among the sub-populations present, which can further reduce the power to detect association.
The exploration of the genetic differences between cleft subtypes is the focus of our ongoing investigation. This current study on the characterization of ancestry-level differences is an important step toward a more fine-grained and detailed investigation into the genetic heterogeneity of OFCs.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by University of Pittsburgh IRB. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.