Exploring the Pool of Pathogenic Variants of Amelogenesis Imperfecta: An Approach to the Understanding of Its Genetic Architecture

Objective: To identify which genes are associated with the clinical phenotype of amelogenesis Imperfecta (AI) and to elucidate which of these genes participate in the determination of isolated and syndromic forms. Methods: In this review, all data on mutations described in AI-related genes were obtained from HGMD® Professional. The data in relation to the mutations, inheritance, phenotype, type of AI and country were supplemented with information from the literature. The identity codes and frequency values were obtained from the dbSNP, ClinVar and OMIM databases. The percentage of specificity (PE) was determined for each gene. Results: HGMD® describes 27 genes involved in AI, which we propose to group into 5 categories: (1) genes whose mutations are associated only with isolated AI, (2) genes whose mutations cause only syndromic AI, (3) genes with both mutations that cause isolated AI and mutations responsible for other pathologies, (4) genes with mutations responsible for syndromic AI and mutations that cause other pathologies, and (5) genes with mutations that cause isolated AI and mutations that cause AI associated with syndromes and other pathologies. Using the PE calculation, the genes were ranked into 5 specificity groups. The genes of category 1 are specific for isolated AI, while the genes of categories 2 and 4 are non-specific. Interestingly, we observed that mutations in some genes were associated with different types of cancer. Conclusion: The ACP4, AMTN, MMP20, ODAPH, RELT, SLC24A4 and SP6 genes participate in causing isolated AI, and the CNNM4, DLX3 and FAM20A genes participate in causing syndromic forms of AI.


INTRODUCTION
Amelogenesis imperfecta (AI) is a group of hereditary malformations of dental development that affect the structure and chemical composition of the enamel of most teeth (1). They occur during the process of odontogenesis, specifically when tooth enamel is formed. This condition is rare, and the different kinds of AI have a combined prevalence ranging from 1/4,000 in Sweden to 1/14,000 in the United States, depending on the demographics of the population studied (1,2).
Defects in the secretory stage of amelogenesis lead to the formation of enamel with decreased or hypoplastic thickness. If failure occurs in the maturation stage, the enamel can retain proteins or present defects in mineralization, resulting in opaque, whitish tissue with lower hardness than normal tissue, that is, in hypomineralized enamel (2,3).
According to the classification of Witkop C. (1989), four main types of AI are distinguished: hypoplastic, hypomature, hypocalcified, and a combined hypomature/hypoplastic AI phenotype with taurodontism. When considering specific enamel traits and inheritance patterns, there can be 15 different AI subtypes (3). Currently, hypomature and hypocalcified AI are included in a single type called hypomineralized, and genetic testing is necessary to distinguish them (4).
In families affected by AI, the trait may present as a single phenotype, or it may be associated with other oral or extraoral traits, forming part of a syndrome (6,7). Given that complete deciduous dentition is present in the mouth at ∼2 years of age, it is clinically possible to observe hereditary defects in enamel development very early in most cases, before the appearance of systemic manifestations. At the time of the initial diagnosis, affected enamel may be the only sign present (8). This has great relevance for affected families because even when it is not possible to discriminate between isolated or syndromic AI through clinical-radiographic examination, patients with AI associated with a syndrome benefit from an early diagnosis that specifies the genetic cause of their disease, which contributes to improving its treatment and prognosis (8).
To date, the literature has not discussed which genes participate strictly in isolated AI and which genes are responsible for syndromic AI, whether a certain gene can be involved in cases of both isolated and syndromic AI, or which genes can cause other pathologies related or unrelated to AI, as reported in the clinical phenotype of dental agenesis (9).
In the context described above, we set the following specific objectives: (1) to identify the genes associated with the clinical phenotype of AI in the human gene mutation database (HGMD R ) (10); (2) to construct a catalog to describe and characterize the pathogenic variants associated with each gene in relation to the type of inheritance, clinical phenotype, country/ethnicity of the carriers, identification number in the single-nucleotide polymorphism database (dbSNP), and population frequency; (3) to establish what other pathologies/conditions are associated with each of these genes; (4) to determine the percentage specificity of each gene in determining the clinical phenotypes of isolated and syndromic AI; and (5) to show that some genes involved in AI can present mutations associated with some types of cancer.

Source of Information
This review was carried out using the professional database of human gene mutations (HGMD R Professional 2020.2 trial version) (10), which was consulted in October 2020, as the main source of information. The literature was obtained from this database, and the articles associated with causal mutations present in each gene involved in AI were selected and analyzed. Only the first report for each mutation was included. Subsequently, the information collected from the database and that obtained from the literature were analyzed in detail by the authors to build the catalog of AI mutations. This research was conducted in full accordance with the ethical principles of the Declaration of Helsinki and with local regulations.

Catalog of AI Genes and Causal Mutations
To fill out the catalog of AI genes and causal mutations, the HGMD R database was consulted for the "amelogenesis imperfecta" phenotype, which showed that 27 genes were associated with this clinical phenotype (10). For each of the 27 genes, data were collected on all associated diseases/phenotypes; the complete record and number of mutations classified by HGMD R as DM, FP, DP, and DFP [DM = mutation reported to be disease-causing, FP = functional polymorphism in vitro/laboratory or in vivo, DP = polymorphism associated with disease, and DFP = polymorphism associated with a disease with additional supporting functional evidence]; changes at the nucleotide level and at the protein level, according to the standard nomenclature of the Human Genome Variation Society 1 (HGVS); and the bibliographic reference in which the mutation was cited for the first time. The inheritance pattern and the clinical diagnosis of the type and subtype of AI reported for each mutation, as well as the country/ethnicity of those affected, were manually corroborated by the authors through an analysis of the literature. When available, the reference SNP (rs) number from dbSNP or the identification number (VCV) in the Database of Clinical Variations, and the population frequency of the mutations were obtained from the databases: dbSNP, 2 ClinVar, 3 and OMIM. 4

Specificity of AI Genes
To calculate the percentage specificity of a causal gene in determining the AI phenotype in isolation, the number of mutations underlying isolated AI reported for that gene (numerator) was divided by the total number of mutations of the same gene underlying other clinical conditions/diseases/phenotypes (denominator), and the result was expressed as a percentage.

Catalog of Pathogenic Variants of AI
The catalog shown in Supplementary Table 1 was constructed with each of the above genes ordered alphabetically (except the SP6 gene) and each of the pathogenic variants reported by HGMD R . A total of 304 mutations constituted the pool of pathogenic variants associated with isolated AI (181 mutations; 22.4%) and syndromic AI (123 mutations; 15.3%) reported to date. The total number of mutations in the 27 genes, including those associated with pathologies/conditions other than AI (502 mutations, 62.3%), was 806.
Supplementary Table 1 also allows us to determine which types and subtypes of AI were linked to certain syndromes and/or pathologies/conditions. For example, hypoplastic AI was part of renal enamel gingival syndrome (FAM20A), AI with brachyolmia (LTBP3) and hypomagnesemia with hypercalciuria and familial nephrocalcinosis (CLDN16). In trichodento-osseous syndrome (DLX3), we found hypoplastic AI and a combined phenotype of hypoplastic/hypomature AI. Jalili syndrome (CNNM4) presented hypoplastic AI, hypomature AI, and a combination of hypomature/hypoplastic AI (Supplementary Table 2).
The pool of isolated and syndromic AI mutations came from cases studied in approximately 47 countries. The countries where the most cases/mutations have been reported were Turkey (18.4%), the United States (9.9%), France (7.9%), Pakistan (6.3%), Korea (5.3%), China (4.9%), and the United Kingdom (4.6%) (Supplementary Tables 2, 3). The remaining countries reported between 0.3 and 3% of the cases/mutations. The distribution of mutations described in AI worldwide was quite broad, covering five continents (Supplementary Table 1, Supplementary Figure 1). Most cases were concentrated on the continents of Asia, Europe, and North America. In Central and South America, the cases contributed 2.6-3% of the mutations reported (Supplementary Figure 1). Oceania and Africa represented 0.7 and 3% of the mutations described, respectively (Supplementary Table 4).
In terms of frequency, several mutations of genes that cause isolated and syndromic AI were not yet reported in the databases, as was the case with the pathogenic variants of the AMTN, LAMA3, STIM1, TP63, and SP6 genes (Supplementary Table 1). Other genes had mutations with frequency values of 0.0000, attributable to the small size of the sampled population. In several genes, the minor allelic frequency was 0.000004; however, there were mutations in the AMBN, FAM83H, MMP20, and RELT genes that had frequency values equal to 4, 0.2, 0.16, and 0.12%, respectively, much higher than expected for a low-prevalence pathology such as AI (Supplementary Table 1).

Findings Based on the Mutational Spectrum of AI Genes
Categorization of AI Genes According to the Distribution of Causal Mutations Table 1 and Figure 2 show the detailed analysis of the 27 genes and the distribution of their mutations causing isolated AI, AI associated with syndromes, and other pathologies/conditions (syndromic and non-syndromic). According to this distribution, the 27 genes could be categorized as follows: 1) Genes whose mutations were associated only with the isolated AI phenotype, such as ACP4, AMTN, MMP20, ODAPH, RELT, SLC24A4, and SP6 (light blue bars in Figure 2). 2) Genes whose mutations caused only syndromic AI, such as CNNM4, DLX3, and FAM20A (pink bars in Figure 2).

3) Genes that had mutations causing isolated AI and mutations
responsible for pathologies/conditions other than AI, such as AMBN, AMELX, COL17A1, ENAM, FAM83H, GPR68, ITGB6, KLK4, LAMA3, and LAMB3 (light blue and purple bars in Figure 2). 4) Genes with mutations responsible for syndromic AI and mutations that caused other pathologies/conditions, such as CLDN16, CLDN19, LTBP3, SLC10A7, STIM1, and TP63 (pink and purple bars in Figure 2). 5) A fifth category formed by the WDR72 gene, which had mutations that caused isolated AI, AI associated with syndromes, and other pathologies/conditions (light-blue, pink, and purple bars in Figure 2).    Additionally, as shown in Figure 2 and  1; Figure 2).

Percentage Specificity of AI Genes
Categories 1 and 2 are all or nothing: they separate the genes that determine the clinical entity of isolated AI from those genes that participate in syndromic AI, a more complex phenotype where other tissues or organs may be affected in addition to craniofacial structures. Categories 3-5 all include genes with mutations that cause isolated and/or syndromic AI in addition to mutations that cause other pathologies/conditions, making these genes less specific in the determination of the pathology, since their mutational spectra include conditions other than AI. As shown in Table 1, Figure 1, and Supplementary Table 5, we determined the percentage specificity for each of the 27 genes to discern how many of them participate in the determination of AI pathology in isolation or non-syndromically. The ACP4, AMTN, MMP20, ODAPH, RELT, SLC24A4, and SP6 genes were in category 1, with 100% specificity. These were the only "totally specific" genes whose mutations caused isolated AI (Figure 1, Table 1; Supplementary Table 5). The genes ENAM, AMELX, and FAM83H, which belonged to category 3 and presented values of 92-97% specificity, made up the "very specific" group. The genes GPR68, AMBN, ITGB6 (category 3) and WDR72 (category 5) had specificity values of 50 to 83%, which would allow them to be considered "moderately specific" (Figure 1, Table 1; Supplementary Table 5). Likewise, the COL17A1, LAMB3, KLK4, and LAMA3 genes, included in category 3, could be considered "not very specific", with specificity percentages of 3-40%. In contrast, the genes CNNM4, DLX3, FAM20A (category 2), CLDN16, CLDN19, LTBP3, SLC10A7, STIM1, and TP63 (category 4) did not show specificity to determine the AI phenotype in isolation (Figure 1, Table 1;  Supplementary Table 5). An alarming observation from Table 1 on some mutations in genes that are related to AI according to HGMD R is that some genes in category 3, such as GPR68, KLK4, and LAMB3, and others in category 4, such as STIM1 and TP63, had mutations that were associated with susceptibility to lung cancer, prostate cancer, cervical cancer, Kaposi's sarcoma, and lung adenocarcinoma, respectively.

DISCUSSION
In this study, we report the results of a detailed analysis of the genes related to isolated and syndromic AI performed with HGMD R , considering all the pathogenic variants of these genes described to date. The main purpose was to elucidate which genes are specifically associated with the clinical phenotypes of isolated and syndromic AI.

Genes Involved in AI According to HGMD ®
For each gene, HGMD R provides the mutations expressed in the standard nomenclature, the total number of reported mutations, and the type of mutation. Information on the number of mutations described for each disease or phenotype associated with the gene of interest and the pathogenicity and population frequency of each mutation can also be obtained from several databases, if available. Using these data for the 27 genes that, according to the data extracted from HGMD R , participate in AI, we constructed a catalog of mutations that informs about the genetic architecture of this enamel pathology.

Findings Gleaned From the Catalog of AI Mutations
When genes involved in AI are mentioned, generally no distinction is made between those that cause isolated AI and syndromic AI. It is assumed that the genes ENAM, DLX3, FAM83H, KLK4, MMP20, WDR72, FAM20A, ODAPH (C4orf26), SLC24A4, and AMELX are responsible for non-syndromic conditions that represent hereditary enamel development defects (5,(11)(12)(13)(14)(15). However, the findings of our study and those of new reports show that mutations in these genes also cause clinical phenotypes of AI associated with syndromes and other pathologies/conditions (8,16,17). In this regard, it is important to explain that, based on the percentage specificity determined, the AMELX, ENAM, and FAM83H genes did not belong to the "total specificity" group as expected. This is because AMELX has mutations associated with cleft palate/lip and with unspecified enamel defects, probably of environmental origin, that do not correspond to AI ( Table 1). Some variants of the ENAM gene (and other genes that encode enamel proteins) have been associated with susceptibility to dental caries, a multifactorial oral pathology with a completely different etiopathogenesis from AI, and on the other hand, variants of the FAM83H gene may be associated with increased intraocular pressure ( Table 1).
In our opinion, the problem of association with syndromes lies in the fact that many articles that reported cases of AI, especially the oldest ones, did not contain an exhaustive clinical analysis of biochemical parameters and other tests that would have allowed them to diagnose a syndrome. Some cases reported by dentists contain very good descriptions of oral phenotypes but lack medical examinations, while some cases reported by physicians include many specialized medical examinations but lack a complete oral clinical-radiographic examination. This increasingly reveals the urgent need for multidisciplinary teams to collaborate, sharing information for the benefit of the patient.
Specifically, in the 80 pathologies and syndromes described by Wright et al. (6), that present enamel defects, only in Jalili syndrome (OMIM 217080) and rhizomelic dysplasia (OMIM 610319) is the diagnosis of AI mentioned: the hypomature type, in the last example. The most mentioned conditions are hypoplasia, defects, hypocalcification, hypomaturation, hypomineralization, abnormalities, dysplasia, discoloration, and pitting of the enamel. It is worth asking how many of these conditions truly represent AI. Could AI form a clinical spectrum going from defects in the development of the enamel (DDE, hypoplasia, diffuse and demarcated opacities) to localized circumscribed hypoplastic enamel defects (LHED) such as those caused by ENAM mutations: c.1258_1259insAG and c.1020_1021ins21bp in the heterozygous state (18,19), then molar-incisor hypomineralization (MIH) and finally to AI?
The catalog shown in Supplementary Table 1, which provides rapid access to all the genes of interest, is important for those who perform genetic counseling, as well as for clinicians in general, whether they are doctors or dentists, because the presence of AI can indicate association with syndromes when the genes in category 2 (CNNM4, DLX3, FAM20A) or category 4 (CLDN16, CLDN19, LTBP3, SLC10A7, STIM1, and TP63) are mutated. Mutations in the category 3 genes AMBN, AMELX, COL17A1, ENAM, FAM83H, GPR68, ITGB6, KLK4, LAMA3 and LAMB3, and the category 5 gene WDR72 can be suspected to be associated with other pathologies/conditions of varying severity.
Approximately half of the AI genes have compound heterozygous mutations, with CLDN19, ACP4, LTBP3, and MMP20 showing the most. These types of mutations occur when different variants at the same locus are inherited from each parent by a descendant. For this situation ocurr, the allelic frequencies of both variants (or at least one) must be "relatively high" in the population; for example, in the MMP20 gene, the mutation c.126+6T>G has a frequency of 0.002176 and is found in compound heterozygosity with the variant c.954-2A>G of unknown frequency (Supplementary Table 1). It is possible that composite heterozygosity in AI is underestimated since many of the published cases have been resolved using only Sanger sequencing. In this context, it would be interesting to re-evaluate the genotype of these cases using next-generation sequencing (NGS).
Another gene that illustrates the complexity of AI is ENAM, which can present autosomal dominant and autosomal recessive inheritance patterns (20)(21)(22) that show dose effects (19), include a mutational hot spot in its sequence (20,23), and exhibit incomplete penetrance (24) or variable expressivity (25). Additionally, it has been found that the ENAM: c.588+1delG mutation can interact with a defective allele of another gene (LAMA3: c.1559G>A), causing digenic inheritance with a generalized hypoplastic AI phenotype and enamel pitting (7). This type of inheritance is considered a link of greater complexity in a pathology since the additive effect of the genotypes in two or more loci is required. Another case of digenic inheritance is exemplified by the finding of the genotype [COL17A1: c.1141+1G>A; LAMA3: c.6477_6486del10], which notably results in hypoplastic AI as well (26).
Another interesting finding of this work is the possible connection of some of the AI genes with cancer (STIM1, TP63, GPR68, KLK4, LAMB3). Although HGMD R had no information that would relate them to cancer through the clinical phenotype of AI, there is evidence in the current literature that the AMBN (prostate cancer) (27), CNNM4 as a PRL3 binding target (liver cancer) (28), DLX3 (cutaneous squamouscell carcinoma) (29), FAM83H (liver cancer, osteosarcoma) (30,31) and MMP20 (lung adenocarcinoma, oral squamous-cell carcinoma) (32, 33) could also play roles in the development and progression of cancer. In vitro or in vivo functional assays will be required to determine which mutations have the potential to predispose patients to the cellular changes necessary for the initiation and progression of cancer.
Most cases of AI shown in the catalog originate from Asian, European, and North American countries, but according to the literature consulted, molecular diagnosis has been performed mainly in the United States, United Kingdom, France, Korea, and China, revealing the large gap and inequality of resources in the countries of Central and South America, possibly as a result of being developing countries that have small budgets for research. Unfortunately, the latter will be accentuated in the future due to the restrictions applied to these budgets, which will be further reduced by the priorities imposed by the COVID-19 pandemic.
From the molecular perspective, the classification of the genes involved in AI into five categories, from totally specific to non-specific, could represent a gradient from lower to higher expression in tissues; the number, type, and location of mutations; the number of functions a gene has; the structural diversity and the ability to interact with other genes. The best example of this is WDR72, a gene with medium AI specificity whose mutations are involved in the determination of isolated AI, syndromic AI, and other pathologies/conditions at the same time (8,34).
The strengths of this work are the construction of a catalog of mutations that informs about the genetic architecture of this enamel pathology and that allows the researcher to determine which types and subtypes of AI are associated with certain syndromes and pathologies/conditions. This study shows a panoramic view of the geographical distribution of AI in the world. In addition, the catalog provides rapid access to all the genes of interest, which is important for those who perform genetic counseling, particularly considering the possible relationship of AI genes with cancer. Finally, this investigation contributes to the determination of phenotypegenotype correlations in AI.
Among the limitations of this work are its reliance on a frequently updated database, which can make the information presented here out of date. In addition, this study would be enriched by a complementary analysis of other pathologies/conditions related to AI genes with information from other databases. In addition, considering only publications included in the HGMD R database could exclude publications in which a disease is phenotypically analyzed without determining the underlying mutation.
We think that the information obtained in this study is relevant for the determination of phenotype-genotype correlations in AI, for early consideration of the presence of mutations in certain genes as markers of association with syndromes, and for understanding how some of these genes could be related to each other and how this relationship could increase the risk of cancer development.

CONCLUSIONS
The main conclusions of this work based on our specific objectives are as follows: (1) 27 AI-related genes were identified in the human gene mutation database (HGMD R ). (2) The constructed catalog showed that 7 genes (ACP4, AMTN, MMP20, ODAPH, RELT, SLC24A4, SP6) were associated with isolated AI and 3 genes (CNNM4, DLX3, FAM20A) with syndromic AI. (3) Furthermore, it was observed that AI can be associated with multiple pathologies/conditions other than AI, either syndromic or non-syndromic. (4) The PE calculation allowed us to corroborate that the 7 genes in category 1, 100% specificity, determine isolated AI, and the 3 genes in category 2, 0% specificity, participate in syndromic AI. (5) From the analysis of phenotypes caused by the different AI mutations, it was observed that the GPR68, KLK4, LAMB3, STIM1 and TP63 genes are associated with both AI and cancer.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author/s.