The First Report of a Missense Variant in RFX2 Causing Non-Syndromic Tooth Agenesis in a Consanguineous Pakistani Family

Background: The syndromic and non-syndromic congenital missing teeth phenotype is termed tooth agenesis. Since tooth agenesis is a heterogeneous disorder hence, the patients show diverse absent teeth phenotypes. Thus identifying novel genes involved in the morphogenesis of ectodermal appendages, including teeth, paves the way for establishing signaling pathways. Methods and Results: We have recruited an autosomal recessive non-syndromic tooth agenesis family with two affected members. The exome sequencing technology identified a novel missense sequence variant c.1421T > C; p.(Ile474Thr) in a regulatory factor X (RFX) family member (RFX2, OMIM: 142,765). During the data analysis eight rare variants on various chromosomal locations were identified, but the co-segregation analysis using Sanger sequencing confirmed the segregation of only two variants RFX2: c.1421T > C; p.(Ile474Thr), DOHH: c.109C > G; p.(Pro37Ala) lying in a common 7.1 MB region of homozygosity on chromosome 19p13.3. Furthermore, the online protein prediction algorithms and protein modeling analysis verified the RFX2 variant as a damaging genetic alteration and ACMG pathogenicity criteria classified it as likely pathogenic. On the other hand, the DOHH variant showed benign outcomes. Conclusion: RFX2 regulates the Hedgehog and fibroblast growth factor signaling pathways, which are involved in the epithelial and mesenchymal interactions during tooth development. Prior animal model studies have confirmed the expression of rfx2 at a developmental stage governing mouth formation. Moreover, its regulatory role and close association with ciliary and non-ciliary genes causing various dental malformations makes it a potential candidate gene for tooth agenesis phenotype. Further studies will contribute to exploring the direct role of RFX2 in human tooth development.


INTRODUCTION
Tooth agenesis (TA) is a craniofacial malformation characterized by the absence of one or more teeth due to failure in the early stages of odontogenesis (Letra et al., 1993). TA is one of the most commonly occurring orofacial congenital anomalies in humans, with a prevalence of ∼3-11% depending upon the ethnicity and geography of the population (Larmour et al., 2005;Shimizu and Maeda, 2009). Populationbased studies have shown that the incidence of third molar agenesis (20%) is the most commonly known missing teeth phenotype. Excluding third molars, the prevalence of permanent teeth agenesis is higher (1.6-9.6%) than the primary dentition (0.5-0.9%) (Vastardis, 2000). Based on the number of missing teeth, TA can be classified into three categories: hypodontia (<six missing teeth), oligodontia (>six missing teeth), and anodontia (congenital absence of all primary and permanent teeth) (Stockton et al., 2000;De Coster et al., 2009). Hypodontia is relatively a common condition (1.6-6.9%) as compared to oligodontia (0.14%), while anodontia is extremely rare (Schalk-van der Weide et al., 1992;Polder et al., 2004;Al-Ani et al., 2017). Environmental (infections, trauma, chemotherapy, and radiotherapy) and genetic factors have been reported to cause TA phenotypes, but the latter is the most common cause of TA (Chhabra et al., 2014). Tooth agenesis shows autosomal recessive/dominant (Murakami et al., 2017;Park et al., 2019) and X-linked recessive/dominant patterns of inheritances (Rasool et al., 2008;Sarkar et al., 2014). Formerly, the monogenic inheritance was considered for TA in the previous studies, but recently several studies have proposed oligogenic patterns Du et al., 2018), supporting the concept of mutational load in human genetic disorders (Posey et al., 2017).
Tooth development is accomplished in a highly coordinated and genetically controlled fashion (Kapadia et al., 2007;Brook, 2009) and is regulated via some molecules involved in the signaling cascades (Kapadia et al., 2007), initiating a series of reciprocal interactions between the epithelium and underlying mesenchyme (Thesleff, 2006). The regular expression of several hundred genes is essential for tooth development (Thesleff, 2014). Genetic alterations in numerous genes have been reported to arrest tooth development in mice and/or humans (Kapadia et al., 2007;Fleischmannova et al., 2008;Brook, 2009;Thesleff, 2014). Four signaling pathways (FGF, BMP, Wnt, and SHH) lead the human tooth formation. In addition, a large number of genes encoding various components of these pathways are either directly or indirectly involved in various tooth conditions (Neubüser et al., 1997;Jussila and Thesleff, 2012). Their loss of function or gain of function may distort the signaling cascades and cause various orofacial anomalies (Nieminen, 2009).
We aimed to characterize the clinical features of an autosomal recessive Pakistani family displaying tooth agenesis and to identify an underlying genetic cause. We present a comprehensive clinical investigation followed by exome sequencing analysis. The exome sequencing data revealed a novel biallelic variant c.1421T > C; p.(Ile474Thr) in 13-exon of RFX2.

Family Recruitment, Pedigree Construction, and Blood Collection
A four-generation consanguineous sixteen membered family with two affected individuals (IV-4 and IV-5) with missing teeth phenotype was recruited from the southern region of Khyber Pakhtunkhwa province, Pakistan. Pedigree was constructed based on the information provided by the father (III-1) of the affected members ( Figure 1A). Informed written consent was taken from all the participants. The Research and Ethical Committee (REC) of Kohat University of Science and Technology (KUST), Kohat, Pakistan, approved the study protocols, strictly following the recommendations of the Declarations of Helsinki.

DNA Extraction
Genomic DNA was isolated from the whole peripheral blood of the patients (IV-4 and IV-5) and other family members (III-1, III-2, IV-1 and IV-6) by using the GenElute ™ blood genomic DNA kit (Sigma-Aldrich MO, United States). Qubit Fluorometer (ThermoFisher Scientific, United States) was used for the quantification of DNA.

Exome Sequencing, Alignment and Variant Calling
A 100 ng/μl DNA of two members (IV-1 and IV-4) was used for exome sequencing. The sequencing libraries of the DNA were prepared with the SeqCap EZ human exome library v2.0 kit. The sequencing was done on Illumina HiSeq 4000 sequencing machine via a paired-end 100-bp protocol (Hussain et al., 2013). The filtration of primary data was carried out by the Illumina real-time analysis (RTA) software v1.8. Afterward, the mapping of the reads to the human reference genome build GRCh37/hg19 (http://www.genome.ucsc.edu/) was performed using the BWA-SW alignment algorithm. Picard tools were used to improve the read quality, and Genome Analysis Toolkit (GATK) was used for realignment and base quality score recalibration. The calling of single nucleotide polymorphisms (SNPs) and short insertions/deletions (INDELs) was performed by Platypus, Haplotype Caller, and Mpileup programs and further filtration was carried out through variant quality score calibration (VQSR) using GATK. The ALLEGRO program identified several large runs of homozygosity (ROH) based on multipoint linkage analysis.
For the coverage analysis of CNV detection, CNMOPS and ExomeDepth algorithms were utilized. In addition, the COMBINE and FUNC algorithms were used to combine the data and the annotation of functional variants.

Variant Search, Classification and Sanger Sequencing
The Varbank pipeline v2.26 (https://varbank.ccg.uni-koeln.de/) of Cologne Center for Genomics (CCG), Cologne, Germany, was utilized for the exome data analysis. The mean coverage of the data was 85%, while at 20X and 10X, the coverage of the targeted bases was 93.6 and 96.6%, respectively. A panel of genes, including MSX1, PAX9, AXIN2, FGFR1, IRF6, LRP6, WNT10A, WNT10B, EDAR, and EDARADD, underlying various dental malformations, was also filtered out to exclude the involvement of any copy number variations, missense, nonsense, or compound heterozygous variants in these genes. Considering the consanguinity among the parents, ROH in the affected members (≥5 Mb) were identified. Later, a variant search was carried out in the ROH to find out rare homozygous variants. Furthermore, an exome-wide search irrespective of ROH was also executed to search for rare biallelic variants. Based on the autosomal recessive inheritance pattern, allele read frequency (75%-100%) for homozygous changes and allele frequency (<1%) for recessive variants was used. VarSome (Kopanos et al., 2019), Human Gene Mutation Database (HGMD) Professional 2019.4, and Database of Single Nucleotide Polymorphisms (dbSNP) were utilized for the evaluation of variants. Genome Aggregation Database v.2.1.1 (gnomAD; https://gnomad.broadinstitute.org/) was consulted to establish the minor allele frequency (MAF; value < 0.01) of the variants. An in-house database of 511 exomes of patients with diverse phenotype and another dataset of 21-exomes of ethically matched Pakhtun patients along with 90-exomes of other Pakistani patients of various ethnic backgrounds were employed as a control. ACMG guidelines were consulted for the classification of pathogenic, likely pathogenic, uncertain significance and benign variants (Richards et al., 2015).
In addition, a thorough search was performed on the variant data of chromosome X, considering the gender of the affected members in the pedigree diagram, but no bonafide variant, neither in EDA nor in other X-linked genes, was identified.
The reference sequences were obtained from the University of California Santa Cruz (UCSC) genome database browser (http://genome.ucsc.edu/cgi-bin/hgGateway). Amplifx version 2.1 (https://inp.univ-amu.fr/en/amplifx-manage-testand-design-your-primers-for-pcr) was used for designing the primers for the amplification of the regions of interest. First, a genomic sequence of 700 bp up-and-downstream from the position of the rare variant was scanned to design an appropriate primer pair. Then, the PCR amplification of the regions of interest was carried out, and the Exo-Sap protocol (https://www.thermofisher.com) was used for purifying the PCR products. Next, the DNA sequencing was performed on the ABI3730 genetic analyzer with BigDye chemistry v3.1. For aligning the sequences against the reference sequence, a sequence alignment tool, BioEdit version 6.0.7 (http://www.mbio.ncsu.edu/BioEdit/bioedit.htm) was utilized.

Protein Structure Prediction
The homology modeling technique was utilized to construct the three-dimensional structures of RFX2 and DOHH proteins using I-TASSER (Iterative Threading ASSEmbly Refinement) server (Yang et al., 2015). For wild-type and mutant (p.Ile474Thr) RFX2 modelling, we used PDB (Protein Data Bank) ID: IDP7 (intrinsically disordered protein-7) structure as a template with an 84% structure homology. In the case of Frontiers in Genetics | www.frontiersin.org January 2022 | Volume 12 | Article 782653 DOHH-p.Pro37Ala, we used the crystal structure of human DOHH (PDB ID: 4D50). A similar energy minimization strategy was utilized for the wild-type and mutant structures. Finally, structures were visualized with PyMOL software (www.pymol. org) (Delano, 2002).

Clinical Report
The affected individuals  showing variable missing teeth conditions were recruited at the dentistry department, Khyber Medical University, Peshawar, Khyber Pakhtunkhwa (KP), Pakistan. A dental specialist performed a clinical evaluation of the patients. The unaffected members (III-1, III-2, IV-1, IV-6) available for the study ( Figure 1A) were also evaluated at the same clinic.
The patients showed spacing in the upper jaw between the two central incisors. In addition, intra-oral examination showed the generalized spacing in the anterior region of the maxilla and mandible. During diagnosis, the missing third molars were excluded. In the case of patient-IV-4, a total of nine permanent teeth were missing (maxillary lateral incisors, left maxillary first premolar, all 4 s premolars, the right maxillary second molar, and right mandibular second molar) ( Figure 1B a, b). However, in patient-IV-5, five permanent teeth including, maxillary lateral incisors, maxillary second premolars, and right maxillary first premolar, were missing ( Figure 1B d, e). Maxillary central incisors were proclined labially in both patients with patchy chalky white discoloration on the labial surface and around the root of the deciduous first molar. All maxillary teeth, except the central incisors, were small in size, and there was a lower arch spacing due to missing mandibular molars on both sides. Deep attrition in the case of maxillary incisors, canines, and premolars with enamel loss from their incisal surfaces and a taurodontism in the left mandibular third molar was observed. In addition, generalized horizontal bone loss and deep biting were also discerned. Furthermore, there was a clinical midline diastema due to the missing lateral incisors and a thin enamel covering over the dentine, especially in the lower incisor and lower premolars. Periapical radiolucency around the root of the deciduous first molar was also reported ( Figure 1B c, f). The unaffected members showed normal permanent dentition in both arches (maxillary and mandibular).
The patients were of standard heights (IV-4 152.4 cm, IV-5 153.6 cm), and they did not show any neurological, skeletal, and visceral organ defects upon medical examinations. Likewise, the abnormalities of other ectodermal appendages, including hypotrichosis, hyponychia/anonychia, cleft lip/ palate, hypo/hyperpigmentation, and hypohidrosis, were ruled out by the medical consultants during the physical investigations of these patients. In addition, the patients did not record any complaints of muscular weakness or any progressive physical lethargies.  Table 1. Therefore, these variants were considered for the co-segregation analysis.
The fine-mapped variants were sequenced via Sanger sequencing for the validation and confirmation of segregation. During the co-segregation analysis, the variants in RFX2 and DOHH, located in a common ROH of 7.1 MB on chromosome 19 segregated in the family (Figures 1A, 2C). The DOHH variant c.109C > G; p.(Pro37Ala) was predicted benign and tolerated by the online prediction tools, while the variant c.1421T > C; p.(Ile474Thr) in RFX2 was predicted damaging. Moreover, the RFX2 variant achieved highly significant CADD (27.3) and GERP (5.1399) scores ( Table 2). The ACMG variant classification criteria classified the RFX2 variant as likely pathogenic, while the DOHH variant was benign ( Table 2). Hence, the RFX2 variant is the most likely candidate causing non-syndromic tooth agenesis. An allele frequency (AF) (0.0001272) of this variant has been calculated in gnomAD. This variant is 32 times monoallelically identified in South Asian (17), European (non-Finnish, 13), Latino/Admixed Americans (1), and in a minor population (1), where the total number of alleles is 251,482 while the number of homozygous alleles is zero. From the South Asian population, 30,616 alleles are included with an AF of 0.0005553 in this database. A dbSNP ID rs769861701 has been assigned to this variant. According to our knowledge, this is the first report of a biallelic variation at c.1421T > C in RFX2, which is likely to cause human absent teeth phenotype (Figures 2A-C).

Protein Structure Prediction
Multiple sequence alignment revealed conservation of the RFX2-Isoleucine-474 residue, as shown in Figure 3A, and the conserved residue is essential for protein structure determination, stability, and functionality. The structural relevance of the highly conserved Isoleucine residue was determined using the homology modeling technique. We examined and studied both wild-type and mutant structures to discover the structural differences, depicted in Figure 3B i and ii. As illustrated in the zoomed-up view, we detected differences in intramolecular distance among residues in the immediate mutation vicinity. Because of the variation in bonding, there is a local difference in the structure, as indicated by the arrows. We also noticed a difference in surface charge distribution between the two forms of the RFX2 protein, as indicated by the arrows ( Figure 3B iii and iv).
On the other hand, Pro37 and Ala37 are located on the surface of wild and mutant DOHH proteins, respectively, suggesting the DOHH variant is substantially insignificant (Figures 4A,B). Furthermore, electrostatic analysis of wildtype and mutant DOHH showed that the residue is not present at the interface of DOHH dimeric form and is not involved in any inter-protein and intra-protein interactions ( Figures 4C,D).

DISCUSSION
In this study, we have investigated the genetic cause of the NSTA in a Pakistani family through exome and Sanger sequencing. The affected individuals manifested a generalized spacing in the upper jaw between the two central incisors in addition to congenitally missing permanent maxillary lateral incisors, left and right maxillary first premolars, and mandibular and maxillary second premolars. Besides, the exome data revealed a novel homozygous sequence variant c.1421T > C; p.(Ile474Thr) in RFX2, which lies in an extended dimerization domain C of RFX2 protein. According to the ACMG guidelines (2015), this variant has been categorized as a likely pathogenic variant ( Table 2).
RFX2 comprises 18 exons (Figures 2A,B), encoding a 723 amino acids protein (RFX2), and has two isoforms, P48378 and P48378-2 (https://www.uniprot.org/uniprot/). In humans, eight RFX genes (RFX1-RFX8) have been characterized (Aftab et al., 2008;Sugiaman-Trapman et al., 2018). RFX genes encode transcription factors (TFs) termed RFX transcription factors (RFX-TFs), performing multiple functions in different organisms and have been argued to be involved in the regulation of various cellular and developmental mechanisms (Zaim et al., 2005;Garg et al., 2015). As the RFX genes have essential activities in development hence, the alterations in their sequence may lead to severe pathogenic consequences (Senti and Swoboda, 2008;Choksi et al., 2014). According to Human Gene Mutation Database (HGMD) Professional 2021.2, five missense variants associated with autism (Sanders et al., 2012;Kosmicki et al., 2017;Satterstrom et al., 2020), cerebral palsy (McMichael et al., 2015), and congenital heart defect (Jin et al., 2017) have already been reported in RFX2. Here we report a novel sequence variant c.1421T > C; p.(Ile474Thr) in this gene in an autosomal recessive family exhibiting tooth agenesis. As Frontiers in Genetics | www.frontiersin.org January 2022 | Volume 12 | Article 782653 predicted by the bioinformatics tools, this missense variant results in structural deviations in RFX2 protein, leading to significant perturbations or abolishing its function. rfx2 expresses in various ciliated tissues in Xenopus, including the neural tube, which leads to the development of the brain, spinal cord and epidermis, gastrocoel roof plate, inner ear, and renal tissues (Chung et al., 2014). In addition, severe cilia-defective embryonic phenotypes were observed in rfx2 -/-Xenopus (Chung et al., 2014). Rfx-TFs are involved in the hearing mechanism of mice (Elkon et al., 2015). rfx2 is highly expressed in testes (Reith et al., 1994), and Wu et al. (2016) have studied the defective spermatogenesis and severe growth retardation in rfx2 -/mice (Wu et al., 2016). In a mouse model study, Bisgrove et al. (2012) have observed the expression of rfx2 at the late headfold stage (Bisgrove et al., 2012). The headfold stage governs the formation of the heart (Varner et al., 2010) and an oral membrane in the jawed vertebrates, which leads to mouth morphogenesis, including the development of jaws and teeth (Soukup et al., 2013). Considering the role of RFX2 as a transcriptional factor and its significant expression in various tissues, we hypothesize that it regulates the morphogenesis of multiple tissues and organs in animals and humans, including the formation of craniofacial tissues. Besides, the sequence alterations of RFX2 may cause various intellectual disabilities, heart anomalies, renal defects, otologic disorders, and abnormal growth of ectodermal appendages, including teeth.
RFX proteins play a central role in the morphogenesis of cilia on the surface of polarized cells (Senti and Swoboda, 2008;Senti et al., 2009). In humans and animals, the primary cilia are involved in dental development, located in the epithelial and mesenchymal tissues during initial stages, differentiation, and formation of complex tissues during tooth development (Kero et al., 2014;Hampl et al., 2017). As the primary cilia are the important signaling centers during vertebrate development (Goetz and Anderson, 2010), any genetic or experimental alteration in their structure or Frontiers in Genetics | www.frontiersin.org January 2022 | Volume 12 | Article 782653 function may lead to defective odontogenesis (Liu et al., 2014), resulting in a change in the tooth formula, size, morphology, position (Hardcastle et al., 1998;Lan et al., 2014), and odontoblast/ameloblast differentiation (Bei, 2009). Moreover, several ciliopathies, including Bardet-Biedl syndrome (BBS), Ellis-van Creveld syndrome (EVC; OMIM: 225,500), Weyers acrofacial dysostosis (WAD; OMIM: 183,530), have been reported in the literature displaying syndromic genetic anomalies, including the dental defects, hence, demonstrating the principal role of primary cilia in tooth morphogenesis (Hampl et al., 2017). Recently, it has been studied that Rfx2 regulates the expression of genes encoding axonemal dynein subunits, proteins involved in epithelial-mesenchymal transition, and the BBSome elements (Chung et al., 2014). These genes have been reported in human ciliopathies (BBS, EVC, WAD) (Sharma et al., 2008;Reiter et al., 2012). Furthermore, Rfx2 directly influences the downstream expression of 911 genes, and 180 genes out of those are only the ciliary genes (Chung et al., 2014).
RFX2 is essential for the expression of numerous genes, regulating the Hedgehog (Hh) signaling (Chung et al., 2012;Chung et al., 2014). Sonic hedgehog (Shh) is expressed in the dental epithelium and regulates the formation of enamel, Frontiers in Genetics | www.frontiersin.org January 2022 | Volume 12 | Article 782653 dentin, cementum, and soft tissues (Hosoya et al., 2020). A recent study by Neugebauer et al. (2009) revealed that the disruption of FGF signaling through fgfr1 reduces the expression of ift88 and two ciliogenic transcription factors foxj1 and rfx2 (Neugebauer et al., 2009). An ift88 null mice exhibit skeletal defects and hypoplastic maxilla, mandible, and supernumerary teeth (Zhang et al., 2003;Ohazama et al., 2009). The pathogenic genetic changes in fgfr1 also cause phenotypes consistent with those seen in ift88. The pathogenic variants in human FGFR1 have been reported to cause various tooth malformations (Dodé et al., 2003;Pitteloud et al., 2006). In human ciliated cells, IFT88 (OMIM: 600,595) co-localizes with OFD1 (OMIM: 300,170), and genetic alterations in OFD1 cause Orofaciodigital Syndrome-1 (Singla et al., 2010). As the expression of the genes discussed above is dependent on each other and they govern the same signaling pathway, we firmly believe that IFT88, FOXJ1 (OMIM: 602,291), and RFX2 are the strong candidate genes associated with human development and the genetic variations in these genes may lead to skeletal, oral and maxillofacial anomalies.

CONCLUSION
In this study, we report a clinical and molecular diagnosis of a consanguineous NSTA Pakistani family. Exome sequencing and Sanger sequencing identified a novel homozygous missense variant c.1421T > C; p.(Ile474Thr) in a novel candidate gene RFX2 causing NSTA in this family. We claim that RFX2, based on its critical regulatory role in ciliary development, SHH, and FGF signaling, is an essential molecular player in human odontogenesis.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: ClinVar, SCV001934213.1.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by the Research and Ethical Committee (REC) of Kohat University of Science and Technology (KUST), Kohat, Pakistan. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
SAK, NM, ZR, UK, AK, and HK contributed to study design and synopsis writing. MK did a clinical evaluation of the Frontiers in Genetics | www.frontiersin.org January 2022 | Volume 12 | Article 782653 patients and wrote a report about it. NW and SAK contributed to data generation and analysis. AN has performed the bioinformatics analysis. SAK, SK, NM, ZR, MK, AN, UK, AK, HK, and NW have critically reviewed clinical and molecular data, and agreed upon writing a manuscript. SAK, SK, and NW contributed to writing, revising, and finalizing the draft.