Pathogenic and Low-Frequency Variants in Children With Central Precocious Puberty

Background Central precocious puberty (CPP) due to premature activation of GnRH secretion results in early epiphyseal fusion and to a significant compromise in the achieved final adult height. Currently, few genetic determinants of children with CPP have been described. In this translational study, rare sequence variants in MKRN3, DLK1, KISS1, and KISS1R genes were investigated in patients with CPP. Methods Fifty-four index girls and two index boys with CPP were first tested by Sanger sequencing for the MKRN3 gene. All children found negative (n = 44) for the MKRN3 gene were further investigated by whole exome sequencing (WES). In the latter analysis, the status of variants in genes known to be related with pubertal timing was compared with an in-house Cypriot control cohort (n = 43). The identified rare variants were initially examined by in silico computational algorithms and confirmed by Sanger sequencing. Additionally, a genetic network for the MKRN3 gene, mimicking a holistic regulatory depiction of the crosstalk between MKRN3 and other genes was designed. Results Three previously described pathogenic MKRN3 variants located in the coding region of the gene were identified in 12 index girls with CPP. The most prevalent pathogenic MKRN3 variant p.Gly312Asp was exclusively found among the Cypriot CPP cohort, indicating a founder effect phenomenon. Seven other CPP girls harbored rare likely pathogenic upstream variants in the MKRN3. Among the 44 CPP patients submitted to WES, nine rare DLK1 variants were identified in 11 girls, two rare KISS1 variants in six girls, and two rare MAGEL2 variants in five girls. Interestingly, the frequent variant rs10407968 (p.Gly8Ter) of the KISS1R gene appeared to be less frequent in the cohort of patients with CPP. Conclusion The results of the present study confirm the importance of the MKRN3-imprinted gene in genetics of CPP and its key role in pubertal timing. Overall, the results of the present study have emphasized the importance of an approach that aligns genetics and clinical aspects, which is necessary for the management and treatment of CPP.


INTRODUCTION
Central Precocious Puberty (CPP) results from the premature activation of the hypothalamic-pituitary-gonadal (HPG) axis. It is clinically defined by the development of progressive secondary sexual characteristics before the age of 8 years in girls and 9 years in boys; mainly breast development in girls and testicular enlargement in boys, acceleration in linear growth and pubertal mood changes (1,2). CPP has been reported to represent 80% of patients with precocious puberty and to be predominant in girls (2)(3)(4). The benchmark for the establishment of a hormonal diagnosis includes the evaluation of predominant luteinizing hormone (LH) over folliclestimulating hormone (FSH) levels after the administration with exogenous luteinizing hormone-releasing hormone (LHRH) agonists (5) or/and elevated estradiol in girls. In boys, elevated testosterone is an additional hormonal criterion. Nevertheless, basal LH levels are also being used in some settings, as upgraded laboratory methodologies for LH assays have become accessible (6). Additionally, in patients with CPP, hand X-rays can show bone age advancement and pelvic ultrasound in girls can confirm the progression of ovarian function and increased uterus size (7).
Family history of CPP has been recognized in up to 27.5% of cases with an autosomal mode of inheritance. Few genes have been described as causative of CPP, involving both excitatory and inhibitory pathways of GnRH secretion (8)(9)(10). At this time, genetic aberrations associated with the Makorin Ring Finger Protein 3 (MKRN3) gene are the leading genetic etiology of CPP (9). Since the groundbreaking discovery of loss-of-function mutations in the MKRN3 gene (11), numerous other studies followed and reported more than 40 novel variants, including missense, nonsense, and frameshift mutations in MKRN3 across families with CPP in a broad spectrum of geographical regions (9,12). MKRN3 gene is located in the Prader-Willi syndrome (PWS)-related region (15q11-q13) on chromosome 15. The maternal allele of the gene is imprinted therefore is expressed only from the paternal allele. All affected patients reported with familial CPP inherited the MKRN3 mutations from their fathers (9,(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28). MKRN3 is composed of five zinc-finger domains: three C3H1 motifs, one C3HC4 RING motif, and one MKRNspecific Cys-His domain (29). C3H1 zinc-finger motifs are responsible for RNA binding while the RING motif is detected in E3 ubiquitin ligases and hereafter, it is anticipated that it has an ubiquitin-ligase activity (30). Recently, it has been demonstrated that auto-ubiquitination of the MKRN3 protein is diminished as a result of mutations located on the C3HC4 RING motif (31). Additional work by Li et al. (32) demonstrated that MKRN3 epigenetically regulates the transcription of GNRH1 through ubiquitination of MBD3 and controls the onset of mammalian puberty.
The MKRN3 is highly expressed in the arcuate nucleus of the developing brain and is believed to be involved in protein degradation leading in that way to an inhibitory effect of the pulsatile GnRH secretion (30). The exact mechanism that accomplish this effect or by which mechanism MKRN3 deficiency results in early reactivation of GnRH secretion is still under investigation. Although, the majority of the reported studies refer to loss-of-function mutations in the coding region of MKRN3, defects in the regulatory regions of the gene were recently described in three studies (23,33,34). In the first study, a four nucleotide deletion (c.-150_-147delTCAG) in the proximal promoter region of the MKRN3 gene was found to be responsible for causing CPP (23). In the second study, a single nucleotide substitution at position 19 (MKRN3:g.+19C>T) from the transcription start site (TSS) in the 5′-UTR region of the MKRN3 gene was also associated with CPP (33). Finally, in a recent third study by our group, we report three novel heterozygous mutations located in the proximal promoter and one in the 5′-UTR region of the MKRN3 gene in a total number of seven nonrelated girls with CPP. Four girls carried the MKRN3: g.-865G>A mutation, one the MKRN3:g.-166G>A, and one the MKRN3:g.-886C>T mutation; all variants were located in the proximal promoter of the gene. Interestingly, a 7.6-year-old girl with CPP was identified with the novel MKRN3:g.+13C>T mutation in the 5′-UTR region (34).
To date, only two gain-of-function mutations in the KISS1/ KISS1R pathway have been reported: a girl with KISS1R missense mutation and a boy with KISS1 missense mutation. Their presence caused upregulation of the KISS1/KISS1R system leading to GnRH secretion and HPG activation (9,35,36). Subsequently, several studies did not identify gain-of-function KISS1 and KISS1R mutations in cohorts of children with CPP, suggesting these may be extremely rare causes of the disorder (37)(38)(39). Lately, several mutations in a second maternally imprinted and paternally expressed gene, the Delta-like noncanonical Notch ligand 1 (DLK1), have also been associated with CPP (9). DLK1 is also known as preadipocyte factor 1 (Pref-1) and is involved in the Notch signaling pathway as an adipocyte modulator (40). Studies identified loss-of-function mutations in the DLK1 gene (deletions and frameshifts) as a rare cause of CPP (41)(42)(43), strengthening a significant role of this factor in human pubertal timing and the age of menarche (44).
In the present study, we have assembled a cohort of patients with CPP to investigate genetic determinants implicated in the disorder, based on the recent global literature data. By using an expanded GnRH/CPP-associated gene panel, we examined the involvement of such genes.

Patients
Fifty-four index girls (96.4%) and two (3.6%) index boys with CPP were referred for genetic investigation to the Department of Molecular Genetics, Function and Therapy at the Cyprus Institute of Neurology and Genetics. All children included in the study fulfilled the criteria for CPP diagnosis. Girls presented with breast development Tanner stage 2 before the age of 8 years and boys presented with testicular enlargement more than 4 ml in volume measured with Prader orchidometer before the age of 9 years. Elevated basal or stimulated gonadotrophins with LH predominance confirmed the diagnosis. Elevated estradiol levels in girls or testosterone levels in boys and imaging studies; bone age X-ray evaluated by Greulich and Pyle method (45), pelvic ultrasound, and hypothalamus-pituitary MRI with contrast were used as additional diagnostic tools.
At a first stage, all index patients with CPP were tested by Sanger sequencing using appropriately designed primers for the MKRN3 (RefSeq NM_005664.4) gene. Further testing by whole exome sequencing (WES) was performed on the negative MKRN3-tested patients. Written informed consent was obtained from parents of all patients under the age of 16 that participated in the study. The project was approved by the Cyprus National Ethics Committee, and all methods were performed in accordance with the relevant guidelines and regulations.

Genetic Analysis
Genomic DNA was extracted from peripheral blood using the Gentra Puregene Kit (Qiagen, Valencia, CA, USA) according to the manufacturer's instructions. The DNA purity was measured using the Nanodrop ND-1000 spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA). Prior to library preparation for whole exome sequencing (WES), genomic DNA was quantified using the Qubit dsDNA BR Assay Kit (Invitrogen, Life Technologies, Eugene, OR, USA) on a Qubit ® 2.0 Fluorometer (Invitrogen, Life Technologies, Eugene, OR, USA). WES was performed by using the TruSeq Exome Kit (Illumina Inc., San Diego, CA, USA) with paired-end 150-bp reads. NGS was performed using the NextSeq 500/550 High Output Kit v2.5 (150 cycles) on a NextSeq500 system (Illumina Inc., San Diego, CA, USA). The FastQC quality control tool (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) was used to evaluate the quality of the WES procedure. The mean target coverage of the whole exome was 62.13×. Specifically, 10× coverage was reached for 92.34% of the nucleotides, 20× coverage for 86.03% of the nucleotides and 30× coverage for 76.96% of the nucleotides, indicating that the WES reaction was of sufficiently high quality for subsequent analysis.

Variant Analysis
The fastq data obtained by WES were processed using an inhouse bioinformatics pipeline. Briefly, all variants were inputted into the VarApp Browser, filtered, and aligned to the human reference genome GRCh38.p12, hg38 assembly. VarApp is a graphical user interface, which supports GEMINI (18). Variants in genes previously associated with pubertal onset and precocious puberty were further analyzed using the Qualimap v2.2.1 tool (46) to calculate the target coverage. Mean target coverage was 60× of the selected genes (Supplementary Table  S1). Variants in these genes were additionally filtered using the VarApp Browser for minor allele frequencies of less than 1% in public databases such as 1,000 genomes, ExAC browser, and Exome Sequencing Project (ESP). Moreover, variants were filtered and selected according to their impact such as frameshift, splice acceptor, splice donor, start lost, stop gained, stop lost, inframe deletion, inframe insertion, missense, protein altering, and splice region. In addition, variants were filtered by the VarApp Browser for their pathogenicity by two in silico tools, SIFT and Polyphen2. Population-specific data from an in-house WES library composed of 43 randomly selected samples of Cypriot origin were used as an in-house control cohort. All variants identified were confirmed by Sanger sequencing. Finally, the variants were categorized for their pathogenicity using the standards and guidelines of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology (47). Any additional variants identified in other genes resulting from the more in-depth WES analysis were also analyzed using the described above methodology.

Genetic Network Modeling
Finally, the genetic network was constructed using the genemania platform (https://genemania.org). MKRN3 was used as the anchor gene and a full association network was designed including all genes that crosstalk with MKRN3. Namely, at the time of the analysis genemania database was indexing 2,830 association networks containing 660,554,667 interactions mapped on 16,6691 genes. For the purposes of this study, only human genes were studied for protein and genetic interactions, similar protein domains, genetic and biochemical pathways, colocalization, and coexpression of genes.

Clinical Findings
The more prevalent clinical, biochemical, and imaging features regarding the cohort of 56 index patients with CPP (54 girls and two boys) are summarized in Table 1. All patients fulfilled the clinical criteria of central precocious puberty; breast development in girls before the age of 8 years and testicular enlargement of more than 4 ml in boys before the age of 9 years. The majority of the girls presented at Tanner stage 2 of breast development. Only two girls presented with premature menarche both at age 9.5 years. The majority of the girls underwent LHRH test which showed a predominance of the LH over FSH levels. Bone age advancement was above 2 years compared with the chronological age in the majority of the patients. MRI of the hypothalamus-pituitary was reported as normal in all the patients included in the study. Twelve of the patients confirmed positive family history of CPP in either parental side.

Pathogenic Genetic Defects in MKRN3 Gene
In 12 index girls with CPP, we have also identified three different mutations in the coding region of the MKRN3 gene. More specifically, the missense p.Gly312Asp was identified in seven index girls, the frameshift p.Met268ValfsTer23 in four index girls, and the nonsense p.Glu298Ter in one index girl with CPP ( Table 2). Segregation analysis of all 12 index girls with MKRN3 mutations identified paternal inheritance. Family history of early menarche was reported in nine of 12 paternal grandmothers of the CPP girls identified with MKRN3 gene mutations. The upstream variants of uncertain significance (VUS) with evidence for likely pathogenic g.23564798C>T (rs74005577) and g.23564819G>A (rs139233681) were respectively reported in two and four CPP index girls, and DNA sequencing of both parents revealed these variants only in the fathers. The low-frequency VUS (0.000016, TOPMED) g.23565172G>A (rs1315899420) 2KB upstream variant was also detected in one female CPP index patient. Finally, the likely pathogenic 5′-UTR g.23565172G>A (rs184950120) was detected in a CPP index girl and segregation analysis also identified paternal inheritance ( Figure 1 and Table 2). The clinical and hormonal findings of the above eight CPP girls identified with the variants in the noncoding regions of the MKRN3 gene are summarized in Table 3.     Table S2).
Additionally, in the cohort of CPP patients under investigation, a number of other more frequent variants in a series of genes (KISS1R, TAC3, GNRH1, GNRHR, LHCGR, MAGEL2, and FSHR) also directly or indirectly involved in pubertal control and activation have been observed (Supplementary Table S2). The more in-depth WES analysis did not identify any variants in other candidate genes.

Genetic Network Modeling for the MKRN3 Gene
The genetic network designed for the MKRN3 gene is a holistic regulatory depiction of the crosstalk between MKRN3 and other genes ( Figure 2). Notably, MKRN3 directly interplays with TAC3, while it is only one node away (TAC3) from KISS1 and KISS1R. TAC3 is a gene that encodes for a member of the tachykinin family of secreted neuropeptides. The mature peptide is the result of the proteolytically processed fully encoded preproprotein and is primarily expressed in the central and peripheral nervous systems. TAC3 has a role of a neurotransmitter, and it is well established that TAC3 mutations can lead to normosmic hypogonadotropic hypogonadism. Likewise, the KISS1 and KISS1R genes are linked mainly not only to cancer as they act as metastasis suppressor genes (mainly suppressing melanomas and breast carcinoma metastases) but also to central precocious puberty as kisspeptin (protein product of these genes) regulates the pubertal activation of GnRH neurons by stimulating gonadotropin-releasing hormone (GnRH)-induced gonadotropin secretion.

DISCUSSION
The present study investigated the genetic impact in patients with CPP by using conventional sequencing and high-throughput whole exome sequencing. The study included 56 unrelated children (54 girls and two boys) with CPP presenting with a variable clinical picture at presentation regarding the stage of puberty ( Table 1). The majority presented with thelarche at Tanner stage 2 of puberty before the age of 8 years. However, two of them presented with premature menarche and Tanner stage 5 of puberty before the age of 9.5 years. The two boys presented with testicular enlargement (testicular size of more than 5 ml) and Tanner stage 2 of puberty before the age of 9 years. Results from the present, in conjunction with previous studies from our group (14,34), indicated that pathogenic ( Table 2) or variants of uncertain significance (VUS)  ( Table 3) in the MKRN3 gene are the most prevalent cause of CPP in our cohort of Cypriot patients, in line with the current published literature (12). The MKRN3 p.Gly312Asp pathogenic variant, first reported in 2016 by Neocleous et al. (14), was the most predominant genetic defect (seven cases) of the present study, followed by the previously reported p.Met268ValfsTer23 pathogenic variant identified in four cases (25) ( Table 2). The two previously reported by Fanis et al. (34) heterozygous VUS with evidence of likely pathogenic activity g.23564798C>T (rs74005577) and g.23564819G>A (rs139233681) and the VUS never described before to be associated with CPP g.23565272G>A (rs1315899420) were identified in seven nonrelated girls with CPP ( Table 2). All three variants, rs74005577, rs139233681, and rs1315899420 are respectively located −886, −865, and −512 nt upstream to the transcription site in the proximal promoter region of the MKRN3 gene. In the same study by Fanis et al. (34), a 5′-UTR (+13 nt downstream to the transcription start site) novel mutation was also identified in a CPP girl. Most of the reported studies describe lossof-function pathogenic variants in the coding region of the MKRN3 gene, and to date, only a couple of studies reported pathogenic variants linked to CPP outside the coding region of the gene. These studies include a recent study that described a single nucleotide substitution at position 19 from the transcription start site (TSS) in the 5′-UTR region of the MKRN3 and another one that reported four nucleotide deletions (c.-150_-147delTCAG) in the proximal promoter region of the gene (23,33). In the last two decades, studies have reported monogenic causes of CPP, including MKRN3 loss-of-function mutations as the most common (11,35,36,43). In the present study, all patients that were tested negative for pathogenic variants in the MKRN3 gene were further investigated by whole exome NGS analysis. We identified numerous single nucleotide variants (SNVs) in various genes known to be involved in pubertal control. Seven SNVs in the DLK1 gene in eight different female CPP patients of our cohort were observed (Supplementary Table S2). Rare pathogenic variants in the DLK1 gene have recently been reported as an infrequent cause of CPP (41)(42)(43), therefore such a connection of the rare SNVs observed in the cases of our cohort as possible associated factors could not be excluded. Up-to-date, DLK1 along with MKRN3 are two of the four known monogenic causes of CPP and that are both imprinted. This elevates the possibility of imprinting to play a significant role in the regulation of puberty and that other imprinted genes might also be involved in the pubertal control (9,48).
Interestingly, the previously reported single nucleotide polymorphism (SNP) rs10407968 (p.Gly8=) in KISS1R gene (49) was detected with a minor allelic frequency (MAF) of 8.33 in the CPP cohort of patients of the present study. A notable and statistically significant difference was observed for the KISS1R rs10407968 between the CPP cohort of patients of the present study (MAF = 8.33) vs. the Cypriot control subjects (MAF = 23. 25). An assumption that can be said with observing the less-frequent rs10407968 MAF in CPP patients is that it could be behaving as a reducing agent for the risk of CPP. The above assumption is primarily made by the fact that activating mutations were reported in the G protein-coupled receptor (KISS1R) and its ligand (KISS1) genes as causes of early GnRH secretion in patients with CPP (35,36). Additional investigation is needed to examine this possibility and the potential functional effects of the rs10407968 SNP. The genetic network modeling indicated a possible interplay between those genes ( Figure 2). However, the roles of the remaining genes that have been identified by the clustering algorithm merit to be further investigated, as they may provide invaluable insights as regulatory elements of the genetic, epigenetic, and developmental traits of central precocious puberty, as well as putative novel pharmacological targets under the prism of personalized and precision medicine.
In the present study, we confirmed the key role of the imprinted MKRN3 gene as the most common cause of monogenic CPP in the Cyprus cohort. The newly described p.Gly312Asp missense loss-of-function pathogenic variant was identified as the most prevalent among the tested Cypriot CPP cohort. This could be most likely due to the founder effect, a frequent phenomenon in the Cypriot population (50)(51)(52)(53)(54). Additionally, other findings of the present study indicate that causing variants can also exist in the MKRN3 proximal promoter and 5′-UTR region and which can also be considered contributing factors to CPP. Overall, the results of the present study have emphasized the importance of an approach that aligns genetics and clinical aspects.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found below: (https://datadryad. org/stash), DOI (https://doi.org/10.5061/dryad.8pk0p2nnj).

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Cyprus National Ethics Committee. Written informed consent to participate in this study was provided by the participants' legal guardian/next of kin.