Candidate Genes for Age at Menarche Are Associated With Uterine Leiomyoma

Age at menarche (AAM) is an important marker of the pubertal development and function of the hypothalamic–pituitary–ovarian system. It was reported as a possible factor for a risk of uterine leiomyoma (UL). However, while more than 350 loci for AAM have been determined by genome-wide association studies (GWASs) to date, no studies of these loci for their association with UL have been conducted so far. In this study, we analyzed 52 candidate loci for AAM for possible association with UL in a sample of 569 patients and 981 controls. The results of the study suggested that 23 out of the 52 studied polymorphisms had association with UL. Locus rs7759938 LIN28B was individually associated with the disease according to the dominant model. Twenty loci were associated with UL within 11 most significant models of intergenic interactions. Nine loci involved in 16 most significant models of interactions between single-nucleotide polymorphism (SNP), induced abortions, and chronic endometritis were associated with UL. Among the 23 loci associated with UL, 16 manifested association also with either AAM (7 SNPs) or height and/or body mass index (BMI) (13 SNPs). The above 23 SNPs and 514 SNPs linked to them have non-synonymous, regulatory, and expression quantitative trait locus (eQTL) significance for 35 genes, which play roles in the pathways related to development of the female reproductive organs and hormone-mediated signaling [false discovery rate (FDR) ≤ 0.05]. This is the first study reporting associations of candidate genes for AAM with UL.


INTRODUCTION
Uterine leiomyoma (UL), commonly known as fibroids, is a benign tumor of smooth muscle in the uterus (Wise and Laughlin-Tommaso, 2016;Stewart et al., 2017). The symptoms include excessive menstrual bleeding, abdominal pain, and pregnancy complications (Zimmermann et al., 2012;Stewart et al., 2017). About 30 percent of reproductive-age women are diagnosed with UL, which is considered as a major factor of gynecologic morbidity (Wise et al., 2004), one of the primary causes of hospitalizations for gynecological disorders, and the most frequent reason for hysterectomy in the United States (Wise and Laughlin-Tommaso, 2016). ULs bear a substantial economic burden: the annual cost of treating this disorder has been estimated to be about $34 billion in the United States alone, which exceeded the combined cost of treating breast and colon cancers (Cardozo et al., 2012).
The biological mechanisms of UL development are not well understood (Rafnar et al., 2018). Family history is an important risk factor for UL (Commandeur et al., 2015;Wise and Laughlin-Tommaso, 2016). A risk to develop the disease is 2.5-fold greater for first-degree relatives of affected women than the population average (Vikhlyaeva et al., 1995), and the concordance among monozygotic twins is almost twice that of dizygotic twins (Luoto et al., 2000). Heritability estimates of UL from twin studies vary from 26 to 69% in European populations (Snieder et al., 1998;Luoto et al., 2000). More estimates come from recent genome-wide association studies (GWASs). Cha et al. (2011) reported candidate loci for UL in chromosome regions 10q24.33 (OBFC1), 11p15.5 (BET1L), and 22q13.1 (TNRC6B) in a cohort of Japanese patients. One more locus, 17q25.3 (FASN, CCDC57, and SLC16A3), was suggested by the genome-wide linkage and follow-up association studies in a meta-analysis of white women (Eggert et al., 2012). The 11p15.5 (BET1L) locus was later replicated in a sample of Caucasians . The 22q13.1 (TNRC6B) locus was replicated in Caucasian, American, and Saudi Arabian cohorts Aissani et al., 2015;Bondagji et al., 2017). However, none of the above loci was replicated in African-American women (Wise et al., 2012). A GWAS of a risk for leiomyoma in African-American women suggested a locus on 22q13.1 (CYTH4) (Hellwege et al., 2017). In 2018, two teams presented results of the first GWAS of UL in European populations (Rafnar et al., 2018;Välimäki et al., 2018) reporting 21 and 22 loci associated with the increased risk of UL.
Age at menarche (AAM) has been suggested as a risk factor for UL (Wise et al., 2004;Terry et al., 2010;Velez Edwards et al., 2013;Wise and Laughlin-Tommaso, 2016). Women with an early AAM have, on average, longer period of menstrual cycling in their life and thus a greater lifelong exposure to estrogens, which may promote growth of UL (Wise et al., 2004). Cell division rate in the myometrium is the highest during the luteal phase of the menstrual cycle; therefore, a longer history of cycling may increase a risk of UL (Wise and Laughlin-Tommaso, 2016). On the other hand, some studies reported no significant association between age of menarche and UL (Samadi et al., 1996;Zimmermann et al., 2012).
The significantly different heritability estimates for UL [from 26 to 69% according to twin studies (Snieder et al., 1998;Luoto et al., 2000) and only 13% according to GWAS data singlenucleotide polymorphism (SNP) heritability (Rafnar et al., 2018)] bring about a problem of so-called "missing heritability." Studies of candidate genes associated with specific risk factors for UL [e.g., AAM and body mass index (BMI)] may help to solve this problem. For example, more than 350 loci have been reported by GWAS for association with AAM (Day et al., 2017). There are data that candidate genes for AAM (e.g., FTO, LIN28B, MAP2K5, TNNI3K, GPRC5B, and FANCL) may also contribute to various anthropometric traits (e.g., BMI, height, and others) (Ong et al., 2009;Elks et al., 2010;Fernandez-Rhodes et al., 2013;Perry et al., 2014), while some of these traits (e.g., an increased BMI) may be risk factors for UL (Luoto et al., 2000;Wise and Laughlin-Tommaso, 2016;McWilliams and Chennathukuzhi, 2017). However, there have been no studies on the possible association of the candidate genes for AAM with a risk of UL.
The purpose of the present study was to analyze 52 candidate loci for menarcheal age for their possible association with UL. Recently, we studied association of these loci with some phenotypic traits (including AAM, height, and BMI) in the same sample of Caucasian females (Ponomarenko et al., 2019). Using the same sample for the analysis of different traits eliminates between-sample heterogeneity and makes comparison and interpretation of the results more meaningful and robust.

Study Participants
The study protocol was approved by the Regional Ethics Committee of Belgorod State University. All participants signed informed consent documents before entering the study. The participants were recruited through the Perinatal Center of the Belgorod Regional Clinical Hospital of St. Joasaph during 2008-2013. All patients with UL underwent hysterectomy followed by morphological verification of the diagnosis. The control group included women without clinical (asymptomatic women) and ultrasound signs of benign proliferative diseases of the reproductive organs. They were enrolled during regular medical examinations at the above Perinatal Center. A total of 1,620 women were recruited: 600 patients with UL (ICD-10 code D25) and 1,020 controls.
The participants were enrolled under the following exclusion criteria: self-declared non-Russian descent, a birthplace outside of Central Russia (Sorokina et al., 2018), chronic severe disorders of the vital organs (heart, respiratory, or renal failure), cancers of pelvis and breast, and severe autoimmune disorders.
The following data were collected for each participant: physical characteristics (height, weight, and BMI), AAM (determined as described elsewhere; Ponomarenko et al., 2019), characteristics of the menstrual cycle (length and duration of menses), other data about reproductive health (number of pregnancies and childbirths, age at first birth, time since last birth, spontaneous and induced abortions, and infertility), marital status, family history of UL, use of oral contraceptives and age at first oral contraceptive use, and smoking and alcohol use. The participants were also examined for the presence of gynecological disorders (endometrial hyperplasia, endometriosis, adenomyosis, etc.).

Blood Sample Collection and DNA Handling
The phlebotomy was performed by a certified nurse. DNA extraction from buffy coat was performed according to the protocol used in our previous gene association studies (Ponomarenko et al., 2019).
In total, 52 loci were selected for the present study ( Supplementary Tables 1-4). All SNPs appeared to have a significant regulatory potential (Supplementary  Table 4). Seventeen SNPs were associated with anthropometric characteristics (Supplementary Table 4). Ten more polymorphisms were not directly associated with AAM but manifested association or tagged with the traits related to menarche [e.g., polycystic ovary syndrome (PCOS), vitamin D metabolism, and physical characteristics; Supplementary Table 4]. These SNPs included rs1884051 ESR1, rs3020394 ESR1, rs12324955 FTO, rs4633 COMT, rs222020 GC, rs222003 GC, rs1544410 VDR, rs3756261 EGF, rs7766109 F13A1, and rs2252673 INSR. All these SNPs have a significant regulatory potential, nine of them are eSNPs, and eight are tagSNPs.
Several of the selected loci were previously reported to be associated with AAM (14 SNPs), BMI (15 SNPs), and height (16 SNPs) in the studied sample (Ponomarenko et al., 2019).

Single-Nucleotide Polymorphism Genotyping and Data Quality Control
DNA samples were genotyped using the Sequenom MassARRAY R iPLEX platform at the Centre of Genomic Sciences (University of Hong Kong). The procedure for DNA sample preparation and data quality control are described elsewhere (Ponomarenko et al., 2019). All DNA samples successfully met the following quality control criteria: call rate > 95%, the success rate of duplicate check > 99.5%, and the success rate of the blank check > 90%. Finally, the proportion of the determined genotypes for the 52 SNPs was 98.84%. The samples with a proportion of determined genotypes < 95% were excluded from the analysis (n = 70). Thus, the final study sample included 1,550 participants: 569 women with UL and 981 control subjects.

Statistical Analysis
Association of clinical anamnestic risk factors with UL was assessed using logistic regression analysis as implemented in the epicalc package in the R software environment [version 3.4.0 (2017-04-21)].
The correspondence of the SNPs to the Hardy-Weinberg equilibrium (HWE) was checked using the chi-square test. The logistic regression method was used to analyze associations of the SNPs with UL assuming additive, recessive, and dominant genetic models. To account for possible confounding effects, the following set of covariates was applied: family history of UL, history of infertility, the presence of induced abortions in the anamnesis, and chronic endometritis as qualitative variables (yes/no), whereas age, BMI, and the number of induced abortions in the anamnesis as quantitative variables (value of the trait) ( Table 1). The adaptive permutation test was applied to adjust for multiple comparisons (Che et al., 2014). The significance level was set at p perm < 0.01 (after the Bonferroni correction based on the numbers of genetic models studied).
The given sample size (569 patients with UL and 981 controls) was sufficient to detect differences in allelic frequencies between affected subjects and controls at OR = 1.23-1.57 for the additive model, OR = 1.35-1.60 for the dominant model, and OR = 1.40-8.70 for the recessive model (at 80% power, A= 0.05 for twosided test).
The haplotype blocks were identified using the algorithm implemented in HaploView v.4.2 (Barrett et al., 2005). The association analyses and permutation test were performed using the respective algorithms implemented in the PLINK v. 2.050 software (Purcell et al., 2007) 7 . The significance level was set at p perm < 0.05.
The interactions between genes were analyzed assuming the two-, three-, and four-locus models and using the model-based multifactor dimensionality reduction (MB-MDR) method (Calle et al., 2008(Calle et al., , 2010Mahachie et al., 2012) in the mbmdr package (version 2.6) in the R software environment [version 3.4.0 (2017-04-21)]. The permutation test was demonstrated to be efficient for analysis of large massifs of GWAS data without reduction of the power (Che et al., 2014). For the permutation test, the following threshold p-values (after the Bonferroni correction based on the numbers of combinations studied for 52 SNPs) were adopted for models of gene-gene interactions: p < 3.8 * 10 −5 (<0.05/1,326) for two-locus models, p < 2.3 * 10 −6 (<0.05/22,100) for threelocus models, and p < 1.8 * 10 −7 (<0.05/270,725) for four-locus models. The significance level was set at p perm ≤ 0.001.
The interactions of the genes with induced abortions and chronic endometritis were analyzed for their possible effect on UL. Induced abortions and chronic endometritis were included in the analysis because (i) they were determined as risk factors for UL in the study sample (Table 1), (ii) induced abortions are a common birth control method of women in Russia as compared with other countries (Sedgh et al., 2007), and (iii) induced abortions are a common cause of post-abortion endometritis and, subsequently, chronic endometritis in Russian women (Douglas et al., 2014). The analysis was performed using MB-MDR with adjustment for covariates (age, BMI, family history of UL, and history of infertility) and multiple comparisons (1,000 permutations) as described above. The permutation test was applied to the selected best models of gene-environment interactions with the significance level of p < 1 * 10 −19 . The significance level was set at p perm < 0.001.
The most significant models of gene-gene and geneenvironment interactions associated with UL were crossvalidated using generalized multifactor dimensionality reduction (GMDR) (Lou et al., 2007;Chen et al., 2011) 8 and the respective software (Beta 0.9) 9 . The main parameters of the validation were adjusted for covariates and multiple comparisons using the permutation test (1,000 permutations with 10-fold crossvalidation, which provided statistical significance at p perm < 0.001 for a validated model).
The identified interactions and proportion of their contribution to the total variance of the trait were visualized using the MDR method 10 and the MDR v. 3.0.2 software 11 .

Functional Single-Nucleotide Polymorphisms
The HaploReg v. 4.1 12 and the data of the European population from the 1000 Genomes Project Phase were utilized to determine SNPs linked to the UL-associated polymorphisms. Then all these loci were analyzed for their functional significance (nonsynonymous SNPs, regulatory potential, and eQTLs).

Non-synonymous Single-Nucleotide Polymorphisms
The SIFTonline tool 13 was used to analyze non-synonymous SNPs and their functional predictions.

Expression Quantitative Trait Loci
The data from the Blood eQTL browser 17 were utilized to assess the eQTL (cis-and trans-eQTL) significance of the studied SNPs in peripheral blood. The eQTL value in other organs and tissues was estimated using the GTEx Portal 18 (data release V7 updated on 09/05/2017, dbGaP Accession phs000424.v7.p2). The false discovery rate (FDR) ≤ 0.05 was applied.

Pathway Analyses
The functional significance of the UL-associated genes in metabolic pathways was analyzed using the Gene Ontology Portal tools (PANTHER Overrepresentation Test accessed on 04/13/2017; PANTHER version 12.0 accessed on 07/10/2017 19 ) and the FDR test to correct the results for multiple comparisons.
GeneMANIA 20 (version 3.5.0, accessed on 03/13/2017) and the automatic weighting were used to construct the gene interaction networks. Gene Ontology Portal provides a possibility to evaluate functional significance of genes in various metabolic pathways using seven different data bases (Gene Ontology molecular function, Gene Ontology biological process, PANTHER protein class, PANTHER pathway, PANTHER molecular function, PANTHER biological process, and Reactome pathway) and makes it possible to construct the gene interaction networks based on seven different variants of gene interaction, i.e., physical interactions, predicted interaction, pathways, co-expression, colocalization, genetic interactions, and shared protein domains.

RESULTS
The phases of the study and main results are outlined in Figure 1.

Study Participants Characteristics
The UL patients (n = 569) were 2.49 years older and had higher BMI as compared with the controls (n = 981) (p < 0.001) ( Table 1). They also had more often a family history of UL (adjusted for age and BMI, OR adj = 2.03, 95% CI 1.46-2.98, p < 0.001), history of infertility (OR adj = 2.36, 95% CI 1.75-3.97, p < 0.001), presence (OR adj = 2.78, 95% CI 2.23-3.48, p < 0.001), and the number of induced abortions in the anamnesis (OR adj = 1.69, 95% CI 1.54-1.86, p < 0.001), chronic endometritis (OR adj = 1.59, 95% CI 1.08-2.35, p = 0.02) ( Table 1). Therefore, these factors (in addition to age and BMI) were applied as covariates in the association analyses. The higher number of pregnancies in the patients (3.34 vs. 2.45 in cases and controls, respectively) was due to the higher number of the induced abortions (1.59 vs. 0.67 in cases and controls, respectively) in this group as compared with the controls.

Single-Nucleotide Polymorphism and Haplotype Association Analysis
The data about the analyzed loci are presented in Supplementary Tables 5, 6. All loci had MAF > 5% and were in correspondence with the HWE (ð bonf > 0.001).

Single-Nucleotide Polymorphism × Single-Nucleotide Polymorphism Interactions
In total, 11 most significant two-, three-, and four-locus models of gene-gene interactions associated with UL were determined ( Table 3 and Supplementary Table 8). These models included 20 SNPs. Loci of the FSHB, LIN28B, and POMC genes appeared in the largest number of models (seven, six, and five, respectively). FIGURE 1 | Schematically displayed the phases of the associations study single-nucleotide polymorphism (SNP) candidate genes for age at menarche and uterine leiomyoma (UL) and at each phase at short note the results obtained.
Locus rs314276 LIN28B contributed to the most significant genegene interaction models at all the levels considered. Among the obtained models of SNP-SNP interactions, the most optimal for prediction was rs314276 LIN28B × rs1782507 FSHB × rs1544410 VDR × rs7589318 POMC (OR = 2.69, Test. Bal. Acc. 54.70, sensitivity 68.72, specificity 55.05, Table 3). It had the highest values of sensitivity and specificity as compared with the others.
The graph of SNP-SNP interactions of 21 polymorphisms (20 SNPs that make up the 11 best models and rs7759938 LIN28B independently associated with UL) (Figure 2) suggests that these interactions are concerted; the highest contribution to the entropy is made by interactions rs3020394 ESR1 × rs7766109 F13A1 (0.51%), rs2164808 POMC × rs7753051 IGF2R (0.50%), and polymorphism rs7759938 LIN28B (0.51%).   The results were obtained using the model-based multifactor dimensionality reduction (MB-MDR) method with adjustment for covariates. NH, number of significant highrisk genotypic combinations associated with uterine leiomyoma; beta H, coefficient of the logistic regression for significant high-risk genotypic combinations associated with uterine leiomyoma; WH, the Wald test value for significant high-risk genotypic combinations associated with uterine leiomyoma; NL, number of significant low-risk genotypic combinations associated with the uterine leiomyoma; beta L, coefficient of the logistic regression for significant low-risk genotypic combinations associated with the uterine leiomyoma; WL, the Wald test value for significant low-risk genotypic combinations associated with uterine leiomyoma; p perm , p-value for the permutation test (1,000 permutations); SNP, single-nucleotide polymorphism.

Gene-Environment Interactions
In total, nine loci were determined to contribute to geneenvironment interactions with induced abortions and chronic endometritis, which were significantly associated with UL. The analysis yielded 16 best two-, three-, and four-order models ( Table 4 and Supplementary Table 10). Two loci, rs12324955 FTO and rs11031010 FSHB, were included in the largest number of the best models (11 and 10, respectively) ( Table 4). Loci rs4633 COMT and rs314280 LIN28B were significantly associated with UL through the interaction with induced abortions and chronic endometritis. Among the obtained models of gene-environment interactions, the most optimal for prediction was rs12324955 FTO × rs11031010 FSHB × chronic endometritis × aborts (OR = 3.90, Test. Bal. Acc. 64.99, sensitivity 62.57, specificity 70.03, Table 4).
The graph of the interactions between the studied loci, induced abortions, and chronic endometritis suggests that the abortions account for the largest proportion (5.02%) of the trait entropy (Figure 3). In general, the contribution of the most significant gene-environment interactions to UL is 0.31-0.41%, on average.

Single-Nucleotide Polymorphisms Associated With Uterine Leiomyoma Are Also Associated With Age at Menarche, Height, and Body Mass Index in Adults
The analyses determined 23 loci associated with UL either individually or through gene-gene and gene-environment interactions. We also analyzed whether these SNPs were  The results were obtained using the model-based multifactor dimensionality reduction (MB-MDR) method with adjustment for covariates. NH, number of significant high-risk genotypic, aborts chronic and endometritis combinations associated with uterine leiomyoma; beta H, coefficient of the logistic regression for significant high-risk genotypic, aborts and chronic endometritis combinations associated with uterine leiomyoma; WH, the Wald test value for significant high-risk genotypic, aborts and chronic endometritis combinations associated with uterine leiomyoma; NL, number of significant low-risk genotypic, aborts and chronic endometritis combinations associated with the uterine leiomyoma; beta L, coefficient of the logistic regression for significant low-risk genotypic, aborts and chronic endometritis combinations associated with the uterine leiomyoma; WL, the Wald test value for significant low-risk genotypic, aborts and chronic endometritis combinations associated with uterine leiomyoma; p perm , p-value for the permutation test (1,000 permutations).
associated with the other phenotypic characteristics, namely, menarcheal age, height, and BMI in the studied sample (the respective results were reported elsewhere; Ponomarenko et al., 2019). Out of the 23 SNPs, 16 SNPs (69.57%) also manifested association with AAM, height, and/or BMI (Supplementary Table 12). Locus rs4633 COMT manifested association with all the above phenotypes, and three loci (rs2164808 POMC, rs4374421 LHCGR, and rs4946651 LIN28B) were also associated with at least two of the three phenotypes analyzed (i.e., age of menarche and/or height and/or BMI).

Non-synonymous Single-Nucleotide Polymorphisms
None of the 23 polymorphisms associated with UL was missense. However, two of these were in linkage disequilibrium with nonsynonymous SNPs. Polymorphism rs4633 was linked to rs4680, which results in an amino acid substitution Val158Met in the COMT protein. This replacement has SIFT Score = 0.02 that corresponds to the predictive value "deleterious." Another locus, rs2241423 (15q23), was linked to rs7170185, a missense variant (Trp200Arg) in the SKOR1 protein. This amino acid change also has a predictive value of "deleterious" (SIFT Score = 0).

Regulatory Effects
The data on regulatory effects of the UL-associated SNPs are given in Supplementary Table 13. Locus rs4633 COMT appeared to have the most pronounced effect; the significant regulatory potential was also determined for rs314280 LIN28B, rs2164808 POMC, rs12324955 FTO, and rs10769908 STK33 (Supplementary Table 13).
The 514 SNPs linked to the UL-associated loci were analyzed for their regulatory potential (Supplementary Table 14). Among these, more than 460 had regulatory significance. The most pronounced regulatory potential was determined for loci linked to polymorphisms rs314276 and rs7759938 of the LIN28B gene (two SNPs), rs11031010, rs555621, and rs1782507 of the FSHB gene (four SNPs), rs2241423 MAP2K (>10 SNPs), rs10769908 STK33 (two SNPs), and rs4633 COMT (three SNPs) (Supplementary Table 14).
Importantly, the loci that are linked to the UL-associated SNPs showed their regulatory effects in organs and tissues related to pathogenesis of UL, i.e., ovaries, muscle tissue, adipose tissue, various brain regions (hypothalamus, pituitary, etc.), and liver (Commandeur et al., 2015).
According to the data of the Genotype-Tissue Expression (GTEx) project, 16 SNPs of the 23 associated with UL had cis-eQTL significance in various tissues and organs, including those that are related to pathogenesis of UL (p < 8 * 10 −5 , p FDR ≤ 0.05) (Supplementary Table 17). Allele T (ref) of rs7759938 LIN28B is associated with the lower expression of the gene in hypophysis (β = -0.50, ð = 1.3 * 10 −11 , p FDR ≤ 0.05), while alternative allele C has a protective effect for UL (OR = 0.74).
Sixteen UL-associated loci manifested strong linkage to more than 380 polymorphisms affecting gene expression (p < 8.5 * 10 −5 , p FDR ≤ 0.05) in various organs and tissues (Supplementary Table 18). Loci rs1782507, rs11031010, and rs555621 of FSHB appeared to have the most pronounced effect as they were linked to more than 120 SNPs affecting expression of the FSHB, ARL14EP, and RP4-710M3.1 genes in more than 25 organs and tissues.
Overall, 18 of the 23 UL-associated loci have cis-eQTL value: they affect expression of 28 genes. Among these, two SNPs are associated with the mRNA transcription level independently, and 16 SNPs are both individually associated with and linked to other loci affecting gene expression levels.

Pathway Analyses
The roles in biological processes or molecular functions of the UL-associated genes (Supplementary Table 13) and those whose expression is affected by the UL-associated 23 loci according to the eQTL analysis (Supplementary Tables 15-18) were analyzed in silico using annotations from the Gene Ontology databases. We found evidence of enrichment for pathways involved in the follicle-stimulating hormone signaling pathway (FDR = 0.002), hormone ligand-binding receptors (FDR = 0.002), ovarian follicle development (FDR = 0.003), and other pathways related to development of the female reproductive organs and hormonemediated signaling pathways (FDR ≤ 0.05) (Supplementary Table 19).

DISCUSSION
This study reports for the first time the associations of 23 candidate loci for AAM with UL. Locus rs7759938 of the LIN28B gene demonstrated the most significant associations: its allele Ñ decreased a risk for UL according to the dominant model (OR = 0.74). This locus is also associated with UL within most significant four-order model of polymorphisms interactions with induced abortions.
According to the GTEx Portal database, rs7759938 and 27 SNPs linked to it have the cis-eQTL significance and may affect expression of LIN28B in the pituitary gland (allele Ñ of rs7759938 increases expression of LIN28B). Association of this SNP with menarcheal age was reported previously (Perry et al., 2009;Elks et al., 2010;Delahanty et al., 2013;Perry et al., 2014;Ponomarenko et al., 2019). Several studies suggested that rs7759938 might play a role in the pubertal development and adult height (Elks et al., 2010;Lango Allen et al., 2010;Widén et al., 2010;Leinonen et al., 2012;Cousminer et al., 2013;Perry et al., 2014). Two polymorphisms, rs9391253 and rs314263, which are linked to rs7759938 (r 2 ≥ 0.94), were associated with height (Berndt et al., 2013;Wood et al., 2014). Importantly, allele C of rs7759938 was previously determined as associated with later menarche, delayed pubertal growth, and high height (Perry et al., 2009, Elks et al., 2010Delahanty et al., 2013). On the other hand, the present study suggested this allele as a protective factor for UL. That means that AAM and a risk for UL may have a shared genetic basis.
Two SNPs of the LIN28B gene (rs314276 and rs4946651) were involved in five out of 11 most significant models of epistatic interactions, and polymorphism rs314280 LIN28B was associated with UL only through interaction with induced abortions and chronic endometritis. All these polymorphisms were reported as associated with AAM (He et al., 2009;Ong et al., 2009;Sulem et al., 2009;Carty et al., 2013). There is evidence that rs314276 may play a role in the pubertal development and height (Ong et al., 2009;Cousminer et al., 2014), BMI, and weight in females (Ong et al., 2011). Previously, we reported the association of the rs314280 and rs4946651 polymorphisms with BMI of adults (Ponomarenko et al., 2019). Two polymorphisms, rs9391253 and rs314263, which are linked to rs314276 (r 2 ≥ 0.94), were associated with height (Berndt et al., 2013;Wood et al., 2014). The data of GTEx Portal suggest that the UL-associated SNPs of LIN28B have cis-eQTL significance: rs314276 (with 20 SNPs linked to it) and rs4946651 (with 15 SNPs linked to it) are associated with expression level of LINC00577 in the cortex and basal ganglia of the brain, LIN28B in the pituitary gland, and HACE1 in skeletal muscle; rs314276 and 27 are linked to SNPs that may affect expression of the LIN28B gene in the pituitary gland.
The above four UL-associated SNPs of the LIN28B gene are located in region 6q16.3-q21 about 38 kb from each other and are linked to about 40 SNPs with the regulatory potential and cis-eQTL values (affect transcription of the LIN28B, LINC00577, and HACE1 genes).
LIN28B (lin-28 homolog B) is a member of the LIN28 family of proteins and functions as a repressor of the let-7 family of miRNA (Tsialikas and Romer-Seibert, 2015). These miRNAs contribute to developmental processes in invertebrate animals, while their role in vertebrates is not well understood. Some data suggest that the let-7 miRNAs may contribute to cell cycle control FIGURE 4 | The interaction networks of the candidate genes for uterine leiomyoma inferred using GeneMANIA (http://genemania.org). The candidate genes determined in the present study are cross-shaded. and oncogenesis (Tsialikas and Romer-Seibert, 2015). The target genes of the let-7 miRNA, Kras, Myc, Hmga2, and Igf2bp1 are involved in body size control and metabolism in mammals . The function of LINC00577 (LIN28B antisense RNA 1) is unknown; according to the information from the Ensembl database, it belongs to the genes encoding long intergenic non-coding RNAs (lincRNAs), which interact with proteins, DNA, and other RNAs, and thus perform important regulatory functions (Ulitsky and Bartel, 2013). A protein product of the HACE1 (HECT domain and ankyrin repeat containing E3 ubiquitin protein ligase 1) gene is a potential tumor suppressor and participates in the specific tagging of target proteins that subsequently results in proteasome degradation 21 .
The rs555621, rs11031010, and rs1782507 polymorphisms of the FSHB gene appeared in the largest number of the most significant models of intergenic interactions (7 out of 11 models); rs11031010 FSHB was presented in 10 out of 16 most significant models gene-environment interactions associated with UL. Recently, we determined association of rs555621 with BMI of adults (Ponomarenko et al., 2019). The above three loci were also associated with AAM , and rs11031010 was suggested to be a risk factor of PCOS and affect a level of luteinizing hormone (LH) in patients (Tian et al., 2016). 21 http://www.genecards.org/ The FSHB gene encodes a beta-subunit of the folliclestimulating hormone. This hormone has multiple functions in the female reproductive system, including stimulation proliferation of follicular granulosa cells, rescue of follicles from apoptosis, and synthesis of receptors for the LH on these cells before ovulation.
The recent meta-analysis of 11 GWAS determined the region on chromosome 11 (11p14.1, rs74485684) harboring the FSHB gene as that associated with endometriosis (Sapkota et al., 2017). Importantly, rs11031010 FSHB analyzed in the present study is located just 1.9 kb from the above locus. On the other hand, the first GWAS of UL in European populations (Rafnar et al., 2018) replicated three of the 19 previously reported endometriosis variants (Sapkota et al., 2017). Several loci associated with endometriosis, including rs11031006 located in 11p14.1, were reported for their association with UL in the recent GWAS (Gallagher et al., 2019). Importantly, the rs11031010 polymorphism of the FSHB gene analyzed in the present study is linked to rs11031006 (r 2 = 0.64).
An epidemiological meta-analysis of 402,868 women suggested at least twofold higher risk for UL in patients with a history of endometriosis (Gallagher et al., 2019). It was hypothesized that both UL and endometriosis were underlain by hormone-related factors (Rafnar et al., 2018) and thus might have a shared genetic basis (Gallagher et al., 2019). This assumption was supported by the recent findings of syntropic genes for these diseases (Rafnar et al., 2018;Gallagher et al., 2019).
There is evidence that some loci linked to rs11031010 may contribute to various reproductive characters, such as concentration of the luteinizing (rs11031002, r 2 = 0.76) and follicle-stimulating (rs11031005, r 2 = 0.64) hormones in blood plasma (Ruth et al., 2016), and age at menopause (rs12294104, r 2 = 0.61) (Stolk et al., 2012). The SNPs upstream of the FSHB transcription start were implicated into a key role in many reproductive processes (Gajbhiye et al., 2018). It is generally acknowledged that women with late menopause or/and early menarche have, on average, an increased lifetime exposure to estrogen due to the larger number of ovulatory cycles. Since mitotic activity of myometrium is the highest during the luteal phase of the cycle, a longer history of menses may theoretically increase a risk for UL. Likewise, this risk may be higher due to the increasing LH level (Wise and Laughlin-Tommaso, 2016).
The present study determined that the rs2164808 polymorphism of the POMC gene was involved in the largest number of the most significant models of gene-gene (four out of 11 models) and gene-environment (five out of 16 models) interactions associated with UL. Previously, we reported the association of the rs2164808 locus with BMI of adults (Ponomarenko et al., 2019). The association of this locus with AAM was first reported by He et al. (2010). Polymorphism rs1561288, which is linked to rs2164808 (r 2 = 0.25), was associated with BMI (Graff et al., 2013). Locus rs2164808 and linked to it rs4665765 are both associated with the transcription level of DNAJC27 (according to the Blood eQTL browser) and RP11-509E16.1 (according to the GTEx Portal database). The data from the SNP Function Prediction and rSNPBase suggest a significant regulatory effect of rs2164808 (proximal regulation and RNA binding protein mediated regulation effects, regulatory potential score 0.334). The POMC (pro-opiomelanocortin-alpha) gene encodes a preproprotein that undergoes tissue-specific posttranslational modifications resulting in up to 10 biologically active peptides, which participate in various cellular processes, including those related to UL (e.g., lipolysis, mobilization of fatty acids, steroidogenesis, and adrenal development) (see footnote 21). The product of the DNAJC27 [DnaJ heat shock protein family (Hsp40) member C27] gene is a GTPase that is associated with the MEK/ERK pathway and may induce cell transformation (see footnote 21). The function of RP11-509E16.1 is unknown; according to the information from the Ensembl database 22 , it belongs to the genes encoding lincRNAs, which interact with proteins, DNA, and other RNAs and thus perform important regulatory functions (Ulitsky and Bartel, 2013). The polymorphisms located in the regions of the POMC, DNAJC27 (RBJ), and ADCY3 genes were associated with pubertal development, height, BMI, obesity, and type I diabetes mellitus (Gudbjartsson et al., 2008;Lango Allen et al., 2010;Speliotes et al., 2010;Bradfield et al., 2011;Berndt et al., 2013;Cousminer et al., 2013;He et al., 2015). 22 http://www.ensembl.org/ One more finding of the present study is that associations with UL were determined not only for candidate SNPs for AAM (Supplementary Table 4) but also for seven polymorphisms manifesting association or tagged with the traits related to menarche (e.g., sex steroid hormone and vitamin D metabolism, and physical characteristics; Supplementary Tables 4, 12). These menarche-related traits are important for pathogenesis of UL (Commandeur et al., 2015;Wise and Laughlin-Tommaso, 2016). The latter are involved in the sex steroid hormone pathway (rs3020394 and rs1884051 ESR1, rs4633 COMT, and rs12324955 FTO and rs7766109 F13A1) and vitamin D metabolism (rs1544410 VDR and rs222020 GC) (Supplementary Table 4, see footnote 21), which are important for UL development (Commandeur et al., 2015;Wise and Laughlin-Tommaso, 2016). For example, rs4633 COMT, which is also associated with AAM, height, and BMI in the studied sample (Ponomarenko et al., 2019) and linked to it rs4680 is associated with the COMT expression level in peripheral blood (according to the Blood eQTL browser). The activity of the COMT enzyme is elevated in human leiomyoma tissue as compared with normal myometrium; since this enzyme is essential for estrogen metabolism, the observed association suggests a possible causal role of COMT in UL formation (Commandeur et al., 2015). In addition, the rs4680 polymorphism encodes a replacement substitution Val158Met in COMT: the Met variant has 40% lower activity as compared with the Val one (Chen et al., 2004). Further support of the possible contribution of rs4680 to UL comes from the metaanalysis by Feng et al. (2013), who reported the association of rs4680 with UL.
We did not find a direct association of AAM with UL but determined that 23 candidate loci for AAM might be associated with UL. Among these, several were also associated with age of menarche, height, and/or BMI of adults in the study sample. This may suggest that menarcheal age per se is not a primary risk factor for UL in the studied sample (the population of Russia) and that other factors may be of greater importance. Specifically, we found that the history of induced abortions significantly increased the risk of UL in the study sample (OR adj = 2.78). Our results support the recent findings by other researchers Song et al., 2017). Induced abortions are common birth control method in Russia (and generally in the former USSR) (Sedgh et al., 2007;David et al., 2007;Douglas et al., 2014), and Russia (along with Cuba) is a leader by this indicator in the world (Sedgh et al., 2007). Induced abortions in the first trimester of pregnancy may cause multiple health complications, including PCOS, hyperprolactinemia, anovulatory menstruation, disorders of the luteal phase of the menstrual cycle, and postabortion endometritis (Douglas et al., 2014). Induced recurring abortions result in injuries of the uterus. The patient cohort in the present study included 44.99% individuals who had at least two induced abortions. Uterine injuries may induce the expression of the growth factor that speed ups cellular proliferation, suppresses apoptosis, and increases extracellular matrix production (Wise and Laughlin-Tommaso, 2016). Indeed, our results suggest that the risk for UL directly correlates with the number of the abortions in the anamnesis (OR adj = 1.69): it increases from OR = 3.85 (95% CI 2.79-5.32, p < 0.001) in a case of two abortions up to , p < 0.001) with four and more abortions. Similar results were reported by Song et al. (2017) who showed that the higher number of induced abortions was associated with an increased risk of UL (one induced abortion, OR = 1.32; two induced abortions, OR = 1.45; and ≥ 3 induced abortions, OR = 1.62). PCOS was associated with a 65% increased risk of UL (Wise and Laughlin-Tommaso, 2016). The mechanisms by which PCOS may elevate risk for UL include increased levels of LH or unopposed estrogens and its association with hyperinsulinemia (Wise and Laughlin-Tommaso, 2016). Induced abortions may also be associated with other hormone-related diseases (e.g., breast cancer) (Schneider et al., 2014). Inflammation in the uterus (stress from chronic inflammation) seems a quite possible etiologic factor of UL: studies showed that inflammation increases the production of extracellular matrix and decreases apoptosis in UL (Commandeur et al., 2015;Wise and Laughlin-Tommaso, 2016).

CONCLUSION
Candidate genes for AAM are associated with UL. Locus rs7759938 LIN28B is associated with UL independently; 20 loci are associated within the 11 most significant intergenic interaction models, and nine loci may contribute to the risk of the disorder through the 16 most significant gene-environment interactions. These associations probably result from multiple effects of more than 510 polymorphisms linked to the above SNPs and implicated in various metabolic processes in organs and tissues that are related to pathogenesis of UL.

DATA AVAILABILITY STATEMENT
The datasets generated for this study can be found in EVA at the following accessions Project: PRJEB40782, Analyses: ERZ1645102.

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Regional Ethics Committee of Belgorod State University. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
IP, MC, and VD made substantial contributions to conception and design, drafted the manuscript, and gave the final approval of the version to be published. ER, AP, IV, IS, and AY analyzed and interpreted of data, drafted the manuscript, and gave the final approval of the version to be published. All the authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.