Repeat-mediated epigenetic dysregulation of the FMR1 gene in the fragile X-related disorders

The fragile X-related disorders are members of the Repeat Expansion Diseases, a group of genetic conditions resulting from an expansion in the size of a tandem repeat tract at a specific genetic locus. The repeat responsible for disease pathology in the fragile X-related disorders is CGG/CCG and the repeat tract is located in the 5′ UTR of the FMR1 gene, whose protein product FMRP, is important for the proper translation of dendritic mRNAs in response to synaptic activation. There are two different pathological FMR1 allele classes that are distinguished only by the number of repeats. Premutation alleles have 55–200 repeats and confer risk of fragile X-associated tremor/ataxia syndrome and fragile X-associated primary ovarian insufficiency. Full mutation alleles on the other hand have >200 repeats and result in fragile X syndrome, a disorder that affects learning and behavior. Different symptoms are seen in carriers of premutation and full mutation alleles because the repeat number has paradoxical effects on gene expression: Epigenetic changes increase transcription from premutation alleles and decrease transcription from full mutation alleles. This review will cover what is currently known about the mechanisms responsible for these changes in FMR1 expression and how they may relate to other Repeat Expansion Diseases that also show repeat-mediated changes in gene expression.


Introduction
The fragile X-related disorders are members of 20+ human genetic conditions known as the Repeat Expansion Diseases (Mirkin, 2007). The disease-causing mutation in all cases is an expansion or increase in the number of repeats in a specific tandem repeat tract. In the case of the fragile X-related disorders, the repeat unit is CGG/CCG and the repeat tract is located in the 5 ′ untranslated region of the fragile X mental retardation 1 (FMR1) gene Verkerk et al., 1991). The FMR1 gene encodes FMRP, a protein important for the regulation of translation of dendritic mRNAs in response to synaptic activation (Weiler et al., 1997;Weiler and Greenough, 1999). Two different pathological FMR1 allele size classes are distinguished. Premutation alleles have 55-200 repeats, while alleles with >200 repeats are referred to as full mutation alleles. These two size classes are distinguished because they confer risk of different clinical conditions. Carriers of premutation alleles are at risk of an adult-onset neurodegenerative disorder known as fragile X-associated tremor/ataxia syndrome (reviewed, in Hagerman, 2013). Female carriers are also at risk of fragile X-associated primary ovarian insufficiency, a condition that is associated with fertility problems and an earlier than normal menopause (reviewed in, Sullivan et al., 2011;Sherman et al., 2014). In contrast, full mutation alleles are associated with fragile X syndrome, the leading heritable cause of intellectual disability and the major monogenic cause of autism (Oberle et al., 1991;Fu et al., 1991;Verkerk et al., 1991).
Premutation and full mutation carrier symptoms are different, in part, because the repeat has paradoxical effects on gene expression. Premutation alleles are hyper-expressed (Tassone et al., 2000a) and disease symptoms result from the deleterious consequences of high levels of expression of the transcript with large numbers of CGG-repeats (reviewed, in Sellier et al., 2014). In contrast, the symptoms of fragile X syndrome result from repeatmediated heterochromatin formation that causes transcriptional silencing and a subsequent deficiency of FMRP Sutcliffe et al., 1992). Interestingly, it is becoming increasingly apparent that heterochromatin formation is a common feature of other Repeat Expansion Diseases that involve large repeat tracts. This includes myotonic dystrophy type 1, Friedreich ataxia and amyotrophic lateral sclerosis/frontotemporal dementia (ALS/FTD; Otten and Tapscott, 1995;Steinbach et al., 1998;Herman et al., 2006;Greene et al., 2007;Belzil et al., 2013;Xi et al., 2013).
How the repeats cause hyper-expression of premutation alleles and silencing of full mutation alleles is not well understood. This review will cover recent findings that suggest how the repeats are able to have such different effects on gene expression.

The CGG-repeat Tract Forms Stable Secondary Structures
The CGG-repeat that is present in both the FMR1 gene and its transcript can form a variety of secondary structures. In vitro the DNA repeats form a stem-loop/hairpin and a folded-hairpin-like structure known as a G-tetraplex or quadruplex (Fry and Loeb, 1994;Kettani et al., 1995;Mitas et al., 1995;Usdin and Woodford, 1995;Patel et al., 2000). The hairpin contains a mixture of Watson-Crick G:C base pairs and Hoogsteen G:G base pairs in a 2:1 ratio. The quadruplex is stabilized primarily by G-quartets. The RNA forms a similar set of structures (Handa et al., 2003;Zumwalt et al., 2007;Malgowska et al., 2014). Similar structures are also formed by many of the repeats responsible for other members of the Repeat Expansion Disease family that show repeat-mediated epigenetic changes (reviewed, in Usdin et al., 2015), and quadruplex formation has been reported for the very GC-rich ALS/FTD repeats (Fratta et al., 2012;Reddy et al., 2013;Haeusler et al., 2014) and the repeats responsible for progressive myoclonus epilepsy type 1 (EPM1; Saha and Usdin, 2001). There is evidence for the formation of stem-loop structures in vivo by the CGG-strand of the fragile X repeat (Loomis et al., 2014) and the CAG-and CTGrepeats responsible for diseases like Myotonic dystrophy type 1 and Huntington disease (Liu et al., 2010). Transient unpairing of the repeat region during DNA replication, DNA repair or transcription is thought to provide the opportunity for these structures to form.
In addition to the above-mentioned structures, many diseaseassociated repeats including the fragile X repeats (Groh et al., 2014;Loomis et al., 2014), the Friedreich ataxia repeat (Groh et al., 2014), the ALS/FTD repeat (Haeusler et al., 2014) and the CTG/CAG-repeats responsible for a number of Repeat Expansion Diseases (Lin et al., 2010) form stable R loops during transcription. This R loop contains an RNA:DNA hybrid formed between the nascent transcript and the transcribed strand. This leaves the displaced non-template strand unpaired. Such R loops form in regions with strand-asymmetry with regard to the distribution of purines and pyrimidines, particularly when the pyrimidinerich strand is being transcribed. This R loop may be facilitated by hairpin formation by the non-template strand that would reduce the likelihood of reannealing of the duplex behind the advancing transcription complex. Conversely, the persistence of this hybrid might also favor the formation of the hairpin on the non-template strand. R loops are the only structures identified to date that are formed by all repeats that become heterochromatinized. In bacteria and yeast R loops can also be formed in trans (e.g., by a distally transcribed mRNA) in the presence of RecA and Rad51 respectively (Zaitsev and Kowalczykowski, 2000;Wahba and Koshland, 2013), but evidence for such R loops in mammalian cells is lacking.
The number of repeats required to form an R loop on the FMR1 gene and the timing of R loop formation is the subject of some debate. One study suggests that they are only formed on full mutation alleles relatively late in neuronal differentiation (Colak et al., 2014), while other work suggests that R loops at the FMR1 locus also form on normal and premutation alleles in a number of different cell types (Groh et al., 2014;Loomis et al., 2014). This discrepancy may reflect differences in the stability of the structures being measured. Formation of these structures in vivo likely reflects some combination of the effect of repeat number, transcription rate (Groh et al., 2014;Loomis et al., 2014) and the expression of proteins that affect R loop stability (Colak et al., 2014).

Repeat-mediated Epigenetic Effects on the Premutation Allele
There is a direct relationship between the repeat number and FMR1 mRNA levels in humans and mice carrying the premutation allele, such that premutation carriers have 2-10 times more FMR1 mRNA than individuals with repeat numbers in the normal range (Tassone et al., 2000b;Entezam et al., 2007;Brouwer et al., 2008). The increased RNA levels are the result of increased transcription initiation rather than increased transcript stability (Tassone et al., 2007). The promoter of premutation alleles is enriched for acetylated histones (Todd et al., 2010) and premutation alleles initiate transcription from upstream start sites more frequently than is seen in normal cells (Beilina et al., 2004). While hyper-expression has not been reported for other Repeat Expansion Diseases, a similar increased usage of more 5 ′ start sites has been reported in individuals with ALS (Sareen et al., 2013).
Very little is currently known about why premutation alleles are overexpressed. CGG/CCG-repeats exclude nucleosomes in vitro (Wang et al., 1996). In principle, this could result in the 5 ′ end of the FMR1 gene being more accessible to transcription factors or to the transcription complex. However, to date there is no evidence for repeat-mediated nucleosome exclusion on FMR1 alleles in vivo. The FMR1 promoter is very CpG rich and has many of the hallmarks of a CpG-island promoter. CpG-island promoters act as transcription-independent nucleation sites for the zinc finger CxxC domain-containing chromatin modifying proteins like CxxC finger protein (CFP1; Thomson et al., 2010). CFP1 is a component of the SET1A/B-containing methyltransferase complex that facilitates H3K4 trimethylation (Clouaire et al., 2012), a histone mark typically associated with the 5 ′ ends of active genes. This has led to the suggestion that such proteins provide a self-reinforcing loop of unmethylated CpG recognition and subsequent protection from DNA methylation (Blackledge et al., 2013). In this view, hyper-expression of premutation alleles would be related to the high density of CpGs in the repeat that favors recruitment of factors that inhibit gene silencing. Alternatively, proteins like ATRX, a member of the SNF2 family of helicases/ATPases, could also contribute to hyper-expression of premutation alleles. ATRX colocalizes with G-rich regions that, like the FMR1 locus (Fry and Loeb, 1994;Kettani et al., 1995;Usdin and Woodford, 1995;Patel et al., 2000), have quadruplex-forming potential. ATRX facilitates transcription elongation through these regions by reducing transcription stalling (Levy et al., 2015).
R loops are a characteristic feature of unmethylated CpGislands where they have been suggested to play a role in preventing gene silencing (Ginno et al., 2012). There is evidence to suggest that R loops provide some protection from de novo methylation by DNMT3B1, the primary de novo DNA methyltransferase active during early development (Ginno et al., 2012). How they do so is currently unknown, but one possibility is that the single stranded region of the R loop is a preferential binding site for a number of epigenetic modifiers that are positive regulators of transcription. These modifiers include members of the H3K4 methyltransferase family (Krajewski et al., 2005), whose activity is thought to inhibit de novo methylation (Ooi et al., 2007). It also includes the activation-induced cytosine deaminase (Chaudhuri et al., 2003) that is thought to be important for DNA demethylation (Popp et al., 2010). R loops may also be less capable of properly binding nucleosomes (Dunn and Griffith, 1980) and there is evidence to suggest that R loop formation causes chromosome decondensation (Powell et al., 2013). R loops formed on longer repeat tracts have been shown to extend further into the flanking regions (Loomis et al., 2014). This could perhaps favor transcription initiation by increasing the likelihood that promoter melting will occur or by facilitating the binding of additional transcription factors or chromatin modifiers to the promoter that in turn promote transcription initiation.
Treatment of fragile X patient cells with DNA methylation inhibitors leads to partial gene reactivation that is not associated with the loss of H3K9me2 or H3K9me3 (Kumari and Usdin, 2014). This suggests that DNA methylation occurs downstream, or is independent of, the deposition of these chromatin modifications. This would be consistent with the observation that in rare full mutation carriers in which gene silencing does not occur, the FMR1 gene is enriched for H3K9me2 but shows no CpG methylation (Tabolacci et al., 2008). The FMR1 gene can also be reactivated by inhibition of Sirtuin 1 (SIRT1), the enzyme responsible for the deacetylation of H3K9 and H4K16 on fragile X alleles (Biacsi et al., 2008). Since DNA demethylation leads to acetylation of H4K16 but not H3K9 (Biacsi et al., 2008), it would be consistent with the idea that deacetylation of H3K9 precedes DNA methylation while deacetylation of H4K16 is a late event in the silencing process acting downstream of DNA methylation.
The earliest epigenetic change associated with silencing of the fragile X allele is unknown. However, the fact that the histone marks H3K9me3 and H4K20me3 show a peak of enrichment in the region of the fragile X repeat would be consistent with the idea that the repeats themselves are the site of nucleation for the silencing process (Kumari and Usdin, 2010). FMR1 mRNA knockdown blocks FMR1 gene silencing during neuronal differentiation in fragile X embryonic stem cells (Colak et al., 2014) and decreases the recruitment of polycomb repressive complex 2 (PRC2) to full mutation alleles that have been reactivated with 5-azadeoxycytidine (Kumari and Usdin, 2014). These findings suggest that the FMR1 transcript plays a key role in gene silencing by facilitating, either directly or indirectly, the recruitment of repressive histone modifying complexes like PRC2.
The FMR1 transcript may facilitate gene silencing by recruiting repressive histone modifiers as many long non-coding RNAs do. Examples are known where such RNA acts in cis or trans to recruit PRC2 along with either the LSD1-CoREST complexes responsible for H3K4me2 demethylation (Tsai et al., 2010), or the polycomb repressive complex 1 (Yap et al., 2010), or the H3K9 methylase G9a (Pandey et al., 2008). Long non-coding RNAs are also involved in recruiting the H4K20 trimethylase Suv4-20h (Bierhoff et al., 2014). The FMR1 transcript may be acting like these RNAs to recruit repressors to the fragile X locus. These factors may in turn act as a molecular scaffold to which other chromatin modifiers bind to ultimately generate the histone modification signature found on fragile X alleles (Kumari and Usdin, 2010). Of interest in this regard is the fact that the Fmr1 transcript is one of the transcripts most commonly associated with PRC2 in normal mouse embryonic stem cells (Zhao et al., 2010). This may be related to the ability of the 5 ′ end of the transcript to form stem-loop structures as assessed by the RNAfold algorithm (http://rna.tbi.univie.ac.at/cgi-bin/RNAfold.cgi). Such structures are thought to be important for PRC2 binding (Zhao et al., 2008). The human FMR1 transcript is predicted to form similar structures and may thus also recruit PRC2 to even to normal and premutation alleles. However, not only would the amount of PRC2 recruited to the FMR1 locus be limited by the relatively low stability of the R loop on small repeat tracts, but any activity of PRC2 would be also inhibited by the presence of nascent RNA (Cifuentes-Rojas et al., 2014) or H3K36me3 (Schmitges et al., 2011), a histone modification deposited co-transcriptionally by the SETD2 protein. Even should some trimethylation of H3K27 occur, it's spread would be limited by the high levels of H3K36me3 present in the region (Yuan et al., 2011).
In some cases targeting of the long non-coding RNA is thought to be accomplished by a bivalent protein that binds both the RNA and the target locus (Jeon and Lee, 2011), whilst in others the RNA is tethered to its target gene via the formation of a RNA:DNA:DNA triplex (Schmitz et al., 2010), or an RNA: DNA hybrid (Rinn and Chang, 2012). Tethering via the formation of an RNA:RNA hybrid has also been suggested (Rinn and Chang, 2012). Recruitment of repressive complexes via an RNA:DNA hybrid may be relevant for FMR1 gene silencing given the persistent R loop that is found coincident with the repeat on FMR1 alleles (Groh et al., 2014;Loomis et al., 2014) and the effect of R loop depletion on gene silencing in fragile X ESC-derived neurons (Colak et al., 2014). R loops have also been implicated in gene silencing in Friedreich ataxia (Groh et al., 2014). In principle, either a cis or transassociation of the transcript with the FMR1 locus could tether the transcript to the promoter allowing it to recruit repressive epigenetic modifiers like PRC2 that ultimately result in FMR1 gene silencing.
Gene silencing may also be triggered by the R loops themselves in ways that are independent of the FMR1 transcript per se. The free DNA strand in the fragile X R loop forms secondary structures (Loomis et al., 2014) that have been suggested to directly recruit DNA methyltransferases (Smith et al., 1994). However, our data suggests that DNA methylation is not the first step in the fragile X gene silencing process (Biacsi et al., 2008;Kumari and Usdin, 2014). Furthermore, Friedreich ataxia repeats do not form such secondary structures and methylatable CpG residues are not present in either the Friedreich ataxia repeats or the myotonic dystrophy type 1 repeats. Thus if gene silencing in these disorders has a common molecular basis, the use of this pathway to initiate gene silencing seems unlikely. R loops also facilitate Pol II pausing and termination (Skourti-Stathaki et al., 2011;Groh et al., 2014). In the β-actin locus pausing results in the production of an antisense transcript. The sense-antisense RNA pair leads to the RNAi-dependent deposition of H3K9me2 and the recruitment of heterochromatin protein 1γ (HP1γ; Skourti-Stathaki et al., 2014). However, knockdown of Dicer, Ago1, and Ago2, genes important for RNAi did not prevent FMR1 gene silencing in neurons differentiated from fragile X embryonic stem cells (Colak et al., 2014). This would seem to rule out a role for RNAi in silencing of full mutation alleles, whether the RNAi is R loopmediated or mediated independently of R loops by the interaction of the FMR1 sense and antisense transcript (Ladd et al., 2007) or CGG-RNA hairpins that are Dicer substrates (Handa et al., 2003). R loops are also prone to chromatin compaction via a H3S10 phosphorylation-dependent mechanism that is still unknown (Castellano-Pozo et al., 2013). R loops have also been shown to result in double-strand breaks (Sordet et al., 2010) that could lead to the recruitment of a variety of epigenetic modifiers including DNA methyltransferase 1 and SIRT1 (O'Hagan et al., 2008).

Unresolved Questions and Future Directions
More work is needed before we understand the epigenetic regulation of the FMR1 gene. Until we do the question of why shorter repeat tracts favor increased gene expression while longer ones favor gene silencing remains unanswered. The paradoxical effect of the repeats presumably reflects the equilibrium between interactions of the repeat tract with factors that favor the accumulation of positive chromatin modifications and those that favor the generation of repressive chromatin. For example, as illustrated in Figure 1, smaller repeat tracts may be more likely to form RNA: FIGURE 1 | Model for repeat-mediated gene dysregulation in fragile X premutation and full mutation carriers. A metastable R loop formed by the premutation allele would leave the non-template strand transiently unpaired and thus able to recruit transcription activators that show a preferential binding to single-stranded regions (Chaudhuri et al., 2003;Krajewski et al., 2005). However, since the RNA:DNA hybrid is relatively short, transcription termination is low and the transcript is not tethered to the FMR1 locus long enough to recruit transcriptional repressors that might bind the hairpins formed by the 5 ′ end of the FMR1 transcript. The net result is that the premutation allele would be associated with elevated levels of active histone modifications and thus hyper-expressed. In contrast, on full mutation alleles the RNA:DNA hybrid is likely to be more stable . It may thus be able to effectively recruit repressive chromatin modifiers to the 5 ′ end of the FMR1 gene that result in the deposition of repressive histone marks. The hybrid may also be long enough to cause significant transcription termination (Skourti-Stathaki et al., 2011;Groh et al., 2014). This would result in a drop in the levels of co-transcriptionally deposited active chromatin modifications. This could result in the loss of the protective effect that these histone marks provide against the deposition of repressive histone marks (Schmitges et al., 2011;Yuan et al., 2011). The non-template strand in the R loop may also be more likely to form secondary structures and thus less likely to bind transcription activators with a preference for single-stranded regions. The net result would be transcriptional silencing of the full mutation allele.
Frontiers in Genetics | www.frontiersin.org June 2015 | Volume 6 | Article 192 DNA hybrids that have a short half-life and that may leave the non-template strand single-stranded just long enough to bind transcriptional activators or DNA demethylases that have a preference for single-stranded regions. In contrast, longer repeat tracts would be more likely to form stable RNA:DNA hybrids that were better able to recruit complexes like G9a and PRC2 that deposit repressive chromatin marks. The longer hybrids may also be more prone to transcription termination that could result in the loss of H3K36me3 and H3K4me3 and thus the protection from encroachment of repressive chromatin (Yuan et al., 2011). The non-template strand may also be more likely to form stable secondary structures that reduce binding of transcription activators with a preference for single-stranded regions.
In addition, while the idea that a co-transcriptionally formed R loop is responsible for silencing is appealing, the ability of siRNA or shRNA to the FMR1 transcript to reduce repression of fragile X alleles (Colak et al., 2014;Kumari and Usdin, 2014) suggests that the situation may not be quite so simple. Despite reports of active RNAi factors in the nucleus (Gagnon et al., 2014), the ability of RNAi to deplete RNAs that are exclusively nuclear is still controversial. An alternative explanation consistent with the available data is that the R loop forms in trans from a transcript that has transitioned through the cytoplasm. If so, then R loop formation may be facilitated by the ability of the CGG-strand of the DNA to form secondary structures during transcription thus leaving the CCG-DNA strand free.
The question of the timing of gene silencing in the fragile X embryo is also still unresolved. This is significant for our understanding as to whether FMRP is present during early embryonic development in individuals with fragile X syndrome. Silencing has been reported to be a relatively late event in embryonic development, occurring >45 days after differentiation of neurons from embryonic stem cells is initiated (Colak et al., 2014). However, in many fragile X ESC lines significant gene silencing has already occurred (Avitzour et al., 2014). This includes the ESC lines used in Colak et. al. study as evidenced by the fact that a significant fraction of the fragile X alleles in these cells are resistant to digestion by Eag I, a methylation-sensitive restriction enzyme (Supplementary Figure 1; Colak et al., 2014). How much silencing occurs in the early embryo and how this silencing compares to the silencing observed in neurons are thus important open questions.
There is also the question of what, if anything, can be done to prevent or reverse the repeat-mediated epigenetic changes in premutation and full mutation carriers. A number of compounds that are able to inhibit repressive epigenetic modifications are already in clinical trials for treatment of Friedreich ataxia. It is of interest that Vitamin B3 (nicotinamide), an inhibitor of SIRT1, has shown promise in Phase I trials for Friedreich ataxia (Libri et al., 2014). The identification of FMR1 transcription as the trigger for silencing opens up a whole new potential approach to gene reactivation. A small molecule that blocks hybrid formation and prevents silencing (Colak et al., 2014) was unable to reactivate a silenced allele. However, it may able to do so in combination with other epigenetic modifiers. Of course, any benefit to be gained by reactivation of the fragile X allele would have to be offset by the risk posed by expression of FMR1 mRNA with long CGGrepeat tracts (Loesch et al., 2012). It may also be possible to ameliorate the symptoms seen in premutation carriers by using epigenetic modifying compounds that reduce the FMR1 hyperexpression. For example, since SIRT1 is involved in down regulation of FMR1 expression, SIRT1 activators may help normalize the levels of the toxic transcript produced from premutation alleles.