Alterations in Polyadenylation and Its Implications for Endocrine Disease

Introduction: Polyadenylation is the process in which the pre-mRNA is cleaved at the poly(A) site and a poly(A) tail is added – a process necessary for normal mRNA formation. Genes with multiple poly(A) sites can undergo alternative polyadenylation (APA), producing distinct mRNA isoforms with different 3′ untranslated regions (3′ UTRs) and in some cases different coding regions. Two thirds of all human genes undergo APA. The efficiency of the polyadenylation process regulates gene expression and APA plays an important part in post-transcriptional regulation, as the 3′ UTR contains various cis-elements associated with post-transcriptional regulation, such as target sites for micro-RNAs and RNA-binding proteins. Implications of alterations in polyadenylation for endocrine disease: Alterations in polyadenylation have been found to be causative of neonatal diabetes and IPEX (immune dysfunction, polyendocrinopathy, enteropathy, X-linked) and to be associated with type I and II diabetes, pre-eclampsia, fragile X-associated premature ovarian insufficiency, ectopic Cushing syndrome, and many cancer diseases, including several types of endocrine tumor diseases. Perspectives: Recent developments in high-throughput sequencing have made it possible to characterize polyadenylation genome-wide. Antisense elements inhibiting or enhancing specific poly(A) site usage can induce desired alterations in polyadenylation, and thus hold the promise of new therapeutic approaches. Summary: This review gives a detailed description of alterations in polyadenylation in endocrine disease, an overview of the current literature on polyadenylation and summarizes the clinical implications of the current state of research in this field.


INTRODUCTION BACKGROUND
Almost all eukaryotic pre-messenger RNAs (pre-mRNAs) and several non-coding transcripts have poly(A) sites and are polyadenylated (Shepard et al., 2011;Derti et al., 2012;Lin et al., 2012). Several studies have mapped poly(A) sites genome-wide in humans Yan and Marr, 2005;Ozsolak et al., 2010;Fu et al., 2011;Shepard et al., 2011;Derti et al., 2012;Lin et al., 2012). One of the latest found that more than two thirds of all genes have multiple implications of the current state of research in this field. We focus on the co-transcriptional, nuclear polyadenylation, as posttranscriptional and cytoplasmic polyadenylation [alterations to the already added poly(A) tail] is beyond the scope of this review.

mRNA TRANSCRIPTION AND POLYADENYLATION
Pre-mRNAs are transcribed from protein-coding genes by the RNA-polymerase II (Pol-II). The nascent pre-mRNA molecule emerging from the Pol-II goes through the co-transcriptional processes of 5 capping, splicing, and polyadenylation before transcription is terminated. Polyadenylation consists of two steps: (1) Cleavage of the pre-mRNA at a poly(A) site and (2) Addition of an untemplated 3 poly(A) tail to the upstream cleavage product. The specific cleavage position and the efficiency of the process depend on the interaction between trans-acting polyadenylation factors and cis-elements present in the pre-mRNAs. All eukaryotic pre-mRNAs, except some replication-dependent histone pre-mRNAs, as well as several non-coding transcripts, including micro-RNAs (miRNAs), have poly(A) sites and are polyadenylated (Shepard et al., 2011;Derti et al., 2012;Lin et al., 2012). Polyadenylation is vital for normal mRNA formation, transcription termination and mRNA export from the nucleus, and affects mRNA stability, subcellular localization, and translational efficiency ). As polyadenylation is required for transcription termination, the efficiency of the polyadenylation process affects gene expression quantitatively (West and Proudfoot, 2009;Yang et al., 2009;Mapendano et al., 2010).

POST-TRANSCRIPTIONAL REGULATION OF GENE EXPRESSION
Tight regulation of gene expression is essential for cells to perform their normal functions and disturbances in gene expression underlie many diseases. Gene expression can be regulated at any step, from the pre-transcriptional epigenetic modifications of the chromatin to the post-translational modifications of the protein product. A quickly responding part of gene expression regulation is the post-transcriptional regulation exerted by RNA-binding proteins (RBPs) and miRNAs. RBPs and miRNAs mainly bind target sequences in the 3 UTR of mRNAs, thereby controlling mRNA turnover and translation rate (Keene, 2007;Friedman et al., 2009). An example of post-transcriptional regulation is seen in the rapid regulation of insulin level and signaling in response to changes in glucose level. This regulation is exerted through altered stability and translation of insulin mRNA and insulin receptor mRNA, and helps to maintain glucose homeostasis (reviewed in Lee and Gorospe, 2010).

ROLE OF ALTERNATIVE POLYADENYLATION IN POST-TRANSCRIPTIONAL REGULATION
As RBPs and miRNAs mainly target the 3 UTR, alterations to the 3 UTR affect the post-transcriptional regulation. More than two thirds of all mammalian genes encode alternative mRNA isoforms with different 3 UTRs through APA (Derti et al., 2012). APA therefore plays an important part in the post-transcriptional regulation, as it determines which isoform of the 3 UTR is expressed (Thomsen et al., 2010). This is seen for the insulin receptor mRNA, from the example above (Levy et al., 1995). Characterization of polyadenylation is therefore essential for understanding both normal cell biology as well as disease. Recent advances in sequencing technologies have made it possible to map poly(A) site usage genome-wide (Mangone et al., 2010;Ozsolak et al., 2010;Fox-Walsh et al., 2011;Fu et al., 2011;Jan et al., 2011;Shepard et al., 2011;Derti et al., 2012;Jenal et al., 2012;Lin et al., 2012;Martin et al., 2012;Pelechano et al., 2012;Yoon et al., 2012;Hoque et al., 2013;Wang et al., 2013;Wilkening et al., 2013). These techniques could be implemented in and improve diagnostics.

Constitutive polyadenylation
Genes with a single poly(A) site can only be constitutively polyadenylated (Figure 1, top), as seen for the INS gene encoding insulin (Garin et al., 2010). Approximately one third of all human genes only have one poly(A) site (Derti et al., 2012). For these genes the efficiency of polyadenylation regulates gene expression through changes in the transcription termination efficiency (West and Proudfoot, 2009;Yang et al., 2009;Mapendano et al., 2010). Weakened polyadenylation can lead to impaired gene expression and read-through transcription (Higgs et al., 1983), while enhanced polyadenylation can lead to upregulated gene expression (Danckwardt et al., 2004).

Alternative polyadenylation
For genes with multiple poly(A) sites, APA can take place (for review of APA see Neilson and Sandberg, 2010;Di Giammartino et al., 2011;Proudfoot, 2011). Only one of the possible poly(A) sites in a pre-mRNA is used for polyadenylation per mRNA transcription event. APA is typically divided into two categories, UTR-APA and coding region (CR)-APA.

Untranslated region alternative polyadenylation.
Alternative polyadenylation occurring at alternative poly(A) sites located in the 3 UTR of the last exon is called UTR-APA. It results in mRNAs with the same CR, but with different 3 UTR length (Figure 1, middle), as seen for the INSR gene encoding insulin receptor (Levy et al., 1995). UTR-APA is the most abundant type of APA, accounting for more than half of the APA events (Yan and Marr, 2005;Shepard et al., 2011). About 70% human genes have multiple poly(A) sites in their 3 UTR and undergo UTR-APA (Derti et al., 2012). The 3 UTR contains various cis-elements associated with post-transcriptional gene regulation, such as target sites for miRNAs and RBPs, as well as AU-rich elements (AREs) and GUrich elements (GREs). The 3 UTR and the factors interacting with it largely determine mRNA stability, subcellular localization, and translational efficiency . It is therefore not surprising, that mutations in the 3 UTR are associated with many diseases (reviewed in Chen et al., 2006a,b). The general length of the 3 UTRs has been found to be inversely correlated with cellular proliferation Elkon et al., 2012). Individual mRNAs with shorter 3 UTRs are more stable (Mayr and Bartel, 2009;Hogg and Goff, 2010;Yepiskoposyan et al., 2011) and generally produce more protein Mayr and Bartel, 2009;Singh et al., 2009). In one study 45% of the mRNAs with a general 3 UTR shortening also had significantly changed expression levels, the majority being upregulated (de Klerk et al., 2012). This would correlate well with a heightened Frontiers in Endocrinology | Genomic Endocrinology Proximal polyadenylation (blue arrow) leads to 3 UTR shortening, less post-transcriptional regulation, and enhanced protein translation. Bottom: coding region (CR)-APA: gene contains additional poly(A) sites located in the CR of exons and in introns. APA results in mRNAs with different 3 UTRs and C-terminal CRs, producing distinct protein isoforms. Proximal polyadenylation (blue arrow) produces a mRNA with a different C-terminal CR and 3 UTR, producing a C-terminally truncated protein isoform.
mRNA stability for these transcripts. However, no such significant change in mRNA levels was found in several other studies, where general changes in 3 UTR length were also seen Fu et al., 2011;Elkon et al., 2012;Morris et al., 2012). The effects of general changes in 3 UTR length thus remain unclear. During normal development, escape of miRNA-mediated regulation by APA is seen (Thomsen et al., 2010;Boutet et al., 2012). In addition escape of both miRNA-and RBP-mediated regulation by APA is seen in cancer cells (Boutaud et al., 2003;Sandberg et al., 2008;Mayr and Bartel, 2009). Genes with UTR-APA have common parts of the 3 UTR expressed in all mRNA isoforms and alternative parts of the 3 UTR, only expressed in the mRNAs with longer 3 UTR isoforms. The alternative parts of the 3 UTR are normally longer, more AU-rich and contain more cis-elements than the common parts . Poly(A) sites preferentially flank cis-elements in the 3 UTR (Yoon et al., 2012), as seen for miRNA target sites ). Interestingly, more than half of the miRNA targets are found downstream of the most proximal poly(A) site in the 3 UTR (Legendre et al., 2006). The energy expenditure required for synthesizing the longer 3 UTR isoforms, is used to provide a more refined control of gene expression, through post-transcriptional regulation, as seen for StAR mRNA (Zhao et al., 2005). Surprisingly, the 3 UTRs of a large number of genes are also expressed alone, separately from their associated protein-coding sequences (Mercer et al., 2011). These expressed 3 UTRs are suggested both to have trans-acting roles and to function as decoys, which can sequester trans-acting factors, such as miRNAs. UTR-APA in these transcripts would therefore affect these functions, such as generating multiple isoforms of a decoying 3 UTR, each with different miRNA-binding profiles. For genes undergoing UTR-APA, the efficiency of polyadenylation contributes to gene expression through the transcription termination efficiency. In addition to this the choice of different poly(A) sites through APA plays an important part in post-transcriptional regulation, as it defines which isoform of the 3 UTR is expressed (Thomsen et al., 2010). Dysfunctional polyadenylation can lead to changes in the APA pattern, altering the 3 UTR, and to changes in polyadenylation efficiency, both affecting gene expression.
Coding region alternative polyadenylation. Alternative polyadenylation occurring at alternative poly(A) sites located in introns or in the CR of exons is called CR-APA. It results in mRNAs with both different C-terminal CRs and 3 UTRs (Figure 1, bottom), as seen for the CALCA gene encoding calcitonin/CGRP (Amara et al., 1984;Zhou et al., 2007). CR-APA often occurs together with alternative splicing of the last exon and it is suggested that there is a dynamic competition between splicing and polyadenylation Evsyukova et al., 2012). About 20% of human genes have at least one utilized intronic poly(A) site, usually located in large introns with weak 5 splice sites . Thousands www.frontiersin.org of dormant intronic poly(A) sites have been characterized, which are normally suppressed and therefore not utilized (Yao et al., 2012a). Enhanced usage of intronic poly(A) sites is associated with proliferation  and with a general transcriptional upregulation (Berg et al., 2012). Polyadenylation within the CR of an exon can convert a Tyr codon to a stop codon and thereby give rise to a functional mRNA (Yao et al., 2012b). Unlike mRNAs with acquired premature termination codons, which also generate C-terminal truncated proteins, the mRNAs generated by CR-APA are not targeted by nonsense-mediated decay (Lejeune and Maquat, 2005). For genes undergoing CR-APA, the efficiency of polyadenylation contributes to gene expression through the transcription termination efficiency. Additionally, the choice of different poly(A) sites through APA determines both the mRNA coding potential and 3 UTR isoform. Dysfunctional polyadenylation can lead to changes in the APA pattern, altering the protein product and 3 UTR, and to changes in polyadenylation efficiency, both affecting gene expression.

Cis-elements guiding polyadenylation
The cis-elements guiding polyadenylation of the nascent pre-mRNA are positioned upstream or downstream relative to a poly(A) site (Figure 2) (reviewed in Tian and Graber, 2012). The poly(A) site is where the pre-mRNA molecule is cleaved, usually immediately 3 of a CA dinucleotide, and where the poly(A) tail on the mature mRNA starts. Poly(A) sites normally contain two core cis-elements: (1) The polyadenylation signal (PAS) placed 10-30nt upstream of the poly(A) site, consisting of the canonical sequence element, AAUAAA, found in more than half of mammalian genes, or a weaker non-canonical variant (Beaudoing et al., 2000;Zhang et al., 2005;Ho and Gunderson, 2011;Wang et al., 2013). (2) The U/GU-rich downstream sequence element (DSE), located up to 30nt downstream of the poly(A) site. Additional elements such as U-rich upstream sequence elements (USE), located just upstream of the PAS (Moreira et al., 1998), G-rich auxiliary downstream elements (Aux-DSE) (Chen and Wilusz, 1998), located downstream of the DSE, and UGUA motifs (Brown and Gilmartin, 2003;Venkataraman et al., 2005;Yang et al., 2011) are found around some poly(A) sites and all act as enhancers of polyadenylation. The strength of a poly(A) site depends on the exact sequence of its surrounding cis-elements, as well as their spatial placement, which together determinates the affinity for the trans-acting factors. In genes with only one poly(A) site, the PAS is typically canonical (Beaudoing et al., 2000). In genes with multiple poly(A) sites, the most distal poly(A) site generally has a canonical PAS whereas the proximal poly(A) sites generally have a non-canonical PAS (Beaudoing et al., 2000;Tian et al., 2005Tian et al., , 2007Shepard et al., 2011;de Klerk et al., 2012;Lin et al., 2012;Yoon et al., 2012;Wang et al., 2013). The proximal poly(A) sites of skipped terminal exons however tend to be associated with a canonical PAS as well . Poly(A) sites with non-canonical ciselements are weaker and require additional factors for efficient utilization , as seen for the weak proximal poly(A) site in the CALCA gene encoding calcitonin/CGRP (Van Oers et al., 1994;Lou et al., 1998). The distal poly(A) sites with canonical ciselements work as a secure last option for polyadenylation, ensuring that transcription is terminated. This minimizes potentially damaging read-through transcription into downstream genes, which normally occurs for a small proportion of transcripts (Dresser et al., 1995). The arrangement of the cis-elements with one Arich element associated with one or more U-rich elements around the poly(A) site is generally conserved across eukaryotes (Millevoi and Vagner, 2010). A comparison of orthologous human, rhesus, dog, mouse, and rat genes, showed that poly(A) site usage is strikingly similar in orthologous tissues between species Derti et al., 2012). The canonical PASs are positionally conserved between species (Derti et al., 2012), opposed to the non-canonical PASs, which have been found to be less conserved (Ara et al., 2006). Intronic poly(A) sites are also less conserved than poly(A) sites in the 3 UTR of terminal exons . The non-canonical proximal poly(A) sites tend to have more conserved flanking regions , in which conserved motifs have been found, possibly explaining the relative frequent usage of these poly(A) sites (Nunes et al., 2010;Ozsolak et al., 2010).

Trans-acting polyadenylation factors (the human 3 -end processing complex)
The human 3 -end processing complex consists of more than 80 proteins , which are highly conserved between species (Darmon and Lutz, 2012). An important part of this complex, the core polyadenylation machinery, consists of five multisubunit protein factors: cleavage and polyadenylation specificity factor (CPSF), cleavage stimulation factor (CstF), cleavage factor I (CF-I), cleavage factor II (CF-II), and poly(A) polymerase (PAP) (Table 1; Figure 2) (reviewed in Chan et al., 2011). The endonucleolytic cleavage of the pre-mRNA is made by the CPSF-73 subunit at the poly(A) site (Figure 2) (Mandel et al., 2006). The pre-mRNA is typically cleaved immediately 3 of a CA dinucleotide (Danckwardt et al., 2008), although the exact cleavage site is usually clustered within a few nucleotides (Beaudoing et al., 2000;Pauws et al., 2001;Tian et al., 2005;Lin et al., 2012). After cleavage an untemplated poly(A) tail of about 250 adenosine-nucleotides is added to the upstream cleavage product by PAP (Figure 2). This is stimulated by the CPSF through the nuclear poly(A)-binding protein (PABPN1), which binds the growing poly(A) tail and controls its length (Kühn et al., 2009). Many of the other proteins in the human 3 -end processing complex do not play a direct role in the 3 -end formation, but instead couple polyadenylation to other cellular processes.

REGULATION OF POLYADENYLATION
Polyadenylation is influenced by three factors: (1) The strength of the cis-elements. (2) The concentration and activity of polyadenylation factors. (3) The concentration and activity regulatory factors (Barabino and Keller, 1999). These determine the efficiency of polyadenylation and the pattern of APA. The regulation of constitutive and alternative polydenylation differs and even CR-APA and UTR-APA are regulated differently. This differentiated regulation of APA is seen during T-cell activation, where CR-APA occurs at both early and late stages of activation, utilizing both more proximal and more distal poly(A) sites. In contrast, UTR-APA is most evident in late stages of activation and dominantly utilize more FIGURE 2 | Cis-elements and polyadenylation factors. Light-green boxes represent cis-elements in the pre-mRNA and light-purple boxes represent polyadenylation factors. The endonucleolytic cleavage of the pre-mRNA is made by the CPSF-subunit CPSF-73 at the poly(A) site, typically immediately 3 of a CA dinucleotide. After cleavage, an untemplated poly(A) tail of about 250 adenosine-nucleotides, is added to the upstream cleavage product by PAP. This is stimulated by the CPSF through the nuclear poly(A)-binding protein (PABPN1) which binds the growing poly(A) tail and controls its length.
proximal poly(A) sites . The differentiated regulation of APA is also seen during stem cell differentiation where UTR-APA generally switches toward utilizing more distal poly(A) sites, opposed to CR-APA which switches toward utilizing both more proximal and more distal poly(A) sites (Shepard et al., 2011). CR-APA often occurs together with alternative splicing of the last exon and it is suggested that there is a dynamic competition between splicing and polyadenylation Evsyukova et al., 2012). CR-APA is thus both regulated by polyadenylation and splicing. The regulation of polyadenylation is complex and the involved factors interact with several other cellular processes. Part of the regulation is mediated through mechanistic interactions with polyadenylation factors, enhancing, or inhibiting their function, competing for the binding to their cis-elements or reducing www.frontiersin.org  Chan et al. (2011) their free levels in the nucleus (reviewed in Millevoi and Vagner, 2010).

Regulation by levels and variants of polyadenylation factors
It is hypothesized that higher levels of polyadenylation factors lead to improved utilization of weaker proximal poly(A) sites, encountered earlier during transcription (Figure 3). This is seen during proliferation, where upregulated levels of polyadenylation factors correlate with proximal poly(A) site usage . It is also seen for many of the individual core polyadenylation factors (see sections below). Every component, but not every subunit, of the core polyadenylation machinery are regulated by post-translational modifications, such as methylation, sumoylation, acetylation, and phosphorylation (reviewed in Ryan and Bauer, 2008).
Cleavage and polyadenylation specificity factor. During colorectal carcinoma progression and in breast cancer cell lines, upregulated levels of CPSF-subunits CPSF-73, CPSF-160, CPSF-30, and symplekin correlate with proximal poly(A) site usage (Mayr and Bartel, 2009;Morris et al., 2012). Interestingly, symplekin is also highly overexpressed in colon, lung, muscle, and prostate tumors where it promotes tumorigenesis (Buchert et al., 2010). CPSF-160 and symplekin are expressed in high levels in germ cells where the general 3 UTR length is the shortest (Dass et al., 2001;Liu et al., 2007). CPSF-100 was conversely downregulated. During the induction of stem cells from somatic cells, all CPSF-subunits except CPSF-100 and hFip1, are upregulated. This correlates with a general shift toward proximal poly(A) site usage. During embryonic development all CPSF-subunits except hFip1 are downregulated. This correlates with a general shift toward distal poly(A) site usage .
The examples indicate that upregulated levels of CPSF enhance polyadenylation, leading to proximal poly(A) site usage. Interestingly, knockdown of the CPSF-subunit hFip1 increases DNA damage and chromosome breakage (Stirling et al., 2012). This is properly caused by dysfunctional polyadenylation, leading to defective transcription termination, read-through transcription and formation of R-loops (see section Sequence changes in genes encoding polyadenylation factors, page 15).

Cleavage stimulation factor.
Upregulated levels of the CstF subunit CstF-64 correlate with proximal poly(A) site usage, during colorectal carcinoma progression and in breast cancer cell lines (Mayr and Bartel, 2009;Morris et al., 2012). Upregulated levels of CstF-64 enhance utilization of proximal poly(A) sites in the mouse Testis Brain RNA-Binding Protein (TB-RBP) gene in male germ cells (Chennathukuzhi et al., 2001). CstF-64 is also upregulated in activated immune cells, enhancing the usage of proximal poly(A) sites (Shell et al., 2005;Sandberg et al., 2008). All CstF subunits are upregulated, during the induction of stem cells from somatic cells, correlating with a general shift toward proximal poly(A) site usage. Additionally, these subunits are downregulated during embryonic development, correlating with a general shift toward distal poly(A) site usage . When comparing the used poly(A) sites in genes that lengthen and genes that shorten during the induction of stem cells from somatic cells, there was major differences in the DSE. The DSE is where CstF binds, indicating that CstF plays a crucial role in the regulation of poly(A) site usage ). The CstF-64 ortholog τCstF-64, expressed at highest levels in testis and brain, is necessary for spermatogenesis and fertilization Hockert et al., 2011). Both CstF-64 and τCstF-64 are expressed in high levels in germ cells, which have the shortest general 3 UTR length (Dass et al., 2001;Liu et al., 2007). Loss of τCstF-64 changes gene expression genome-wide in germ cells, leads to more frequent usage of distal poly(A) sites and causes more read-through transcription due to aberrant transcription termination (Li et al., 2012c). RNAimediated depletion of CstF-64 has a small effect on APA, but leads to upregulation of τCstF-64, suggesting that CstF-64 and τCstF-64 play redundant roles (Yao et al., 2012a). In support of this, during differentiation of C2C12 myoblasts τCstF-64 is also upregulated at the same time as the three subunits of CstF are downreulated ). Codepletion of CstF-64 and τCstF-64 leads to significant changes in APA, with enhanced usage of distal poly(A) sites (Yao et al., 2012a). Together this indicates that upregulation Frontiers in Endocrinology | Genomic Endocrinology of CstF enhances polyadenylation, leading to the usage of proximal poly(A) sites. A CstF-64 splice variant βCstF-64 is expressed specifically in nervous tissue and is thought to have regulatory functions here (Shankarling et al., 2009). Interestingly, CstF-64 increases fivefold during cell cycle, at G 0 to S phase transition (Martincic et al., 1998) and low levels of CstF-64 cause reversible cell cycle arrest while depletion causes apoptosis .
Cleavage factor I. Cleavage factor I has been found to be important for non-canonical poly(A) site usage by enhancing the recruitment of CPSF (Brown and Gilmartin, 2003;Venkataraman et al., 2005;Sartini et al., 2008;Yang et al., 2011). In the germ cells, where the general 3 UTR length is the shortest, the two CF-I subunits CF-I-25 and CF-I-68 are upregulated Sartini et al., 2008). These two subunits are also upregulated during the induction of stem cells from somatic cells, correlating with a general shift toward proximal poly(A) site usage . These examples correlate well with the proposed function of CF-I in promoting non-canonical poly(A) site usage. However, reduced levels of CF-I has been found to enhance more proximal poly(A) site usage (Kubo et al., 2006;Kim et al., 2010). Loss of function of CF-I-25 and CF-I-68, but not of CF-I-59, have also been found to enhance genome-wide proximal poly(A) site usage Martin et al., 2012). This indicates that CF-I has other functions than enhancing the recruitment of CPSF, yet to be fully explained. Interestingly, downregulation of CF-I-68 has also been found to increase tumor invasiveness (Yu et al., 2008).
Cleavage factor II. The CF-II subunit hClp1 is upregulated during the induction of stem cells from somatic cells, correlating with a general shift toward proximal poly(A) site usage . Likewise, hClp1 is downregulated during embryonic development, correlating with a general shift toward distal poly(A) site usage ). In the germ cells, where there is a short general 3 UTR length, hClp1 is also www.frontiersin.org upregulated . Low levels of the CF-II subunit hPcf11 reduces transcription termination efficiency, due to weakened polyadenylation (West and Proudfoot, 2008). Taken together it shows that upregulation of CF-II enhances polyadenylation, leading to proximal poly(A) site usage.

Poly(A) polymerase.
In breast cancer cell lines, upregulated levels of PAP correlate with proximal poly(A) site usage (Mayr and Bartel, 2009). PAP is also upregulated during the induction of stem cells from somatic cells, correlating with a general shift toward proximal poly(A) site usage . These examples indicate that upregulation of PAP enhances polyadenylation, leading to usage of proximal poly(A) sites. PAP is post-translationally modified during cell cycle progression, with the modifications reflecting the proliferative state of the cell (Thomadaki et al., 2008). PAP modifications also correlate with apoptosis (Thomadaki et al., 2008). One of the PAP regulators, poly(ADP-ribose) polymerase 1 (PARP1), modifies PAP during heat shock, leading to an inhibition of polyadenylation (Di Giammartino et al., 2013). In addition to PAP, three nuclear non-canonical PAPs exists: Neo-PAP, TRAP, and Star-PAP. Neo-PAP has a function equal to that of PAP (Topalian et al., 2001). TRAP is only expressed in testis and has functions not fully understood (Lee et al., 2000). Star-PAP is necessary for the 3 -end formation of selected mRNAs, such as the cytoprotective enzyme heme oxygenase-1, which is upregulated in response to oxidative stress (Mellman et al., 2008) and the BIK protein, an initiator of mitochondrial apoptosis (Li et al., 2012b). Star-PAP activity is modulated directly by phosphatidylinositol 4,5-bisphosphate (PI4,5P 2 ) and Star-PAP associates with both PIPKIα, which can generate new PI4,5P 2 , and with the PI4,5P 2 sensitive protein kinase CKIα, which can phosphorylate Star-PAP (Laishram et al., 2011). The phosphorylation of Star-PAP is critical for its activity and PI4,5P 2 sensitivity.
Other polyadenylation factors. The nuclear poly(A)-binding protein (PABPN1) is upregulated during the induction of stem cells from somatic cells, correlating with a general shift toward proximal poly(A) site usage ). However, PABPN1 recently has been shown to suppress polyadenylation at weaker proximal poly(A) sites and loss of PABPN1 results in genomewide 3 UTR shortening (de Klerk et al., 2012;Jenal et al., 2012). Loss of PABPN1 also increases apoptotic markers in affected cells (Bhattacharjee and Bag, 2012).

Regulation by non-polyadenylation factors
Splice factors. Many known splice factors have been shown to affect polyadenylation ( Table 2).

Non-splice factors.
Many non-splice factors have also been shown to affect polyadenylation ( Table 3).

Regulation by transcription and co-transcriptional processes
Transcription. Polyadenylation is necessary for transcription termination (West and Proudfoot, 2009). It also affects transcription re-initiation at upstream promoters (Mapendano et al., 2010). The CTD of the Pol-II is necessary for the cleavage step in mammals (Licatalosi et al., 2002). It recruits CPSF, CstF, and CF-II and transfers them to their cis-elements in the pre-mRNA, as it emerges from the Pol-II (McCracken et al., 1997;Lunde et al., 2010). The recruitment is mediated by phosphorylation of the Pol-II CTD serine 2 (Ser2) residues, which are progressively phosphorylated during Pol-II elongation (Ahn et al., 2004). Poly(A) sites located near the 5 -end of genes are normally not used, preventing premature 3 -end processing. This is probably because of the low Ser2 phosphorylation in the Pol-II CTD right after transcription initiation, resulting in inefficient recruitment of polyadenylation factors (Guo et al., 2011). The general transcription factor TFIIB is required for recruiting CPSF and CstF to the promoter (Wang et al., 2010). The transcription elongation rate affects the choice of poly(A) site, with a lower rate enhancing usage of proximal poly(A) sites (Pinto et al., 2011). Transcription elongation factor ELL2 also enhances polyadenylation (Martincic et al., 2009). Pol-II occupancy is found to be higher at distal poly(A) sites than at proximal poly(A) sites (Lin et al., 2012). Transcriptional activity regulates APA, with highly expressed genes expressing short 3 UTR isoforms and lowly expressed genes expressing long 3 UTR isoforms. This as the Pol-II is more likely to pause at poly(A) sites in the highly expressed genes (Ji et al., 2011). As the mRNAs with short 3 UTRs generated from the highly expressed genes are more stable and produce more protein, this effect potentiates the gene expression difference. Transcriptional activators have also been shown to enhance polyadenylation (Nagaike et al., 2011). The choice of alternative 5 UTRs and promoters affect the choice of poly(A) site and thus APA (Winter et al., 2007;Ji et al., 2011).
Capping. The presence of a 5 -end m7GpppG cap has been shown to positively affect the efficiency of polyadenylation (Cooke and Alwine, 1996). The nuclear cap-binding complex (CBC), bound to the 5 cap, is necessary for assembling a stable 3 -end processing complex. Depletion of CBC thus strongly reduces the efficiency of the cleavage step (Flaherty et al., 1997).
Splicing. The patterns of APA and alternative splicing correlate across tissues, indicating coordinated regulation of these processes . Because the used intronic poly(A) sites are usually located in large introns with weak 5 splice sites, which require more time to be spliced out, it has been suggested that there is a dynamic competition between splicing and polyadenylation Evsyukova et al., 2012). Many of the trans-factors involved in polyadenylation are also known to take part in splicing Evsyukova et al., 2012). Similarly, a number of known splice factors have also been shown to affect polyadenylation ( Table 2).

Regulation by other processes
RNA structure. The secondary structure of RNA can affect polyadenylation. This is seen in HIV-1-strains where the proximal poly(A) site dominantly fold into a hairpin structure, thereby inhibiting polyadenylation (Gee et al., 2006). Highly used poly(A) sites are positively associated with an energetically favorable mRNA structure near poly(A) sites that exposes the PAS (Khaladkar et al., 2011).

Nucleosome positioning.
Decreased nucleosome density is found around poly(A) sites, most significantly around the actively used Frontiers in Endocrinology | Genomic Endocrinology  ones (Spies et al., 2009). Increased nucleosome density is found downstream of poly(A) sites, most significantly downstream of the actively used ones (Spies et al., 2009;Khaladkar et al., 2011).
Highly expressed genes have a lower nucleosome density around poly(A) sites than genes expressed at low levels (Ji et al., 2011).

Epigenetic modifications.
Allele-specific poly(A) site usage is seen in an imprinted mouse gene (H13), as a result of DNAmethylation. A CpG island separates polyA sites utilized in H13.
Alleles without methylation of the CpG island utilize a proximal poly(A) site, generating a truncated H13, whereas alleles with methylation of the CpG island, utilize downstream poly(A) sites (Wood et al., 2008). Histone methylation (H3K36me3) decreases downstream of used poly(A) sites (Lian et al., 2008;Wang et al., 2013). Highly expressed genes have increased levels of histone methylation (H3K4me3 and H3K36me3) around proximal poly(A) sites compared to around distal poly(A) sites (Ji et al., 2011;Lin et al., 2012). The pattern of histone modifications can thus characterize poly(A) sites and even discriminate between poly(A) sites with high and low usage, suggesting a link Frontiers in Endocrinology | Genomic Endocrinology between chromatin structure and mRNA structure (Khaladkar et al., 2011).
Circadian rhythms. The suppressor of cytokine signaling 3 (SOCS3) mRNA, encodes a protein important for the leptin signaling pathway. SOCS3 mRNA is expressed in two isoforms due to APA. The two isoforms are expressed in an oscillating pattern and in opposite phases (Ptitsyn and Gimble, 2007).

METHODS FOR CHARACTERIZING POLY(A) SITE USAGE COMPLEMENTARY DNAs, EXPRESSED SEQUENCE TAGS, AND MICROARRAYS
Early studies to identify poly(A) sites in human were done by aligning complementary DNAs (cDNAs) and expressed sequence tags (ESTs) to the genome, thereby identifying the 3 end of genes. These analyses revealed that more than half of human genes have multiple poly(A) sites and undergo APA (Beaudoing et al., 2000;Tian et al., 2005;Yan and Marr, 2005). Later studies using microarray platforms were designed to characterize APA in different cell types, cancers, and at different development stages. These analyses showed that a general 3 UTR shortening associates with increased proliferation, dedifferentiation, and cell transformation Singh et al., 2009) whereas a general 3 UTR lengthening associates with cell differentiation . Poly(A) site usage were also found to differ according to tissue type, developmental stage, genotype, and cancer subtype (Kwan et al., 2008;Sandberg et al., 2008;Singh et al., 2009). The cDNA and EST methods have limited statistical power and microarrays are limited to comparing the results only with already known poly(A) sites and have difficulties with quantifying the different 3 -UTR isoforms produced by APA. Thus new methods for characterizing poly(A) site usage were needed.

HIGH-THROUGHPUT SEQUENCING TECHNIQUES
With the development of new high-throughput sequencing techniques, several methods have been developed to specifically characterize poly ( . These new techniques provide higher coverage than any of the previous methods, allow better detection of the poly(A) sites and give a more precise quantification of their relative usage. High-throughput sequencing studies have confirmed that more than half of human genes have multiple poly(A) sites and undergo APA (Derti et al., 2012). They have also confirmed that 3 UTR shortening associates with proliferation  and that 3 UTR lengthening associates with differentiation (Shepard et al., 2011). However, they have not unanimously confirmed that 3 UTR shortening associates with cell transformation, but instead found that APA patterns can shift for both more proximal and more distal poly(A) site usage during cell transformation Elkon et al., 2012;Lin et al., 2012;Morris et al., 2012). Several computational studies have been performed to characterize patterns in the cis-elements around used poly(A) sites obtained from experiments (Beaudoing et al., 2000;Legendre and Gautheret, 2003;Tian et al., 2005;Ara et al., 2006;Nunes et al., 2010;Ozsolak et al., 2010). Computational methods have also been developed to identify poly(A) sites (Tabaska and Zhang, 1999;Graber et al., 2002;Hajarnavis et al., 2004;Bajic et al., 2005;Cheng et al., 2006;Retelska et al., 2006;Akhtar et al., 2010).

THE POLYADENYLATION PATTERN CHANGED PATTERNS OF POLYADENYLATION UNDER DIFFERENT CELLULAR CONDITIONS
Changes in cellular conditions can cause a widespread dynamic shift in the use of poly(A) sites by APA, altering the 3 UTRs, and in some cases the coding potential of the mRNAs. A shift toward more proximal poly(A) site usage associates with increased proliferation Elkon et al., 2012) and with dedifferentiation ). In contrast, shift toward more distal poly(A) site usage associates with cell differentiation (Figure 3) Mangone et al., 2010;Hilgers et al., 2011;Shepard et al., 2011). The developmental potency of a cell seems to be inversely correlated to 3 UTR length, with the 3 UTR being shortest in germ cells < stem cells < partly differentiated cells < terminally differentiated cells (Figure 4) Shepard et al., 2011). 3 UTR shortening has also been reported to be associated with cell transformation (Mayr and Bartel, 2009;Singh et al., 2009;Lin et al., 2012;Morris et al., 2012). Recent studies have however found that this association is inconsistent Elkon et al., 2012). Interestingly, there is an enrichment of binding sites for several transcription factors involved in proliferation/differentiation, including E2F, cmyc, and p53 in the promoter-regions of many of the genes coding for polyadenylation factors Elkon et al., 2012). High levels of E2F have been shown to enhance the expression of polyadenylation factors. E2F transcription factors are upregulated during proliferation . This might explain the shift in the APA during proliferation, while changes in levels of other transcription factors might explain the shift in APA during differentiation. Activation of neurons also induce 3 UTR shortening for multiple genes (Flavell et al., 2008). This is due to the relative shortage of splice factor U1 generated by the transcriptional upregulation found in activated neurons, as shortage of U1 leads to enhanced usage of proximal poly(A) sites (Berg et al., 2012).

During cell differentiation/dedifferentiation
Generally the 3 UTRs shorten when inducing pluripotent stem cells from somatic cell types, the opposite of what happens during embryonic development. When inducing from germ cells 3 UTRs generally lengthen, the opposite of what happens during postnatal testis development ). The genes with 3 UTR shortening during the induction were more likely to have had 3 UTR lengthening during embryonic development. Vice versa, the genes with 3 UTR lengthening during the induction were more likely to have had 3 UTR shortening during embryonic development ). The proximal poly(A) sites highly responsive to cell state change during the induction of somatic cells, were also highly regulated during embryonic development, but in the opposite direction. When comparing these www.frontiersin.org FIGURE 4 | 3 UTR length and developmental potential. The developmental potency of a cell seems to be inversely correlated to 3 UTR length. The 3 UTR is shortest in germ cells < stem cells < partly differentiated cells < terminally differentiated cells, due to different APA patterns in these cells.
poly(A) sites with less responsive proximal poly(A) sites in other genes, they were more conserved, had stronger DSE and lead to longer alternative 3 UTRs . A recent highthroughput sequencing study also found a correlation between 3 UTR lengthening and differentiation (Shepard et al., 2011).

During proliferation and cell transformation
A general shift toward more proximal poly(A) site usage is seen during increased proliferation, leading to 3 UTR shortening. Enhanced usage of proximal intronic poly(A) sites is also seen, leading to changes in mRNA CRs Elkon et al., 2012). 3 UTR shortening has also been reported be associated to cell transformation (Mayr and Bartel, 2009;Singh et al., 2009;Morris et al., 2012), with the transformed cells expressing mRNAs with shorter 3 UTRs compared with non-transformed cells with similar proliferation rate (Mayr and Bartel, 2009). 3 UTR shortening has also been associated with poor cancer prognosis (Lembo et al., 2012). 3 UTR shortening is, e.g., seen for fibroblast growth factor 2 (FGF-2), which is involved in tumor neovascularization. The distal poly(A) site is primarily utilized in fibroblasts, in contrast to in transformed cell lines, where more proximal poly(A) sites are utilized (Touriol et al., 1999). The 3 UTR shortening is stronger during the transition from arrested to proliferative state than during the transition from proliferative to transformed state . Recent studies have found the association between 3 UTR shortening and cell transformation to be inconsistent. One study found an even distribution between genes with 3 UTR shortening and lengthening in one transformed cell line . Another study found a shift toward usage of more distal poly(A) sites for the majority of genes in another transformed cell line . Both these studies did however find a general 3 UTR shortening in other transformed cell lines Elkon et al., 2012). Yet other studies found a general 3 UTR shortening compared to matched normal tissue, both in tumor samples from five different tissues (Lin et al., 2012) and in several colorectal carcinomas (Morris et al., 2012). These examples illustrate that the shifts in poly(A) site usage in transformed cells is more complex than just a shift toward usage of more proximal poly(A) sites.

THE POLYADENYLATION PATTERN AS A BIOMARKER
Poly(A) site usage differs according to tissue type, developmental stage, genotype, and cancer subtype (Breton et al., 2001;Zhang et al., 2005;Kubo et al., 2006;Kwan et al., 2008;Sandberg et al., 2008;Wang et al., 2008;Singh et al., 2009;Thomsen et al., 2010;Derti et al., 2012;MacIsaac et al., 2012). This makes the polyadenylation pattern a candidate biomarker, which could be used for, e.g., cancer classification. Tissue-specific APA is common and often associated with the proximal non-canonical poly(A) sites (Beaudoing et al., 2000). It has been estimated that 52% of all CR-APA events and 80% of all UTR-APA events are regulated differentially between tissues , making tissue-specific UTR-APA events even more differentially regulated than tissue-specific alternative splicing.

ALTERATIONS IN POLYADENYLATION IN ENDOCRINE DISEASE SEQUENCE CHANGES IN AND AROUND CIS-ELEMENTS
Changes in the sequence in and around cis-elements can disrupt normal polyadenylation and cause disease (reviewed in Danckwardt et al., 2008). Single nucleotide polymorphisms (SNPs) have been identified that can both create or disturb PASs (Thomas and Saetrom, 2012). SNPs in PASs can affect mRNA length and expression, and are significant predictors of gene expression levels (Thomas and Saetrom, 2012;Yoon et al., 2012). SNPs can also affect APA. A significant high fraction of the SNPs which affect APA, are linked to SNPs found by genome-wide association studies (GWAS). A high proportion of the APA-alleles created by these SNPs, are positively correlated with disease-risk alleles (Thomas and Saetrom, 2012). GWAS-indentified trait-associated SNPs are generally overrepresented in the 3 UTRs of genes, possibly creating, or affecting the cis-elements involved in polyadenylation (Arnold et al., 2012).

Loss of function changes
A functional poly(A) site is required for polyadenylation and thus transcription termination (Connelly and Manley, 1988;West and Proudfoot, 2009). Loss of function changes in the cis-elements of poly(A) sites, can therefore lead to reduced gene expression caused by weakened polyadenylation efficiency. Such loss of function changes are associated with various diseases. They were first described in the PAS of the only poly(A) site in the HBA2 gene causing α-thalassemias (Higgs et al., 1983;Harteveld et al., 1994) and in a PAS in the HBB gene causing β-thalassemias (Orkin et al., 1985;Jankovic et al., 1990;Rund et al., 1992;van Solinge et al., 1996). Loss of function changes causing disease, frequently seem to affect the PAS, which in one of the two core cis-elements found at poly(A) sites. Any possible base change in the canonical PAS (AAUAAA) significantly reduce polyadenylation efficiency (Sheets et al., 1990). The PAS is responsible for binding the core polyadenylation factor CPSF, which is required for polyadenylation (Chan et al., 2011). The reduced efficiency of polyadenylation associated with changes to the canonical PAS, is thus caused by weakened recruitment of CPSF to poly(A) sites. The examples of endocrine diseases affected by loss of function changes in and around cis-elements of poly(A) sites are described below.

Neonatal diabetes.
A loss of function mutation in the INS gene PAS has been shown to cause neonatal diabetes. This disruptive mutation changes the PAS of the only poly(A) site (AAUAAA to AAUAAG), reducing insulin mRNA expression to less than 3 × 10 -4 percent (Garin et al., 2010). The authors hypothesize that this is due to impaired mRNA stability. However, such a low expression level is more likely a result of weakened polyadenylation efficiency of the only poly(A) site in the INS gene, leading to impaired transcription termination and thus reduced gene expression, as seen similarly for other genes (Higgs et al., 1983;Bennett et al., 2001;Stacey et al., 2011). The change of the PAS sequence (AAUAAA to AAUAAG) reduces polyadenylation efficiency about 98% (Sheets et al., 1990). As the PAS is crucial for binding CPSF, the reduced polyadenylation efficiency in the mutated INS gene is due to weakened CPSF recruitment (Chan et al., 2011).

Immune dysfunction, polyendocrinopathy, enteropathy, Xlinked.
A loss of function mutation in the PAS of the only poly(A) site in the FOXP3 gene causes immune dysfunction, polyendocrinopathy, enteropathy, X-linked (IPEX). The mutation changes the PAS (AAUAAA to AAUGAA), leading to diminished FOXP3 mRNA expression (Bennett et al., 2001). This is probably due to impaired polyadenylation of the only poly(A) site in the FOXP3 gene, leading to non-termination of transcription, as seen similarly for other genes (Higgs et al., 1983;Stacey et al., 2011). The change of the PAS sequence (AAUAAA to AAUGAA) reduces polyadenylation efficiency about 95% (Sheets et al., 1990). The PAS is vital for binding CPSF, hence the impaired polyadenylation of the mutated FOXP3 gene is due to weakened binding of CPSF (Chan et al., 2011).

Mayer-Rokitanski-Küster-Hauser syndrome.
The AMH gene encoding Anti-Müllerian Hormone is located only 739 bp downstream of the housekeeping gene encoding splicing factor 3a subunit 2 (SF3a2) (Dresser et al., 1995), a component of the splice factor U2 (Bennett and Reed, 1993). Read-through transcription of the SF3a2 gene into the AMH gene, due to failed polyadenylation of the SF3a2 gene, is normally seen for a small proportion of transcripts (Dresser et al., 1995). This leads to constitutive expression of AMH along with SF3a2 and results in higher than normal AMH levels. It is hypothesized, that polymorphisms around the SF3a2 gene poly(A) site leading to reduced polyadenylation with more read-through transcription, could cause Mayer-Rokitanski-Küster-Hauser (MRKH) syndrome. MRKH syndrome is characterized by the Müllerian ducts failing to develop into the uterus, cervix, and upper vagina. Polymorphisms around the SF3a2 gene poly(A) site were however not found in 30 MRKH patients examined in one study (Oppelt et al., 2005).

www.frontiersin.org
Hypothalamic-pituitary-adrenal axis dysregulation. In the serotonin transporter (SERT ) gene a PAS U/G polymorphism exists in the distal, more canonical poly(A) site. The G-allele changes the PAS (AUUAAC to AGUAAC) leading to a lesser usage of the distal poly(A) site. This reduces total SERT expression to about half in the brain (Gyawali et al., 2010). This is sensible as the distal, canonical poly(A) site is most efficient for polyadenylation and most frequently utilized, therefore yielding higher mRNA expression levels (Wang et al., 2013). The G-allele is associated with an increased risk for panic disorder (Gyawali et al., 2010), as well as heightened anxiety and depressive symptoms (Hartley et al., 2012). Interestingly, the SSRI fluoxetine increases the usage of the distal poly(A) site, suggesting that the therapeutic effect of this compound depends on alterations in polyadenylation (Hartley et al., 2012). Serotonin has many functions in the endocrine system, e.g., in the regulation of the hypothalamicpituitary-adrenal axis (Heisler et al., 2007). This suggests that the SERT gene polymorphism could have endocrine implications as well.
Cancer. In the PAS of the only poly(A) site in the TP53 gene a A/C polymorphism exists. The C-allele changes the PAS (AAUAAA to AAUACA), leading to impaired polyadenylation and transcription termination. This results in decreased p53 mRNA expression and read-through trancription. The change of the PAS sequence (AAUAAA to AAUACA) reduces polyadenylation efficiency about 89% (Sheets et al., 1990). CPSF binds the PAS, so the reduced polyadenylation efficiency is due to weakened binding of CPSF (Chan et al., 2011). The C-allele is associated with various cancer diseases (Stacey et al., 2011;Zhou et al., 2012). This polymorphism could be important for endocrine tumor diseases as well, as loss of p53 is an important event in the pathogenesis of many tumors, e.g., thyroid tumors (Yoshimoto et al., 1992).

Gain of function changes
Gain of function changes in the cis-elements of poly(A) sites, can lead to enhanced gene expression caused by improved polyadenylation efficiency and to changes in APA. Such gain of function changes are associated with various diseases. They were first described in the CA dinucleotide immediately 5 of the cleavage site in the F2 gene encoding thrombin (Gehring et al., 2001;Ceelie et al., 2004;Danckwardt et al., 2004Danckwardt et al., , 2006Danckwardt et al., , 2007. Later they were also described in the DSE of the F2 gene (Danckwardt et al., 2004), as well as in the DSE of the FGG gene encoding fibrinogen gamma (Lisman et al., 2005;Uitte de Willige et al., 2007). All of these mutations cause thrombophilia through enhanced polyadenylation efficiency, leading to upregulated gene expression. The examples of endocrine diseases affected by gain of function changes in and around cis-elements of poly(A) sites are described below.

Type I diabetes.
The GIMAP5 gene has two poly(A) sites in its 3 UTR. In the PAS of the proximal poly(A) site a G/A polymorphism exists. The minor A-allele changes the PAS to a canonical one (AAUAGA to AAUAAA) leading to enhanced proximal poly(a) site usage (Shin et al., 2007). The change of the PAS sequence (AAUAGA to AAUAAA) increases polyadenylation efficiency about 25-fold (Sheets et al., 1990). The A-allele is associated with high levels of tyrosine phosphatase (IA-2) autoantibodies, which are reported to be associated with clinical onset of type I diabetes (Shin et al., 2007). Interestingly, the G-allele is associated with an increased risk of systemic lupus erythematosus (SLE) (Hellquist et al., 2007). The A-allele induce a higher proportion of mRNAs with the short 3 UTR isoform, but was not found to have an effect on total GIMAP5 mRNA levels (Hellquist et al., 2007). Differences in translational efficiency between the mRNAs with the long and short 3 UTR isoforms are thus believed to be affecting the levels of IA-2 autoantibodies.
Type II diabetes. A proximal intronic poly(A) site in the transcription factor 7-like 2 (TCF7L2) gene has been found. Usage of this poly(A) site leads to production of a C-terminally truncated protein isoform that repress T-cell factor/lymphoid-enhancer factor (TCF/LEF)-dependent target genes (Locke et al., 2011). The authors hypothesize that this poly(A) site might explain why intronic SNPs in the TCF7L2 gene are associated with a risk of type 2 diabetes, as they could enhance the utilization of this poly(A) site.

IGF-1 deficiency. A U/A polymorphism in a proximal poly(A)
site PAS of the insulin-like growth factor 1 (IGF-1) gene 3 UTR has been described. It changes the PAS (AAUAUA to AAAAUA), leading to expression of a shorter than normal IGF-1 mRNA. The U/A polymorphism has been reported in a child born small for gestation age with IGF-1 deficiency (Bonapace et al., 2003). A later study however reported that this polymorphism along with other polymorphisms in this PAS of the IGF-1 gene do not cause IGF-1 deficiency nor growth impairment (Coutinho et al., 2007).

Endocrine tumor diseases.
A C/T polymorphism in the RET proto-oncogene, six nucleotides upstream of an intronic proximal poly(A) site PAS of intron 19 exists (Gartner et al., 2005). The adjacent poly(A) site is commonly used for polyadenylation by CR-APA. The C/T polymorphism lies within a binding site of the nucleic acid binding protein Pbx1 (AAATTAG(C/T)T), which might affect polyadenylation at the adjacent poly (A) site. The Tallele carriers have the more canonical binding motif for Pbx1. Heterozygosity for the T-allele was found in a significantly higher number of patients with various endocrine tumors than in healthy controls. Homozygosity for the T-allele was exclusively found in DNA from endocrine tumors with a high malignant potential, as a result of loss of heterozygosity (LOH) in these tumors (Gartner et al., 2005). This suggests that Pbx1 enhances polyadenylation at the adjacent poly(A) site, leading to higher expression levels of RET.
Cancer. Gain of function mutations have been found in the CCND1 gene encoding Cyclin D1 in mantle cell lymphoma (MCL) tumors. These mutations create new proximal poly(A) sites, giving rise to expression of truncated mRNAs lacking most of the 3 UTR. These truncated mRNAs are associated with strongly proliferative MCL tumors and inferior survival (Wiestner et al., 2007). Such mutations could be important for endocrine tumor diseases as well, as high expression of Cyclin D1 is associated with many cancers, e.g., pancreatic endocrine tumors (Guo et al., 2003).

Cancer
Mutations in yeast polydadenylation factors have been shown to disrupt normal transcription termination (Birse et al., 1998) and to cause inefficient transcription elongation (Luna et al., 2005). They have also been shown to cause genome instability, through formation of R-loops (Stirling et al., 2012). R-loops are persistent, transcription-associated RNA:DNA hybrids that expose damageprone single stranded DNA (ssDNA) on the nonsense strand and may block the replication fork progression (Stirling et al., 2012). Non-terminated pre-mRNA transcripts, which form due to defective cleavage at the poly(A) site (Birse et al., 1998), could form the basis of these RNA:DNA hybrids found in R-loops. Genome instability is found in many human cancers, in part due to increased chromosome instability (CIN), which is linked to R-loops (Wahba et al., 2011). CIN is normally found as an early event in the oncogenic process and can lead to loss or gain of whole chromosomes or chromosomal fragments. Researchers working with yeast strains, each with mutations in one gene out of 305 genes involved with CIN, found that mutations in 44 of the genes resulted in an indicated increased CIN. Seven of these 44 genes encoded subunits of polyadenylation factors. They were homologs of CPSF-subunits CPSF-100, hFip1, and Wdr33, CstF subunit CstF-64, both subunits of CF-II and yeast factor HRP1, lacking a human homolog. The mutant stains had indicated increased CIN in all cell cycle stages (Stirling et al., 2012). Genome-wide analysis of fragile chromatin sites in these mutant stains supported a transcription-dependent mechanism of DNA damage, characteristic of R-loop formation. Increased RNA:DNA hybrid formation in polyadenylation factor mutant stains were found and the resulting CIN could be suppressed by RNaseH, which degrades the RNA and thereby removes the R-loops (Stirling et al., 2012). The researchers found that small interfering RNA (siRNA) knockdown of the CPSF-subunit hFip1 in human cells increased DNA damage and chromosome breakage (Stirling et al., 2012). Interestingly, hFip1 also comprises the N-terminal part of a fusion protein with platelet-derived growth factor receptor alpha (PDGFRα), which causes 14-60% the incidents of hypereosinophilic syndrome/chronic eosinophilic leukemia (Cools et al., 2003;Gotlib and Cools, 2008;Li et al., 2012a). Truncation fusions of yeast FIP1, analogous to those found for hFip1 in chronic eosinophilic leukemia, was found to cause genome instability in yeast (Stirling et al., 2012). All these findings indicate that the normal 3 -end processing complex maintains genome integrity by suppressing Rloop formation, thereby suppressing CIN, a function that may be impaired in many human cancers (Stirling et al., 2012). Mutations in genes encoding polyadenylation factors could be important for endocrine tumor diseases, as CIN is already known as a powerful prognostic indicator for endocrine pancreatic tumor patients (Jonkers et al., 2007).

Hereditary and sporadic parathyroid tumors
The tumor suppressor protein Cdc73 is mutationally inactivated in hereditary and sporadic parathyroid tumors (Rozenblatt-Rosen et al., 2009). It normally associates with CPSF-CstF complexes and enhances polyadenylation of specific target genes. The inactivation of Cdc73 leads to decreased association of CPSF-CstF complexes with these target genes, reducing their expression levels (Rozenblatt-Rosen et al., 2009). This suggests that impaired polyadenylation of these target genes plays a part in the parathyroid tumorigenesis.

Fragile X-associated premature ovarian insufficiency
A shift in poly(A) site usage due to UTR-APA is seen in the fragile X mental retardation 1 (FMR1) gene premutation alleles compared to normal alleles (Tassone et al., 2011). The FMR1 gene 5 UTR contains a CGG repeat element that is expanded to more than 200 CGG repeats and methylated in fragile X syndrome. The premutation alleles of the FMR1 gene, with 55-200 CGG repeats, are associated with various phenotypes, including fragile X-associated premature ovarian insufficiency (Toniolo and Rizzolio, 2007). These phenotypic manifestations could be a result of the UTR-APA of the FMR1 pre-mRNA. The choice of alternative 5 UTRs affects the choice of poly(A) site (Winter et al., 2007;Ji et al., 2011). This could explain how the CGG expansion in the 5 UTR of the FMR1 gene affects the choice of poly(A) site by altering the transcription start site (Tassone et al., 2011).

Diabetic nephropathy
A high-glucose-regulated gene, HGRG-14, found in human mesangial cells, is expressing a longer isoform of HGRG-14 mRNA due to UTR-APA under high-glucose conditions (Abdel Wahab et al., 1998). This results in lower HGRG-14 protein levels. The switch from the shorter to the longer isoform is detected within 2 h of exposure to high-glucose levels and the authors hypothesize that this may be involved in the pathogenesis of diabetic nephropathy (Abdel Wahab et al., 1998).

Medullary thyroid carcinoma
Human Thyroid C cells express an alternative isoform of calcitonin mRNA, containing all exons of the CALCA gene, producing calcitonin, and calcitonin carboxyl terminal peptide II (CCP II). Expression of this isoform is seen in low levels in normal subjects.
In medullary thyroid carcinomas expression is more abundant, leading to higher plasma levels of CCP II (Minvielle et al., 1991). CCP II could thus be used as a diagnostic marker for medullary thyroid carcinomas. As the alternative processing of the CALCA gene occurs through APA, polyadenylation regulation may be aberrant in the medullary thyroid carcinomas.

Pre-eclampsia
Soluble fms-like tyrosine kinase 1 (sFlt-1) arises from CR-APA utilizing an intronic poly(A) site in the FLT1 gene encoding VEGF receptor 1 (Thomas et al., 2007). sFlt-1 binds VEGF and thereby reduces its free circulating level. Expression of sFlt-1 is upregulated in cytotrophoblasts during hypoxia and is suggested to induce endothelial damage (Nagamatsu et al., 2004;Zhao et al., 2012). sFlt-1 expression is increased in pre-eclampsia and evidence suggests that it causes the development of the disease (Shibata et al., 2005;Zhao et al., 2012). Selective expression of sFlt-1 www.frontiersin.org can be induced by targeting the 5 splice site with an antisense oligonucleotide, thereby inhibiting U1 binding (Vorlová et al., 2011). This hints that U1 plays a role in the upregulation of sFlt-1. Several techniques using antisense elements can inhibit poly(A) site usage (see section Treatment by altering polyadenylation) (Beckley et al., 2001;Fortes et al., 2003;Goraczniak et al., 2009;Vickers et al., 2011;Vorlová et al., 2011;Blazquez et al., 2012;Vickers and Crooke, 2012). These techniques could be used to lower the expression of sFlt-1, possibly alleviating the symptoms or even treating pre-eclampsia in the future. Interestingly, sFlt-1 is also increased in type II diabetes (Nandy et al., 2010).

Ectopic Cushing syndrome
Ectopic Cushing syndrome is cause by adrenocorticotropic hormone (ACTH) production in non-pituitary tumors, such as small cell lung cancers, which is normally non-suppressible (Parks et al., 1998). An ACTH-producing small cell lung cancer was shown to express a C-terminally truncated isoform of glucocorticoid receptor due to CR-APA using an intronic poly(A) site. This caused the ACTH production to be non-suppressible by exogenous glucocorticoid administration (Parks et al., 1998). Several techniques using antisense elements can inhibit poly(A) site usage (see Treatment by altering polyadenylation) (Beckley et al., 2001;Fortes et al., 2003;Goraczniak et al., 2009;Vickers et al., 2011;Vorlová et al., 2011;Blazquez et al., 2012;Vickers and Crooke, 2012). If these techniques were used to target the intronic poly(A) site utilized by CR-APA, the normal full length isoform of glucocorticoid receptor would be expressed. This would render the ACTH production suppressible by glucocorticoids, possibly treating the Cushing syndrome.

Adrenal and gonadal dysfunction
The gene encoding the steroidogenic acute regulatory (StAR) protein express two mRNA isoforms (1.6 and 3.5 kb) differing only in the length of their 3 UTR due to UTR-APA (Duan et al., 2009). StAR regulates the rate-limiting step in steroid biosynthesis. ACTH stimulates the StAR function by increasing cAMP and cAMP-protein kinase A (PKA) activity, leading to preferential expression of the longer, less stable 3.5 kb StAR mRNA isoform (Zhao et al., 2005). PKA also stimulate transcription of the RBP TIS11b, which selectively binds the 3.5 kb mRNA isoform and destabilizes it, while at the same time enhancing StAR protein translation (Duan et al., 2009). Aberrant expression of StAR due to alterations in polydenylation could cause abnormal steroid production and adrenal/gonadal dysfunction, as seen for other disruptive mutations leading to enhanced or diminished StAR expression (Okuhara et al., 2008;Sahakitrungruang et al., 2010).

Enhancing poly(A) site usage
Specific usage of selected intronic poly(A) sites can be induced by targeting the upstream 5 splice site with an antisense oligonucleotide, thereby inhibiting U1 binding (Vorlová et al., 2011). As U1 associating with 5 splice sites represses adjacent downstream intronic poly(A) site, this enhances the usage of these intronic poly(A) sites and leads to expression of C-terminal truncated proteins (Kaida et al., 2010;Vorlová et al., 2011;Berg et al., 2012).
Unlike mRNAs with acquired premature termination codons, which also generate C-terminal truncated proteins, the mRNAs generated by usage of intronic poly(A) sites are not targeted by nonsense-mediated decay (Lejeune and Maquat, 2005). Thousands of dormant intronic poly(A) sites have been characterized (Yao et al., 2012a). Induced usage of any of these sites can lead to the expression of specific C-terminal truncated protein isoforms.

Inhibiting poly(A) site usage
Different techniques using antisense elements have been used to effectively inhibit poly(A) site usage. They include antisense oligonucleotides (Vorlová et al., 2011), siRNAs (Vickers and Crooke, 2012), and U1 modifications (Beckley et al., 2001;Fortes et al., 2003;Goraczniak et al., 2009;Vickers et al., 2011;Blazquez et al., 2012). The techniques can both induce APA and lead to significantly reduced gene expression, depending on the poly(A) site targeted.

DRUGS TARGETING POLYADENYLATION FACTORS
The specific PAP inhibitor cordycepin is an adenosine-nucleotide analog. It reduces polyadenylation efficiency and causes defects in transcription termination (Kondrashov et al., 2012). Cordycepin has been shown to inhibit proliferation and induce apoptosis in various cancer cell lines (Thomadaki et al., 2008;Imesch et al., 2011). Cordycepin also inhibit the induction of inflammatory genes mediated by cytokines (Kondrashov et al., 2012).

PERSPECTIVES
Recently, the field of polyadenylation has seen major progress. Advances in high-throughput sequencing have made it possible to characterize polyadenylation genome-wide and lowered the price to ∼100$ per sample (Wang et al., 2013). This has stimulated the field and more research on polyadenylation is now being published. This will help to find more endocrine pathways regulated by polyadenylation and thus to discover novel implications of alterations in polyadenylation for endocrine disease. The lowered cost of genome-wide characterization of polyadenylation will also push forward the implementation of this technique into the clinic to improve diagnostics. For instance, cancers show characteristic changes in APA patterns (Mayr and Bartel, 2009;Singh et al., 2009;Fu et al., 2011;Elkon et al., 2012;Lin et al., 2012;Morris et al., 2012). The inclusion of APA screenings in diagnostics could therefore help to produce more accurate diagnoses by, e.g., differentiating between cancer subtypes. It is therefore crucial to increase the awareness about the importance of polyadenylation among physicians and scientists working with endocrinology. Alternative polyadenylation is increasingly being recognized as an important regulator of the human transcriptome, like alternative splicing. It has been estimated that 52% of all CR-APA events and 80% of all UTR-APA events are regulated differentially between tissues, making tissue-specific UTR-APA events even more differentially regulated than tissue-specific alternative splicing . Interestingly, changes in the APA pattern is seen during changes in cellular conditions, such as proliferation, differentiation, cellular transformation, and dedifferentiation Mayr and Bartel, 2009;Singh et al., 2009;Fu et al., 2011;Elkon et al., 2012;Lin et al., 2012; Frontiers in Endocrinology | Genomic Endocrinology Morris et al., 2012). Unlike mRNAs created by alternative splicing, mRNAs generated by CR-APA are not targeted by nonsensemediated decay, thus lead to the production of C-terminally truncated protein isoforms (Lejeune and Maquat, 2005). Such isoforms can have significantly altered functions (Vorlová et al., 2011) and have been suggested to cause diseases such as pre-ecclamsia (Zhao et al., 2012). C-terminally truncated protein isoforms generated by CR-APA could be causative of many other diseases. UTR-APA alters the 3 UTR length and induces changes in stability and translational efficiency for individual mRNAs Mayr and Bartel, 2009;Singh et al., 2009;Hogg and Goff, 2010;Yepiskoposyan et al., 2011). However, the effect of global changes in 3 UTR length remains mysterious as no significant change in mRNA levels were seen in several cases with general changes in 3 UTR length Fu et al., 2011;Elkon et al., 2012;Morris et al., 2012). The latter findings could be explained by the mRNAs competing for trans-acting factors. For instance, in activated neurons, a general upregulation of pre-mRNAs, leads to a relative shortage of splice factor U1, which in turn, leads to changes in APA (Berg et al., 2012). Similarly, a general change in mRNA 3 UTR length could lead to a relative shortage/abundance of trans-acting factors, giving rise to a relative weaker effect on gene expression levels, than seen for a single gene expressing an alternative 3 UTR isoform. It would however still be interesting to investigate the changes in protein translation efficiency in cells with general changes in 3 UTR length.
Both loss of function and gain of function changes in and around poly(A) sites have been shown to cause disease (reviewed in Danckwardt et al., 2008). It is probable that loss of function changes most frequently cause disease, when affecting the stronger, more canonical poly(A) site of a gene. Firstly, these sites are most efficient for polyadenylation and the most commonly utilized. Usage of these sites therefore give rise to higher mRNA expression levels (Wang et al., 2013). Secondly, these sites are often found as the most distal poly(A) site, which works as the last option for polyadenylation and ensures that transcription is terminated (Beaudoing et al., 2000;Tian et al., 2005Tian et al., , 2007Shepard et al., 2011;de Klerk et al., 2012;Lin et al., 2012;Yoon et al., 2012;Wang et al., 2013). When loss of function changes affect these more canonical poly(A) sites, the effect on gene expression is therefore strong. This is seen for the SERT gene encoding a SERT. Here a loss of function change in the PAS of the distal, canonical poly(A) site, changes the PAS (AUUAAC to AGUAAC). This leads to lesser usage of the poly(A) site, reducing total SERT expression to about half (Gyawali et al., 2010). When loss of function changes affect the poly(A) site in genes where only one poly(A) site is found, the effect on gene expression is even stronger, as no alternative poly (A) sites may be used. This is seen for the INS, FOXP3, and TP53 genes, where loss of function changes in the PAS of the only poly(A) site in these genes, lead to an almost diminished gene expression (Bennett et al., 2001;Garin et al., 2010;Stacey et al., 2011). When loss of function changes affect the weaker poly(A) sites in genes with multiple poly(A) sites, the effect is changes in the APA pattern, altering the 3 UTR and in some cases the protein product. Gene expression is also affected, but often in a more subtle way. Interestingly, loss of function changes causing disease also commonly seem to affect the PAS (Higgs et al., 1983;Bennett et al., 2001;Garin et al., 2010;Gyawali et al., 2010;Stacey et al., 2011). This is predictable, as the PAS is one of the two core ciselements found at poly(A) sites. It is essential for binding CPSF, which is required for both the cleavage and the polyadenylation step (Chan et al., 2011). It is important to note that loss of function changes can also lead to subclinical manifestations, as seen in the metachromatic leukodystrophy pseudodeficiency phenotype. Here a change in the PAS (AAUAAC to AGUAAC) of a majorly used proximal poly(A) site in the ARSA gene reduces gene expression by 90% (Gieselmann et al., 1989). Conversely, it seems that gain of function changes associated with disease often affect proximal poly(A) sites. This is expected, as the proximal poly(A) sites are generally weaker and therefore have the biggest potential to be enhanced (Beaudoing et al., 2000;Tian et al., 2005Tian et al., , 2007Shepard et al., 2011;de Klerk et al., 2012;Lin et al., 2012;Yoon et al., 2012;Wang et al., 2013). Interestingly, gain of function changes causing disease also commonly seem to affect other cis-elements than the PAS, as seen for the CA dinucleotide immediately 5 of the cleavage site and for the DSE (Gehring et al., 2001;Ceelie et al., 2004;Danckwardt et al., 2004Danckwardt et al., , 2006Danckwardt et al., , 2007Uitte de Willige et al., 2007). This is also anticipated, as improvement of auxiliary cis-elements can both enhance polyadenylation at poly(A) sites with a canonical PAS and at poly(A) sites with a non-canonical PAS.
Various techniques using specific antisense elements that enhance or inhibit the usage of specific poly(A) sites exists (Beckley et al., 2001;Fortes et al., 2003;Goraczniak et al., 2009;Vickers et al., 2011;Vorlová et al., 2011;Blazquez et al., 2012;Vickers and Crooke, 2012). These techniques hold great promises of novel therapeutic approaches, as they can both induce APA and change gene expression significantly. One obvious use would be to reverse the effect of disease-causing mutations affecting polyadenylation. However, they could also be used to treat diseases, where alterations in polyadenylation do not take part in the pathogenesis. One way of treating could be by inducing the expression of C-terminally truncated protein isoforms. This primarily leads to loss of function of the full length product, but the expressed C-terminally truncated protein isoforms can have beneficial functions on their own. This is seen for the VEGF receptor, where a C-terminally truncated isoform functions as a soluble decoy. Induction of this isoform resulted in an antiangiogenic effect both in targeted cells and in untreated cells exposed to the conditioned media (Vorlová et al., 2011). Another way to treat using these techniques, is by significantly reducing gene expression by inhibiting the more canonical poly(A) site (Goraczniak et al., 2009;Blazquez et al., 2012). A potential side-effect free usage of these techniques, would be to inhibit premature intronic poly(A) sites, which are used in diseased cells, but not in healthy cells. As polyadenylation is inhibited at the intronic poly(A) site, the targeted intron will be spliced out and degraded, leaving the healthy cells unaffected by the treatment.
(2) Many diseases and symptoms are caused by defective polyadenylation (reviewed in Danckwardt et al., 2008). Several endocrine diseases are either caused by or associated with alterations in polyadenylation. These are neonatal diabetes (Garin et al., 2010), IPEX (Bennett et al., 2001), type I and II diabetes (Shin et al., 2007;Locke et al., 2011), pre-eclampsia (Zhao et al., 2012), fragile X-associated premature ovarian insufficiency (Tassone et al., 2011), ectopic Cushing syndrome (Parks et al., 1998), and several types of endocrine tumor diseases (Minvielle et al., 1991;Gartner et al., 2005). Consequently, the advances in this field therefore lead to a better understanding of these diseases and expose new possible drug targets. (3) Novel techniques using antisense elements that can both enhance and inhibit the usage of specific poly(A) sites have been developed (Beckley et al., 2001;Fortes et al., 2003;Goraczniak et al., 2009;Vickers et al., 2011;Vorlová et al., 2011;Blazquez et al., 2012;Vickers and Crooke, 2012). These techniques hold the promise of novel therapeutic approaches, through desired induced changes in polyadenylation.