Generation of 2′,3′-Cyclic Phosphate-Containing RNAs as a Hidden Layer of the Transcriptome

Cellular RNA molecules contain phosphate or hydroxyl ends. A 2′,3′-cyclic phosphate (cP) is one of the 3′-terminal forms of RNAs mainly generated from RNA cleavage by ribonucleases. Although transcriptome profiling using RNA-seq has become a ubiquitous tool in biological and medical research, cP-containing RNAs (cP-RNAs) form a hidden transcriptome layer, which is infrequently recognized and characterized, because standard RNA-seq is unable to capture them. Despite cP-RNAs’ invisibility in RNA-seq data, increasing evidence indicates that they are not accumulated simply as non-functional degradation products; rather, they have physiological roles in various biological processes, designating them as noteworthy functional molecules. This review summarizes our current knowledge of cP-RNA biogenesis pathways and their catalytic enzymatic activities, discusses how the cP-RNA generation affects biological processes, and explores future directions to further investigate cP-RNA biology.


INTRODUCTION
After transcription, newly synthesized RNA molecules must undergo maturation steps to become functional molecules, and unnecessary RNAs are subjected to turnover. In both the RNA maturation and turnover mechanisms, enzymatic cleavage of RNA molecules plays a crucial role. When cleaved, RNAs can generally possess a hydroxyl group (OH), a phosphate (P), or a 2 ,3cyclic phosphate (cP) at their termini. While OH and P can be found at both the 5 -and 3 -ends of RNAs, a cP presents only at the 3 -end of RNAs in which the 2 -and 3 -positions of ribose is bridged by the phosphate (Figure 1). Catalytic machineries of RNA cleavage determine the terminal phosphate states of the generated RNA molecules, which is not just a consequence of the cleavage, but, in many cases, is critical for further RNA maturations and functions. The current, standard RNA-sequencing (RNA-seq) methods rely on 5 -P/3 -OH ends of RNAs, and thus, RNAs with a cP (cP-containing RNAs: cP-RNAs) cannot be captured because cP end cannot be ligated to the 3 -adapter by ATP-dependent ligase. Consequently, cP-RNAs are "invisible" in RNA-seq data and therefore form a hidden component of transcriptome. However, accumulating evidence indicates that the cP-RNA generation is significant in various biological processes. Here, we summarize our knowledge of cP-RNAs' biogenesis mechanisms, expression, and molecular functions and discuss how to further interrogate cP-RNA biology.

POSSIBLE CATALYTIC MECHANISMS OF cP FORMATION
There are multiple situations in which a cP is formed at the 3end of RNA molecules. cP frequently appears as an intermediate form during RNA cleavage by many endoribonucleases [e.g., pancreatic ribonuclease (RNase A), RNase T 1 , and RNase T 2 ], which eventually generate RNAs with 3 -P/5 -OH ends (Cuchillo et al., 1997;Irie, 1997;Nichols and Yue, 2008). RNA cleavage by these enzymes is composed of two steps: (i) transesterification (transphosphorylation), forming an intermediate cP, and (ii) cP hydrolysis to generate a 3 -P (Fabian and Mantsch, 1995;Lilley, 2011) (Figure 2A). RNase A, the best studied enzyme among such endoribonucleases, contains a catalytic triad, His12, Lys41, and His119, which, especially the two histidines, serve as general acidbase catalysts during both steps (Roberts et al., 1969;Thompson and Raines, 1994;Cuchillo et al., 2011). Step (i) is initiated with 2 -OH deprotonation by a base catalyst, His12, followed by nucleophilic attack of the phosphorus by the generated 2 -oxygen (O), which causes transesterification to form a 2 ,3 -cP. His119 assists the reaction as an acid catalyst by donating a proton to the leaving group, forming a 5 -OH end. Lys41 forms a hydrogen bond with 2 -O to transiently stabilize the cP. In step (ii), to hydrolyze the cP, His119 serves as a base catalyst to remove a proton from the vicinal water molecule, while His12 serves as an acid catalyst by donating a proton to form 2 -OH, generating a 3 -P as a final form.
Although the above case produces a cP just as an intermediate form, many ribonucleases, as summarized in Table 1, generate a cP as a final form via their RNA cleavage that only conducts step (i) without proceeding to step (ii) ( Figure 2B). As a well-studied example, RNA cleavage by angiogenin (ANG), an endoribonuclease belonging to the RNase A superfamily (Dyer and Rosenberg, 2006;Sheng and Xu, 2016), yields a cP end (Shapiro et al., 1986). ANG contains the catalytic triad, His13, Lys40, and His114, which are well-conserved among RNase A superfamily members, but shows 10 5 -10 6 -fold lower ribonucleolytic activity compared to RNase A (Shapiro et al., 1986;Harper and Vallee, 1989). Certain unique structural features of ANG can explain this low catalytic activity. ANG's substrate binding pocket is obstructed by Gln117 Russo et al., 1994) (Figure 2C), which is stabilized by a hydrogen bond with Thr44 (Leonidas et al., 1999(Leonidas et al., , 2002Holloway et al., 2004). Two hydrogen bonds from Asp116 and Ser118 further stabilize Gln117's obstructive position (Russo et al., 1994). These steric hindrances would cause decreased substrate accessibility, possibly leading to low cleavage and cP hydrolysis activities. Indeed, a single mutation of Gln117 to Gly showed at least a ∼20-fold increase in ribonuclease activity, as well as a ∼28-fold increase in cP hydrolysis activity (Russo et al., 1994;Leonidas et al., 2002). In addition, Asp116 could contribute to low cP hydrolysis activity as well as low cleavage activity. While the corresponding Asp121 of RNase A forms a hydrogen bond with catalytic His119, presumably to support its imidazole ring orientation, Asp116 of ANG does not support catalytic His114 but forms two hydrogen bonds with Ser118 (Leonidas et al., 2002). The lack of support for His114 should have an adverse effect on the cP hydrolysis reaction because His114 would have initiated the reaction as a base catalyst, possibly leaving a cP as a final form.
RNA cleavage by colicin E5, a cytotoxic endoribonuclease found in Escherichia coli, also yields a cP as a final form (Ogawa et al., 1999;Ogawa et al., 2006), presumably due to cP structure stabilization. The ribonuclease domain of colicin E5 does not contain histidines, the most frequently utilized catalytic residues (Bartlett et al., 2002), but possesses Arg33 and Lys25 (numbering from C-terminal domain) as catalytic residues Inoue-Ito et al., 2012). Although the catalytic mechanism remains to be further examined, these residues, along with Ile94 that supports the orientation of Arg33, might stabilize a cP structure (Inoue-Ito et al., 2012), which would contribute to generating a cP as a final form. A cP structure may also be stabilized through interaction with a protein during RNA cleavage of a eukaryotic cP-forming exoribonuclease, U six biogenesis protein 1 (USB1), also known as mutated in poikiloderma with neutropenia protein 1 (MPN1). USB1 contains two well-conserved His-x-Ser (HxS) catalytic motifs in the active site cleft (Mroczek et al., 2012;Hilcenko et al., 2013). It is speculated that, while His120 and His208 in these motifs serve as general acid-base catalysts, Ser122 and Ser210 in these motifs coordinate the oxygens in a cP after transesterification, potentially stabilizing a cP structure as a final form by preventing further hydrolysis.
While cP end is predominantly formed by ribonucleasecatalyzed transesterification, RNA 3 -terminal phosphate cyclase (RtcA) can catalyze de novo cP formation by a distinct molecular mechanism involving the following three steps (Genschik et al., 1997(Genschik et al., , 1998Billy et al., 2000;Filipowicz, 2016). First, RtcA is autoadenylylated with ATP to form a covalent RtcA-AMP intermediate. The autoadenylylation is initiated by a His309 (in E. coli RtcA; His320 in human RtcA)-mediated Three catalytic residues are shown in pink, and the residues forming a substrate binding pocket and/or associating with Gln117 are shown in green. Orange dotted lines indicate the hydrogen bonds that particularly support Gln117's obstructive position (Leonidas et al., 1999(Leonidas et al., , 2002Holloway et al., 2004).
nucleophilic attack of ATP α-phosphorus, followed by covalent bond formation and pyrophosphate (PPi) release. Second, the holoenzyme then transfers the AMP to 3 -P of the substrate RNA to form an RNA with 3 -PP-5 A. Third, the energetically unstable phosphoanhydride bond between the two phosphates is cleaved by 2 -OH-mediated attack, resulting in cP formation and releasing AMP.

Ribonucleases
Although the detailed molecular basis of cP formation remains to be determined, RNA cleavage by many ribonucleases produces a cP as a final, predominant form, generating cP-RNAs (Table 1). A tRNA splicing endonuclease is one of the oldest ribonucleases known to generate cP-RNAs (Abelson et al., 1998;Hopper and Phizicky, 2003;Yoshihisa, 2014). In eukaryotes, precursors of some tRNAs, such as tRNA LeuCAA , tRNA IleUAU , and tRNA TyrGUA , contain an intronic region within their anticodon-loop (Chan and Lowe, 2016). Although the splicing activity to remove tRNA introns and cP formations during the splicing were discovered in the early 1980s (Peebles et al., 1983), many years and much effort were required to identify tRNA-splicing endonuclease subunit 2 (Sen2) as the endoribonuclease directly responsible for tRNA splicing (Trotta et al., 1997;Paushkin et al., 2004;Phizicky and Hopper, 2010), partly due to its membrane association property and low cellular expression level. Sen2 is a subunit of the heterotetrameric SEN complex and cleaves the 5 -splice site of tRNAs to leave a cP at the 3 -end of 5 -exons ( Figure 3A), whereas the 3 -splice site is cleaved by Sen34. As expected from its crucial role in tRNA splicing, SEN2 is an essential gene in yeast (Trotta et al., 1997). In humans, SEN2 gene mutations are associated with pontocerebellar hypoplasia (Budde et al., 2008;Namavar et al., 2011;Bierhals et al., 2013).
Angiogenin, originally identified as a protein factor promoting angiogenesis Kurachi et al., 1985;Strydom et al., 1985), is another ancient enzyme that produces cP-RNAs (Shapiro et al., 1986). ANG has diverse physiological roles and is associated with various pathological conditions such as cancers and neurodegenerative diseases (Tello-Montoliu et al., 2006;Gao and Xu, 2008;Sheng and Xu, 2016). tRNAs were identified as major endogenous RNA targets of ANG in Xenopus oocytes (Saxena et al., 1992), and, subsequently, ANGmediated cleavages of tRNA anticodon-loops were reported to generate functional tRNA half molecules in human cell lines (Fu et al., 2009;Yamasaki et al., 2009; ( Figure 3B). In hormone-dependent cancer cells, ANG cleavage has been shown to occur for mature aminoacylated tRNAs, generating 5 -tRNA halves with a 5 -P and a 3 -terminal cP, and 3 -tRNA halves with a 5 -OH and a 3 -terminal amino acid . Although ANG homologs are only found in vertebrates (Cho and Zhang, 2006;Sheng and Xu, 2016), 5 -tRNA halves expressed in Bombyx mori cells still contain a cP (Honda et al., 2017), suggesting that, even in the absence of ANG homologs, those organisms express an unidentified cP-forming endoribonuclease to cleave tRNAs and generate tRNA halves as cP-RNAs. In vertebrates, U6 snRNA mostly contains a cP (Lund and Dahlberg, 1992), whereas all other snRNAs do not (Maraia and Intine, 2002). During maturation of U6 snRNA, several uridines are added to the 3 -end of a precursor RNA by terminal uridylyl transferase 1 (TUT1). Subsequently, USB1 (also known as MPN1), a cP-forming 3 to 5 exoribonuclease, excises a 3terminal uridine stretch to generate a mature 3 -end with four or five uridines containing a cP (Shchepachev et al., 2012;Mroczek and Dziembowski, 2013;Didychuk et al., 2018) (Figure 3C). Although USB1 belongs to the 2H phosphoesterase superfamily and contains a cyclic phosphodiesterase (CPDase) motif Mazumder et al., 2002;Myllykoski et al., 2013), human USB1 lacks the CPDase activity and thus generates a cP as a final form. In contrast, yeast Usb1 retains the CPDase activity (Didychuk et al., 2017), generating 3 -P end of U6 snRNA (Lund and Dahlberg, 1992). It will be intriguing to address how and why the difference arose.
rRNA maturation requires a cP-forming endoribonuclease, Lethal in the absence of Ssd1 (Las1 in yeast; Las1L in human) (Castle et al., 2010;Schillewaert et al., 2012). In yeast, rRNA maturation starts from processing of a nascent 37S rRNA precursor into shorter precursors, including 27S rRNA (Henras et al., 2015;Gerstberger et al., 2017). The 27S rRNA is further cleaved by Las1 between 5.8S and 25S rRNA sequences, generating 7S rRNA as a 5 -cleavage product with a cP (Gasse et al., 2015;Pillon et al., 2017). The cleavage is catalyzed by an N-terminal α-helical 'higher eukaryotes and prokaryotes nucleotide-binding' (HEPN) domain of Las1, which has been defined as a conserved R xxxH catalytic motif ( : N, D, or H) (Anantharaman et al., 2013;Pillon et al., 2017). During further processing of 7S rRNA into mature 5.8S rRNA, a cP end of 7S rRNA is removed and processed by unknown mechanisms; therefore, cP is absent in mature rRNAs.
cP is also formed in a mRNA splicing event that plays a crucial role in activating the unfolded protein response (UPR) pathway upon endoplasmic reticulum (ER) stress. Inositol-requiring enzyme 1 (Ire1), a cP-forming endonuclease, is associated with the ER membrane with its C-terminal domain exposed to the cytosol (Urano et al., 2000;Zhang and Kaufman, 2004). While an interaction with 'binding immunoglobulin protein' (BiP), an ER chaperon protein, retains Ire1 as an inactive monomer under normal conditions, ER stress releases BiP, allowing Ire1 to form a homodimer that harbors an active nuclease domain. The activated Ire1 is involved in splicing of HAC1 mRNA (in yeast; XBP1 mRNA in human) by cleaving both 5 -and 3 -splice sites, in which a cP is formed at the conserved 3 -terminal G of 5 -cleavage products (Sidrauski and Walter, 1997;Gonzalez et al., 1999;Shinya et al., 2011) (Figure 3D). From the spliced, mature form of HAC1 mRNA, a basic-region leucine-zipper transcription factor HAC1 is expressed, eventually promoting the transcription of its target genes containing UPR-responsive elements (Sidrauski and Walter, 1997;Urano et al., 2000;Zhang and Kaufman, 2004).
cP-forming endoribonucleases are further found in colicins, toxic proteins that are encoded in plasmid DNAs in some E. coli strains to invade and kill other bacteria (Cascales et al., 2007). Among over 20 colicins identified thus far, colicin E5 and D have been shown to cleave the anticodon-loop of tRNAs and form a cP (Ogawa et al., 1999;Ogawa et al., 2006) (Figure 3E). While endoribonuclease activity of those colicins is masked by immunity proteins in host E. coli, colicin E5 invades other bacteria and cleaves tRNA TyrGUA , tRNA HisGUG , tRNA AsnGUU , and tRNA AspGUC between G at nucleotide position 34 (G 34 ) and U 35 (Ogawa et al., 1999, and colicin D cleaves all four isoacceptors of tRNA Arg between A 38 and G 39 /C 39 (Tomita et al., 2000), contributing to bacterial lethality.
The E. coli genome also encodes cP-forming endoribonucleases involved in toxin-antitoxin (TA) systems. TA systems involve bacterial stress responses, often considered "suicidal programs, " comprising a stable toxin and an unstable antitoxin that neutralizes the cognate toxin in cells (Unterholzner et al., 2013). In the well-studied MazEF system (Figure 3F), toxic endoribonuclease MazF is neutralized by antitoxin MazE under normal conditions, but various stresses, such as nutrient limitation, DNA damage, and antibiotic exposure, degrade MazE and thereby release MazF (Jensen and Gerdes, 1995;Yarmolinsky, 1995;Engelberg-Kulka and Glaser, 1999) which cleaves whole cellular mRNAs to prevent further protein production (Zhang et al., 2003). MazF cleaves the 5 -side of an ACA motif within mRNAs, and forms a cP (Zhang et al., 2003(Zhang et al., , 2005aVesper et al., 2011). Recent reports showed MazF-catalyzed cleavage of 16S and 23S rRNAs, and some tRNAs such as tRNA LysUUU (Vesper et al., 2011;Moll and Engelberg-Kulka, 2012;Schifano et al., 2013;Schifano et al., 2014Schifano et al., , 2016Mets et al., 2017), indicating that MazF is a critical suicide factor causing perturbation of the whole cellular transcriptome. The ChpBIK system, another TA system, also uses a cP-forming enzyme as a toxin. When released from antitoxin ChpBI under stress conditions, toxic endoribonuclease ChpBK cleaves mRNAs at the 5 -or 3 -side of A in an ACY sequence motif to prevent further protein production (Christensen et al., 2003;Zhang et al., 2005b). The 5 -cleavage products contain.
The genome of some E. coli isolates possesses a prr locus, encoding PrrC endonuclease (also known as anticodon nuclease: ACNase), which is considered to be another bacterial suicide program (Kaufmann, 2000). PrrC activity is usually silenced by interaction with a masking protein, but, upon T4 phage infection, it forms an ACNase complex and cleaves tRNA LysUUU between U 33 and U 34 , which serve as a host defense to inhibit translation of T4 proteins (Amitsur et al., 1987;Morad et al., 1993). The 5 -tRNA LysUUU half resulting from the PrrC cleavage harbors a cP.
cP-forming cytotoxic endoribonucleases are also present in eukaryotes. Zymocin and PaT are toxin complexes secreted by the yeasts Kluyveromyces lactis and Pichia acaciae, respectively, to inhibit the growth of other yeasts (Lu et al., 2005(Lu et al., , 2008Klassen et al., 2008) (Figure 3E). Zymocin is composed of the three subunits; two of them assist target cell binding and invasion, while the remaining γ-subunit cleaves tRNAs in targeted yeasts (Stark and Boyd, 1986). The γ-subunit of zymocin recognizes a 5-methoxycarbonylmethyl-2-thiouridine (mcm 5 s 2 U), a specific modified RNA nucleotide present at np 34 of tRNA GluUUC , tRNA LysUUU , and tRNA GlnUUG , and cleaves between U 34 and U 35 of those tRNAs (Lu et al., 2005), leaving a cP at the ribose of mcm 5 s 2 U in the cleavage products. PaT is a heterodimer composed of PaOrf1, a cell invasion-assisting subunit, and PaOrf2, an endonuclease subunit (McCracken et al., 1994;Klassen et al., 2004). PaOrf2 recognizes 5methoxycarbonylmethyl uridine (mcm 5 U) and cleaves between U 34 and U 35 of tRNA GlnUUG , leaving a cP at the ribose of mcm 5 U in the cleavage product (Klassen et al., 2008;Chakravarty et al., 2014).
cP-forming endoribonucleases are further found in viruses. DNA topoisomerase, encoded in vaccinia virus, belongs to the type IB family of eukaryotic DNA topoisomerases and uniquely harbors endoribonucleolytic activity, which forms a cP end at the cleaved RNAs (Sekiguchi and Shuman, 1997;Shuman, 1998). Analogous to yeast topoisomerase I, which can remove single ribonucleotides in DNA duplexes (Kim et al., 2011), topoisomerase's RNA cleavage activity might be involved in maintaining genome integrity during DNA replication. Replicative nidoviral uridylate-specific endoribonuclease (NendoU), encoded in nidovirus, is also a cP-forming endoribonuclease (Ivanov et al., 2004). While the functional role of the endoribonuclease activity in virus infection and replication is not fully understood, NendoU preferentially targets dsRNA and cleaves the 5 -side of uridine in G-U or G-U-U sequence to generate cP-RNAs (Ivanov et al., 2004).

Ribozymes
Ribozymes are another cP-yielding biocatalyst. Among several distinct classes of ribozymes, a class of small, self-cleaving ribozymes is known to generate cPs (Scott and Klug, 1996;Doherty and Doudna, 2001;Serganov and Patel, 2007). Small self-cleaving ribozymes are widely found in bacterial, plant, and mammalian genomes, and are involved in gene controls and expressions (Shih and Been, 2002;Serganov and Patel, 2007). Out of 11 identified ribozymes in this class, 10 have been shown to form cP ends as a result of their cleavage of RNAs (Saville and Collins, 1990;Scott and Klug, 1996;Winkler et al., 2004;Salehi-Ashtiani et al., 2006;Roth et al., 2014;Harris et al., 2015;Li et al., 2015;Weinberg et al., 2015). In the case of the hepatitis delta virus (HDV) ribozyme, the 85-nt minimal self-cleavage domain cleaves between U −1 and G 1 (Shih and Been, 2002;Puerta-Fernandez et al., 2003). While C 75 is suggested to act as a general acid catalyst by donating a proton from its N3 in the pyrimidine ring to a leaving group, several different molecules have been proposed as potential base catalysts: water or hydroxide from the solvent, water molecules coordinated to the Mg 2+ , or 2 -OH of G 27 positioned closely adjacent to the catalytic site (Ward et al., 2014).

Enzymes That Act Directly on the 3 -End of RNAs
There are two protein catalysts that have been reported to form a cP by a distinct molecular mechanism from transesterification during RNA cleavage. As described above, RtcA can catalyze de novo cP formation by directly acting on the 3 -end of RNAs (Genschik et al., 1997(Genschik et al., , 1998Billy et al., 2000;Filipowicz, 2016). Although endogenous RNA target of RtcA is unknown, in the E. coli genome, rtcA and an RNA ligase rtcB form an rtcBA operon, which is implicated in RNA repair pathway (Das and Shuman, 2013a;Burroughs and Aravind, 2016). Archaeal thermophilic RNA ligase from Methanobacterium thermoautotrophicum, MthRnl, is the other enzyme which can also catalyze de novo cP formation (Zhelkovsky and McReynolds, 2014;Yoshinari et al., 2017). While MthRnl can ligate 5 -P and 3 -OH ends of RNAs (Torchia et al., 2008), when substrate RNAs contain a 3 -P, MthRnl coverts it to a 3 -cP by a similar mechanism with RtcA (Zhelkovsky and McReynolds, 2014). In addition, MthRnl possesses the 3 -deadenylation activity which can remove a 3 -terminal adenosine with an OH end and form a cP (Yoshinari et al., 2017). Endogenous RNA target of MthRnl is unknown.

BIOLOGICAL SIGNIFICANCE OF cP FORMATION AND cP-RNA EXPRESSION
What is the significance of cP formation in RNAs? It has been shown that cP formation in U6 snRNA regulates RNA interaction with protein factors. While nascent U6 snRNA containing 3 -OH end is bound by La protein (Maraia and Intine, 2002;Maraia and Bayfield, 2006), cP formation of mature U6 snRNA promotes interaction with Lsm2-8 complexes (Khusial et al., 2005;Licht et al., 2008) (Figure 3C). The affinity of the cP-containing RNA to Lsm2-8 is higher than 3 -OH-containing RNA, and the interaction of La/3 -OH and Lsm2-8/cP is mutually exclusive: even when both La and Lsm2-8 exist in the reaction solution, RNA with 3 -OH or with cP only binds to La or Lsm2-8, respectively (Licht et al., 2008). cP formation is, therefore, a critical factor for forming functional spliceosome complexes with Lsm2-8 (Didychuk et al., 2018). Although this is the only proven example of the cP-regulated formation of RNA-protein complex, cP formation in other cP-RNAs may modulate RNA interaction with protein factors.
RNA ligation reaction can depend on a cP in a substrate RNA. In tRNA splicing, Sen2-mediated cleavage forms a 3terminal cP in 5 -exons, which is then ligated to the 5 -OH end of 3 -exons by tRNA ligase (Popow et al., 2012;Yoshihisa, 2014) ( Figure 3A). In Arabidopsis thaliana, the tRNA ligase AtRNL is able to ligate cP ends to 3 -exons but cannot use 3 -P ends as its ligation substrate (Schutz et al., 2010;Tanaka et al., 2011a). This cP-specific ligation activity was also observed in wheat germ extract (Konarska et al., 1982). In this plant ligation process, cP ends of 5 -exons are first converted to 2 -P and 3 -OH. 5 -OH ends of 3 -exons are phosphorylated, followed by ligation to 3 -OH of 5 -exons (Popow et al., 2012;Yoshihisa, 2014). Other organisms employ distinct molecular mechanisms in ligation of cP-containing 5 -exons to 3 -exons in tRNA splicing (Popow et al., 2012;Yoshihisa, 2014). In humans, RtcB was identified as a tRNA ligase (Popow et al., 2011). Experiments using lysates and RtcB immunoprecipitates from HeLa cells suggest that human RtcB prefers cP and 5 -OH for ligation (Filipowicz et al., 1983;Popow et al., 2011). However, whether the substrate specificity extends to 3 -P and 5 -OH containing RNA still awaits analysis using a recombinant human tRNA ligase complex. In mammals, RtcB is involved in splicing of XBP1 mRNA in the UPR pathway (Filipowicz, 2014;Jurkin et al., 2014;Lu et al., 2014) (Figure 3D). E. coli RtcB is also able to ligate cP and 5 -OH, as well as 3 -P and 5 -OH (Tanaka and Shuman, 2011;Tanaka et al., 2011b). In the ligation, cP is first converted to 3 -P, then ligated to 5 -OH (Tanaka et al., 2011a). E. coli RtcB can catalyze the religation of 16S rRNA at the site cleaved by stress-induced MazF activity, which generates full-length 16S rRNA and contributes to restoration from the stress conditions (Temmel et al., 2017).
Besides influencing interaction and activity of proteins, cP formation may play a role in stabilizing RNA molecules by protecting them from degradation. Ehrlich exoribonuclease extracted from Ehrlich ascites cells and various mouse tissues, later defined as exoribonuclease II, was shown to degrade singlestranded RNAs with 3 -OH ends more rapidly than those with cP and 3 -P ends (Sporn et al., 1969), suggesting that cP formation is advantageous for RNA molecules to exist stably in cells. In contrast, because RNAs with 3 -P ends are more rapidly degraded by the exosome complex exoribonuclease, Rrp44, than those with 3 -OH ends (Zinder et al., 2016), cP formation could also negatively impact the stability of cP-RNAs. Thus, a cP structure might be able to regulate RNA stability in both directions by affecting degradation activity of nucleases or modulating RNAprotein interactions. Further study is required to shed more light on the potential function of cP formation in RNA stability.
The above described advantages of cP formation may, in turn, suggest the biological significance of cellular cP-RNA expression. While the functional significance of U6 snRNA, which belongs to cP-RNAs, or tRNAs and rRNAs, whose biogenesis is intermediated by cP-RNAs, have been apparent for a long time, previously uncharacterized cP-RNAs are now being demonstrated as functional molecules which play important roles in various biological processes. Representative examples of such functional cP-RNAs include the 5 -tRNA half molecules. In mammalian cells, various stress stimuli trigger ANG-mediated tRNA cleavage to produce functional tRNA halves, termed tRNA-derived stress-induced RNAs (tiRNAs) (Fu et al., 2009;Yamasaki et al., 2009) (Figure 3B). 5 -tiRNAs, corresponding to 5 -tRNA halves, have been shown as functional molecules that can promote formation of stress granules and regulate translation via YB-1 protein-involved pathway (Emara et al., 2010;Ivanov et al., 2011Ivanov et al., , 2014Lyons et al., 2016).
ANG-mediated tRNA cleavage is also promoted by sex hormone signaling pathways in hormone dependent breast and prostate cancer cells, generating a distinct class of tRNA halves termed sex hormone-dependent tRNA-derived RNAs (SHOT-RNAs)  ( Figure 3B). 5 -SHOT-RNAs, belonging to cP-RNAs, promote cell proliferation. The expression levels of SHOT-RNAs in tissues and serum of prostate cancer patients have been shown to be associated with pathological and prognostic parameters, suggesting the use of SHOT-RNAs as potential diagnostic biomarkers (Zhao et al., 2018). In terms of diseases, many different ANG gene mutations have been identified in patients with amyotrophic lateral sclerosis (ALS) and Parkinson's disease (Tello-Montoliu et al., 2006;Gao and Xu, 2008), implying that ANG-catalyzed production of tRNA halves could be involved in the pathogenesis of these neurodegenerative disorders (Thiyagarajan et al., 2012). Indeed, accumulation of tRNA halves contributes to the pathogenesis of a syndromic form of intellectual disability and Dubowitz-like syndrome (Blanco et al., 2014).
cP-RNAs can also function as direct precursors for shorter functional RNAs. In B. mori germ cells, some abundant species of Piwi-interacting RNAs (piRNAs), a germline-specific class of small regulatory RNAs, are produced directly from cP-containing 5 -tRNA halves (Honda et al., 2017) (Figure 3G). Although many microRNAs (miRNAs) are derived from tRNAs (Shigematsu and Kirino, 2015;Telonis et al., 2015), whether the tRNA-derived miRNAs are also generated from cP-containing tRNA halves has not been examined yet. Further research may reveal more evidence of cP-RNA uses as direct precursors for functional RNAs.

SPECIFIC SEQUENCING AND QUANTIFICATION OF cP-RNAs
To further expand cP-RNA research, it is imperative to capture cP-RNA expression profiles accurately, which is not possible using standard RNA-seq methods. Specific cP-RNA sequencing can be achieved by cP-RNA-seq  which takes advantage of distinct properties of two well-used enzymes, T4 polynucleotide kinase (T4 PNK) and a phosphatase such as calf intestinal phosphatase (CIP). T4 PNK has 3 -terminal phosphatase activity that removes both a P and cP from the 3end of RNAs (Amitsur et al., 1987;Das and Shuman, 2013b), whereas CIP removes only a P but not a cP. In cP-RNA-seq, RNAs are first treated with CIP to remove a P, followed by periodate oxidization. Because the oxidation cleaves the 3 -end of all RNAs other than cP-RNAs, subsequent cP removal, adapter ligation, and cDNA amplification steps are exclusively applied to cP-RNAs, leading to selective amplification and sequencing of cP-RNAs   (Figure 4A). cP-RNA-seq only requires commercially available enzymes and reagents, which is an advantage of the method. As a limitation of the method, RNAs lacking a 2 ,3 -diol structure of ribose, such as plant miRNAs and animal piRNAs that contain 2 -O-methyl ribose modification (Yang et al., 2006;Kirino and Mourelatos, 2007;Ohara et al., 2007), can also be amplified despite the absence of a cP, because those RNAs would be resistant to periodate oxidation. This point should always be remembered especially when 20-30-nt small RNAs are used for the method. Thus far, cP-RNA-seq has been applied only to the two cell lines, human BT-474 breast cancer cells and B. mori BmN4 cells (Honda et al., , 2017. Although high mapping ratio of the obtained reads to tRNA sequences showed the specificity and credibility of the method, both of the studies narrowly focused on short RNA fraction containing tRNA haves. Further application of the method to broader RNA populations will enable more global identification of cP-RNA species. As an alternative method, Arabidopsis tRNA ligase AtRNL can be used for specific cP-RNA sequencing (Schutz et al., 2010). Because its ligation activity is specific to a cP but not to a 3 -P and 3 -OH, AtRNL selectively ligates a 3 -adapter to cP-RNAs among all RNA species. After the ligation, for efficient reverse transcription, a 2 -P formed at the substrateadapter junction should be removed by 2 -phosphotransferase treatment. Therefore, two specific recombinant proteins, AtRNL and Saccharomyces cerevisiae 2 -phosphotransferase Tpt1, were purified and used in the method (Schutz et al., 2010). Application of the method to human brain total RNA identified numerous reads of cP-RNAs containing U6 snRNA. The 3 -ends of ∼90% of the U6 snRNA reads were identified as a consistent, mature form, validating the specificity and credibility of the method. Considering the ligation activity for cP-RNAs, RtcB can also be used for cP-RNA sequencing (Donovan et al., 2017). Because RtcB can ligate 3 -P ends, as well as cP ends, a phosphatase treatment to remove 3 -P prior to RtcB-mediated 3 -adaptor ligation would be required for specific capture of cP-RNAs.
After cP-RNA sequencing, amplification and quantification of the representative identified cP-RNA species are necessary to validate their expression and analyze whether a cP end is the major 3 -end form of the identified sequences. Standard RT-qPCR, amplifying internal sequences of targeted RNAs, is inappropriate for specific amplification of cP-RNAs because it cannot distinguish between cP-RNAs and RNAs with other terminal states. To specifically analyze cP-RNAs, RNAs treated with T4 PNK or CIP can be subjected to 3 -adapter ligation, followed by TaqMan RT-qPCR targeting 3 -adapter-RNA ligation products Honda et al., , 2017Shigematsu et al., 2018;Zhao et al., 2018) (Figure 4B). The dependency of amplification signals on RNA treatment with T4 PNK, but not with CIP, allows researchers to confirm that the detected signals are derived from cP-RNAs because they should be ligated to a 3 -adapter only after cP removal by T4 PNK treatment. As an alternative method for analyzing cP ends, T4 PNK-or CIP-treated RNAs can be subjected to a poly (A) polymerase reaction which is able to add poly (A) tails to 3 -OH ends, but not to cP ends (Zaug et al., 1996). Moreover, northern blot can be used to observe slight differences in band mobility between cP-RNAs and RNAs with other terminal states .

FUTURE PERSPECTIVES
Despite the findings described in this review, current information regarding cellular expression profiles of cP-RNAs is very limited and fragmented. Although increasing accumulation of RNA-seq data has accelerated the comparative analyses of transcriptomes and, therefore, been critical to identifying significant RNA species in biological phenomena and diseases, the "invisibility" of cP-RNA expression in RNA-seq data makes cP-RNA research be still at an initial stage. The immediate future focus should be on capturing the comprehensive repertoire of cP-RNAs expressed in different tissues and cells by using the above described specific sequencing methods. Given that cP-RNAs are expressed as functional molecules, capturing the entire cP-RNA repertoire would broaden the catalog of functional non-coding RNAs and could reveal significant biological events that have been eluding standard RNA-seq. Besides cP-RNA expression, molecular mechanisms behind cP-RNA biogenesis and function still remain elusive. Presumably, not all cP-RNA-producing enzymes have been identified and characterized to date. Because determining cP-RNA-generating enzymes only by their aminoacid sequences and protein motifs is impossible, discovering novel cP-RNA-generating enzymes will rely on detailed structural and biochemical characterizations of each enzyme. Given the already-proven biological roles of cP formation and cP-RNA expression, it is not surprising that cP-RNAs are involved in a wide range of biological processes. Considering the "hidden" nature of cP-RNAs in conventional RNA-seq data, further research efforts to characterize cP-RNAs would likely clarify substantially greater biological significance of cP-RNAs, which will advance our understanding of the expanding realm of noncoding RNA molecules.

AUTHOR CONTRIBUTIONS
MS and YK conceptualized the theme and wrote the review with substantial help by TK in compiling reference papers. All authors reviewed and approved the final manuscript.

FUNDING
Work in the lab on this topic has been supported by the National Institutes of Health Grant (GM106047 and AI130496 to YK), American Cancer Society Research Scholar Grant (RSG-17-059-01-RMC to YK), the W. W. Smith Charitable Trust Grant (C1608 to YK), and a Japan Society for the Promotion of Science Postdoctoral Fellowship for Research Abroad (to MS).